Title |
Δ |
Do I need to free constant memory assigned using cudaMemcpyToSymbol?
|
+3.28 |
different thread blocks definition
|
0.00 |
How CUDA constant memory allocation works?
|
0.00 |
Peak Bandwidth for CUDA Surface Memory?
|
0.00 |
NVIDIA Visual Profiler, Debug and Release modes in Visual Studio 2010
|
-4.47 |
cuda kernels not executing concurrently
|
0.00 |
the Kernel delay increase by increasing the blocksPerGrid and threa...
|
0.00 |
cuda "invalid argument" error on second kernel
|
0.00 |
CUDA Debugging: "No value at target location", I clearly...
|
0.00 |
create OpenCL project in Visual Studio 2010
|
0.00 |
OpenGL shader debugging. NVIDIA Parallel nSight?
|
+3.19 |
Cuda zero-copy performance
|
0.00 |
Develop a Cuda DLL working with different Runtime versions
|
0.00 |
CUDA measure execution time per gpu core
|
0.00 |
CUDA performance improves when running more threads than there are...
|
+3.46 |
Is register overflowing a possible cause of a CUDA_EXCEPTION_5, War...
|
0.00 |
Strategies for timing CUDA Kernels: Pros and Cons?
|
-4.30 |
Shared memory allocation in CUDA
|
0.00 |
Why only one of the warps is executed by a SM in cuda?
|
+4.82 |
Uncoalesced float2 CUDA kernel
|
0.00 |
Nsight 2.2 sometimes works sometimes doesn't
|
0.00 |
not able to use printf in cuda kernel function
|
+4.73 |
Difference on creating a CUDA context
|
0.00 |
Disabling TDR for CUDA in Windows 8
|
0.00 |
Is my GTX680 really performing
|
+3.26 |
CUDA: Passing parameters to host compiler during Nsight session
|
+3.39 |
Unexpectedly large cmem[2] usage in CUDA code
|
0.00 |
Cuda Shared memory shown as register in Nsight
|
0.00 |
Cuda: Where do the built-in variables reside? (threadIdx, blockIdx,...
|
0.00 |
Understanding counters in CUDA profiler
|
+3.57 |
CUDA disable L1 cache only for one variable
|
-0.17 |
What does a high branch efficiency and low control flow efficiency...
|
0.00 |
Calculating achieved bandwidth and flops/Gflops, and evaluate CUDA...
|
0.00 |
Scalar variables and registers : CUDA
|
0.00 |
Difference in time reported by NVVP and counters
|
0.00 |
Define struct array in function
|
-1.31 |
Nsight profile experiments not running
|
0.00 |
Clear uint3 in CUDA using cudaMemset
|
0.00 |
Do we need two GPUs to debug CUDA code?
|
0.00 |
How to use L2 Cache in CUDA
|
0.00 |
How is cudaMemset implemented?
|
0.00 |
CUDA threads, SMX, SP and blocks, how do they work?
|
0.00 |
driver.Context.synchronize()- what else to take into consideration...
|
-0.14 |
Time between Kernel Launch and Kernel Execution
|
0.00 |
'Flush records'-Warning in Parallel Nsight profiling results
|
0.00 |
Trouble measuring the elapsed time of a CUDA program and CUDA kernels
|
0.00 |
location of cudaEventRecord and overlapping ops, when second kernel...
|
0.00 |
How to measure Streaming Multiprocessor use/idle times in CUDA?
|
0.00 |
In CUDA, how can we call a device function in another translation u...
|
0.00 |
Misaligned Shared or Local Address
|
0.00 |