| Title |
Δ |
| Do I need to free constant memory assigned using cudaMemcpyToSymbol?
|
+3.28 |
| different thread blocks definition
|
0.00 |
| How CUDA constant memory allocation works?
|
0.00 |
| Peak Bandwidth for CUDA Surface Memory?
|
0.00 |
| NVIDIA Visual Profiler, Debug and Release modes in Visual Studio 2010
|
-4.47 |
| cuda kernels not executing concurrently
|
0.00 |
| the Kernel delay increase by increasing the blocksPerGrid and threa...
|
0.00 |
| cuda "invalid argument" error on second kernel
|
0.00 |
| CUDA Debugging: "No value at target location", I clearly...
|
0.00 |
| create OpenCL project in Visual Studio 2010
|
0.00 |
| OpenGL shader debugging. NVIDIA Parallel nSight?
|
+3.19 |
| Cuda zero-copy performance
|
0.00 |
| Develop a Cuda DLL working with different Runtime versions
|
0.00 |
| CUDA measure execution time per gpu core
|
0.00 |
| CUDA performance improves when running more threads than there are...
|
+3.46 |
| Is register overflowing a possible cause of a CUDA_EXCEPTION_5, War...
|
0.00 |
| Strategies for timing CUDA Kernels: Pros and Cons?
|
-4.30 |
| Shared memory allocation in CUDA
|
0.00 |
| Why only one of the warps is executed by a SM in cuda?
|
+4.82 |
| Uncoalesced float2 CUDA kernel
|
0.00 |
| Nsight 2.2 sometimes works sometimes doesn't
|
0.00 |
| not able to use printf in cuda kernel function
|
+4.73 |
| Difference on creating a CUDA context
|
0.00 |
| Disabling TDR for CUDA in Windows 8
|
0.00 |
| Is my GTX680 really performing
|
+3.26 |
| CUDA: Passing parameters to host compiler during Nsight session
|
+3.39 |
| Unexpectedly large cmem[2] usage in CUDA code
|
0.00 |
| Cuda Shared memory shown as register in Nsight
|
0.00 |
| Cuda: Where do the built-in variables reside? (threadIdx, blockIdx,...
|
0.00 |
| Understanding counters in CUDA profiler
|
+3.57 |
| CUDA disable L1 cache only for one variable
|
-0.17 |
| What does a high branch efficiency and low control flow efficiency...
|
0.00 |
| Calculating achieved bandwidth and flops/Gflops, and evaluate CUDA...
|
0.00 |
| Scalar variables and registers : CUDA
|
0.00 |
| Difference in time reported by NVVP and counters
|
0.00 |
| Define struct array in function
|
-1.31 |
| Nsight profile experiments not running
|
0.00 |
| Clear uint3 in CUDA using cudaMemset
|
0.00 |
| Do we need two GPUs to debug CUDA code?
|
0.00 |
| How to use L2 Cache in CUDA
|
0.00 |
| How is cudaMemset implemented?
|
0.00 |
| CUDA threads, SMX, SP and blocks, how do they work?
|
0.00 |
| driver.Context.synchronize()- what else to take into consideration...
|
-0.14 |
| Time between Kernel Launch and Kernel Execution
|
0.00 |
| 'Flush records'-Warning in Parallel Nsight profiling results
|
0.00 |
| Trouble measuring the elapsed time of a CUDA program and CUDA kernels
|
0.00 |
| location of cudaEventRecord and overlapping ops, when second kernel...
|
0.00 |
| How to measure Streaming Multiprocessor use/idle times in CUDA?
|
0.00 |
| In CUDA, how can we call a device function in another translation u...
|
0.00 |
| Misaligned Shared or Local Address
|
0.00 |