Title |
Δ |
How to allocate arrays of arrays in structure with CUDA Fortran?
|
-0.35 |
Real scaled Sparse matrix vector multiplication in Cusp?
|
0.00 |
passing a unified memory pointer to kernel slows down the program
|
0.00 |
Numba python CUDA vs. cuBLAS speed difference on simple operations
|
0.00 |
How to turn host memory into page-locked using CUDA driver API
|
0.00 |
Why do my in-kernel dynamic memory allocations fail for larger grid...
|
0.00 |
CUDA: Forgetting kernel launch configuration does not result in NVC...
|
0.00 |
cuda - Is PyCuda slower than pure Cuda in the process of allocating...
|
0.00 |
What is the fastest way to perform vector-by-vector dot products fo...
|
0.00 |
pyCuda, issues sending multiple single variable arguments
|
+0.32 |
Parallel reduction #5 unroll the last warp
|
+0.31 |
Is it a bad idea to swap PyCuda DeviceAllocation objects?
|
0.00 |
Using CUDA Optionally on Linux
|
0.00 |
CUDA device runtime api cudaMemsetAsync doesn't work
|
0.00 |
nested thrust::fill does not work for varied input values
|
0.00 |
cuda global pointer allocation in different source file
|
0.00 |
Does nvcc support "-pthread" option internally?
|
0.00 |
Re-declaration of cmath functions in CUDA's math_functions.h
|
0.00 |
Iterating through a 2D array in PyCUDA
|
0.00 |
Cuda Fortran Device variable initialisation scope
|
0.00 |
PyCUDA kernel function
|
0.00 |
calculate Matrix Vector multiplication with python in cuda
|
0.00 |
How to build code containing cuda function and c++ template function
|
0.00 |
How can I write the memory pointer in CUDA
|
0.00 |
How to remove a nested loop with CUDA Thrust for an all-pair distan...
|
0.00 |
When using inline PTX asm() instructions, what does 'volatile&#...
|
0.00 |
cuda 2D layered tex size: too large?
|
0.00 |
More than 65535 blocks in CUDA
|
0.00 |
Using cuBLAS with complex numbers from Thrust
|
0.00 |
Cuda optim for maxwell
|
0.00 |
a CUDA error When a large array is used as input data
|
0.00 |
How to calculate logarithm in GPU(python3.5+numba+CUDA8.0)
|
0.00 |
Theano search for CUDA 7.5 files but I have CUDA 8 installed. How t...
|
+0.98 |
C++/CUDA: Calculating maximum gridSize and blockSize dynamically
|
0.00 |
Compiler error when including <chrono> in a CUDA program - ev...
|
+0.99 |
Why the call to thrust::inclusive_scan is much slower than subseque...
|
+1.00 |
Two values can't be printed correctly (python3.5+numba+CUDA8.0)
|
0.00 |
math_functions.hpp not found when using CUDA with Eigen
|
0.00 |
why can't I get the right sum of 1D array with numba (cuda pyth...
|
0.00 |
declare cuda __constant__ memory in a class
|
0.00 |
The version ('80000') of the host compiler ('Apple clan...
|
-0.79 |
Why do I need to declare CUDA variables on the Host before allocati...
|
+1.01 |
cusp::extract_diagonal not found
|
0.00 |
loading a shared library (cuda) in python via ctypes, cannot dynami...
|
0.00 |
When should I favor a more specific atomic operation over using ato...
|
+0.31 |
Compiling my CUDA program with libraries provided in toolkit
|
0.00 |
How do I know that cudaMemcpyAsync is done reading host memory?
|
0.00 |
math function in cuda
|
0.00 |
Insert into host_vector using thrust
|
0.00 |
Can thrust::gather be used "in-place"?
|
0.00 |