Title |
Δ |
Global bitwise shift of 128, 256, 512 bit registry using intrinsics?
|
-1.94 |
__lzcnt returns 31 - (# of leading zeros)
|
0.00 |
Produce loops without cmp instruction in GCC
|
+1.46 |
How do I perform absolute value on double using intrinsics?
|
0.00 |
Emulating shifts on 32 bytes with AVX
|
+3.13 |
SSE2 intrinsics - comparing 2 __m128i's containing 4 int32'...
|
0.00 |
Compress mask using AVX intrinsics
|
-2.89 |
Optimization of OpenCL code?
|
-0.54 |
how to pass parameters to const value in intel SSE intrinsics?
|
-2.79 |
Clang/GCC Compiler Intrinsics without corresponding compiler flag
|
+4.35 |
Stream intrinsic degrades performance
|
+3.72 |
bitcount implementation in opencl
|
+3.29 |
fortran intrinsic function sqrt() behaviour
|
+3.81 |
Sum 4 integer from a 128 bit __m128 Intel Intrinsic
|
0.00 |
opencl - Kernel limitations
|
0.00 |
OpenCL clCreateKernel throws CL_INVALID_PROGRAM_EXECUTABLE
|
0.00 |
Arithmetic shift using intel intrinsics
|
+4.06 |
Error in compile-time arguments with AMD
|
-0.47 |
Only storing 2 first floats of a __m128 variable in C
|
+4.02 |
Java/Open CL/Aparapi: What to kind of performance to expect from wh...
|
+0.05 |
Trigonometric functions in constant array initializers in OpenCL
|
0.00 |
An optimized implementation of the Heaviside function
|
0.00 |
SSE half loads (_mm_loadh_pi / _mm_loadl_pi) issue warnings
|
0.00 |
Best assembly or compilation for minimum of three values
|
+3.88 |
Best Practices of using GPU with OpenCL
|
+0.05 |
Double Precision variations in OpenCL printf
|
0.00 |
Large for loop crashing in GeForce Nvidia GT 610
|
+0.13 |
using square root function (sqrt) with doubles in OpenCL
|
0.00 |
Intrinsics for CPUID like informations?
|
-1.30 |
No OpenCL on Cygwin but it is installed on Windows - how to install?
|
0.00 |
undefined reference to `__lzcnt16?
|
0.00 |
Visual C++ comma operator and sse intrinsics
|
0.00 |
SSE 4 popcount for 16 8-bit values?
|
+4.82 |
SSE HADDPS error: '__m256' does not name a type?
|
-4.02 |
SSE shifting instruction causes weird output (-1.#IND00) in subsequ...
|
0.00 |
printf makes error and don't show the result
|
+0.41 |
OpenCL Device not Exist
|
0.00 |
How to get max compute units with the C++ wrapper?
|
0.00 |
Why should I cast to cl_platform_info when using getInfo?
|
0.00 |
OpenCL tasks hang at submitted
|
0.00 |
Opencl and HD5850
|
+3.86 |
sin, cos, tan not accurate
|
+3.66 |
substitutions for cl_khr_int64_base_atomics
|
0.00 |
Running OpenCL kernel on multiple GPUs?
|
+4.58 |
OpenCL: struct field initialization from inline function does not w...
|
0.00 |
Direct Cpu Threads or OpenCL
|
-4.00 |
OpenCL computation freezes the screen
|
-3.99 |
_mm_crc32_u64 poorly defined
|
+4.00 |
gcc and sin/cos/transcendental functions precision like in Windows
|
0.00 |