StackRating

An Elo-based rating system for Stack Overflow
Home   |   About   |   Stats and Analysis   |   Get a Badge
Rating Stats for

wim

Rating
1518.79 (34,472nd)
Reputation
2,311 (72,882nd)
Page: 1 2
Title Δ
Efficient conditional ceiling and floor in HLSL 0.00
fastest stride 2 gather 0.00
Floating point compare of absolute values in AVX 0.00
How to properly compare an integer and a floating-point value? +0.38
How can a SSE2 function be missing from the header it is supposed t... 0.00
Illegal Instruction with mm_cmpeq_epi8_mask +2.36
What is the fastest inverse of _mm_movemask_ps()? 0.00
Determine if rounding occurred for a floating-point operation in C/... 0.00
Fastest way to expand bits in a field to all (overlapping + adjacen... +0.09
Is it possible to convince clang to auto-vectorize this code withou... 0.00
Why does the vhaddps instruction add in such an involved way? -1.87
Comparing two values in the form (a + sqrt(b)) as fast as possible? -1.22
AVX2 Transpose of a matrix represented by 8x __m256i registers 0.00
Is there an function in AVX512 like _mm512_sign_epi16 (__m512i a, _... -0.86
What's the point of _mm_cmpgt_sd and other similar methods? +4.78
AVX2 instructions latency and throughput 0.00
Intel vector instruction to zero-extend 8 4-bit values packed in a... +4.07
BMI for generating masks with AVX512 +4.09
How to use _mm_extract_epi8 function? -3.09
Auto-vectorize shuffle instruction 0.00
What's the fastest way to perform an arbitrary 128/256/512 bit... 0.00
Fast way to get a close power-of-2 number (floating-point) +4.18
Is it possible to sum every 3 neighbouring elements in an array and... +4.17
What is the fastest way to convert a large c-array of char8 to shor... +1.84
Why can GCC not vectorize this function and loop? 0.00
How do I broadcast the lowest word of a __m256i? 0.00
How to generate simd code for math function "exp" using o... 0.00
AVX or other's set instruction that can extract a specific bit,... +4.19
How to bit scan forward and reverse a __uint128_t (128bit)? 0.00
How to copy the lower part of the lanes of a XMM register with libs... 0.00
Using Intel Compiler SVML `__m128 _mm_sincos_ps ()` Effectively 0.00
How to efficiently convert an 8-bit bitmap to array of 0/1 integers... -3.95
Rotating (by 90°) a bit matrix (up to 8x8 bits) within a 64-bi... 0.00
Reproduce _mm256_sllv_epi16 and _mm256_sllv_epi8 in AVX2 +4.05
Fast and accurate iterative generation of sine and cosine for equal... -3.10
Computing 8 horizontal sums of eight AVX single-precission floating... -4.03
Convert signed short to float in C++ SIMD -3.25
Count leading zeros in __m256i word +1.59
Accurate vectorizable implementation of acosf() -2.68
Multishift operation -3.13
Fastest Implementation of Exponential Function Using AVX +4.44
Using a variable to index a simd vector with _mm256_extract_epi32()... +4.62
AVX2 expand contiguous elements to a sparse vector based on a condi... 0.00
sqrt of uint64_t vs. int64_t -4.11
Most efficient way to find the index of the only '1' bit in... +1.06
AVX vs. SSE: expect to see a larger speedup 0.00
C/C++ intrinsic for assembly VMOVD 0.00
Combine packed nibbles into packed bytes 0.00
Efficient way to set first N or last N bits of __m256i to 1, the re... 0.00
Finding the most frequently occurring element in an SSE register +3.49