The problem: there are 64-bit values with some data bits and some metadata bits; metadata includes a k-bit field describing a "type" (k >= 0). Type field is located in a lower 32-bits.
Procedure processes two "types", one denoted with code 3 and another with 5. When all items are of type 3 then we can use a fast AVX2 path, if there are some types 5, we have to call an additional function (a virtual method, to be precise). Read more ...
niedziela, 22 marca 2015
Subskrybuj:
Komentarze do posta (Atom)
1 komentarz:
Thank you for these enlightening SIMD posts! A few errata:
> auto A_type = convert(A_lo_type, A_hi_type); // PACKSSQD
PACKSSQD does not exist
> Version 2:
> auto any_5 = A_type + B_type + packed_dword(0x7fffff80) // PADDD x 2
0x7fffff80 -> 0x7fffffc0
> MOVMSK_PS
MOVMSKPS
> PSLRD
PSLLD
Prześlij komentarz