Quantcast
Channel: Active questions tagged gcc - Stack Overflow
Viewing all articles
Browse latest Browse all 22031

Vectorize __float128 dot product with SIMD/AVX

$
0
0

If I have in C++11 (on Linux, with gcc on Intel Xeon) two __float128* arrays A and B (fixed size, fits entirely in the cache), do you know of/can provide a code that makes the __float128 dot product of those arrays (i.e. the sum of their element-wise product) using SIMD/AVX acceleration where possible.

Unfortunately MKL (and no efficient BLAS library afaik) supports __float128, so this acceleration would reduce somewhat the massive __float128 slowdown versus double to a point where we really can use it.

There are numerical stability reasons to go for __float128 in our case, so less than that is not an option unfortunately.


Viewing all articles
Browse latest Browse all 22031

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>