I'm trying to figure out how to get GCC or Clang to use [base+index*scale]
addressing to index two unrelated arrays, like this, to save an add
instruction:
;; rdi is srcArr, rsi is srcArr - dstArr, rax is len of arraysLoop: mov ecx, dword ptr [rdi + 4*rsi] mov dword ptr [rdi], ecx add rdi, 4 cmp rdi, rax jb .Loop
The C++ code below achieves this, but because dst
and src
are unrelated, subtracting them is UB, even though the difference is never dereferenced. (I think this will nonetheless work on all x86/64 hardware though.)
#include <cstddef>#include <cstring>void offset(float* __restrict__ dst, const float* __restrict__ src, size_t n) { const float *enddst = dst + n; const std::ptrdiff_t offset = src - dst; while (dst < enddst) { std::memcpy(dst, /* src */ dst + offset, sizeof(float)); dst++; }}
(Ignore the silly loop body, it's just for illustration as something using the addresses.)
Is there a way to do this without UB?