With GCC 9.4.0, the code
static inline double k(double const *restrict a) { return a[-1] + a[0] + a[1];}static void f(double const *restrict a, double *restrict b, int n) { for (int i = 0; i < n; ++i) { b[i] = k(a); }}static inline void swap(double **a, double **b) { double *tmp = *a; *a = *b; *b = tmp;}void g(double * a, double * b, int n, int nt) { for (int t = 0; t < nt; ++t) { f(a, b, n); swap(&a, &b); }}
results in loop versioned for vectorization because of possible aliasing
for the loop in f
when compiled using gcc -march=native -Ofast -fopt-info-vec
, suggesting that the restrict
qualifiers in f
are being ignored.
If I instead manually inline k
into f
:
static void f(double const *restrict a, double *restrict b, int n) { for (int i = 0; i < n; ++i) { b[i] = a[-1] + a[0] + a[1]; }}
the loop is vectorized without this versioning.
In both cases GCC reports that all of the functions are automatically inlined into each other.
The same occurs if I instead use __restrict
and compile with g++
.
Why does this happen? Is there a cleanly portable (using standard language features) way of causing GCC to respect the restrict
qualifiers in the first case so that I can benefit from vectorization without versioning, without separating f
and g
into different translation units?