Can't figure out how to convince gcc/clang that my pointers don't intersect; for what I see, it looks like restrict
is only honored if specified in function arguments and ignored otherwise. Here is my code:
#if defined(_MSC_VER) || defined(__cplusplus)
#define restrict __restrict
#endif
struct s {
int sz;
int *a;
int *b;
};
struct s_r {
int sz;
int *restrict a;
int *restrict b;
};
void foo_dumb_struct(struct s *s, int c) {
int sz = s->sz;
for(int i = 0; i != sz; ++i) {
s->a[i] = s->b[0] + c;
}
}
void foo_restricted_arrays(int sz,
int *restrict a, int *restrict b,
int c) {
for(int i = 0; i != sz; ++i) {
a[i] = b[0] + c;
}
}
void foo_restricted_struct(struct s_r *s, int c) {
int sz = s->sz;
for(int i = 0; i != sz; ++i) {
s->a[i] = s->b[0] + c;
}
}
void foo_restricted_subcall(struct s *s, int c) {
foo_restricted_arrays(s->sz, s->a, s->b, c);
}
void foo_restricted_cast(struct s *s, int c) {
int sz = s->sz;
int *restrict a = s->a;
int *restrict b = s->b;
for(int i = 0; i != sz; ++i) {
a[i] = b[0] + c;
}
}
Icc is fine with this code, but gcc/clang generates re-read of b[0]
on each iteration for foo_restricted_struct
and foo_restricted_cast
, for all architectures I could test with godbolt. Any time it is used in function arguments (including nested functions or C++ lambdas) it is fine and extra load is removed. https://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html suggests that it actually worked as I want it to, but I'm not sure their gcc wasn't customized for cell specifically.
Is my usage of restrict wrong, or gcc/clang only implements restrict for function arguments and nothing else?