I need to implement this paper
Page 5, there is a function
void Split(double a, double *x, double *y) { double z = a * ((1 << r) + 1); *x = z - (z - a); *y = a - *x;}
With -O3
(which I need for the rest of the program, gcc sees that z - (z - a)
can be simplified to a
(but only when function is inlined) :
Split(double, double*, double*): movsd .LC0(%rip), %xmm1 mulsd %xmm0, %xmm1 movapd %xmm1, %xmm2 subsd %xmm0, %xmm2 subsd %xmm2, %xmm1 subsd %xmm1, %xmm0 movsd %xmm1, (%rdi) movsd %xmm0, (%rsi) ret.LC3: .string "%lf -- %lf\n"main: subq $8, %rsp movl $.LC3, %edi movl $2, %eax movsd .LC1(%rip), %xmm1 movsd .LC2(%rip), %xmm0 call printf xorl %eax, %eax addq $8, %rsp ret.LC0: .long 67108864 .long 1099956224.LC1: .long 0 .long -1104234414.LC2: .long -335544320 .long 1073920081
assembly for the function itself is not optimized but in the caller (main), all intermediate operations vanish.
How can I have this function inlined but no optimization for it
Do I really have to write inline assembly ? Or is there any compiler hint ?
[EDIT] : i can add a volatile
to z and prevent optimization. But then, compiler is not able to vectorize properly