Sometimes gcc uses 32bit register, when I would expect it to use a 64bit register. For example the following C code:
unsigned long long
div(unsigned long long a, unsigned long long b){
return a/b;
}
is compiled with -O2 option to (leaving out some boilerplate stuff):
div:
movq %rdi, %rax
xorl %edx, %edx
divq %rsi
ret
For the unsigned division, the register %rdx
needs to be 0
. This can be achieved by means of xorq %rdx, %rdx
, but xorl %edx, %edx
seems to have the same effect.
At least on my machine there was no performance gain (i.e. speed up) for xorl
over xorq
.
I have actually more than just one question:
- Why does gcc prefer the 32bit version?
- Why does gcc stop at
xorl
and doesn't usexorw
? - Are there machines for which
xorl
is faster thanxorq
? - Should one always prefer 32bit register/operations if possible rather than 64bit register/operations?