Recently I tried to pack my code into small ATTiny13 with 1kB of flash. In optimalisation process I discovered something weird for me. Let's take the example code:
#include <avr/interrupt.h>int main() { TCNT0 = TCNT0 * F_CPU / 58000;}
It has no sense of course, but interesting thing is output size - it produces 248 bytes.
Quick explaination of code: F_CPU
is constant defined by -DF_CPU=...
switch for avr-gcc
, TCNT0
is 8-bit register (on ATTiny13). In real program I assign equation result to uint16_t, but still same behaviour was observed.
If part of expression were wrapped in brackets:
TCNT0 = TCNT0 * (F_CPU / 58000);
Output file size is 70 bytes. Huge difference, but results of these operations are same (right?).
I looked into generated assembly code and, despite fact that I don't understand ASM very well, I see that no-brackets version adds some labels like:
00000078 <__divmodsi4>: 78: 05 2e mov r0, r21 7a: 97 fb bst r25, 7 7c: 16 f4 brtc .+4 ; 0x82 <__divmodsi4+0xa> 7e: 00 94 com r0 80: 0f d0 rcall .+30 ; 0xa0 <__negsi2> 82: 57 fd sbrc r21, 7 84: 05 d0 rcall .+10 ; 0x90 <__divmodsi4_neg2> 86: 14 d0 rcall .+40 ; 0xb0 <__udivmodsi4> 88: 07 fc sbrc r0, 7 8a: 02 d0 rcall .+4 ; 0x90 <__divmodsi4_neg2> 8c: 46 f4 brtc .+16 ; 0x9e <__divmodsi4_exit> 8e: 08 c0 rjmp .+16 ; 0xa0 <__negsi2>
And much more. I learned only x86 assembler awhile, but as far as I remember, for division there was simple mnemonic. Why avr-gcc
adds so much code in first example?
Another question is why compiler does not inline right part of equation if both numbers are known in compile time.