I have a simple C program:
int main(){ unsigned int counter = 0;++counter;++counter;++counter; return 0;}
I am using the following compile flags:
arm-none-eabi-gcc -c -mcpu=cortex-m4 -march=armv7e-m -mthumb -mfloat-abi=hard -mfpu=fpv4-sp-d16 -DPART_TM4C123GH6PM -O0 -ffunction-sections -fdata-sections -g -gdwarf-3 -gstrict-dwarf -Wall -MD -std=c99 -c -MMD -MP -MF"main.d" -MT"main.o" -o"main.o""../main.c"
(some -I directives removed for brevity)
Note that I'm deliberately using -O0
to disable optimisations because I'm interested in learning what the compiler does to optimise.
This compiles into the following assembly for ARM Cortex-M4:
6 unsigned int counter = 0;00000396: 2300 movs r3, #000000398: 607B str r3, [r7, #4]7 ++counter;0000039a: 687B ldr r3, [r7, #4]0000039c: 3301 adds r3, #10000039e: 607B str r3, [r7, #4]8 ++counter;000003a0: 687B ldr r3, [r7, #4]000003a2: 3301 adds r3, #1000003a4: 607B str r3, [r7, #4]9 ++counter;000003a6: 687B ldr r3, [r7, #4]000003a8: 3301 adds r3, #1000003aa: 607B str r3, [r7, #4]
Why are there so many ldr r3, [r7, #4]
and str r3, [r7, #4]
instructions generated? And why does r7
even need to be involved, can't we just use r3
?