Quantcast
Channel: Active questions tagged gcc - Stack Overflow
Viewing all articles
Browse latest Browse all 22002

gcc: strange asm generated for simple loop

$
0
0
m68k-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CFLAGS = -Wall -Werror -ffreestanding -nostdlib -O2 -m68000 -mshort

I am very confused why gcc generates such (seemingly) non-optimal code for a simple for loop over a const array.

const unsigned int pallet[16] = {
  0x0000,
  0x00E0,
  0x000E,
  ...
  0x0000
};

...

volatile unsigned long * const VDP_DATA = (unsigned long *) 0x00C00000;

...

for(int i = 0; i < 16; i++) {
  *VDP_DATA = pallet[i];
}

Results in:

 296:   41f9 0000 037e  lea 37e <pallet+0x2>,%a0
 29c:   223c 0000 039c  movel #924,%d1
 2a2:   4240            clrw %d0
 2a4:   0280 0000 ffff  andil #65535,%d0
 2aa:   23c0 00c0 0000  movel %d0,c00000 <_etext+0xbffc2c>
 2b0:   b288            cmpl %a0,%d1
 2b2:   6712            beqs 2c6 <main+0x46>
 2b4:   3018            movew %a0@+,%d0
 2b6:   0280 0000 ffff  andil #65535,%d0
 2bc:   23c0 00c0 0000  movel %d0,c00000 <_etext+0xbffc2c>
 2c2:   b288            cmpl %a0,%d1
 2c4:   66ee            bnes 2b4 <main+0x34>

My main concern:

Why the useless first element compare at 2b0? This will never hit and never gets branched back to. It just ends up being duplicate code all for the first iteration.

  • Is there a better way to write this dead-simple loop such that gcc wont produce this strange code?
  • Are there any compiler flags/optimizations I can take advantage of? O3 simply unrolls the loop, which I don't want either as space is a bigger concern than speed at this part of the code.
  • Maybe I'm being too scrupulous, but I just figured this wouldn't be the most difficult code to generate. I was expecting something more along the lines of (probably wrong but you get the idea):
lea pallet,%a0
movel #7,%d0
1:
movel %a0@+,c00000
dbra %d0,1

I get that I have to be a bit more explicit in my code to get it to write in long chunks. My main point here is how come gcc can't seem to figure out the my intentions i.e I just want to dump this data in to this address.

Another observation:

clrw %d0andil #65535,%d0movel %d0,c00000. Why not just clrl and move?


Viewing all articles
Browse latest Browse all 22002

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>