'AMD64 -- nopw assembly instruction?

In this compiler output, I'm trying to understand how machine-code encoding of the nopw instruction works:

00000000004004d0 <main>:
  4004d0:       eb fe                   jmp    4004d0 <main>
  4004d2:       66 66 66 66 66 2e 0f    nopw   %cs:0x0(%rax,%rax,1)
  4004d9:       1f 84 00 00 00 00 00

There is some discussion about "nopw" at http://john.freml.in/amd64-nopl. Can anybody explain the meaning of 4004d2-4004e0? From looking at the opcode list, it seems that 66 .. codes are multi-byte expansions. I feel I could probably get a better answer to this here than I would unless I tried to grok the opcode list for a few hours.


That asm output is from the following (insane) code in C, which optimizes down to a simple infinite loop:

long i = 0;

main() {
    recurse();
}

recurse() {
    i++;
    recurse();
}

When compiled with gcc -O2, the compiler recognizes the infinite recursion and turns it into an infinite loop; it does this so well, in fact, that it actually loops in the main() without calling the recurse() function.


editor's note: padding functions with NOPs isn't specific to infinite loops. Here's a set of functions with a range of lengths of NOPs, on the Godbolt compiler explorer.



Solution 1:[1]

The assembler (not the compiler) pads code up to the next alignment boundary with the longest NOP instruction it can find that fits. This is what you're seeing.

Solution 2:[2]

I would guess this is just the branch-delay instruction.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 R.. GitHub STOP HELPING ICE
Solution 2 Oliver Charlesworth