'Why is 0 moved to stack when using return value?

I'm experimenting disassembling clang binaries of simple C programs (compiled with -O0), and I'm confused about a certain instruction that gets generated.

Here are two empty main functions with standard arguments, one of which returns value and other does not:

// return_void.c
void main(int argc, char** argv)
{
}

// return_0.c
int main(int argc, char** argv)
{
    return 0;
}

Now, when I disassemble their assemblies, they look reasonably different, but there's one line that I don't understand:

return_void.bin:
(__TEXT,__text) section
_main:
0000000000000000    pushq   %rbp
0000000000000001    movq    %rsp, %rbp
0000000000000004    movl    %edi, -0x4(%rbp)
0000000000000007    movq    %rsi, -0x10(%rbp)
000000000000000b    popq    %rbp
000000000000000c    retq

return_0.bin:
(__TEXT,__text) section
_main:
0000000100000f80    pushq   %rbp                
0000000100000f81    movq    %rsp, %rbp          
0000000100000f84    xorl    %eax, %eax          # We return with EAX, so we clean it to return 0
0000000100000f86    movl    $0x0, -0x4(%rbp)    # What does this mean?
0000000100000f8d    movl    %edi, -0x8(%rbp)
0000000100000f90    movq    %rsi, -0x10(%rbp)
0000000100000f94    popq    %rbp
0000000100000f95    retq

It only gets generated when I use the function is not void, so I thought that it might be another way to return 0, but when I changed the returned constant, this line didn't change at all:

// return_1.c
int main(int argc, char** argv)
{
    return 1;
}

empty_return_1.bin:
(__TEXT,__text) section
_main:
0000000100000f80    pushq   %rbp
0000000100000f81    movq    %rsp, %rbp
0000000100000f84    movl    $0x1, %eax           # Return value modified
0000000100000f89    movl    $0x0, -0x4(%rbp)    # This value is not modified
0000000100000f90    movl    %edi, -0x8(%rbp)
0000000100000f93    movq    %rsi, -0x10(%rbp)
0000000100000f97    popq    %rbp
0000000100000f98    retq

Why is this line getting generated and what is it's purpose?



Solution 1:[1]

clang is making space on the stack for the arguments (registers edi and rsi) and puts the value 0 on the stack, too, for some reason. I assume that clang compiles your code to an SSA-representation like this:

int main(int argc, char** argv)
{
    int a;

    a = 0;
    return a;
}

This would explain why a stack slot is allocated. If clang does constant propagation, too, this would explain why eax is zeroed out instead of being loaded from -4(%rbp). In general, don't think too much about dubious constructs in unoptimized assembly. After all, you forbade the compiler from removing useless code.

Solution 2:[2]

movl   $0x0,-0x4(%rbp)

This instruction stores 0 at %rbp - 4. It seems that clang allocates a hidden local variable for an implicit return value from main.

From the clang mailing list:

Yes. We allocate an implicit local variable to hold the return value; return statements then just initialize the return slot and jump to the epilogue, where the slot is loaded and returned. We don't use a phi because the control flow for getting to the epilogue is not necessarily as simple as a simple branch, due to cleanups in local scopes (like C++ destructors).

Implicit return values like main's are handled with an implicit store in the prologue.

Source: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2012-February/019767.html

Solution 3:[3]

According the the standard (for hosted environments), 5.1.2.2.1, main is required to returrn an int result. So do not expect defined behavior if violating this.

Furthermore, main is actually _not required to explicitly return 0; this is implicitly returned if it reaches the end of the function. (Note this is only for main, which also does not have a prototype.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 fuz
Solution 2
Solution 3 too honest for this site