'How to copy the value at a certain address in memory to a register in gcc AT&T style
I want to copy the value at a certain address in memory to a register using AT&T style assembly. I know this shouldn't be hard, and I think in Intel style it's something like:
mov rdi, [0xdeadbeef]
But I don't know much about the AT&T style (or assembly in general). I searched about it but all the examples about mov that I got didn't include this one.
So can anyone tell me how that instruction looks like?
Also, where can I find a complete list of x86_64 assembly instructions in AT&T style?
Solution 1:[1]
To copy the value at a certain address in memory to a register in 32-bit mode we use
mov edi, [0xdeadbeef] ; Intel
movl 0xdeadbeef, %edi ; AT&T
In AT&T any literal that is not prefixed by $ is an address
But in x86_64 64-bit absolute addressing is not allowed, so you can't use movq 0xdeadbeef, %rdi like above. The only instruction that has 64-bit immediate is mov (movabs in gas), which can assign a 64-bit constant to any registers, or move value at a 64-bit absolute address to Areg
mov rax, [0xdeadbeef] ; Intel
movabs 0xdeadbeef, %rax ; AT&T
If you really need to move the value from a 64-bit absolute address to a register different from Areg you must use indirect addressing instead
mov rdi, 0xdeadbeef ; Intel
mov rdi, [rdi]
movq $0xdeadbeef, %rdi ; AT&T
movq (%rdi), %rdi
or if you want the value to be copied to both rax and rdi then
mov rax, [0xdeadbeef] ; Intel
mov rdi, rax
movabs 0xdeadbeef, %rax ; AT&T
movq %rax, %rdi
Here the q suffix means quadword (64-bit) registers
In AT&T syntax the size of memory operands is determined from the last character of the instruction mnemonic. Mnemonic suffixes of
b,w,landqspecify byte (8-bit), word (16-bit), long (32-bit) and quadruple word (64-bit) memory references. Intel syntax accomplishes this by prefixing memory operands (not the instruction mnemonics) withbyte ptr,word ptr,dword ptrandqword ptr. Thus, Intelmov al, byte ptr fooismovb foo, %alin AT&T syntax.In 64-bit code,
movabscan be used to encode themovinstruction with the 64-bit displacement or immediate operand.https://sourceware.org/binutils/docs/as/i386_002dVariations.html
More information about 64-bit mov instruction here: Difference between movq and movabsq in x86-64. As you can see there's no version for moving from a 32-bit absolute address to a 64-bit register, so even in rare cases when the address fits in 32 bits like 0xdeadbeef, you still have to use movabs Areg, moffs64
Solution 2:[2]
Normally mov rdi, [0x123456] is fine, AT&T mov 0x123456, %rdi.
In this special case, your address 0xdeadbeef is outside the low 2GiB so you can't use a normal 32-bit absolute address. But it's within the low 4GiB, so you can use a 32-bit address-size override to get a 32-bit zero-extended address instead of needing movabs with a full 64-bit absolute address (moffs), or moving an imm64 to a register to set up for mov (%rdi), %rdi
NASM syntax:
a32 mov rdi, [a32 abs 0xdeadbeef]
GAS AT&T syntax:
addr32 mov 0xdeadbeef, %rdi
Both assemble to the same machine code, which objdump disassembles as:
67 48 8b 3c 25 ef be ad de mov 0xdeadbeef(,%eiz,1),%rdi
32-bit absolute [disp32] uses a SIB with no index (the longer of the two redundant encodings in 32-bit machine code for a [disp32] absolute addressing mode), so that's probably why it disassembles that way. The shorter of the two encodings was repurposed for x86-64 to be [RIP+rel32].
An address-size prefix costs 1 extra byte, but does execute efficiently on existing CPUs. It does not cause an LCP stall on Intel CPUs unless you use it on movabs, because the length of the rest of the instruction is the same with or without it. (Unlike in 32-bit mode where it overrides the interpretation of disp32 to be disp16, and ModRM to be 16-bit style with no optional SIB).
The other option is mov $imm32, %r32 (5 bytes) to get the address zero-extended that way. This is 2 separate instructions but actually smaller machine code size: 8 total bytes vs. 9 for mov with an absolute 32-bit address. It will still decode to 2 uops, so it's less efficient than the single-instruction load.
401009: bf ef be ad de mov $0xdeadbeef,%edi
40100e: 48 8b 3f mov (%rdi),%rdi
Alternatives in NASM syntax for full 64-bit addresses, as in
Load from a 64-bit address into other register than rax
mov rsi, 0x000000efdeadbeef ; address into register
mov rsi, [rsi]
mov rax, [qword 0x00000000deadbeef] ; moffs64 load into RAX, then copy
mov rdi, rax
AT&T Disassembly:
401011: 48 be ef be ad de ef 00 00 00 movabs $0xefdeadbeef,%rsi
40101b: 48 8b 36 mov (%rsi),%rsi
40101e: 48 a1 ef be ad de 00 00 00 00 movabs 0xdeadbeef,%rax
401028: 48 89 c7 mov %rax,%rdi
If you omit the qword in [qword 0xdeadbeef], NASM will warn warning: dword data exceeds bounds and emits:
# without forcing qword address encoding for NASM, it truncates to a disp32
48 8b 04 25 ef be ad de mov rax,QWORD PTR ds:0xffffffffdeadbeef
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
