Category "assembly"

Can modern x86 implementations store-forward from more than one prior store?

In the case that a load overlaps two earlier stores (and the load is not fully contained in the oldest store), can modern Intel or AMD x86 implementations forwa

Why is a conditional move not vulnerable to Branch Prediction Failure?

After reading this post (answer on StackOverflow) (at the optimization section), I was wondering why conditional moves are not vulnerable for Branch Prediction

Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures?

I'm doing micro-optimization on a performance critical part of my code and came across the sequence of instructions (in AT&T syntax): add %rax, %rbx mov %r

Generating 1 random number within 0-256 range in x86 8086 tasm(16 bit) [duplicate]

At this point i have been learning assembly for about 6 months. My current project is a random number generator. I need to generate 1 random n

Converting ASCII hex number to 32-bit binary integer in x86

So im reading the user's 8-digit input, and saving it into a variable. for example: Enter an 8-digit hex number: 1ABC5678 So, then i loop through the 1ABC5678 h

What is the difference, if any, between LONG and FAR jumps in Assembly?

I'm looking at some practice code for assembly, and the assignment is basically to replace one jump point with another. The original jmp is a SHORT jmp, and th

Can x86's MOV really be "free"? Why can't I reproduce this at all?

I keep seeing people claim that the MOV instruction can be free in x86, because of register renaming. For the life of me, I can't verify this in a single tes

Printing hex from dx with nasm

I actually want to print the content of the dx register with nasm. Thereby the content is a 16 bit hex digit such as 0x12AB. Therefore I've first implemented a

Printing hex from dx with nasm

I actually want to print the content of the dx register with nasm. Thereby the content is a 16 bit hex digit such as 0x12AB. Therefore I've first implemented a

Adding 2D arrays in Assembly (x86)

I have to add two 3*3 arrays of words and store the result in another array. Here is my code: .data a1 WORD 1,2,3 WORD 4,2,3 WORD 1,4,3 a2 WORD 4, 3, 8

Understanding bubble vs stall vs repeated decode/fetch

I'm really confused on the difference between bubbles, stalls, and repeated decoding/fetching. My text is the Patterson text, 3rd edition. Example 1: add $3,

Micro fusion and addressing modes

I have found something unexpected (to me) using the Intel® Architecture Code Analyzer (IACA). The following instruction using [base+index] addressing add

unknown pseudo-op: `.globl_start'

I have some assembly code written for 32-bit machines but I need to run that on x86-64 bit architecture. Please suggest ways to achieve this. I'm compiling usi

How to add two numbers, integer and a float in NASM?

I have this code that is suppose to add two numbers, a float(3.25) and a integer(2). EDITED: extern _printf, _scanf global _main section .bss num1: resb 4 s

How can I multiply 64 bit operands and get 128 bit result portably?

For x64 I can use this: { uint64_t hi, lo; // hi,lo = 64bit x 64bit multiply of c[0] and b[0] __asm__("mulq %3\n\t" : "=d" (hi), "=a" (lo)

Error in simple g++ inline assembler

I'm trying to write a "hello world" program to test inline assembler in g++. (still leaning AT&T syntax) The code is: #include <stdlib.h> #include &

GCC assembly and unsupported instruction `mov'

I'm trying to compile the following assembly code in level2.s movl $0x0000000054756825, %rdi movl $0x000000000040198c, $(0x0000000055685ff8) ; do I need $ for

Assembler HCS12 how does register with index work with TST-instruction?

Hello this is my test code: LDX #$2000 LDY #$1000 LDD #$0000 la: ADDD #1 MOVB 1, X+, 1, Y+

two pass assembler fix

I'm working on a 2 pass assembler and have been looking at sample codes online to familiarise myself. I found the following code but there appears to be a probl

two pass assembler fix

I'm working on a 2 pass assembler and have been looking at sample codes online to familiarise myself. I found the following code but there appears to be a probl