In the case that a load overlaps two earlier stores (and the load is not fully contained in the oldest store), can modern Intel or AMD x86 implementations forwa
After reading this post (answer on StackOverflow) (at the optimization section), I was wondering why conditional moves are not vulnerable for Branch Prediction
I'm doing micro-optimization on a performance critical part of my code and came across the sequence of instructions (in AT&T syntax): add %rax, %rbx mov %r
At this point i have been learning assembly for about 6 months. My current project is a random number generator. I need to generate 1 random n
So im reading the user's 8-digit input, and saving it into a variable. for example: Enter an 8-digit hex number: 1ABC5678 So, then i loop through the 1ABC5678 h
I'm looking at some practice code for assembly, and the assignment is basically to replace one jump point with another. The original jmp is a SHORT jmp, and th
I keep seeing people claim that the MOV instruction can be free in x86, because of register renaming. For the life of me, I can't verify this in a single tes
I actually want to print the content of the dx register with nasm. Thereby the content is a 16 bit hex digit such as 0x12AB. Therefore I've first implemented a
I actually want to print the content of the dx register with nasm. Thereby the content is a 16 bit hex digit such as 0x12AB. Therefore I've first implemented a
I have to add two 3*3 arrays of words and store the result in another array. Here is my code: .data a1 WORD 1,2,3 WORD 4,2,3 WORD 1,4,3 a2 WORD 4, 3, 8
I'm really confused on the difference between bubbles, stalls, and repeated decoding/fetching. My text is the Patterson text, 3rd edition. Example 1: add $3,
I have found something unexpected (to me) using the Intel® Architecture Code Analyzer (IACA). The following instruction using [base+index] addressing add
I have some assembly code written for 32-bit machines but I need to run that on x86-64 bit architecture. Please suggest ways to achieve this. I'm compiling usi
I have this code that is suppose to add two numbers, a float(3.25) and a integer(2). EDITED: extern _printf, _scanf global _main section .bss num1: resb 4 s
For x64 I can use this: { uint64_t hi, lo; // hi,lo = 64bit x 64bit multiply of c[0] and b[0] __asm__("mulq %3\n\t" : "=d" (hi), "=a" (lo)
I'm trying to write a "hello world" program to test inline assembler in g++. (still leaning AT&T syntax) The code is: #include <stdlib.h> #include &
I'm trying to compile the following assembly code in level2.s movl $0x0000000054756825, %rdi movl $0x000000000040198c, $(0x0000000055685ff8) ; do I need $ for
Hello this is my test code: LDX #$2000 LDY #$1000 LDD #$0000 la: ADDD #1 MOVB 1, X+, 1, Y+
I'm working on a 2 pass assembler and have been looking at sample codes online to familiarise myself. I found the following code but there appears to be a probl
I'm working on a 2 pass assembler and have been looking at sample codes online to familiarise myself. I found the following code but there appears to be a probl