'Looking for approaches to optimize a simple byte-at-a-time strcpy in MIPS assembly
I am trying to optimize the following block of mips code and am struggling to find ways to reduce overall instructions or other ways of optimization. Thanks!
.data
Str: .asciiz "ABC"
Strempty: .asciiz " "
.text
.globl main
main:
jal strcpy #Jump to the strcpy subroutine
li $v0,4 #Code for printing strings
la $a0,Strempty #syscall to print the content of the copied string
syscall
li $v0,10 #Terminates the program
syscall
strcpy:
la $a0,Str #Get address of string y
la $a1, Strempty #Get address of string x
addi $sp,$sp,-4 #Adjust stack pointer to make room for doubleword
sw $s5,0($sp) #Push $s5 - i.e push contents of $s5 to the top of the stack
add $s5,$zero,$zero #i=0
L1:
add $t0,$s5,$a0 #$t0 = addr of y[i]
lbu $t1,0($t0) #t1 = y[i]
add $t2,$s5,$a1 #$t2 = addr of x[i]
sb $t1,0($t2) #x[i] = y[i]
beq $t1,$zero,L2 #Branches out when the null character is reached
addi $s5,$s5,1 #i=i+1
j L1 #Next iteration for L1
L2:
lw $s5,0($sp) #restore $s5
addi $sp,$sp,4 #pop the doubleword from the stack pointer
jr $ra #return to main
Solution 1:[1]
Don't use
$s5— use$t4or$v0instead, these are free to use there without the save/restore.$s5is a call preserved register, whereas the$ts and$vs are call clobbered (so go ahead and clobber one). In fact, you should use registers$a0and$a1as the parameter registers.Don't use the stack at all — this is a leaf function and doesn't require any stack space given there are registers immediately available to use.
Use pointers instead of indexing — try to do a pointer version in C first so you can follow it, but the gist is that indexing requires repeated addition, and using pointers instead will reduce that a bit, from 3 additions in the loop to 2 of them. The pointer version will not have need of counter/index variable
i. Modify$a0and$a1in place; and eliminatei.Don't load your parameter values inside the function — let them come as passed by
main.As @Peter says, broadly speaking if we have an unconditional backward branch at the end of the loop and a conditional forward branch somewhere inside the loop that terminates it, we can use the conditional branch to accomplish the looping, and thus eliminate the backward branch. This will become obvious after getting rid of the index increment as the sequence will be:
beq $t1, $0, L2
j L1
L2:
And that sequence can be inverted: change the beq into bne and the branch target label from L2 to L1, then remove the j L1. That conditional branch will both perform the exit test and branching backwards to continue the loop.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
