'Why does the compiler copy RDI to another register, and then copy it back to RDI inside a loop?

I'm analysing a piece of inefficient code, but some of it is so confusing?

Original code:

#include <string.h>

void lowwer(char *str) {
  for (int i = 0; i < strlen(str); ++i) {
    str[i] -= ('A' - 'a');
  }
}

Assembly code (generated by clang 13 with -Og option):

lowwer:
  pushq %r14 # use saved-registers
  pushq %rbx
  pushq %rax
  # guard do while
  cmpb  $0, (%rdi) # compare &str with null (check if strlen(str) == 0)
  je    .LBB0_3
  # loop initialization
  movq  %rdi, %r14 # %r14 = str
  xorl  %ebx, %ebx # clear %rbx (for more compact encoding)
.LBB0_2:                                # =>This Inner Loop Header: Depth=1
  addb  $32, (%r14,%rbx) # subtract -32 from str[i] ('A' - 'a' = -32)
  addq  $1, %rbx # ++i
  movq  %r14, %rdi # seems meaningless here?
  callq strlen@PLT
  cmpq  %rbx, %rax # check i < strlen(str)
  ja    .LBB0_2
.LBB0_3: # end
  addq  $8, %rsp # ???
  popq  %rbx # free registers
  popq  %r14
  retq
  1. what does the instruction movq %r14, %rdi is doing? It seemed meangingless because %r14 holding the string pointer and the rdi is the same.
  2. What the intention of the instruction addq $8, %rsp. Looks clueless.


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source