'Pointer arithmetic in C when used as a target array for strcat()
When studying string manipulation in C, I've come across an effect that's not quite what I would have expected with strcat(). Take the following little program:
#include <stdio.h>
#include <string.h>
int main()
{
char string[20] = "abcde";
strcat(string + 1, "fghij");
printf("%s", string);
return 0;
}
I would expect this program to print out bcdefghij. My thinking was that in C, strings are arrays of characters, and the name of an array is a pointer to its first element, i.e., the element with index zero. So the variable string is a pointer to a. But if I calculate string + 1 and use that as the destination array for concatenation with strcat(), I get a pointer to a memory address that's one array element (1 * sizeof(char), in this case) away, and hence a pointer to the b. So my thinking was that the target destination is the array starting with b (and ending with the invisible null character), and to that the fghij is concatenated, giving me bcdefghij.
But that's not what I get - the output of the program is abcdefghij. It's the exact same output as I would get with strcat(string, "fghij"); - the addition of the 1 to string is ignored. I also get the same output with an addition of another number, e.g. strcat(string + 4, "fghij");, for that matter.
Can somebody explain to me why this is the case? My best guess is that it has to do with the binding precedence of the + operator, but I'm not sure about this.
Edit: I increased the size of the original array with char string[20] so that it will, in any case, be big enough to hold the concatenated string. Output is still the same, which I think means the array overflow is not key to my question.
Solution 1:[1]
You will get an output of abcdefghij, because your call to strcat hasn't changed the address of string (and nor can you change that – it's fixed for the duration of the block in which it is declared, just like the address of any other variable). What you are passing to strcat is the address of the second element of the string array: but that is still interpreted as the address of a nul-terminated string, to which the call appends the second (source) argument. Appending that second argument's content to string, string + 1 or string + n will produce the same result in the string array, so long as there is a nul-terminator at or after the n index.
To print the value of the string that you actually pass to the strcat call (i.e., starting from the 'b' character), you can save the return value of the call and print that:
#include <stdio.h>
#include <string.h>
int main()
{
char string[20] = "abcde";
char* result = strcat(string + 1, "fghij"); // strcat will return the "string + 1" pointer
printf("%s", result); // bcdefghij
return 0;
}
Solution 2:[2]
char string[] = "abcde";
strcat(string + 1, "fghij");
Append five characters to a full string array. Booom. Undefined behavior.
Adding something to a string array is a performance optimization that tells the runtime that the string is known to be at least that many characters long.
You seem to believe that a string is a thing of its own and not an array, and strcat is doing something to its first argument. That's not how that works. Strings are arrays*; and strcat is modifying the array contents.
*Somebody's going to come by and claim that heap allocated strings are not arrays. OP is not dealing with heap yet.
Solution 3:[3]
Arrays are non-modibfiable lvalues. For example you may not write
char string[20] = "abcde";
char string2[] = ""fghij"";
string = string2;
Used in expressions arrays with rare exceptions are implicitly converted to pointers to their first elements.
If you will write for example string + 1 then the address of the array will not be changed.
In this call
strcat(string + 1, "fghij");
elements of the array string are being overwritten starting from the second element of the array.
In this statement
printf("%s", string);
there is outputted the whole array starting from its first character (again the array designator used as an argument is converted to a pointer to its first element).
You could write for example
printf("%s", string + 1);
In this case the array is outputted starting from its second element.
Solution 4:[4]
These are just two pointers to different parts of the same memory inside the same array. There is nothing in your code which creates a second array. "the name of an array is a pointer to its first element" well, not really, it decays into a pointer to its first element whenever used in an expression. So in case of string + 1, this decay first happens to the string operand and then you get pointer arithmetic afterwards. You can actually never do pointer arithmetic on array types, only on decayed pointers. Details here: Do pointers support "array style indexing"?
As for strcat, it basically does two things: call strlen on the original string to find where it ends, then call strcpy to append the new string at the position where the null terminator was stored. It's the very same thing as typing strcpy(&src[strlen(src)], dst);
Therefore it won't matter if you pass string + 1 or string, because in either case strcat will look for the null terminator and nothing else.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Joshua |
| Solution 3 | Vlad from Moscow |
| Solution 4 | Lundin |
