'Signed to unsigned conversion in C - is it always safe?
Suppose I have the following C code.
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
What implicit conversions are going on here, and is this code safe for all values of u and i? (Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
Solution 1:[1]
Conversion from signed to unsigned does not necessarily just copy or reinterpret the representation of the signed value. Quoting the C standard (C99 6.3.1.3):
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
For the two's complement representation that's nearly universal these days, the rules do correspond to reinterpreting the bits. But for other representations (sign-and-magnitude or ones' complement), the C implementation must still arrange for the same result, which means that the conversion can't just copy the bits. For example, (unsigned)-1 == UINT_MAX, regardless of the representation.
In general, conversions in C are defined to operate on values, not on representations.
To answer the original question:
unsigned int u = 1234;
int i = -5678;
unsigned int result = u + i;
The value of i is converted to unsigned int, yielding UINT_MAX + 1 - 5678. This value is then added to the unsigned value 1234, yielding UINT_MAX + 1 - 4444.
(Unlike unsigned overflow, signed overflow invokes undefined behavior. Wraparound is common, but is not guaranteed by the C standard -- and compiler optimizations can wreak havoc on code that makes unwarranted assumptions.)
Solution 2:[2]
Referring to The C Programming Language, Second Edition (ISBN 0131103628),
- Your addition operation causes the int to be converted to an unsigned int.
- Assuming two's complement representation and equally sized types, the bit pattern does not change.
- Conversion from unsigned int to signed int is implementation dependent. (But it probably works the way you expect on most platforms these days.)
- The rules are a little more complicated in the case of combining signed and unsigned of differing sizes.
Solution 3:[3]
When one unsigned and one signed variable are added (or any binary operation) both are implicitly converted to unsigned, which would in this case result in a huge result.
So it is safe in the sense of that the result might be huge and wrong, but it will never crash.
Solution 4:[4]
When converting from signed to unsigned there are two possibilities. Numbers that were originally positive remain (or are interpreted as) the same value. Number that were originally negative will now be interpreted as larger positive numbers.
Solution 5:[5]
As was previously answered, you can cast back and forth between signed and unsigned without a problem. The border case for signed integers is -1 (0xFFFFFFFF). Try adding and subtracting from that and you'll find that you can cast back and have it be correct.
However, if you are going to be casting back and forth, I would strongly advise naming your variables such that it is clear what type they are, eg:
int iValue, iResult;
unsigned int uValue, uResult;
It is far too easy to get distracted by more important issues and forget which variable is what type if they are named without a hint. You don't want to cast to an unsigned and then use that as an array index.
Solution 6:[6]
What implicit conversions are going on here,
i will be converted to an unsigned integer.
and is this code safe for all values of u and i?
Safe in the sense of being well-defined yes (see https://stackoverflow.com/a/50632/5083516 ).
The rules are written in typically hard to read standards-speak but essentially whatever representation was used in the signed integer the unsigned integer will contain a 2's complement representation of the number.
Addition, subtraction and multiplication will work correctly on these numbers resulting in another unsigned integer containing a twos complement number representing the "real result".
division and casting to larger unsigned integer types will have well-defined results but those results will not be 2's complement representations of the "real result".
(Safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result.)
While conversions from signed to unsigned are defined by the standard the reverse is implementation-defined both gcc and msvc define the conversion such that you will get the "real result" when converting a 2's complement number stored in an unsigned integer back to a signed integer. I expect you will only find any other behaviour on obscure systems that don't use 2's complement for signed integers.
https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation https://msdn.microsoft.com/en-us/library/0eex498h.aspx
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Samuel Liew |
| Solution 3 | Mats Fredriksson |
| Solution 4 | Tim Ring |
| Solution 5 | Taylor Price |
| Solution 6 | Community |
