'Comparing unsigned integer with negative literals
I have this simple C program.
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
bool foo (unsigned int a) {
return (a > -2L);
}
bool bar (unsigned long a) {
return (a > -2L);
}
int main() {
printf("foo returned = %d\n", foo(99));
printf("bar returned = %d\n", bar(99));
return 0;
}
Output when I run this -
foo returned = 1
bar returned = 0
Recreated in godbolt here
My question is why does foo(99) return true but bar(99) return false.
To me it makes sense that bar would return false. For simplicity lets say longs are 8 bits, then (using twos complement for signed value):
99 == 0110 0011
-2 == unsigned 254 == 1111 1110
So clearly the CMP instruction will see that 1111 1110 is bigger and return false.
But I dont understand what is going on behind the scenes in the foo function. The assembly for foo seems to hardcode to always return mov eax,0x1. I would have expected foo to do something similar to bar. What is going on here?
Solution 1:[1]
In the first function
bool foo (unsigned int a) {
return (a > -2L);
}
the both operands of the expression a > -2L have the type long (the first operand is converted to the type long due to the usual arithmetic conversions because the rank of the type long is greater than the rank of the type unsigned int and all values of the type unsigned int in the used system can be represented by the type long). And it is evident that the positive value 99L is greater than the negative value -2L.
The first function could produce the result 0 provided that sizeof( long ) is equal to sizeof( unsigned int ). In this case the type long is unable to represent all (positive) values of the type unsigned int. As a result due to the usual arithmetic conversions the both operands will be converted to the type unsigned long.
For example running the function foo using MS VS 2019 where sizeof( long ) is equal to 4 as sizeof( unsigned int ) you will get the result 0.
Here is a demonstration program written in C++ that visually shows the reason why the result of a call of the function foo using MS VS 2019 can be equal to 0.
#include <iostream>
#include <iomanip>
#include <type_traits>
int main()
{
unsigned int x = 0;
long y = 0;
std::cout << "sizeof( unsigned int ) = " << sizeof( unsigned int ) << '\n';
std::cout << "sizeof( long ) = " << sizeof(long) << '\n';
std::cout << "std::is_same_v<decltype( x + y ), unsigned long> is "
<< std::boolalpha
<< std::is_same_v<decltype( x + y ), unsigned long>
<< '\n';
}
The program output is
sizeof( unsigned int ) = 4
sizeof( long ) = 4
std::is_same_v<decltype( x + y ), unsigned long> is true
That is in general the result of the first function is implementation defined.
In the second functions
bool bar (unsigned long a) {
return (a > -2L);
}
the both operands have the type unsigned long (again due to the usual arithmetic conversions and ranks of the types unsigned long and signed long are equal each other, so an object of the type signed long is converted to the type unsigned long) and -2L interpreted as unsigned long is greater than 99.
Solution 2:[2]
The reason for this has to do with the rules of integer conversions.
In the first case, you compare an unsigned int with a long using the > operator, and in the second case you compare a unsigned long with a long.
These operands must first be converted to a common type using the usual arithmetic conversions. These are spelled out in section 6.3.1.8p1 of the C standard, with the following excerpt focusing on integer conversions:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
In the case of comparing an unsigned int with a long the second bolded paragraph applies. long has higher rank and (assuming long is 64 bit and int is 32 bit) can hold all values than an unsigned int can, so the unsigned int operand a is converted to a long. Since the value in question is in the range of long, section 6.3.1.3p1 dictates how the conversion happens:
When a value with integer type is converted to another integer type other than
_Bool, if the value can be represented by the new type, it is unchanged
So the value is preserved and we're left with 99 > -2 which is true.
In the case of comparing an unsigned long with a long, the first bolded paragraph applies. Both types are of the same rank with different signs, so the long constant -2L is converted to unsigned long. -2 is outside the range of an unsigned long so a value conversion must happen. This conversion is specified in section 6.3.1.3p2:
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
So the long value -2 will be converted to the unsigned long value 264-2, assuming unsigned long is 64 bit. So we're left with 99 > 264-2, which is false.
Solution 3:[3]
I think what is happening here is implicit promotion by the compiler. When you perform comparison on two different primitives, the compiler will promote one of them to the same type as the other. I believe the rules are that the type with the larger possible value is used as the standard. So in foo() you are implicitly promoting your argument to a signed long type and the comparison works as expected. In bar() your argument is an unsigned long, which has a larger maximum value than signed long. Here the compiler promotes -2L to unsigned long, which turns into a very large number.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | akatz |
