'wrong result on adition of numbers larger than epsilon using numpy.float128

Considering that epsilon is the smallest number that you can add to one.

I'm getting 1 instead of 1+epsilon when I perform the addition and print the result.

I've implemented a getEpsilon function. I added a print statement for debugging.

The function is implemented as follows:

def getEpsilon():
    a = np.float128(1)
    b = np.float128(1)
    c = np.float128(2)
    while a + b != a:
        b = b / c
        d = a+b
        print (F"b={b:3.50f}, d={d:3.50f}")
    return b * c

After some iterations of the while loop the value of d is just 1, but a + b != a still evaluates as True.

This is the output:

b=0.5000000000000000000000000, d=1.5000000000000000000000000
b=0.2500000000000000000000000, d=1.2500000000000000000000000
...
b=0.0000000000000004440892099, d=1.0000000000000004440892099
b=0.0000000000000002220446049, d=1.0000000000000002220446049
b=0.0000000000000001110223025, d=1.0000000000000000000000000
b=0.0000000000000000555111512, d=1.0000000000000000000000000
...
b=0.0000000000000000001084202, d=1.0000000000000000000000000
b=0.0000000000000000000542101, d=1.0000000000000000000000000

Why does a + b != a have a different behavior than d = a+b

It looks like some operation is done with 64 bits instead.

If I repeat it with the float64 equivalent type the result is (last 2 lines):

b=0.0000000000000002220446049, d=1.0000000000000002220446049
b=0.0000000000000001110223025, d=1.0000000000000000000000000


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source