'Python: data type conversion of numpy array containing NaN
I am incurring in a behaviour that I cannot comprehend.
Problem description
I was hoping that trying to convert the data type of an array to integer would raise the "classic" ValueError: cannot convert float NaN to integer when the array contains NaN.
This unfortunately does not happen.
At a first glance this seems not happening because the NaN contained in a numpy array are converted to numpy.float64 instead of "remaining" float as per numpy.ndarray documentation.
In fact, consider this example:
import numpy as np
arr = np.array([np.nan, 1])
print(type(arr[0])) # Output: <class 'numpy.float64'>
# Converting numpy.float64 NaN to other data types
## numpy integers
print( np.int8(arr[0])) # Output: 0
print(np.int32(arr[0])) # Output: 0
print(np.int32(arr[0])) # Output: -2147483648 (i.e. -2 ** 31)
print(np.int64(arr[0])) # Output: -9223372036854775808 (i.e. -2 ** 63)
## numpy unsigned integers
print( np.uint8(arr[0])) # Output: 0
print(np.uint16(arr[0])) # Output: 0
print(np.uint32(arr[0])) # Output: 0
print(np.uint64(arr[0])) # Output: 9223372036854775808
## numpy floats
print(np.float16(arr[0])) # Output: NaN
print(np.float32(arr[0])) # Output: NaN
print(np.float64(arr[0])) # Output: NaN
# Converting numpy.float NaN to other data types
## np.float and np.int are aliases to float and int respectively (both np.float and np.int are deprecated)
type(np.nan) == type(np.float(np.nan)) # Output: True
print(np.int(np.nan)) # Output: ValueError: cannot convert float NaN to integer
## numpy integers
print( np.int8(np.nan)) # Output: ValueError: cannot convert float NaN to integer
print(np.int16(np.nan)) # Output: ValueError: cannot convert float NaN to integer
print(np.int32(np.nan)) # Output: ValueError: cannot convert float NaN to integer
print(np.int64(np.nan)) # Output: ValueError: cannot convert float NaN to integer
## numpy unsigned integers
print( np.uint8(np.nan)) # Output: ValueError: cannot convert float NaN to integer
print(np.uint16(np.nan)) # Output: ValueError: cannot convert float NaN to integer
print(np.uint32(np.nan)) # Output: ValueError: cannot convert float NaN to integer
print(np.uint64(np.nan)) # Output: ValueError: cannot convert float NaN to integer
## numpy floats
print(np.float16(np.nan)) # Output: NaN
print(np.float32(np.nan)) # Output: NaN
print(np.float64(np.nan)) # Output: NaN
Questions
- Why the elements of
arrare converted tonumpy.float64instead offloatin the first place? - In what exactly do python built-in
floatandnumpy.float64differ? - Why
np.int32(arr)andnp.int64(arr)contain the "smallest possible int" whilenp.int8(arr)andnp.int16(arr)contain zeros? (Similar question for unsigned integer types)
Edit
I am experiencing this behaviour on multiple platform/python/numpy versions, see the three examples below.
import platform
import sys
import numpy
print('Platform:', platform.platform())
print('Python version:', sys.version)
print('numpy.__version__:', numpy.__version__)
WSL
Platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Python version: 3.9.7 (default, Mar 3 2022, 13:49:04)
[GCC 9.3.0]
numpy.__version__: 1.21.5
Windows
Platform: Windows-10-10.0.22593-SP0
Python version: 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:26:21) [MSC v.1929 64 bit (AMD64)]
numpy.__version__: 1.21.0
Online W3Schools
Platform: Linux-4.19.0-18-amd64-x86_64-with-glibc2.29
Python version: 3.8.2 (default, Mar 13 2020, 10:14:16)
[GCC 9.3.0]
numpy.__version__: 1.18.2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
