'How to convert a numpy array with dtype=object to a numpy array of int?

Now I have a numpy array:

In [1]: import numpy as np

In [2]: a = np.ones(10) * (1 << 64)

In [3]: a
Out[3]: 
array([1.8446744073709552e+19, 1.8446744073709552e+19,
       1.8446744073709552e+19, 1.8446744073709552e+19,
       1.8446744073709552e+19, 1.8446744073709552e+19,
       1.8446744073709552e+19, 1.8446744073709552e+19,
       1.8446744073709552e+19, 1.8446744073709552e+19], dtype=object)

For array a, all of its elements are float. I want to convert it to a numpy array with integer elements. I can do it in this way:

In [4]: np.array([int(x) for x in a])
Out[4]: 
array([18446744073709551616, 18446744073709551616, 18446744073709551616,
       18446744073709551616, 18446744073709551616, 18446744073709551616,
       18446744073709551616, 18446744073709551616, 18446744073709551616,
       18446744073709551616], dtype=object)

However, this seems stupid and doesn't seem to be a numpy-like solution. In numpy there is a function astype, but it cannot be used here:

In [5]: a.astype(int)
---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
Input In [5], in <cell line: 1>()
----> 1 a.astype(int)

OverflowError: Python int too large to convert to C long

Do there exist any better solutions to convert the array a to integers?



Solution 1:[1]

As Nechoj has pointed out:

The maximum integer you can have with numpy is a.astype(np.int64). But this is not big enough for your numbers.

But there is a workaround if you do not need the full precision. Just choose a different scale, so that the resulting numbers are small enough to be representable as int64, e.g.

a_in_quadrillions = (a / 10**15).astype(float).round().astype(int)
a_in_quadrillions
array([18447, 18447, 18447, 18447, 18447, 
       18447, 18447, 18447, 18447, 18447])

The detour via float is necessary, because round() does not accept the object data type. If you do not care about the final digit, there is a shorter solution instead:

a_in_quadrillions = (a / 10**15).astype(int)
a_in_quadrillions
array([18446, 18446, 18446, 18446, 18446, 
       18446, 18446, 18446, 18446, 18446])

Note that this does not round the numbers, but just cuts them off.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Arne