'ValueError: Cannot convert non-finite values (NA or inf) to integer

df.dtypes
name         object
rating       object
genre        object
year          int64
released     object
score       float64
votes       float64
director     object
writer       object
star         object
country      object
budget      float64
gross       float64
company      object
runtime     float64
dtype: object

Then when i try to convert using :

df['budget'] = df['budget'].astype("int64")

it says:

ValueError                                Traceback (most recent call last)
<ipython-input-23-6ced5964af60> in <module>
      1 # Change Datatype for Columns
----> 2 df['budget'] = df['budget'].astype("int64")
      3 
      4 #df['column_name'].astype(np.float).astype("Int32")
      5 #df['gross'] = df['gross'].astype('int64')

~\anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors)
   5696         else:
   5697             # else, only a single dtype is given
-> 5698             new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors)
   5699             return self._constructor(new_data).__finalize__(self)
   5700 

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in astype(self, dtype, copy, errors)
    580 
    581     def astype(self, dtype, copy: bool = False, errors: str = "raise"):
--> 582         return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
    583 
    584     def convert(self, **kwargs):

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in apply(self, f, filter, **kwargs)
    440                 applied = b.apply(f, **kwargs)
    441             else:
--> 442                 applied = getattr(b, f)(**kwargs)
    443             result_blocks = _extend_blocks(applied, result_blocks)
    444 

~\anaconda3\lib\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors)
    623             vals1d = values.ravel()
    624             try:
--> 625                 values = astype_nansafe(vals1d, dtype, copy=True)
    626             except (ValueError, TypeError):
    627                 # e.g. astype_nansafe can fail on object-dtype of strings

~\anaconda3\lib\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy, skipna)
    866 
    867         if not np.isfinite(arr).all():
--> 868             raise ValueError("Cannot convert non-finite values (NA or inf) to integer")
    869 
    870     elif is_object_dtype(arr):

ValueError: Cannot convert non-finite values (NA or inf) to integer


Solution 1:[1]

Assuming that the budget does not contain infinite values, the problem may be because you have nan values. These values are usually allowed in floats but not in ints.

You can:

  1. Drop na values before converting
  2. Or, if you still want the na values and have a recent version of pandas, you can convert to an int type that accepts nan values (note the i is capital):

df['budget'] = df['budget'].astype("Int64")

Solution 2:[2]

Try this notice the capital "i" in Int64

df['budget'] = df['budget'].astype("Int64") 

you might have some NaN values in this column which might be the reason for this issue

From pandas docs:

Changed in version 1.0.0: Now uses pandas.NA as the missing value rather than numpy.nan

Follow the link to find out more:

https://pandas.pydata.org/pandas-docs/stable/user_guide/integer_na.html

Or you could fill the NaN/NA values with 0 and than do .astype("int64")

df['budget'] = df['budget'].fillna(0) 

Solution 3:[3]

Check for any null values present in the column. If there are no null values. Try using apply() instead of astype()

df['budget'] = df['budget'].apply("int64")

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Mohammad
Solution 2 Tomer Poliakov
Solution 3 Pulkit Chandel