'convert pandas values to int and when containing nan values [duplicate]
I am reading a dataframe from excel. Such a sheet contains empty values.
I want to convert all the values (numbers to int) but this can not be done directly because the nan values.
this is a possible way around:
convert into int data in pandas
import pandas as pd
import numpy as np
ind = list(range(5))
values = [1.0,np.nan,3.0,4.0,5.0]
df5 = pd.DataFrame(index=ind, data={'users':values})
df5
then transform the nan to -1 which is an int
df5 = df5.replace(np.nan,-1)
df5 = df5.astype('int')
df5 = df5.replace(-1, np.nan)
but this operation transform again the data into float.
Why? how should I do it?
I dont want to have comma values, i.e. decimals, since "users" are persons.
Solution 1:[1]
Check out https://stackoverflow.com/a/51997100/11103175. There is a functionality to keep it as a NaN value by using dtype 'Int64'.
You can specify the dtype when you create the dataframe or after the fact
import pandas as pd
import numpy as np
ind = list(range(5))
values = [1.0,np.nan,3.0,4.0,5.0]
df5 = pd.DataFrame(index=ind, data={'users':values},dtype='Int64')
#df5 = df5.astype('Int64')
df5
Giving:
users
0 1
1 <NA>
2 3
3 4
4 5
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
