'Python: Pandas df.fillna() function change all data type into object
I want to fill feature with null value in dataframe. But when I fill to all feature, every data type I'm filling was changed to "Object".
I have dataframe with data type:
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 umur 7832 non-null float64
1 jenis_kelamin 7840 non-null object
2 pekerjaan 7760 non-null object
3 provinsi 7831 non-null object
4 gaji 7843 non-null float64
5 is_menikah 7917 non-null object
6 is_keturunan 7917 non-null object
7 berat 7861 non-null float64
8 tinggi 7843 non-null float64
9 sampo 7858 non-null object
10 is_merokok 7917 non-null object
11 pendidikan 7847 non-null object
12 stress 7853 non-null float64
And I use fillna() for filling null value to every feature
# Feature categoric type inputation
df['jenis_kelamin'].fillna(df['jenis_kelamin'].mode()[0], inplace = True)
df['pekerjaan'].fillna(df['pekerjaan'].mode()[0], inplace = True)
df['provinsi'].fillna(df['provinsi'].mode()[0], inplace = True)
df['sampo'].fillna(df['sampo'].mode()[0], inplace = True)
df['pendidikan'].fillna(df['pendidikan'].mode()[0], inplace = True)
# Feature numeric type inputation
df['umur'].fillna(df['umur'].mean, inplace = True)
df['gaji'].fillna(df['gaji'].mean, inplace = True)
df['berat'].fillna(df['berat'].mean, inplace = True)
df['tinggi'].fillna(df['tinggi'].mean, inplace = True)
df['stress'].fillna(df['stress'].mean, inplace = True)
But after that, all feature's data type has been changed to Object:
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 umur 7917 non-null object
1 jenis_kelamin 7917 non-null object
2 pekerjaan 7917 non-null object
3 provinsi 7917 non-null object
4 gaji 7917 non-null object
5 is_menikah 7917 non-null object
6 is_keturunan 7917 non-null object
7 berat 7917 non-null object
8 tinggi 7917 non-null object
9 sampo 7917 non-null object
10 is_merokok 7917 non-null object
11 pendidikan 7917 non-null object
12 stress 7917 non-null object
I think it can be work to convert every feature with astype(), but is there any other efficient way to fill null value without change the datatype?
Solution 1:[1]
I think you are missing your brackets on .mean(), so it is filling the series with a method instead of the actual values.
You want, for example:
df['umur'].fillna(df['umur'].mean(), inplace = True)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
