'Replace string values in pandas rows with numbers with the help of for loop
I have 10 unique values in one column from my dataframe. For example below is the dataframe
df['categories'].unique()
output is :
Electronic
Computers
Mobile Phone
Router
Food
I want to replace 'Electronic' with 1, 'Computers' with 2, 'Mobile Phone' with 3, 'Router' with 4 and 'Food' with 5. The expected output must be
df['categories'].unique()
Expected output:
1
2
3
4
5
I tried looping the df['categories'].unique(), but i'm unable to do that. Can anyone help me with this?
Solution 1:[1]
you could try this:
df['categories'] = df['categories'].astype('category').cat.codes
Solution 2:[2]
scikit-learn provides similar functionality.
This approach is optimal when you are trying to build a predictive model and the codes do not play a role:
For example, it does not matter to you that: "Computers" category will get a code of '1' or '2' or '5'.
from sklearn.preprocessing import OrdinalEncoder
enc = OrdinalEncoder()
df['categories'] = enc.fit_transform(X=df[['categories']]).astype('int')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | SergFSM |
| Solution 2 |
