'Dropping duplicate rows ignoring case (lowercase or Uppercase)
I have a data frame with one column (col). I'm trying to remove duplicate records regardless of lowercase or Uppercase, for example
df = pd.DataFrame({'Col': ['Appliance Identification', 'Natural Language','Social networks',
'natural language', 'Personal robot', 'Social Networks', 'Natural language']})
output:
Col
0 Appliance Identification
1 Natural Language
2 Social networks
3 natural language
4 Personal robot
5 Social Networks
6 Natural language
Expected Output:
Col
0 Appliance Identification
1 Social networks
2 Personal robot
3 Natural language
How can this Dropping be done regardless of case-insensitively?
Solution 1:[1]
Convert values to lowercase and filter duplicates by Series.duplicated with invert mask by ~ in boolean indexing:
df = df[~df['Col'].str.lower().duplicated()]
print (df)
Col
0 Appliance Identification
1 Natural Language
2 Social networks
4 Personal robot
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jezrael |
