'Problems lowercasing each word in a pandas dataframe column with lists of strings [duplicate]
As the title says, I'm trying to lowercase each element in a list of strings on a dataframe column.
Example of what I have:
df
A
0 [Verapamil hydrochloride]
1 [Simvastatin]
2 [Sulfamethoxazole, Trimethoprim]
Example of what I want to have:
df
A
0 [verapamil hydrochloride]
1 [simvastatin]
2 [sulfamethoxazole, trimethoprim]
I tried using:
df['A'].apply(lambda x: [w.lower() for w in x])
but it outputs:
TypeError: 'float' object is not iterable
When checking individually it does not identify any floats
type(df['A'][0])
#Out: list
type(df['A'][0][0])
#Out: str
I'm doing this because I want to compare lists later using set(), because not only the elements in the other lists can have the strings in lowercase, but can also change the order within the lists.
I don't really know what to do, because I can't find the reasons for that error. Is there an alternative?
Solution 1:[1]
import pandas as pd
df = pd.read_csv('DCI.csv')
df['ActiveSubstances'] = df['ActiveSubstances'].astype(str)
df['ActiveSubstances'] = df.apply(lambda row: row['ActiveSubstances'].lower(), axis=1)
print(df)
Output
ActiveSubstances
0 ['verapamil hydrochloride']
1 ['verapamil hydrochloride']
2 ['verapamil hydrochloride']
3 ['simvastatin']
4 ['simvastatin']
... ...
192520 ['doxepin hydrochloride']
192521 ['doxepin hydrochloride']
192522 ['ethosuximide']
192523 ['fludrocortisone acetate']
192524 ['sulfamethoxazole', 'trimethoprim']
[192525 rows x 1 columns]
Converting to str and then applying lower() solves it.
Solution 2:[2]
You can use:
variable.lowercase()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | user2736738 |
| Solution 2 | MEHUL VERMA |
