'python pandas: filter out rows with multiple conditions
I need to fix a CSV file. When I read it via pandas it shows me just one column but it has multiple.
So I split the column:
df = df['test_column'].str.split(' ', expand = True)
and got 168 rows.
I changed the name of the columns:
df.set_axis(list(range(1, 169)), axis = 1, inplace = True)
I am currently looking each column if it is completely empty or it has value in it with:
a = 122 #just a column name
df = df[df[a].notnull()]
print(df[a].to_string())
The Problem here is, even the specific row is empty it still shows me. I assume there is just spaces (" ").
So how can do multiple conditions?
Solution 1:[1]
IIUC replace all empty strings or spaces to missing values first:
#removed ' ', by default plitting by arbitrary space
df = df['test_column'].str.split(expand = True)
#starting columns by 1
df.columns += 1
df = df.replace(r'^\s*$', np.nan, regex=True)
a = 122 #just a column name
df = df[df[a].isna()]
print (df)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
