'Pandas: filter dataframe with type of data
I have dataframe. It's a part
member_id event_duration domain category
0 299819 17 element.yandex.ru None
1 299819 0 mozilla.org Программы
2 299819 4 vbmail.ru None
3 299819 aaa vbmail.ru None
How filter df with type?
Usually I do it with str.contains
, maybe it's normal to specify any like
df[df.event_duration.astype(int) == True]
?
Solution 1:[1]
If all the other row values are valid as in they are not NaN
, then you can convert the column to numeric using to_numeric
, this will convert strings to NaN
, you can then filter these out using notnull
:
In [47]:
df[pd.to_numeric(df['event_duration'], errors='coerce').notnull()]
Out[47]:
member_id event_duration domain category
0 299819 17 element.yandex.ru None
1 299819 0 mozilla.org ?????????
2 299819 4 vbmail.ru None
This:
df[df.event_duration.astype(int) == True]
won't work as the string will raise an ValueError
exception as the string cannot be converted
Solution 2:[2]
Solution 3:[3]
You can use regex as well.
df[df["event_duration"].str.contains(r"^\d+$")]
Solution 4:[4]
Best_soultion:
df["event_duration"].transform(lambda x: x.fillna('') if x.dtype == 'float64' else x.float64(0))
df["event_duration"].transform(lambda x: x.replace('orange','5') if x.dtype == 'object' else x.fillna(0))
You can find all different str set in interger column.
s= set([x for x in df["event_duration"] if type(x).__name__ == "str"])
s
for ex. output:
apple
mango
Then you can filter it out like
df[df["event_duration"]!='apple']
#or
df[df["event_duration"].isin(s)==False] #or True for reverse
or coerce the error, you can do something like this
df["event_duration"] = pd.to_numeric(df["event_duration"], errors='coerce')
Solution 5:[5]
Some of the above answers seem overly complex. In most instances this should work where there are mixed datatypes in a column:
df[df['event_duration'].apply(lambda x: isinstance(x, str))]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | EdChum |
Solution 2 | Vaasha |
Solution 3 | vks |
Solution 4 | |
Solution 5 | DavidWalker |