'Pandas delete all rows that are not a 'datetime' type
I've got a large file with login information for a list of users. The problem is that the file includes other information in the Date column. I would like to remove all rows that are not of type datetime in the Date column. My data resembles
df:
| Name | Date |
|---|---|
| name_1 | 2012-07-12 22:20:00 |
| name_1 | 2012-07-16 22:19:00 |
| name_1 | 2013-12-16 17:50:00 |
| name_1 | 4345 # type = 'int' |
| ... | # type = 'float' |
| name_2 | 2010-01-11 19:54:00 |
| name_2 | 2010-02-06 12:10:00 |
| ... | |
| name_2 | 2012-07-18 22:12:00 |
| name_2 | 4521 |
| ... | |
| name_5423 | 2013-11-23 10:21:00 |
| ... | |
| name_5423 | 7532 |
I've tried modifying the solution to
Finding non-numeric rows in dataframe in pandas?
Remove rows where column value type is string Pandas
and How-should-I-delete-rows-from-a-DataFrame-in-Python-Pandas
to fit my needs.
The problem is that whenever I attempt the change I either get an error or the entire dataframe gets deleted
Solution 1:[1]
Use pd.to_datetime with parameter errors='coerce' to make non-dates into NaT null values. Then you can drop those rows
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
df = df.dropna(subset=['Date'])
df
Solution 2:[2]
We could use the same method as @piRSquared to make non-datetime values to NaT values and use notna (i.e. select non-NaN values) to create a boolean Series and filter df:
df = df[pd.to_datetime(df['Date'], errors='coerce').notna()]
Note that unlike @piRSquared's method, it doesn't modify the dtype of "Date" column. Whether it's a desirable behavior or not is up to you.
Output:
Name Date
0 name_1 2012-07-12 22:20:00
1 name_1 2012-07-16 22:19:00
2 name_1 2013-12-16 17:50:00
5 name_2 2010-01-11 19:54:00
6 name_2 2010-02-06 12:10:00
8 name_2 2012-07-18 22:12:00
11 name_5423 2013-11-23 10:21:00
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | piRSquared |
| Solution 2 |

