'Convert date + time strings to epoch milliseconds in dataframe column (when present)
I have a dataframe with a column called "snapshot_timestamp" where the time is in this format: 2022-05-01 23:45:47.428 (year, month, day, hour, minutes, seconds, milliseconds). Sometimes, the value isn't populated and it's currently an na value.
I want to convert the entire dataframe to epoch milliseconds.
I've tried a bunch of different ways but so far I can't get it quite right.
The closest I got was:
df['snapshot_timestamp] = pd.to_datetime(df['snapshot_timestamp']).values.astype(np.int64)
But that seems to do weird things (negative result) when there was no value for that row.
I also tried this, but as far as I could tell it didn't do anything at all?
df['snapshot_timestamp'].apply(lambda x: parser.parse(x).timestamp() * 1000 if not pd.isna(x) else pd.NA)
I'm still trying to wrap my head around what's possible with dataframes, so any help here would be greatly appreciated!
Solution 1:[1]
Try the below. You should do better handling than a try/catch (check not a time)
import pandas as pd
import numpy as np
def apply_ms(x):
try:
return x.astype(np.int64) / int(1e6)
except:
return x
df = pd.DataFrame(["2022-05-01 23:45:47.428"])
df[0] = pd.to_datetime(df[0])
df.loc[1] = np.nan
print(df[0].apply(lambda x: apply_ms(x)))
Should return you something like:
0 2022-05-01 23:45:47.428
1 NaT
Name: 0, dtype: datetime64[ns]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
