'Convert date + time strings to epoch milliseconds in dataframe column (when present)

I have a dataframe with a column called "snapshot_timestamp" where the time is in this format: 2022-05-01 23:45:47.428 (year, month, day, hour, minutes, seconds, milliseconds). Sometimes, the value isn't populated and it's currently an na value.

I want to convert the entire dataframe to epoch milliseconds.

I've tried a bunch of different ways but so far I can't get it quite right.

The closest I got was:

df['snapshot_timestamp] = pd.to_datetime(df['snapshot_timestamp']).values.astype(np.int64)

But that seems to do weird things (negative result) when there was no value for that row.

I also tried this, but as far as I could tell it didn't do anything at all?

    df['snapshot_timestamp'].apply(lambda x: parser.parse(x).timestamp() * 1000 if not pd.isna(x) else pd.NA)

I'm still trying to wrap my head around what's possible with dataframes, so any help here would be greatly appreciated!



Solution 1:[1]

Try the below. You should do better handling than a try/catch (check not a time)

import pandas as pd
import numpy as np

def apply_ms(x):
    try:
        return x.astype(np.int64) / int(1e6)
    except:
        return x


df = pd.DataFrame(["2022-05-01 23:45:47.428"])
df[0] = pd.to_datetime(df[0])
df.loc[1] = np.nan

print(df[0].apply(lambda x: apply_ms(x)))

Should return you something like:

0   2022-05-01 23:45:47.428
1                       NaT
Name: 0, dtype: datetime64[ns]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1