'Date and time conversion using Pandas & Python to create Ts

I'm trying to get Ts using my existing data of data and time, which looks like

(Pdb) df[0][:7] 
0    [Data & Time]
1     Jan 01 08:00:01.193
2     Jan 01 08:00:01.275
3     Jan 01 08:00:01.294
4     Jan 01 08:00:01.335
5     Jan 01 08:00:01.357
6     Jan 01 08:00:01.392
Name: 0, dtype: object

So using above data, I'm planning to insert another column 'Ts' with value like df[0][i] - df[0][i-1].

so the plan is

  1. split data into two column Jan 01 08:00:01.193 -> 01/01 08:00:01.193
  2. insert new column 'Ts', makes 3 columns in total.

I'm looking into regex to split columns, but I'm wondering if it's possible to access and execute split all data without using for loop? I've learned little about vectorization, but I cannot imagine how to with integration of regex...?



Solution 1:[1]

Given a properly formatted DataFrame:

             timestamp
0  Jan 01 08:00:01.193
1  Jan 01 08:00:01.275
2  Jan 01 08:00:01.294
3  Jan 01 08:00:01.335
4  Jan 01 08:00:01.357
5  Jan 01 08:00:01.392

We can first convert it to datetime:

df.timestamp = pd.to_datetime(df.timestamp, format='%b %d %H:%M:%S.%f')

Then make the two new columns:

df['day/month'] = df.timestamp.dt.strftime('%d/%m')
df['time'] = df.timestamp.dt.time

Output:

                timestamp day/month             time
0 1900-01-01 08:00:01.193     01/01  08:00:01.193000
1 1900-01-01 08:00:01.275     01/01  08:00:01.275000
2 1900-01-01 08:00:01.294     01/01  08:00:01.294000
3 1900-01-01 08:00:01.335     01/01  08:00:01.335000
4 1900-01-01 08:00:01.357     01/01  08:00:01.357000
5 1900-01-01 08:00:01.392     01/01  08:00:01.392000

Edit:

By the way, this isn't regex. They're strftime() and strptime() Format Codes.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1