'Python Pandas - Finding the number of events between two dates (1 data frame and 1 list)

I have a csv file with 2 datetimes (pre-start and pre-end) in each row, as well as a list of datetimes (install_list).

I am trying to iterate through the csv file and add a column that returns the total number of dates from the install_list that are between the pre-start time and the pre-end time in each row.

I am using the code below, but it is returning the total number of items in the list for each row in the csv.

example: File 1 = start time, end time List 1 = install time

Desired Result for Each Row = IF Install Time >= Start Time AND Install Time <= End Time, SUM(Installs)

Col1 (Start Time): 1/1/21 12:00:00 PM

Col2 (End Time): 1/1/21 12:10:00 PM

Install Time List = [1/1/21 12:05:00 PM, 1/1/21 12:11:00 PM]

Desired Result for Row1/Col3 = 1

Code Below:

import datetime
import pandas as pd
from collections import Counter

df_post_logs = pd.read_csv('logs_merged.csv',index_col=False)
df_installs = pd.read_csv('install_merge.csv',index_col=False)

'''Convert UTC to EST on Installs Add Column'''

df_installs['conversion date'] = pd.to_datetime(df_installs['conversion date'],infer_datetime_format='%Y-%m-%d')
df_installs['conversion time'] = pd.to_datetime(df_installs['conversion time'],infer_datetime_format='%H:%S:%M')

utc_datetime = df_installs['conversion time']
est_datetime = utc_datetime - datetime.timedelta(hours=5)


df_installs['utc datetime'] = utc_datetime
df_installs['est datetime'] = est_datetime

'''Add Column 10 Minutes Pre-Spot Time to Post Logs/10 Minutes Post Time to Spot'''

df_post_logs['Air Date'] = pd.to_datetime(df_post_logs['Air Date'],infer_datetime_format='%Y-%m-%d')
df_post_logs['Air Time'] = pd.to_datetime(df_post_logs['Air Time'],infer_datetime_format='%H:%S:%M')

timestamp = df_post_logs['Air Time']

df_post_logs['timestamp'] = timestamp
df_post_logs['pre spot time start'] = timestamp - datetime.timedelta(minutes=10, seconds=1)
df_post_logs['pre spot time end'] = timestamp - datetime.timedelta(seconds=1)
df_post_logs['post spot time'] = timestamp + datetime.timedelta(minutes=10)

'''SUM of Installs between pre-spot time'''

install_list = pd.to_datetime(df_installs['est datetime']).to_list()

for pre_spot_start in df_post_logs['pre spot time start']:
    pre_spot_start_time = pre_spot_start

for pre_spot_end in df_post_logs['pre spot time end']:
    pre_spot_end_time = pre_spot_end

for pre_spot_end in df_post_logs['pre spot time end']:
    pre_spot_end_time = pre_spot_end

pre_spot_install = 0

for row in df_post_logs:
    for date in install_list:
        if date >= pre_spot_start_time and date <= pre_spot_end_time:
            pre_spot_install = pre_spot_install+1

df_post_logs['Pre Spot Install'] = pre_spot_install

df_post_logs.to_csv('Test.csv')


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source