'Compute new pandas column for the number of time a date intersects a list of date ranges

I have actually solved the problem, but I am looking for advice for a more elegant / pandas-orientated solution.

I have a pandas dataframe of linkedin followers with a date field. The data looks like this:

    Date    Sponsored followers Organic followers   Total followers
0   2021-05-30  0   105 105
1   2021-05-31  0   128 128
2   2021-06-01  0   157 157
3   2021-06-02  0   171 171
4   2021-06-03  0   133 133

I have a second dataframe that contains the start and end dates for paid social campaigns. What I have done is create a list of tuples from this dataframe, where the first element in the tuple is the start date, and the second is the end date, i converted these dates to datetimes as such:

[(datetime.date(2021, 7, 8), datetime.date(2021, 7, 9)),
 (datetime.date(2021, 7, 12), datetime.date(2021, 7, 13)),
 (datetime.date(2021, 7, 13), datetime.date(2021, 7, 14)),
 (datetime.date(2021, 7, 14), datetime.date(2021, 7, 15)),
 (datetime.date(2021, 7, 16), datetime.date(2021, 7, 18)),
 (datetime.date(2021, 7, 19), datetime.date(2021, 7, 21)),
 (datetime.date(2021, 7, 30), datetime.date(2021, 8, 2)),
 (datetime.date(2021, 7, 30), datetime.date(2021, 8, 2)),
 (datetime.date(2021, 7, 30), datetime.date(2021, 8, 2)),
 (datetime.date(2021, 8, 9), datetime.date(2021, 8, 12)),
 (datetime.date(2021, 8, 12), datetime.date(2021, 8, 15)),
 (datetime.date(2021, 9, 3), datetime.date(2021, 9, 7)),
 (datetime.date(2021, 10, 22), datetime.date(2021, 11, 21)),
 (datetime.date(2021, 10, 29), datetime.date(2021, 11, 10)),
 (datetime.date(2021, 10, 29), datetime.date(2021, 11, 2)),
 (datetime.date(2021, 11, 3), datetime.date(2021, 11, 4)),
 (datetime.date(2021, 11, 5), datetime.date(2021, 11, 8)),
 (datetime.date(2021, 11, 9), datetime.date(2021, 11, 12)),
 (datetime.date(2021, 11, 12), datetime.date(2021, 11, 16)),
 (datetime.date(2021, 11, 11), datetime.date(2021, 11, 12)),
 (datetime.date(2021, 11, 25), datetime.date(2021, 11, 27)),
 (datetime.date(2021, 11, 26), datetime.date(2021, 11, 28)),
 (datetime.date(2021, 12, 8), datetime.date(2021, 12, 11))]

In order to create a new column in my main dataframe (which is a count of how many campaigns falls on any given day), I loop through each row in my dataframe, and then each element in my list using the following code:

is_campaign = []

for date in df['Date']:
    count = 0
    for date_range in campaign_dates:
        if date_range[0] <= date <= date_range[1]:
            count += 1
    is_campaign.append(count)

df['campaign'] = is_campaign

Which gives the following result:

df[df['campaign']!=0]

Date    Sponsored followers Organic followers   Total followers campaign
39  2021-07-08  0   160 160 1
40  2021-07-09  17  166 183 1
43  2021-07-12  0   124 124 1
44  2021-07-13  16  138 154 2
45  2021-07-14  22  158 180 2
... ... ... ... ... ...
182 2021-11-28  31  202 233 1
192 2021-12-08  28  357 385 1
193 2021-12-09  29  299 328 1
194 2021-12-10  23  253 276 1
195 2021-12-11  25  163 188 1

Any advice on how this could be done in a more efficient way, and specifically using pandas functionality would be appreciated.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source