'Comparing current value to all future values in pandas

I have a dataframe similar to this that is ~10,000 to ~100,000 rows:

data = [['2000-01-01', 10], ['2000-01-02', 15], ['2000-01-03', 14], 
    ['2000-01-04', 13], ['2000-01-05', 17], ['2000-01-06', 16],
    ['2000-01-09', 19], ['2000-01-10', 20], ['2000-01-11', 18]]


df = pd.DataFrame(data, columns = ['Date', 'Value'])

Creating data as such:

Date Value
2000-01-01 10
2000-01-02 15
2000-01-03 14
2000-01-04 13
2000-01-05 17
2000-01-06 16
2000-01-09 19
2000-01-10 20
2000-01-11 18

I want to compare each value to all the values ahead of it, and find the last instance where the value is equal to or lower than the current value. Output should look like this:

Date Value Latest Date Equal or Below Value
2000-01-01 10 2000-01-01
2000-01-02 15 2000-01-04
2000-01-03 14 2000-01-04
2000-01-04 13 2000-01-04
2000-01-05 17 2000-01-06
2000-01-06 16 2000-01-06
2000-01-09 19 2000-01-11
2000-01-10 20 2000-01-11
2000-01-11 18 2000-01-11

Any help appreciated.



Solution 1:[1]

One way using pandas.Series.expanding with idxmin:

s = pd.Series(df["Value"].values, 
              index=pd.to_datetime(df["Date"]).view(int)).iloc[::-1]
s = s.expanding().apply(lambda x: (x - x.iloc[0]).idxmin())
df["Latest Date"] = pd.to_datetime(s).values[::-1]

Output:

         Date  Value Latest Date
0  2000-01-01     10  2000-01-01
1  2000-01-02     15  2000-01-04
2  2000-01-03     14  2000-01-04
3  2000-01-04     13  2000-01-04
4  2000-01-05     17  2000-01-06
5  2000-01-06     16  2000-01-06
6  2000-01-09     19  2000-01-11
7  2000-01-10     20  2000-01-11
8  2000-01-11     18  2000-01-11

Explain:

Comparing each element with its descendants is same as expanding in a reversed order. That's why I did s.iloc[::-1].

Furthermore, pandas.Series.expanding can only handle if and only if the result of apply is numeric; so the set the index with view(int) that will be used for idxmin.

Solution 2:[2]

I am using these collections of functions, it works well:

from datetimerange import DateTimeRange
from datetime import datetime, timedelta, timezone

# make datetime from timestamp, thus no timezone info is attached
date = datetime.fromtimestamp(timestamp)
date = datetime.fromisoformat(datestring)

# make local timezone with time.timezone
local_tz = timezone(timedelta(seconds=-time.timezone))

# attach different timezones as you wish
datestring = "2022-01-28 00:00:00"
utc_time = datetime.fromisoformat(datestring).astimezone(timezone.utc)
local_time = datetime.fromtimestamp(datestring).astimezone(local_tz)
print(utc_time.isoformat(timespec='seconds')) 
print(local_time.isoformat(timespec='seconds'))

# shift time by adding some minutes or hours
time = datetime.fromtimestamp(timestamp_value).astimezone(local_tz)
new_time = datetime.fromtimestamp(timestamp_value).astimezone(local_tz) + timedelta(minutes=30)

Check if time is higher or lower

print(time > new_time)   # it should be False
print(time < new_time)   # it should be True

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Chris
Solution 2