'Is there a way to get Pandas DataFrame values based on timestamp from another column in Python?

I have rain data and sensor data that is collected on 15min intervals. What I want to do is only collect sensor data 72 hours after the last rain drop has fallen. If rain is observed between that time, the counter resets until 72 hours dry time is observed.

I converted the data to timestamp data but can't figure out the logic for the above. Links to example data as well as example tables below.

Timestamp Precipitation(mm)
2021-04-01 00:15 6
2021-04-01 00:30 0
Timestamp Sensor Depth (mm)
2021-04-01 00:15 12
2021-04-01 00:30 4

example rain data example sensor data

import pandas as pd
import matplotlib.pyplot as plt
import os
from datetime import datetime, date, time

file = pd.read_csv('example_sensor.csv')
rain = pd.read_csv('example_rain.csv')

east1_df = pd.DataFrame(file)


east1_df['Timestamp'] = pd.to_datetime(east1_df['Timestamp'], format='%Y-%m-%d %H:%M')
east1_df.index=east1__df['Timestamp']

rain['Timestamp'] = pd.to_datetime(rain['Timestamp'], format='%Y-%m-%d %H:%M')
rain.index=rain['Timestamp']#pd.DatetimeIndex([east1_spring_df['Timestamp']], dtype='datetime64[ns]', freq=None)


Solution 1:[1]

I am not aware of a pandas functionality to achieve this. However, there is a way to do this with numpy. You would just need to extract the data from the dataframe.

Using a boxcar function one can filter for events which span a certain period by convolving it with the rainfall data.

Here's a minimal example on how to achieve this using numpy:

import numpy as np
from datetime import datetime, timedelta

def datetime_range(start, end, delta):
    result = []
    current = start
    while current < end:
        result.append(current)
        current += delta
    return result

def create_boxcar(dry_hours, delta_minutes):
    n_dry = dry_hours * 60 // delta_minutes
    return np.ones(n_dry) / n_dry

def create_data(delta_minutes):
    stamps = np.array(datetime_range(datetime(2022, 2, 23), datetime(2022, 2, 28), timedelta(minutes=delta_minutes)))
    rainfall = np.random.randn(len(stamps))-1  # shifted normal distribution
    rainfall[rainfall < 0] = 0  # coerce negative values to zero
    sensor = np.arange(len(stamps))  # just a ramp
    return stamps, rainfall, sensor

delta_minutes = 15
stamps, rainfall, sensor = create_data(delta_minutes)

# get dry regions
no_rainfall = (rainfall == 0).astype(int)
# create boxcar filter with desired length
dry_hours_before_read = 3
box_filter = create_boxcar(dry_hours_before_read, delta_minutes)
# get regions with desired dry period:
# Convolve boxcar and data, look for a result of 1,
# i.e full overlap of boxcar and no_rainfall
readout_region = np.convolve(no_rainfall, box_filter, 'same') == 1

# get timestamps and values during dry period
timestamp_dry_enough = stamps[readout_region]
sensor_dry_enough = sensor[readout_region]

After that manipulation, you could feed that information back to the dataframe for further pandas-based filtering:

east1_df[f'no rain for {dry_hours_before_read} hours'] = readout_region

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1