'Is there a way to get Pandas DataFrame values based on timestamp from another column in Python?
I have rain data and sensor data that is collected on 15min intervals. What I want to do is only collect sensor data 72 hours after the last rain drop has fallen. If rain is observed between that time, the counter resets until 72 hours dry time is observed.
I converted the data to timestamp data but can't figure out the logic for the above. Links to example data as well as example tables below.
| Timestamp | Precipitation(mm) |
|---|---|
| 2021-04-01 00:15 | 6 |
| 2021-04-01 00:30 | 0 |
| Timestamp | Sensor Depth (mm) |
|---|---|
| 2021-04-01 00:15 | 12 |
| 2021-04-01 00:30 | 4 |
example rain data example sensor data
import pandas as pd
import matplotlib.pyplot as plt
import os
from datetime import datetime, date, time
file = pd.read_csv('example_sensor.csv')
rain = pd.read_csv('example_rain.csv')
east1_df = pd.DataFrame(file)
east1_df['Timestamp'] = pd.to_datetime(east1_df['Timestamp'], format='%Y-%m-%d %H:%M')
east1_df.index=east1__df['Timestamp']
rain['Timestamp'] = pd.to_datetime(rain['Timestamp'], format='%Y-%m-%d %H:%M')
rain.index=rain['Timestamp']#pd.DatetimeIndex([east1_spring_df['Timestamp']], dtype='datetime64[ns]', freq=None)
Solution 1:[1]
I am not aware of a pandas functionality to achieve this. However, there is a way to do this with numpy. You would just need to extract the data from the dataframe.
Using a boxcar function one can filter for events which span a certain period by convolving it with the rainfall data.
Here's a minimal example on how to achieve this using numpy:
import numpy as np
from datetime import datetime, timedelta
def datetime_range(start, end, delta):
result = []
current = start
while current < end:
result.append(current)
current += delta
return result
def create_boxcar(dry_hours, delta_minutes):
n_dry = dry_hours * 60 // delta_minutes
return np.ones(n_dry) / n_dry
def create_data(delta_minutes):
stamps = np.array(datetime_range(datetime(2022, 2, 23), datetime(2022, 2, 28), timedelta(minutes=delta_minutes)))
rainfall = np.random.randn(len(stamps))-1 # shifted normal distribution
rainfall[rainfall < 0] = 0 # coerce negative values to zero
sensor = np.arange(len(stamps)) # just a ramp
return stamps, rainfall, sensor
delta_minutes = 15
stamps, rainfall, sensor = create_data(delta_minutes)
# get dry regions
no_rainfall = (rainfall == 0).astype(int)
# create boxcar filter with desired length
dry_hours_before_read = 3
box_filter = create_boxcar(dry_hours_before_read, delta_minutes)
# get regions with desired dry period:
# Convolve boxcar and data, look for a result of 1,
# i.e full overlap of boxcar and no_rainfall
readout_region = np.convolve(no_rainfall, box_filter, 'same') == 1
# get timestamps and values during dry period
timestamp_dry_enough = stamps[readout_region]
sensor_dry_enough = sensor[readout_region]
After that manipulation, you could feed that information back to the dataframe for further pandas-based filtering:
east1_df[f'no rain for {dry_hours_before_read} hours'] = readout_region
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
