'pandas interpolation and extrapolation by timestamp id by id
i want interpolation and extrapolation(Linear interpolation) by timestamp id by id.
timestamp starts 1383260400000, ends 1383343800000 and another id(from 1 to 2025) has same issues.
expected :
| timestamp | id | strength |
|---|---|---|
| 1383260400000 | 1 | -0.3803901328171995 |
| 1383261000000 | 1 | -0.42196042219455937 |
| 1383261600000 | 1 | Linear interpolated data |
| 1383262200000 | 1 | Linear interpolated data |
| 1383262800000 | 1 | Linear interpolated data |
| 1383263400000 | 1 | Linear interpolated data |
| 1383264000000 | 1 | Linear interpolated data |
| 1383264600000 | 1 | Linear interpolated data |
| 1383265200000 | 1 | -0.460714706261982 |
and some are have start timestamp(1383260400000) or end timestamp(1383343800000).
| timestamp | id | strength |
|---|---|---|
| 1383260400000 | 3 | Linear interpolated data |
| 1383261000000 | 3 | Linear interpolated data |
| 1383261600000 | 3 | Linear interpolated data |
| 1383262200000 | 3 | Linear interpolated data |
| 1383262800000 | 3 | some values |
| 1383263400000 | 3 | some values |
| .... | ||
| 1383343800000 | 3 | Linear interpolated data |
i want to extrapolate and interpolate missing values.
here is code other did :
def interpolation():
# from 1383260400000 to 1383343800000
r = pd.date_range(pd.to_datetime(1383260400000, unit='ms'),
pd.to_datetime(1383343800000, unit='ms'),
freq='10Min')
# df3['strength'] = df3['strength'].replace(0, np.nan)
ids = df3['id'].unique()
mux = pd.MultiIndex.from_product([r, ids], names=['timestamp', 'id'])
f = lambda x: x.interpolate(limit_area='inside', limited_direction='both')
df4 = (df3.set_index(['timestamp', 'id'])
.reindex(mux)
.groupby('id')['strength']
.transform(f).fillna(0)
.reset_index())
return df4
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
