'How to calculate frequency (freq) when using seasonal decomposition()
I am trying to separate seasonality, trend and residual from timeseries 'XYZ.csv' (sales data collected over 2 years of time).
[XYZ.csv contains 2 columns - date and sales. Date has been set as an index within the code.]
import pandas as pd
import statsmodels.api as sm
df = pd.read_csv('XYZ.csv')
df.date=pd.to_datetime(df.date)
df.set_index('date',inplace=True)
res = sm.tsa.seasonal_decompose
(df.colA.interpolate(),freq=?, model='additive')
resplot= res.plot()
observed = res.observed
seasonality = res.seasonal
This code works fine. The only trouble is to understand how to calculate the frequency for this time series? And if there is any predefined way in which I can do it. Thanks for any help/suggestions in advance!
Solution 1:[1]
A very brut force approach would consist in searching the period minimizing the residuals by exploring all the potential periods:
res_vs_lag = {}
for p in range(1, 250):
res = sm.tsa.seasonal_decompose(df.colA, period=p, model='additive')
res_vs_lag[p] = res.resid.abs().sum()
Then you can plot the resulting series:
pd.Series(res_vs_lag).plot()
An elegant approach would rely on autocorrelations or spectral analysis (https://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.pacf.html).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | dokteurwho |
