'How to calculate frequency (freq) when using seasonal decomposition()

I am trying to separate seasonality, trend and residual from timeseries 'XYZ.csv' (sales data collected over 2 years of time).

[XYZ.csv contains 2 columns - date and sales. Date has been set as an index within the code.]

import pandas as pd

import statsmodels.api as sm

df = pd.read_csv('XYZ.csv')

df.date=pd.to_datetime(df.date)

df.set_index('date',inplace=True)

res = sm.tsa.seasonal_decompose
(df.colA.interpolate(),freq=?, model='additive')

resplot= res.plot()

observed = res.observed

seasonality = res.seasonal

This code works fine. The only trouble is to understand how to calculate the frequency for this time series? And if there is any predefined way in which I can do it. Thanks for any help/suggestions in advance!



Solution 1:[1]

A very brut force approach would consist in searching the period minimizing the residuals by exploring all the potential periods:

res_vs_lag = {}
for p in range(1, 250):
    res = sm.tsa.seasonal_decompose(df.colA, period=p, model='additive')
    res_vs_lag[p] = res.resid.abs().sum()

Then you can plot the resulting series:

pd.Series(res_vs_lag).plot()

An elegant approach would rely on autocorrelations or spectral analysis (https://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.pacf.html).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 dokteurwho