''Cannot add integral value to Timestamp without freq' error for ARIMA model although re-indexed with frequency
I'm trying to do a time series prediction using an ARIMA model on this series:
1960-01-01 12.7
1961-01-01 12.1
1962-01-01 12.7
1963-01-01 12.8
1964-01-01 12.3
1965-01-01 13.0
1966-01-01 12.5
1967-01-01 12.9
1968-01-01 12.9
1969-01-01 13.3
1970-01-01 13.2
1971-01-01 13.0
1972-01-01 12.6
1973-01-01 12.2
1974-01-01 12.4
1975-01-01 12.7
1976-01-01 12.6
1977-01-01 12.2
1978-01-01 12.5
1979-01-01 12.2
1980-01-01 12.2
1981-01-01 12.2
1982-01-01 12.1
1983-01-01 12.3
1984-01-01 11.7
1985-01-01 11.8
1986-01-01 11.5
1987-01-01 11.2
1988-01-01 11.0
1989-01-01 10.9
1990-01-01 10.8
1991-01-01 10.8
1992-01-01 10.6
1993-01-01 10.4
1994-01-01 10.2
1995-01-01 10.2
1996-01-01 10.2
1997-01-01 10.0
1998-01-01 9.8
1999-01-01 9.8
2000-01-01 9.6
2001-01-01 9.3
2002-01-01 9.4
2003-01-01 9.5
2004-01-01 9.1
2005-01-01 9.1
2006-01-01 9.0
2007-01-01 9.0
2008-01-01 9.0
2009-01-01 9.3
2010-01-01 9.2
2011-01-01 9.1
2012-01-01 9.4
2013-01-01 9.4
2014-01-01 9.2
2015-01-01 9.6
Name: Death rate, crude (per 1,000 people), dtype: float64
I use the following code to generate different (p, d, q) values then try each value and get the corresponding AIC, then choose the one that is related to the least AIC. Then use this (p, d, q) values in prediction.
import datetime
import warnings
import itertools
from sklearn.metrics import mean_squared_error as mse
def MAPE (A, F):
import numpy as np
n = len(A)
Av = np.array(A.values)
Fv = np.array(F.values)
mape = np.mean(np.abs((Av-Fv)/Av))*100
mape = np.around(mape, decimals= 2)
return mape
# Generate pdq combinations
p= d= q= range(7)
pdq = list(itertools.product(p, d, q))
# Choose min pdq corresponding to min AIC
warnings.filterwarnings('ignore')
param_aic = {}
for param in pdq:
try:
mod = sm.tsa.ARIMA(cmortS, order= param)
result = mod.fit()
param_aic[param] = result.aic
except:
continue
min_aic = min(param_aic.values())
min_param = ()
for pm, aic in param_aic.items():
if aic == min_aic:
min_param = pm
# Run the model with min pdq
model = sm.tsa.ARIMA(cmortS, order= min_param)
results = model.fit()
#Forecast validation
tp = ''
if min_param[1] > 0:
tp = 'levels'
else:
tp = 'linear'
train_sz = int(len(cmortS)*0.66)
train = cmortS[:train_sz]
tst = cmortS[train_sz:]
pred_strt = tst.index[0]
tst_pred = results.predict(start= pred_strt, typ= tp)
mserror = mse(tst, tst_pred)
mserror = np.round(mserror, decimals= 5)
mp = MAPE(tst, tst_pred)
print('Model order: {}, MAPE: {}%, mse: {}'.format(min_param, mp, mserror))
# Prediction
end_yr = '2050'
end_dt = pd.to_datetime(end_yr, format= '%Y')
strt_dt = pd.to_datetime('2014', format= '%Y')
Var_pred = results.predict(start= strt_dt, end= end_dt, typ = tp)
Var_pred
and I get the following error when I run it:
ValueError: Cannot add integral value to Timestamp without freq.
Although I reindexed the series with a date range with freq= 'AS', I still get the same error.
How can I solve that?
Solution 1:[1]
Changing the final few lines of your code to this format should resolve the error message:
# Prediction
strt_date = pd.to_datetime('2014-01-01 01:00:00')
end_date = pd.to_datetime('2050-01-01 01:00:00')
Var_pred = results.predict(start = strt_date, end = end_date, typ = tp)
Var_pred
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Halee |
