'The `start` argument could not be matched to a location related to the index of the data

I don't know why my 'start' pred won't work. I added some edits to pd.to_datetime but they didn't work.

This is my code:

pred = results.get_prediction(start=pd.to_datetime('2018-06-01'), dynamic=False)
pred_ci = pred.conf_int()
ax = y['2015':].plot(label='observed')
pred.predicted_mean.plot(ax=ax, label='One-step ahead Forecast', alpha=.7, figsize=(14, 4))
ax.fill_between(pred_ci.index,
                pred_ci.iloc[:, 0],
                pred_ci.iloc[:, 1], color='k', alpha=.2)
ax.set_xlabel('Date')
ax.set_ylabel('Retail_sold')
plt.legend()
plt.show()

and the log of my error always refers to my time format, I had to resample my data before, from daily data to monthly data before I started the analysis of the data and workaround the data, but I don't know why my data can't be read using pd.todatetime.

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 1546300800000000000

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2896             try:
-> 2897                 return self._engine.get_loc(key)
   2898             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

KeyError: Timestamp('2019-01-01 00:00:00')

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 1546300800000000000

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
11 frames
pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

KeyError: Timestamp('2019-01-01 00:00:00')

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

KeyError: Timestamp('2019-01-01 00:00:00')

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/statsmodels/tsa/base/tsa_model.py in _get_prediction_index(self, start, end, index, silent)
    522             start, start_index, start_oos = self._get_index_label_loc(start)
    523         except KeyError:
--> 524             raise KeyError('The `start` argument could not be matched to a'
    525                            ' location related to the index of the data.')
    526         if end is None:

KeyError: 'The `start` argument could not be matched to a location related to the index of the data.'

I used Google Colab and Python 3.7.

Does anyone have the solution of my problem?



Solution 1:[1]

The underlying problem here is that your data doesn't have an index with an associated frequency, because your data skips days (for example going from 2016/2/5 to 2016/2/14).

Solution 2:[2]

In a similar issue my problem was that I was passing as value data a list and I needed to convert in a pd serie

 data_to predict = pd.Series(imput_data, index=myIndex)

Solution 3:[3]

You should set the dataset's index with time columns like this:

df["Time"] = pd.to_datetime(df['Time'], infer_datetime_format=True)
df = df.set_index(["Time"])

Solution 4:[4]

you can try the following:

predictions = results.predict(start=train_data.shape[0],end=(train_data.shape[0]+test_data.shape[0]-1), dynamic=False)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 user1670642
Solution 2 Enrique Benito Casado
Solution 3 RiveN
Solution 4 allexiusw