'How to predict next X values in my model?

I have this program that is used to predict outflow of a reservoir and I cannot seem to get it to predict the next day worth of data. The data is in 30 minute increments and would expect 48 points of data to represent the next 24 hours. Is there something I am doing incorrect or have missed?

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import datetime as dt
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout

# Prepare data
data = pd.read_csv('data.csv')
data["Date"] = pd.to_datetime(data["Date"])

scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(data["Outlet"].values.reshape(-1, 1))

prediction_days = 30

x_train, y_train = [], []

for x in range(prediction_days, len(scaled_data)):
    x_train.append(scaled_data[x-prediction_days:x, 0])
    y_train.append(scaled_data[x, 0])

x_train, y_train = np.array(x_train), np.array(x_train)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

# Create neural network
model = Sequential()
model.add(LSTM(units=50, return_sequences=True, input_shape=(x_train.shape[1], 1)))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=50))
model.add(Dropout(0.2))
model.add(Dense(units=1))

model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x_train, y_train, epochs=25)

# Test the model
test_start = dt.datetime(2022, 1, 1)
test_end = dt.datetime.now()
actual_outlet = data[(data['Date'] >= test_start) & (data['Date'] <= test_end)]['Outlet']
total_dataset = pd.concat((data['Outlet'], actual_outlet), axis=0)

model_inputs = total_dataset[len(total_dataset) - len(actual_outlet) - prediction_days:].values
model_inputs = model_inputs.reshape(-1, 1)
model_inputs = scaler.transform(model_inputs)

x_test = []

for x in range(prediction_days, len(model_inputs)):
    x_test.append(model_inputs[x-prediction_days:x, 0])

x_test = np.array(x_test)
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))

prediction_outlet = model.predict(x_test)
prediction_outlet = scaler.inverse_transform(prediction_outlet)
print(len(prediction_outlet))

The length of my prediction_outlet comes to 1056. Have I missed some steps?



Solution 1:[1]

Your test data has 22 days of data since you are taking the slice from test_start to today. since you have 48 data points for each day it adds up to 1056 data points to predict. So you are making the right length of predictions. On the other hand, since you don't have it ordered some of the dates are mixed up. In one instance you have half of the data points from Feb-02 and the rest from Jan-13 which will be very confusing for the network. You might want to order them and make sure you don't take missing dates in your slices.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Omer Faruk Yolal