'ARIMA forecasting flat line
I browsed most of the posts with a similar problem but I couldn't solve my problem. I am new to R and I am trying to forecast consumption based on past values. There isn't much data(8 days in a 5 minute interval). A sample is as follows: Time and Power consumption sample data
# Libraries required
library(dplyr)
library(tseries)
library(lubridate)
library(ggplot2)
library(forecast)
# Code start
power_use <- read.csv("Power Use.csv")
time_vector <- dmy_hm(power_use$ï..Time)
ac13 = power_use$power.shelly1pm.8CAAB57782C4
# Fill missing data using "Impute"
# Find the columns with the missing data
list_na <- colnames(power_use)[ apply(power_use, 2, anyNA) ]
list_na
# Compute the mean of each column in the data while ignoring missing data
avg_missing <- apply(power_use[,colnames(power_use) %in% list_na],2,mean,na.rm = TRUE)
# Replace NA values with the median value computed in the previous step
power_use_replaced <- power_use %>%
mutate(ac13_mean_replaced = ifelse(is.na(ac13), avg_missing[14], ac13))
sum(is.na(power_use_replaced$power.shelly1pm.8CAAB57782C4))
sum(is.na(power_use_replaced$ac13_mean_replaced))
head(power_use_replaced)
# de-trending and de-seasonalizing
AC13 <- power_use_replaced$ac13_mean_replaced
plot(time_vector,AC13,type= "l")
noSnoT = diff(AC13)
noSnoT
plot1 = plot(time_vector[2:length(time_vector)],noSnoT,type = "l", xlab = "Time", ylab = "Energy in kWh")
# Dickey-Fuller Test
Stationarity = adf.test(noSnoT, alternative = "stationary")
if (Stationarity$p.value <= 0.01) {
print("AC13 is a stationary data set")
}
# ACF and PACF(Partial Auto Correlation Factor)
acf(AC13) # The decaying plot shows that AC13 is not stationary
pacf(AC13)
acf(noSnoT)
pacf(noSnoT)
# Splitting data into training and validation period
train_per <- 0.8
limit <- floor(train_per * length(noSnoT))
train_noSnoT <- noSnoT[1:limit]
train_noSnoT
validation_noSnoT <- noSnoT[(limit+1):length(noSnoT)]
validation_noSnoT
arima_noSnoT = arima(train_noSnoT, order=c(0,0,1))
arima_noSnoT
checkresiduals(arima_noSnoT)
forecast_noSnoT = forecast(arima_noSnoT,h=length(validation_noSnoT), level=c(95))
forecast_noSnoT
plot(forecast_noSnoT)
Can anyone enlighten me as to how to solve my problem? I am getting a flat line validation period which does not match the data in that time period.
Solution 1:[1]
Generally it is a good approach to search for the right order of arima models, but I would rather recommend using tools like auto.arima which automatically find the best order.
The order should be determined by the forecasting accuracy evaluated with test data and not by the appearance of the curve. Rob Hyndman said multiple times, that a flat time series forecast might hold better results than one with seasonality and trend in many occasions.
If you want to avoid flat forecasts, I would recommend models like FB prophet or linear_regression from parsnip, as they tend to capture the curves dynamic better.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Leonhard Geisler |
