'predicting simple autoregressive model with fully connected

the question is at the end, you can just jump to the question, I just wanted to share my process, in case someone want to give me general advice.

I started learning how to use LSTM layers and tried to build a simple predictor to the following AR model:

class AR_model:
def __init__(self, length=100):
    self.time = 0
    self.first_value = 0
    self.a1 = 0.6
    self.a2 = -0.5
    self.a3 = -0.2
    self.Xt = self.first_value
    self.Xt_minus_1 = 0
    self.Xt_minus_2 = 0
    self.length = length

def __iter__(self):
    return self

def __next__(self):  # raise StopIteration
    if self.time == self.length:
        raise StopIteration

    new_value = self.a1 * self.Xt + \
                self.a2 * self.Xt_minus_1 + \
                self.a3 * self.Xt_minus_2 + \
                random.uniform(0, 0.1)

    self.Xt_minus_2 = self.Xt_minus_1
    self.Xt_minus_1 = self.Xt
    self.Xt = new_value

    self.time += 1

    return new_value

which basicly means the following series:
Xt = a1 * Xt−1 + a2 * Xt−2 + a3X * t−3 + Ut
where: a1 = 0.6, a2 = −0.5, a3 = −0.2 and Ut (i.i.d) ∼ Uniform(0, 0.1)

using the following forward method:

def forward(self, input):    
# input: [Batch x seq_length x input_size]   

x, _ = self.lstm(input)   
# x:     [Batch x seq_length x hidden_state]   

x = x[:, -1, :]       
# taking only the last   x:     [Batch x hidden_state]  

x = self.linear(x)   
# x:     [Batch x 1]   
return x

the best result seems ok: picture of results, 91 steps with the following hyper-parameters:

signal_count = 50   
signal_length = 200  
hidden_state = 200  
learning_rate = 0.1 

also tried it on sin and tri waves:
sin wave 20 steps
tri wave 75 steps
tri wave might have worked on deeper layered network but I didnt bother to try

Question 1

It make sense that for a simple AR model, such as:
Xt = a1 * Xt−1 + a2 * Xt−2 + a3X * t−3 + Ut
where: a1 = 0.6, a2 = −0.5, a3 = −0.2 and Ut (i.i.d) ∼ Uniform(0, 0.1)

It would be possible to get a good prediction with a simple three input one layered fully connected network, where the inputs are the last tree values of the AR series.

but I just get terrible result. Even when I remove the noise from the AR model I still get bad results. Am I in the wrong to think this? I didn't post the code because I think its a concept problem. If someone asks, I will post.

Question 2

for the above AR model, what simple predictor would you recommend, not necessarily based deep learning. asking friends I got recommended kalman filter, and Markovian based. haven't really checked them out yet.

Thank you for reading



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source