'Keras simpleRNN by hand debugging
I plan to transfer my Keras model to embedded C code. In order to do that, I'm first validating in python whether my calculations are correct. There are some errors in these calculations with weights taken directly from a trained model. I'm looking for some help to debug whether there is some initial state error, some mistake in calculation, or something unaccounted for.
I have some model:
model = models.Sequential()
model.add(layers.SimpleRNN(units=48, input_shape = (n_historical,6), activation='selu'))
model.add(layers.Dense(units=4, activation="linear"))
that has been trained and validated. I have by hand, recreated and validated the SELU activation function
def selu(x, alpha = 1.6732632423543772848170429916717, scale = 1.0507009873554804934193349852946):
    if x >= 0:
        return scale * x
    else:
        return scale * alpha * (exp(x) - 1)
I obtain the weights and test data from the dataset that I have used during training:
set_of_weights = model.get_weights()
(train_Y, train_X), (test_Y, test_X) = rnn_dataset(n_historical = 5)
When obtaining the shape of each set of weights, i can determine the relation:
(6, 48)    # Input -> RNN node
(48, 48)   # RNN_prev -> RNN node
(48,)      # RNN bias
(48, 4)    # RNN -> output
(4,)       # output bias
Right now I'm using an initial state which is all zero:
values_RNN_initial_state =   [0] * 48
values_RNN_prev =   values_RNN_initial_state
values_RNN =        [0] * 48
values_O =          [0] * 4
Then for each input in my input sequence, I calculate the network starting with the RNN cells :
        # first determine values rnn
        # it seems to be act(sum(I*wi) + sum(RNN_prev*w_rnn_prev) + bias) 
        for index_v_RNN in range(len(values_RNN)):
            s = 0
            # Node += input * weight
            for index_v_I in range(len(test_X[i][index_testdata])):
                s += test_X[i][index_testdata][index_v_I] * set_of_weights[0][index_v_I][index_v_RNN]
            # Node += RNN_prev * weight
            for index_v_RNN_prev in range(len(values_RNN_prev)):
                s += values_RNN_prev[index_v_RNN_prev] * set_of_weights[1][index_v_RNN_prev][index_v_RNN]
            #node output = selu(sum + bias)    
            values_RNN[index_v_RNN] = selu(s + set_of_weights[2][index_v_RNN])
After that calculate the output layer in a similar way
        # secondly determine the values of the output layer
        # it seems to be linear(sum(Rnn & w_rnn_out) + bias) 
        for index_v_O in range(len(values_O)):
            s = 0
            for index_v_RNN in range(len(values_RNN)):
                s += values_RNN[index_v_RNN] * set_of_weights[3][index_v_RNN][index_v_O]
            values_O[index_v_O] = linear(s + set_of_weights[4][index_v_O])
        #lastly overwrite the prev_rnn with rnn such that 
Lastly, overwrite the previous rnn values with the current for the next iteration
values_RNN_prev = values_RNN
Now if we compare the test data, interference with the keras model, and the calculation described above:
The keras model does not do a perfect prediction, but it can clearly be seen that it's much more stable.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source | 
|---|
