'Keras simpleRNN by hand debugging

I plan to transfer my Keras model to embedded C code. In order to do that, I'm first validating in python whether my calculations are correct. There are some errors in these calculations with weights taken directly from a trained model. I'm looking for some help to debug whether there is some initial state error, some mistake in calculation, or something unaccounted for.

I have some model:

model = models.Sequential()
model.add(layers.SimpleRNN(units=48, input_shape = (n_historical,6), activation='selu'))
model.add(layers.Dense(units=4, activation="linear"))

that has been trained and validated. I have by hand, recreated and validated the SELU activation function

def selu(x, alpha = 1.6732632423543772848170429916717, scale = 1.0507009873554804934193349852946):
    if x >= 0:
        return scale * x
    else:
        return scale * alpha * (exp(x) - 1)

I obtain the weights and test data from the dataset that I have used during training:

set_of_weights = model.get_weights()
(train_Y, train_X), (test_Y, test_X) = rnn_dataset(n_historical = 5)

When obtaining the shape of each set of weights, i can determine the relation:

(6, 48)    # Input -> RNN node
(48, 48)   # RNN_prev -> RNN node
(48,)      # RNN bias
(48, 4)    # RNN -> output
(4,)       # output bias

Right now I'm using an initial state which is all zero:

values_RNN_initial_state =   [0] * 48
values_RNN_prev =   values_RNN_initial_state
values_RNN =        [0] * 48
values_O =          [0] * 4

Then for each input in my input sequence, I calculate the network starting with the RNN cells :

        # first determine values rnn
        # it seems to be act(sum(I*wi) + sum(RNN_prev*w_rnn_prev) + bias) 
        for index_v_RNN in range(len(values_RNN)):
            s = 0
            # Node += input * weight
            for index_v_I in range(len(test_X[i][index_testdata])):
                s += test_X[i][index_testdata][index_v_I] * set_of_weights[0][index_v_I][index_v_RNN]

            # Node += RNN_prev * weight
            for index_v_RNN_prev in range(len(values_RNN_prev)):
                s += values_RNN_prev[index_v_RNN_prev] * set_of_weights[1][index_v_RNN_prev][index_v_RNN]

            #node output = selu(sum + bias)    
            values_RNN[index_v_RNN] = selu(s + set_of_weights[2][index_v_RNN])

After that calculate the output layer in a similar way

        # secondly determine the values of the output layer
        # it seems to be linear(sum(Rnn & w_rnn_out) + bias) 
        for index_v_O in range(len(values_O)):
            s = 0
            for index_v_RNN in range(len(values_RNN)):
                s += values_RNN[index_v_RNN] * set_of_weights[3][index_v_RNN][index_v_O]
            values_O[index_v_O] = linear(s + set_of_weights[4][index_v_O])

        #lastly overwrite the prev_rnn with rnn such that 

Lastly, overwrite the previous rnn values with the current for the next iteration

values_RNN_prev = values_RNN

Now if we compare the test data, interference with the keras model, and the calculation described above: comparison The keras model does not do a perfect prediction, but it can clearly be seen that it's much more stable.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source