'Keras simpleRNN by hand debugging
I plan to transfer my Keras model to embedded C code. In order to do that, I'm first validating in python whether my calculations are correct. There are some errors in these calculations with weights taken directly from a trained model. I'm looking for some help to debug whether there is some initial state error, some mistake in calculation, or something unaccounted for.
I have some model:
model = models.Sequential()
model.add(layers.SimpleRNN(units=48, input_shape = (n_historical,6), activation='selu'))
model.add(layers.Dense(units=4, activation="linear"))
that has been trained and validated. I have by hand, recreated and validated the SELU activation function
def selu(x, alpha = 1.6732632423543772848170429916717, scale = 1.0507009873554804934193349852946):
if x >= 0:
return scale * x
else:
return scale * alpha * (exp(x) - 1)
I obtain the weights and test data from the dataset that I have used during training:
set_of_weights = model.get_weights()
(train_Y, train_X), (test_Y, test_X) = rnn_dataset(n_historical = 5)
When obtaining the shape of each set of weights, i can determine the relation:
(6, 48) # Input -> RNN node
(48, 48) # RNN_prev -> RNN node
(48,) # RNN bias
(48, 4) # RNN -> output
(4,) # output bias
Right now I'm using an initial state which is all zero:
values_RNN_initial_state = [0] * 48
values_RNN_prev = values_RNN_initial_state
values_RNN = [0] * 48
values_O = [0] * 4
Then for each input in my input sequence, I calculate the network starting with the RNN cells :
# first determine values rnn
# it seems to be act(sum(I*wi) + sum(RNN_prev*w_rnn_prev) + bias)
for index_v_RNN in range(len(values_RNN)):
s = 0
# Node += input * weight
for index_v_I in range(len(test_X[i][index_testdata])):
s += test_X[i][index_testdata][index_v_I] * set_of_weights[0][index_v_I][index_v_RNN]
# Node += RNN_prev * weight
for index_v_RNN_prev in range(len(values_RNN_prev)):
s += values_RNN_prev[index_v_RNN_prev] * set_of_weights[1][index_v_RNN_prev][index_v_RNN]
#node output = selu(sum + bias)
values_RNN[index_v_RNN] = selu(s + set_of_weights[2][index_v_RNN])
After that calculate the output layer in a similar way
# secondly determine the values of the output layer
# it seems to be linear(sum(Rnn & w_rnn_out) + bias)
for index_v_O in range(len(values_O)):
s = 0
for index_v_RNN in range(len(values_RNN)):
s += values_RNN[index_v_RNN] * set_of_weights[3][index_v_RNN][index_v_O]
values_O[index_v_O] = linear(s + set_of_weights[4][index_v_O])
#lastly overwrite the prev_rnn with rnn such that
Lastly, overwrite the previous rnn values with the current for the next iteration
values_RNN_prev = values_RNN
Now if we compare the test data, interference with the keras model, and the calculation described above:
The keras model does not do a perfect prediction, but it can clearly be seen that it's much more stable.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|