'Learn checksum rule with Keras

I have a set of data with fixed 9 numbers ahead with 1 digit at last position as checksum in an unknown rule. I try to build a learning model to find it out with Keras, but fail to train.

So I generate the test data with specific rule that the checksum is mod 10, but still fail to train. I one-hot-encode the 9 numbers to form dataset into shape of (N,9,10), and then send to dense layer, with loss of crossentropy.

Here is my code:

import numpy as np
from keras import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.utils import to_categorical

# generate test data
test_input = []
test_output = []
for _ in range(10000):
    value = int(np.round(np.random.rand()*1E9,0))
    chk = value % 10
    no = str(value).rjust(9, '0')
    test_input.append(no)
    test_output.append(chk)

test_input = [[int(s) for s in c_no] for c_no in test_input]
test_input = to_categorical(test_input)
test_output = to_categorical(test_output)

# build model
model = Sequential()
model.add(Dense(50, input_shape=(9, 10), activation='relu'))
model.add(Dense(30, activation='relu'))
model.add(Dense(20, activation='relu'))
model.add(Flatten())
model.add(Dense(10))
model.summary()

# train model
epoch_num = 20
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
history = model.fit(test_input, test_output, epochs=epoch_num, verbose=2, batch_size=50, validation_split=0.2)

However, even with easy checksum rule like this, my model still cannot be trained successfully. The loss does not decrease and the accuracy remains around 0.1.

I'd like to know what mistake I made, thanks!



Solution 1:[1]

you just need to add softmax activation function in the last dense layer

model.add(Dense(10, activation= 'softmax'))

Just by this, I got an accuracy of 1.0 in 14th epoch, Following the output

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_8 (Dense)             (None, 9, 50)             550       
                                                                 
 dense_9 (Dense)             (None, 9, 30)             1530      
                                                                 
 dense_10 (Dense)            (None, 9, 20)             620       
                                                                 
 flatten_2 (Flatten)         (None, 180)               0         
                                                                 
 dense_11 (Dense)            (None, 10)                1810      
                                                                 
=================================================================
Total params: 4,510
Trainable params: 4,510
Non-trainable params: 0
_________________________________________________________________
Epoch 1/20
160/160 - 1s - loss: 2.2983 - accuracy: 0.1203 - val_loss: 2.2850 - val_accuracy: 0.1520 - 865ms/epoch - 5ms/step
Epoch 2/20
160/160 - 0s - loss: 2.2735 - accuracy: 0.1605 - val_loss: 2.2584 - val_accuracy: 0.1925 - 337ms/epoch - 2ms/step
Epoch 3/20
160/160 - 0s - loss: 2.2449 - accuracy: 0.2153 - val_loss: 2.2252 - val_accuracy: 0.2585 - 343ms/epoch - 2ms/step
Epoch 4/20
160/160 - 0s - loss: 2.2050 - accuracy: 0.2892 - val_loss: 2.1759 - val_accuracy: 0.3275 - 339ms/epoch - 2ms/step
Epoch 5/20
160/160 - 0s - loss: 2.1401 - accuracy: 0.3769 - val_loss: 2.0904 - val_accuracy: 0.4140 - 342ms/epoch - 2ms/step
Epoch 6/20
160/160 - 0s - loss: 2.0201 - accuracy: 0.4697 - val_loss: 1.9275 - val_accuracy: 0.5145 - 330ms/epoch - 2ms/step
Epoch 7/20
160/160 - 0s - loss: 1.7985 - accuracy: 0.5654 - val_loss: 1.6385 - val_accuracy: 0.6090 - 339ms/epoch - 2ms/step
Epoch 8/20
160/160 - 0s - loss: 1.4392 - accuracy: 0.6821 - val_loss: 1.2205 - val_accuracy: 0.7570 - 321ms/epoch - 2ms/step
Epoch 9/20
160/160 - 0s - loss: 1.0012 - accuracy: 0.8110 - val_loss: 0.7857 - val_accuracy: 0.8690 - 336ms/epoch - 2ms/step
Epoch 10/20
160/160 - 0s - loss: 0.6177 - accuracy: 0.9005 - val_loss: 0.4681 - val_accuracy: 0.9380 - 324ms/epoch - 2ms/step
Epoch 11/20
160/160 - 0s - loss: 0.3714 - accuracy: 0.9532 - val_loss: 0.2821 - val_accuracy: 0.9745 - 331ms/epoch - 2ms/step
Epoch 12/20
160/160 - 0s - loss: 0.2273 - accuracy: 0.9834 - val_loss: 0.1801 - val_accuracy: 0.9910 - 339ms/epoch - 2ms/step
Epoch 13/20
160/160 - 0s - loss: 0.1420 - accuracy: 0.9980 - val_loss: 0.1104 - val_accuracy: 0.9995 - 356ms/epoch - 2ms/step
Epoch 14/20
160/160 - 0s - loss: 0.0914 - accuracy: 1.0000 - val_loss: 0.0733 - val_accuracy: 1.0000 - 334ms/epoch - 2ms/step
Epoch 15/20
160/160 - 0s - loss: 0.0620 - accuracy: 1.0000 - val_loss: 0.0520 - val_accuracy: 1.0000 - 327ms/epoch - 2ms/step
Epoch 16/20
160/160 - 0s - loss: 0.0447 - accuracy: 1.0000 - val_loss: 0.0390 - val_accuracy: 1.0000 - 331ms/epoch - 2ms/step
Epoch 17/20
160/160 - 0s - loss: 0.0340 - accuracy: 1.0000 - val_loss: 0.0302 - val_accuracy: 1.0000 - 324ms/epoch - 2ms/step
Epoch 18/20
160/160 - 0s - loss: 0.0269 - accuracy: 1.0000 - val_loss: 0.0245 - val_accuracy: 1.0000 - 337ms/epoch - 2ms/step
Epoch 19/20
160/160 - 0s - loss: 0.0220 - accuracy: 1.0000 - val_loss: 0.0204 - val_accuracy: 1.0000 - 319ms/epoch - 2ms/step
Epoch 20/20
160/160 - 0s - loss: 0.0185 - accuracy: 1.0000 - val_loss: 0.0173 - val_accuracy: 1.0000 - 330ms/epoch - 2ms/step

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ASLAN