'Why SparseCategoricalCrossentropy is not working with this machine learning model?
I have a .csv database file which looks like this:
Day Hour N1 N2 N3 N4 N5 ... N14 N15 N16 N17 N18 N19 N20
0 1996-03-18 15:00 4 9 10 16 21 ... 48 62 66 68 73 76 78
1 1996-03-19 15:00 6 12 15 19 28 ... 63 64 67 69 71 75 77
2 1996-03-21 15:00 2 4 6 7 15 ... 52 54 69 72 73 75 77
3 1996-03-22 15:00 3 8 15 17 19 ... 49 60 61 64 67 68 75
4 1996-03-25 15:00 2 10 11 14 18 ... 55 60 61 66 67 75 79
... ... ... .. .. .. .. .. ... ... ... ... ... ... ... ...
13596 2022-01-04 22:50 17 18 22 26 27 ... 64 65 71 72 73 76 80
13597 2022-01-05 15:00 1 5 8 14 15 ... 47 54 59 67 70 72 76
13598 2022-01-05 22:50 6 7 14 15 16 ... 54 55 59 61 70 71 80
13599 2022-01-06 15:00 9 10 11 17 28 ... 51 55 65 67 72 76 78
13600 2022-01-06 22:50 1 2 6 9 11 ... 51 52 54 67 68 73 75
I have found this article: https://machinelearningmastery.com/how-to-develop-convolutional-neural-network-models-for-time-series-forecasting/
But I am trying to develop a modified version of that 1D CNN model by using softmax function on the last layer and SparseCategoricalCrossentropy() as loss function and also by adding new functions to that code making it different.
This is my code so far and the model I am trying to build and use:
# multivariate output 1d cnn example
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' # or any {'0', '1', '2'}
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
# multivariate output 1d cnn example
from numpy import array
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import *
from tensorflow.keras.losses import *
from tensorflow.keras.layers import *
import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.callbacks import ModelCheckpoint
# Define the Required Callback Function
class printlearningrate(tf.keras.callbacks.Callback):
def on_epoch_end (self, epoch, logs={}):
optimizer = self.model.optimizer
lr = K.eval(optimizer.lr)
Epoch_count = epoch + 1
print('\n', "Epoch:", Epoch_count, ', Learning Rate: {:.7f}'.format(lr))
printlr = printlearningrate()
# split a multivariate sequence into samples
def split_sequences (sequences, n_steps):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the dataset
if end_ix > len(sequences) - 1:
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
df = pd.read_csv('DrawsDB.csv')
print(df)
# df['Time'] = df[['Day', 'Hour']].agg(' '.join, axis=1)
df.insert(0, 'Time', df[['Day', 'Hour']].agg(' '.join, axis=1))
df.drop(columns=['Day', 'Hour'], inplace=True)
df.set_index('Time', inplace=True)
print(df)
numpy_array = df.to_numpy()
print(type(numpy_array))
print(numpy_array)
# choose a number of time steps
n_steps = 10
# convert into input/output
X, y = split_sequences(numpy_array, n_steps)
print(X.shape, y.shape)
# the dataset knows the number of features, e.g. 2
n_features = X.shape[2]
# Reduce learning rate when nothing happens to lower more the loss:
reduce_lr = ReduceLROnPlateau(monitor='loss', factor=0.9888888888888889,
patience=10, min_lr=0.0000001, verbose=1)
epochs = 10
# saving best model every epoch with ModelCheckpoint:
checkpoint_filepath = 'C:\\Path\\To\\Saved\\CheckPoint\\model\\'
model_checkpoint_callback = ModelCheckpoint(
filepath=checkpoint_filepath,
monitor='loss',
save_best_only=True,
save_weights_only=True,
verbose=1)
# define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation=LeakyReLU(), input_shape=(n_steps, n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation=LeakyReLU()))
model.add(Dense(n_features))
model.compile(optimizer=Nadam(lr=0.09), loss=SparseCategoricalCrossentropy(),
metrics=['accuracy', mean_squared_error, mean_absolute_error, mean_absolute_percentage_error])
# fit model
model.fit(X, y, epochs=10, verbose=2, callbacks=[printlr, reduce_lr, model_checkpoint_callback])
split_sequences function like its name says it is splitting the database by taking just N rows from it as input and trying to predict the all N+1 row from the database as the output.
However, I think there is a problem because I am getting this error every time I am trying to run the python script:
tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [32,20] and labels shape [640]
[[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits
(defined at C:\Users\UserName\AppData\Local\Programs\Python\Python37\lib\site-packages\keras\backend.py:5114)
]] [Op:__inference_train_function_1228]
Any idea on how to fix this problem, please?
Thank you in advance!
Solution 1:[1]
Assuming the labels are integers, they have the wrong shape for SparseCategoricalCrossentropy. Check the docs.
Try converting your y to one-hot encoded labels:
y = tf.keras.utils.to_categorical(y, num_classes=20)
and change your loss function to CategoricalCrossentropy:
model.compile(optimizer=Nadam(lr=0.09), loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=['accuracy', mean_squared_error, mean_absolute_error, mean_absolute_percentage_error])
and it should work.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
