'Whenever I try to train my ML model, the training loss increases exponentially

I am quite new to Tensorflow, so I'm sure I'm just making a simple mistake, but for the life of me I can't figure out what it is. I have a pre-generated dataset of images and my goal is to estimate a scalar parameter for each image.

I managed to use TF2 and Keras to create a basic ML model and I got some success, but I want more fine tuned control over the training process, so I decided to try to use the TF2 API. I followed the tutorial on the Tensorflow website, and I managed to train the sample model from the tutorial on the sample dataset provided. However, when I tried to alter the code to train a model on my custom dataset I just couldn't get it to work. Whenever I would perform an optimization step, the training loss of the model increases exponentially, and I have no idea what's causing it. Any help would be appreciated.

Here is my code as reference:

# Loading Data
data = tf.convert_to_tensor(np.load("data.npy")[:4096])
labels = tf.convert_to_tensor(np.load("labels.npy")[:4096]) # all labels are floats with values between 1000 and 2000

# data.shape = TensorShape([4096, 512, 512, 1])
# labels.shape = TensorShape([4096, 1])
 
# Creating Model
inLayer = Input(shape=(512, 512, 1), name="input")
 
outLayer = Conv2D(16, (15, 15), activation="relu")(inLayer)
outLayer = BatchNormalization()(outLayer)
outLayer = MaxPooling2D((3, 3))(outLayer)
 
outLayer = Conv2D(16, (9, 9), activation="relu")(outLayer)
outLayer = BatchNormalization()(outLayer)
outLayer = MaxPooling2D()(outLayer)
 
outLayer = Conv2D(16, (5, 5), activation="relu")(outLayer)
outLayer = BatchNormalization()(outLayer)
outLayer = MaxPooling2D()(outLayer)
 
outLayer = Dropout(0.2)(outLayer)
outLayer = Flatten()(outLayer)
 
outLayer = Dense(1, name="output", activation="linear")(outLayer)
 
model = Model(inLayer, outLayer)
 
# Creating Loss function
loss_object = tf.keras.losses.MeanSquaredError()
 
def loss(model, x, y, training):
    y_ = model(x, training=training)
    return loss_object(y_true=y, y_pred=y_)
 
# Creating Gradient function and optimizer
def grad(model, inputs, targets):
    with tf.GradientTape() as tape:
        loss_value = loss(model, inputs, targets, training=True)
    return loss_value, tape.gradient(loss_value, model.trainable_variables)
 
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
 
# Calculating one step of optimization
loss_value, grads = grad(model, data[:32], labels[:32])
 
print("Step: {}, Initial Loss: {}".format(optimizer.iterations.numpy(), loss_value.numpy()))
optimizer.apply_gradients(zip(grads, model.trainable_variables))
print("Step: {}, Loss: {}".format(optimizer.iterations.numpy(), loss(model, data[:32], labels[:32], training=True).numpy()))
 
 
##############################################
Output:
Step: 0, Initial Loss: 2211676.5
Step: 1, Loss: 158504091648.0

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Whenever I try to train my ML model, the training loss increases exponentially

Sources

Related Questions