'Whenever I try to train my ML model, the training loss increases exponentially
I am quite new to Tensorflow, so I'm sure I'm just making a simple mistake, but for the life of me I can't figure out what it is. I have a pre-generated dataset of images and my goal is to estimate a scalar parameter for each image.
I managed to use TF2 and Keras to create a basic ML model and I got some success, but I want more fine tuned control over the training process, so I decided to try to use the TF2 API. I followed the tutorial on the Tensorflow website, and I managed to train the sample model from the tutorial on the sample dataset provided. However, when I tried to alter the code to train a model on my custom dataset I just couldn't get it to work. Whenever I would perform an optimization step, the training loss of the model increases exponentially, and I have no idea what's causing it. Any help would be appreciated.
Here is my code as reference:
# Loading Data
data = tf.convert_to_tensor(np.load("data.npy")[:4096])
labels = tf.convert_to_tensor(np.load("labels.npy")[:4096]) # all labels are floats with values between 1000 and 2000
# data.shape = TensorShape([4096, 512, 512, 1])
# labels.shape = TensorShape([4096, 1])
# Creating Model
inLayer = Input(shape=(512, 512, 1), name="input")
outLayer = Conv2D(16, (15, 15), activation="relu")(inLayer)
outLayer = BatchNormalization()(outLayer)
outLayer = MaxPooling2D((3, 3))(outLayer)
outLayer = Conv2D(16, (9, 9), activation="relu")(outLayer)
outLayer = BatchNormalization()(outLayer)
outLayer = MaxPooling2D()(outLayer)
outLayer = Conv2D(16, (5, 5), activation="relu")(outLayer)
outLayer = BatchNormalization()(outLayer)
outLayer = MaxPooling2D()(outLayer)
outLayer = Dropout(0.2)(outLayer)
outLayer = Flatten()(outLayer)
outLayer = Dense(1, name="output", activation="linear")(outLayer)
model = Model(inLayer, outLayer)
# Creating Loss function
loss_object = tf.keras.losses.MeanSquaredError()
def loss(model, x, y, training):
y_ = model(x, training=training)
return loss_object(y_true=y, y_pred=y_)
# Creating Gradient function and optimizer
def grad(model, inputs, targets):
with tf.GradientTape() as tape:
loss_value = loss(model, inputs, targets, training=True)
return loss_value, tape.gradient(loss_value, model.trainable_variables)
optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)
# Calculating one step of optimization
loss_value, grads = grad(model, data[:32], labels[:32])
print("Step: {}, Initial Loss: {}".format(optimizer.iterations.numpy(), loss_value.numpy()))
optimizer.apply_gradients(zip(grads, model.trainable_variables))
print("Step: {}, Loss: {}".format(optimizer.iterations.numpy(), loss(model, data[:32], labels[:32], training=True).numpy()))
##############################################
Output:
Step: 0, Initial Loss: 2211676.5
Step: 1, Loss: 158504091648.0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
