'Keras - Adding loss to intermediate layer while ignoring the last layer

I've created the following Keras custom model:

import tensorflow as tf
from tensorflow.keras.layers import Layer

class MyModel(tf.keras.Model):
    def __init__(self, num_classes):
        super(MyModel, self).__init__()
        self.dense_layer = tf.keras.layers.Dense(num_classes,activation='softmax')
        self.lambda_layer = tf.keras.layers.Lambda(lambda x: tf.math.argmax(x, axis=-1))

    
    def call(self, inputs):
        x = self.dense_layer(inputs)
        x = self.lambda_layer(x)
        return x

    # A convenient way to get model summary 
    # and plot in subclassed api
    def build_graph(self, raw_shape):
        x = tf.keras.layers.Input(shape=(raw_shape))
        return tf.keras.Model(inputs=[x], 
                              outputs=self.call(x))

The task is multi-class classification. Model consists of a dense layer with softmax activation and a lambda layer as a post-processing unit that converts the dense output vector to a single value (predicted class).

The train targets are a one-hot encoded matrix like so:

[
   [0,0,0,0,1]
   [0,0,1,0,0]
   [0,0,0,1,0]
   [0,0,0,0,1]
]

It would be nice if I could define a categorical_crossentropy loss over the dense layer and ignore the lambda layer while still maintaining the functionality and outputting a single value when I call model.predict(x).

Please note
My workspace environment doesn't allow me to use a custom training loop as suggested by @alonetogether excellent answer.

Solution 1:^[1]

You can try using a custom training loop, which is pretty straightforward IMO:

import tensorflow as tf
from tensorflow.keras.layers import Layer

class MyModel(tf.keras.Model):
    def __init__(self, num_classes):
        super(MyModel, self).__init__()
        self.dense_layer = tf.keras.layers.Dense(num_classes,activation='softmax')
        self.lambda_layer = tf.keras.layers.Lambda(lambda x: tf.math.argmax(x, axis=-1))

    
    def call(self, inputs):
        x = self.dense_layer(inputs)
        x = self.lambda_layer(x)
        return x

    # A convenient way to get model summary 
    # and plot in subclassed api
    def build_graph(self, raw_shape):
        x = tf.keras.layers.Input(shape=(raw_shape))
        return tf.keras.Model(inputs=[x], 
                              outputs=self.call(x))
        
n_classes = 5
model = MyModel(n_classes)
labels = tf.keras.utils.to_categorical(tf.random.uniform((50, 1), maxval=5, dtype=tf.int32))
train_dataset = tf.data.Dataset.from_tensor_slices((tf.random.normal((50, 1)), labels)).batch(2)
optimizer = tf.keras.optimizers.Adam()
loss_fn = tf.keras.losses.CategoricalCrossentropy()
epochs = 2
for epoch in range(epochs):
    print("\nStart of epoch %d" % (epoch,))
    for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):
        with tf.GradientTape() as tape:
            logits = model.layers[0](x_batch_train)
            loss_value = loss_fn(y_batch_train, logits)

        grads = tape.gradient(loss_value, model.trainable_weights)
        optimizer.apply_gradients(zip(grads, model.trainable_weights))

And prediction:

print(model.predict(tf.random.normal((1, 1))))

[3]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1

'Keras - Adding loss to intermediate layer while ignoring the last layer

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]