'Picking the right metric for a model ending with TimeDistributed layer
I am trying to train a model for a NER task, with the model below. I am a bit confused about the right metrics to use here, I was expecting to use a classic CategoricalCrossentropy, but:
- the model evaluates the accuracy to zero when training and testing
- however when calculating the accuracy manually it's definitely not zero
I am not familiar with the TimeDistributed layer and I think the issue might be coming from here... The shape of the output of the TD layer and the shape of my targets are the same
What am I am missing?
See below my code:
def init_model():
input_ids = tf.keras.layers.Input(shape=(SEQ_LEN,),dtype='int32')
attention_mask = tf.keras.layers.Input(shape=(SEQ_LEN,),dtype='int32')
x = backbone({'input_ids':input_ids,
'attention_mask':attention_mask})[0]
backbone.trainable = False
x = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units = 512,
activation = 'tanh',
#recurrent_dropout=.2,
dropout=.2,
return_sequences=True))(x)
#x = tf.keras.layers.LayerNormalization()(x)
x_res = tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(units = 512,
activation = 'tanh',
#recurrent_dropout=.2,
dropout=.2,
return_sequences=True))(x)
x = tf.keras.layers.add([x,x_res])
output = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(16,activation = 'softmax'))(x)
model = tf.keras.models.Model(inputs={'input_ids':input_ids,
'attention_mask':attention_mask},outputs=output)
return model
and the compiling:
loss = tf.keras.losses.CategoricalCrossentropy(name='categorical_crossentropy')
metric = tf.keras.metrics.Accuracy(name='accuracy')
opt = tf.keras.optimizers.Adam()
model.compile(optimizer=opt,loss=loss,metrics=[metric])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
