'Why is my accuracy is zero after set backbone trainable?
I trained my model with frozen backbone like:
model.get_layer('efficientnet-b0').trainable = False
Now, I unfreeze backbone, compile model, start training and get accuracy closed to zero. Why? How properly fine-tune model?
Model:
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
inp1 (InputLayer) [(None, 256, 256, 3) 0
__________________________________________________________________________________________________
efficientnet-b0 (Functional) (None, None, None, 1 4049564 inp1[0][0]
__________________________________________________________________________________________________
global_average_pooling2d (Globa (None, 1280) 0 efficientnet-b0[0][0]
__________________________________________________________________________________________________
dropout (Dropout) (None, 1280) 0 global_average_pooling2d[0][0]
__________________________________________________________________________________________________
dense (Dense) (None, 512) 655872 dropout[0][0]
__________________________________________________________________________________________________
inp2 (InputLayer) [(None,)] 0
__________________________________________________________________________________________________
head/arcface (ArcMarginProduct) (None, 15587) 7980544 dense[0][0]
inp2[0][0]
__________________________________________________________________________________________________
softmax (Softmax) (None, 15587) 0 head/arcface[0][0]
==================================================================================================
Total params: 12,685,980
Trainable params: 8,636,416
Non-trainable params: 4,049,564
Solution 1:[1]
Of course accuracy is ZERO. 15k class needs a huge dataset and a very complex model to learn as well. furthermore, hyperparameters are very important like epoch, batch-size, learning rate, and so on. For instance, if you set batch-size = 1 in a binary classification, the accuracy always would be 50% (in a balanced training dataset).
you may say more about your training dataset. of course, it should be about 1.5 million if it is not that huge, you must use pre-trained models
Solution 2:[2]
It's hard to train model which is able to classify more then 1k classes (see ResNet, AlexNet and etc) using approach like this
Your model backbone has already have the necessary weights to classify the your data. When you unfreeze efficientnet-b0 you allowed gradient propagation through all model. So all weights will be updated and there is no garantie that your optimizer will do it well
To train your model you need to look on learning rate, loss function, optimizer and maybe change your architecture
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Alihdr |
Solution 2 | dkagramanyan |