'Adivce needed: Mid-layer output as regularization (preserving feature extractor)

I would appreciate advice regarding a matter. I'll describe the problem in general, then describe more in detail exactly my usecase for it (CV-related). General explanation I want to regularize a mid-layer of a neural net output as part of the loss.I want to make it so the first layers of my model don't stray too far from the layers it was initialized with (pretrained). Given L-layered NN for a classification task and the pretrained model Z, let's say I wish for the first L/2 layers to be roughly the same. Then perhaps I'd want my loss to be something along the lines of:

Ideas I have so far:

diff = cosine similarity
freeze layers up to L/2 Specific usecase Given my net is comprised of a feature-extractor initialized with that of a zero-shot model, and a head, I want my model to be finetuned on its dataset but without letting the feature extractor go wild.

Help appreciated regarding

Ideas of how to achieve this
Papers that address this Thank you!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Adivce needed: Mid-layer output as regularization (preserving feature extractor)

Sources

Related Questions