'How to check the understanding of a trained model?
I'm currently training two models (BERT & MPNet) for a semantic textual similarity (STS) task using the SentenceTransformers library.
Now I want to check, if the base models and/or the trained models understand specific words/names which occur within the training dataset. I tried masking or calculating the similarity to specific categories related to the words/names but the results were hardly distinguishable.
Is there any way to check or even prove if a model understands a specific word or sequence?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
