Category "bert-language-model"

Does Fine-tunning Bert Model in multiple times with different dataset make it more accuracy?

i'm totally new in NLP and Bert Model. What im trying to do right now is Sentiment Analysis on Twitter Trending Hashtag ("neg", "neu", "pos") by using DistilBer

why do pooler use tanh as a activation func in bert, rather than gelu?

class BERTPooler(nn.Module): def init(self, config): super(BERTPooler, self).init() self.dense = nn.Linear(config.hidden_size, config.hidden_size) self.activati