Category "language-model"

Torch shape mismatch error while training a GPT2 model

I am trying to train a GPT2 language model for text generation tasks. I am trying to include an additional embedding layer (with POS-tagging) on top of token em

When using padding in sequence models, is Keras validation accuracy valid/ reliable?

I have a group of non zero sequences with different lengths and I am using Keras LSTM to model these sequences. I use Keras Tokenizer to tokenize (tokens start

Pretraining a language model on a small custom corpus

I was curious if it is possible to use transfer learning in text generation, and re-train/pre-train it on a specific kind of text. For example, having a pre