Category "gpt"

Torch shape mismatch error while training a GPT2 model

I am trying to train a GPT2 language model for text generation tasks. I am trying to include an additional embedding layer (with POS-tagging) on top of token em

Speed up text generation of GPT-2 simple

Hi I am trying to generate a 20 token text using GPT-2 simple. It is taking me around 15 seconds to generate the sentence. AI Dungeon is taking around 4 seconds