Category "transformer"

Temporal Fusion Transformer in savedModel format

I am trying to save the model from here https://github.com/greatwhiz/tft_tf2/blob/master/README.md in SavedModel format (preferably with Functional API). The so

transformers and BERT downloading to your local machine

I am trying to replicates the code from this page. At my workplace we have access to transformers and pytorch library but cannot connect to internet from our py

how to train a bert model from scratch with huggingface?

i find a answer of training model from scratch in this question: How to train BERT from scratch on a new domain for both MLM and NSP? one answer use Trainer and

Why DETR need to set a empty class?

Why DETR need to set a empty class? It has set a "Background" class, which means non-object, why?

Why DETR need to set a empty class?

Why DETR need to set a empty class? It has set a "Background" class, which means non-object, why?

what's the difference between "self-attention mechanism" and "full-connection" layer?

I am confused with these two structures. In theory, the output of them are all connected to their input. what magic make 'self-attention mechanism' is more powe

How to understand masked multi-head attention in transformer

I'm currently studying code of transformer, but I can not understand the masked multi-head of decoder. The paper said that it is to prevent you from seeing the

AttributeError: 'GPT2TokenizerFast' object has no attribute 'max_len'

I am just using the huggingface transformer library and get the following message when running run_lm_finetuning.py: AttributeError: 'GPT2TokenizerFast' object

PyTorch Temporal Fusion Transformer prediction output length

I have trained a temporal fusion transformer on some training data and would like to predict on some unseen data. To do so, I'm using the pytorch_forecasting Ti