'/Pytorch: Difference between using GAT dropout and torch.nn.functional.dropout layer?

I was looking at the PyTorch geometric documentation for the Graph Attention Network layer: here (GATconv)

Question: What is the difference between using the dropout parameter in the GATconv layer compared with including a dropout via torch.functional.nn.droupout? Are these different hyper parameters?

My attempt: From the definitions below, they seem to be referring to different things:

  • The dropout from torch.nn.functional is defined as: "During training, randomly zeroes some of the elements of the input tensor with probability p using samples from a Bernoulli distribution."
  • The dropout from the GATconv is defined as: "Dropout probability of the normalized attention coefficients which exposes each node to a stochastically sampled neighborhood during training."

So would the GAT dropout need to be a different hyperparameter in a Grid Search cross validation?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source