'Difference between CrossEntropyLoss and NNLLoss with log_softmax in PyTorch?
When I am building a classifier in PyTorch, I have 2 options to do
- Using the
nn.CrossEntropyLosswithout any modification in the model - Using the
nn.NNLLosswithF.log_softmaxadded as the last layer in the model
So there are two approaches.
Now, what approach should anyone use, and why?
Solution 1:[1]
They're the same.
If you check the implementation, you will find that it calls nll_loss after applying log_softmax on the incoming arguments.
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
Solution 2:[2]
Both the cross-entropy and log-likelihood are two different interpretations of the same formula. In the log-likelihood case, we maximize the probability (actually likelihood) of the correct class which is the same as minimizing cross-entropy. Though you're correct both of these have created some ambiguity in the literature, however, there are some subtleties and caveats, I would highly suggest you go through this thread, as this topic has been rigorously discussed there. You may find it useful.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Khalid Saifullah |
