'Difference between CrossEntropyLoss and NNLLoss with log_softmax in PyTorch?

When I am building a classifier in PyTorch, I have 2 options to do

  1. Using the nn.CrossEntropyLoss without any modification in the model
  2. Using the nn.NNLLoss with F.log_softmax added as the last layer in the model

So there are two approaches.

Now, what approach should anyone use, and why?



Solution 1:[1]

They're the same.

If you check the implementation, you will find that it calls nll_loss after applying log_softmax on the incoming arguments.

return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)

Solution 2:[2]

Both the cross-entropy and log-likelihood are two different interpretations of the same formula. In the log-likelihood case, we maximize the probability (actually likelihood) of the correct class which is the same as minimizing cross-entropy. Though you're correct both of these have created some ambiguity in the literature, however, there are some subtleties and caveats, I would highly suggest you go through this thread, as this topic has been rigorously discussed there. You may find it useful.

Cross-Entropy or Log-Likelihood in Output layer

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Khalid Saifullah