'Pytorch model doesn't converge with single GPU but works well on two same GPUs

I met a strange problem. I trained my model with one GPU(RTX Titan), and it doesn't converge. However, it worked well on two same GPUs with the same settings. There is nothing to do with the batch size. And I use the torch.fft and torch.Transformer layer. I use Python 3.8, Pytorch 1.71 and Cuda 10.1.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source