'How to debug cuda out of memory on pytorch 0.4.1? (paper code)

I'm trying to make the code from this github repo work with the default dataset they work on, however I'm getting a CUDA OOM error out of nowhere. The error happens the first time something is fed to the encoder of the method, so no previous iterations run, it pings right from the start.

What troubles me is that a) it happens immediately in the first forward pass through the model, so the tensorboard summary writer they have does not log anything (a file is created but I cannot access it, the tensorboard site does not read anything), and b) I'm running my tests on two 1080 Ti's, so it is unlikely that I don't have enough memory. Even with three the error persists.

I read that tensorboard profiler can show memory usage, but it is supported at pytorch 1.8 and up, I'm using 0.4.1 on this project. Can anyone share an idea on how to debug the memory usage in this instance?

Thanks!

Edit: I forgot to add that changing the batch size did not help, so the problem must be with the model's size. Plus I want to recreate results so I probably should not change any of the parameters.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source