'"No Such Process" consumes GPU memory

When I use nvidia-smi, I found nearly 20GB GPU Memory is missing somewhere (total listed processes took 17745MB, meanwhile Memory-Usage is 37739MB):

enter image description here

Then I use nvitop, you can see No Such Process has actually taken my GPU resources. However, I cannot kill this PID:

>>> sudo kill -9 118238
kill: (118238): No such process

enter image description here

How can I get rid of this ghost process without interupting others?



Solution 1:[1]

I have found the solution in this answer: https://stackoverflow.com/a/59431785/6563277.

First, I run sudo fuser -v /dev/nvidia* to see all processes are using my GPU RAM that nvidia-smi has failed to show.

Then, I saw some "ghost" Python processes. And after killing it, the GPU RAM was free up.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1