'How do I set up visdom on a remote server (HPC3, using a SLURM script)?

I am trying to visualise different loss function plots during training using Visdom. I am using a HPC3 system which makes use of a SLURM script and command line arguments to run the training.

I have already tried to use this tutorial https://gist.github.com/amoudgl/011ed6273547c9312d4f834416ab1d0c but I find that when I try to run it, it says the port is already in use despite the fact I cannot actually open the link itself. I tried using the demo.py script provided in the link above to try and example before changing my code, but even that does not work for me.

I am not sure if it is the nature of SLURM scripts/commands or if I have not implemented it properly. My steps are as follows (following the tutorial above):

ssh -N -f -L localhost:8097:localhost:8097 [email protected] (on remote server terminal)

Using sbatch command and the slurm script:

no_proxy:localhost python demo.py

Then on my local machine (using Terminal), I activate visdom:

visdom 

This has led to two outcomes: I am either unable to connect to port and get an error saying this port is already being used. Alternatively, sometimes it allows me to navigate on my browser but nothing shows up.

Any help would be greatly appreciated.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source