Category "slurm"

Bash script to send commands to remote ssh session

Is it possible to write a bash script that opens a remote node (i.e. through ssh and/or slurm) and starts an interactive session there after running some comman

Slurm parallel "steps": 25 independent runs, using 1 cpu each, at most 5 simultaneously

I was previously using HTCondor as a cluster scheduler. Now even after reading Slurm documentation, I have no idea how to parallelize... What I want to achieve

Start cannot spawn child process: No such file or directory

Hi I get this message when I run my job in slurm what does it mean? tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No suc

Per-node default partition in SLURM

I'm configuring a small cluster, controlled by SLURM. This cluster has one master node and two partitions. Users submit their jobs from worker nodes, I've rest

Scheduling more jobs than MaxArraySize

Let's say I have 6233 simulations to run. The commands are generated and stored in a file, one in each line. I would like to use Slurm to schedule and run these

Is it possible to configure the directory for sbatch's default output file?

Is there some way to configure an alternative default directory (other than the current directory) for sbatch to put the file slurm-%j.out (or slurm-%A_%a.out)

Dask-SLURMCluster: [Errno 104] Connection reset by peer

I'm running into a problem using a Xarray together with SLURMcluster from Dask. I'm using pandas_plink to load some data into a Xarray, then filtering it and ma

Running parallel jobs in slurm

I was wondering if I could ask something about running slurm jobs in parallel.(Please note that I am new to slurm and linux and have only started using it 2 day

Limit the number of running jobs in SLURM

I am queuing multiple jobs in SLURM. Can I limit the number of parallel running jobs in slurm? Thanks in advance!

Error while running a slurm job through crontab which uses Intel MPI for parallization

I am trying to run WRF (real.exe, wrf.exe) through the crontab using compute nodes but compute nodes are not able to run slurm job. I think there is some issue

What does it mean for slurm job to crash with `bus error`?

When running a Python script via slurm srun --pty bash I get a cryptic error message Bus error: core dumped. I searched the slurm documentation and it doesn't m