'Nextflow Fails When Using a GPU in Google Cloud Platform
I am getting an error trying to run the following nextflow script with the following config and I would like to know why.
script.nf
process test_proc {
container "nvidia/cuda:11.3.0-cudnn8-devel-ubi8@sha256:a4e84a99bfe1e402831e394fddc7f166a0461d664a6db5d091893cbd6d5147f3"
executor "google-lifesciences"
machineType "n1-highmem-16"
accelerator 2, type: "nvidia-tesla-v100"
containerOptions "--gpus all"
"""
echo "hello"
"""
}
nextflow.config
google {
project = "project-1111111"
region = "europe-west4"
lifeSciences {
bootDiskSize = "200 GB"
debug = true
preemptible = false
sshDaemon = true
}
}
docker {
enabled = true
runOptions = "--user='root' --pull=always"
}
Command
nextflow run script.nf -c nextflow.config -bucket-dir gs://my_awesome_bucket/nextflow
Error
Error executing process > 'test_proc'
Caused by: Process
test_procterminated with an error exit status (2)Command executed:
echo "hello"
Command exit status: 2
Command output: (empty)
Command error: Execution failed: generic::unknown: installing drivers: installing GPU drivers and /proc/driver/nvidia does not exist: exit status 1
Work dir:
gs://selene-nextflow-workdir/nextflow/b1/df9f35ada1568fd99cf909ffd8425cTip: you can replicate the issue by changing to the process work dir and entering the command
bash .command.run
I already tried a lot of different nvidia/cuda images working my way from the latest images to the past, but none of them worked.
What does work is commenting out the lines accelerator ... and containerOptions ..., and using the latest Ubuntu image, which leads me to believe that my general nextflow setup/installation is correct and that this might be a driver issue, possibly on the side of the Google Cloud Platform.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
