'Nextflow Fails When Using a GPU in Google Cloud Platform

I am getting an error trying to run the following nextflow script with the following config and I would like to know why.

script.nf

process test_proc {
    container "nvidia/cuda:11.3.0-cudnn8-devel-ubi8@sha256:a4e84a99bfe1e402831e394fddc7f166a0461d664a6db5d091893cbd6d5147f3"

    executor "google-lifesciences"
    machineType "n1-highmem-16"
    accelerator 2, type: "nvidia-tesla-v100"
    containerOptions "--gpus all"

    """
    echo "hello"
    """
}

nextflow.config

google {
    project = "project-1111111"
    region = "europe-west4"

    lifeSciences {
        bootDiskSize = "200 GB"
        debug = true
        preemptible = false
        sshDaemon = true
    }
}

docker {
    enabled = true
    runOptions = "--user='root' --pull=always"
}

Command

nextflow run script.nf -c nextflow.config -bucket-dir gs://my_awesome_bucket/nextflow

Error

Error executing process > 'test_proc'

Caused by: Process test_proc terminated with an error exit status (2)

Command executed:

echo "hello"

Command exit status: 2

Command output: (empty)

Command error: Execution failed: generic::unknown: installing drivers: installing GPU drivers and /proc/driver/nvidia does not exist: exit status 1

Work dir:
gs://selene-nextflow-workdir/nextflow/b1/df9f35ada1568fd99cf909ffd8425c

Tip: you can replicate the issue by changing to the process work dir and entering the command bash .command.run


I already tried a lot of different nvidia/cuda images working my way from the latest images to the past, but none of them worked.

What does work is commenting out the lines accelerator ... and containerOptions ..., and using the latest Ubuntu image, which leads me to believe that my general nextflow setup/installation is correct and that this might be a driver issue, possibly on the side of the Google Cloud Platform.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source