'WSL2 Linux: Numba CUDA Error_no_device however OS can see it

I am having issues with getting numba to work on my WSL installation. I am trying to use the OpenPCDet library from here (https://github.com/open-mmlab/OpenPCDet). However, I am getting this error. It seems like Numba isn't able to see the device. I also tried doing numba -s to get information, this is what I get (a subset is presented).

__CUDA Information__ Error: CUDA device intialisation problem. Message:Error at driver init: [100] Call to cuInit results in CUDA_ERROR_NO_DEVICE: Error class:

ROC Information ROC available : False Error initialising ROC due to : No ROC toolchains found. No HSA Agents found, encountered exception when searching: Error at driver init: NUMBA_HSA_DRIVER /opt/rocm/lib/libhsa-runtime64.so is not a valid file path. Note it must be a filepath of the .so/.dll/.dylib or the driver:

Conda Information conda_build_version : not installed conda_env_version : 4.12.0 platform : linux-64 python_version : 3.9.7.final.0 root_writable : True

Current Conda Env (Subset) cudatoolkit 10.2.89 hfd86e86_1 cumm-cu102 0.2.8 pypi_0 pypi numba 0.55.1 pypi_0 pypi pytorch 1.7.1 py3.8_cuda10.2.89_cudnn7.6.5_0 pytorch

Also, just to check whether WSL sees a GPU device, I used the command on this website (https://ubuntu.com/blog/getting-started-with-cuda-on-ubuntu-on-wsl-2). I get:

Run "nbody -benchmark [-numbodies=]" to measure performance. -fullscreen (run n-body simulation in fullscreen mode) -fp64 (use double precision floating point values for simulation) -hostmem (stores simulation data in host memory) -benchmark (run benchmark to measure performance) -numbodies= (number of bodies (>= 1) to run in simulation) -device= (where d=0,1,2.... for the CUDA device to use) -numdevices= (where i=(number of CUDA devices > 0) to use for simulation) -compare (compares simulation results running once on the default GPU and once on the CPU) -cpu (run n-body simulation on the CPU) -tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Windowed mode Simulation data stored in video memory Single precision floating point simulation 1 Devices used for simulation GPU Device 0: "Pascal" with compute capability 6.1

Compute 6.1 CUDA device: [NVIDIA GeForce GTX 1080 Ti] 28672 bodies, total time for 10 iterations: 23.114 ms = 355.669 billion interactions per second = 7113.380 single-precision GFLOP/s at 20 flops per interaction

To provide further information. The nvidia-smi commands results in the following output:

Sun May 22 16:37:56 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 510.47.03 Driver Version: 512.15 CUDA Version: 11.6 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... On | 00000000:01:00.0 On | N/A | | 23% 26C P8 11W / 250W | 1671MiB / 11264MiB | 3% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ Can somebody help with this? Thanks!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source