'WSL2 Linux: Numba CUDA Error_no_device however OS can see it
I am having issues with getting numba to work on my WSL installation. I am trying to use the OpenPCDet library from here (https://github.com/open-mmlab/OpenPCDet). However, I am getting this error. It seems like Numba isn't able to see the device. I also tried doing numba -s to get information, this is what I get (a subset is presented).
__CUDA Information__
Error: CUDA device intialisation problem. Message:Error at driver init:
[100] Call to cuInit results in CUDA_ERROR_NO_DEVICE:
Error class:
ROC Information
ROC available : False
Error initialising ROC due to : No ROC toolchains found.
No HSA Agents found, encountered exception when searching:
Error at driver init:
NUMBA_HSA_DRIVER /opt/rocm/lib/libhsa-runtime64.so is not a valid file path. Note it must be a filepath of the .so/.dll/.dylib or the driver:
Conda Information
conda_build_version : not installed
conda_env_version : 4.12.0
platform : linux-64
python_version : 3.9.7.final.0
root_writable : True
Current Conda Env (Subset)
cudatoolkit 10.2.89 hfd86e86_1
cumm-cu102 0.2.8 pypi_0 pypi
numba 0.55.1 pypi_0 pypi
pytorch 1.7.1 py3.8_cuda10.2.89_cudnn7.6.5_0 pytorch
Also, just to check whether WSL sees a GPU device, I used the command on this website (https://ubuntu.com/blog/getting-started-with-cuda-on-ubuntu-on-wsl-2). I get:
Run "nbody -benchmark [-numbodies=]" to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies= (number of bodies (>= 1) to run in simulation)
-device= (where d=0,1,2.... for the CUDA device to use)
-numdevices= (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
Windowed mode Simulation data stored in video memory Single precision floating point simulation 1 Devices used for simulation GPU Device 0: "Pascal" with compute capability 6.1
Compute 6.1 CUDA device: [NVIDIA GeForce GTX 1080 Ti] 28672 bodies, total time for 10 iterations: 23.114 ms = 355.669 billion interactions per second = 7113.380 single-precision GFLOP/s at 20 flops per interaction
To provide further information. The nvidia-smi commands results in the following output:
Sun May 22 16:37:56 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 512.15 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 On | N/A |
| 23% 26C P8 11W / 250W | 1671MiB / 11264MiB | 3% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ Can somebody help with this? Thanks!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
