'InternalError: cudaGetDevice() failed. Status: initialization error when running tensorflow

I recently got a new Windows computer that came with a GPU (NVIDIA Quadpro P4200) for work. I was hoping to run some old code I had, but now taking advantage of the GPU. I am attempting to run a LSTM model for text classification using Tensorflow. Let me apologize in advance for what is probably too much information, I just am at a lost and dont know where the issue is. Also, admittedly, I have very little understanding of some of the more technical aspects of this.

I current have the below versions:

\# Name                    Version                   Build  Channel  
tensorflow                2.6.0           gpu_py39he88c5ba_0  
tensorflow-base           2.6.0           gpu_py39hb3da07e_0  
tensorflow-estimator      2.6.0              pyh7b7c402_0  
tensorflow-gpu            2.6.0                h17022bd_0  

cudatoolkit               11.3.1               h59b6b97_2

cudnn                     8.2.1                cuda11.3_0

Also, when i run nvidia-smi, I get:

+-----------------------------------------------------------------------------+  
| NVIDIA-SMI 419.17       Driver Version: 419.17       CUDA Version: 10.1     |  
|-------------------------------+----------------------+----------------------+  
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |  
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |  
|===============================+======================+======================|  
|   0  Quadro P4200       WDDM  | 00000000:01:00.0 Off |                  N/A |  
| N/A   51C    P8     8W /  N/A |    111MiB /  8192MiB |      0%      Default |  
+-------------------------------+----------------------+----------------------+  

and when i run nvcc --version, i get:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:52:33_Pacific_Standard_Time_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0

When I look in Windows Device Manager, and look at the driver, it lists version: 25.21.14.1917

When I run my code, I get the following error:

InternalError: cudaGetDevice() failed. Status: initialization error

I have googled for solutions, and found several suggesting I use differing versions of cudatoolkit, cudnn and tensorflow. I have tried several options, including reverting back to cudatoolkit 10.1 and cudnn 7.6.5 which required using an older version of tensorflow and python 3.8 (the above is using 3.9). When I made those changes, tensorflow did not appear to detect my GPU at all.

I think I am going to request that our IT department update my GPU driver from here however, I am a bit worried that might not solve my problem (and may make it worse?).

Update

I got everything to work! I guess it will be useful to keep this here in case others have similar issues.

So, the problem was indeed that I needed to use an older version of cudatoolkit. What I did was to install cudatoolkit 10.1. However, when installing tensorflow-gpu 2.3, you need to use pip install and not conda install. Not sure why.

So, in short, to get the older versions of tensorflow to recognize my GPU, I needed to install all packages with pip (although, I did install cudatoolkit and cudnn with conda...all other packages seemed to require pip).



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source