'tensorflow: Fail to find dnn implementation

I'm trying to run my code Keras CuDNNGRU on tensorflow using gpu but it always get error "Fail to find dnn implementation" even though I already installed CUDA and CuDNN.

I already reinstall both CUDA and CuDNN for several times and upgrading CuDNN version from 7.2.1 to 7.5.0 but it doesnt fix anything. I also try to run my code on Jupyter Notebook and in python compiler (on terminal) and both results are same. Here are the details of hardware and software of mine.

  1. Tesla V100 PCIE 16GB
  2. Ubuntu 18.04
  3. NVIDIA-SMI 384.183
  4. CUDA 9.0
  5. CuDNN 7.5.0
  6. Miniconda 3
  7. Python 3.6
  8. Tensorflow 1.12
  9. Keras 2.1.6

Here is my code.

encoder_LSTM = tf.keras.layers.CuDNNGRU(hidden_unit,return_sequences=True,return_state=True)
encoder_LSTM_rev=tf.keras.layers.CuDNNGRU(hidden_unit,return_state=True,return_sequences=True,go_backwards=True)

encoder_outputs, state_h = encoder_LSTM(x)
encoder_outputsR, state_hR = encoder_LSTM_rev(x)

And this is the error message.

2019-05-27 19:08:06.814896: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-05-27 19:08:06.814956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-27 19:08:06.814971: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 
2019-05-27 19:08:06.814978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N 
2019-05-27 19:08:06.815279: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14678 MB memory) -> physical GPU (device: 0, name: Tesla V100-PCIE-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
2019-05-27 19:08:08.050226: E tensorflow/stream_executor/cuda/cuda_dnn.cc:373] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2019-05-27 19:08:08.050350: E tensorflow/stream_executor/cuda/cuda_dnn.cc:381] Possibly insufficient driver version: 384.183.0
2019-05-27 19:08:08.050378: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at cudnn_rnn_ops.cc:1214 : Unknown: Fail to find the dnn implementation.
2019-05-27 19:08:08.050483: E tensorflow/stream_executor/cuda/cuda_dnn.cc:373] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2019-05-27 19:08:08.050523: E tensorflow/stream_executor/cuda/cuda_dnn.cc:381] Possibly insufficient driver version: 384.183.0
2019-05-27 19:08:08.050541: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at cudnn_rnn_ops.cc:1214 : Unknown: Fail to find the dnn implementation.
Traceback (most recent call last):
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.UnknownError: Fail to find the dnn implementation.
     [[{{node cu_dnngru/CudnnRNN}} = CudnnRNN[T=DT_FLOAT, direction="unidirectional", dropout=0, input_mode="linear_input", is_training=true, rnn_mode="gru", seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](cu_dnngru/transpose, cu_dnngru/ExpandDims, gradients/while/Shape/Enter_grad/zeros/Const, cu_dnngru/concat)]]
     [[{{node mean_squared_error/value/_37}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1756_mean_squared_error/value", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ta_skenario1.py", line 271, in <module>
    losss, op = sess.run([loss, optimizer], feed_dict={x:data,y_label:label,initial_input:begin_sentence})
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: Fail to find the dnn implementation.
     [[node cu_dnngru/CudnnRNN (defined at ta_skenario1.py:205)  = CudnnRNN[T=DT_FLOAT, direction="unidirectional", dropout=0, input_mode="linear_input", is_training=true, rnn_mode="gru", seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](cu_dnngru/transpose, cu_dnngru/ExpandDims, gradients/while/Shape/Enter_grad/zeros/Const, cu_dnngru/concat)]]
     [[{{node mean_squared_error/value/_37}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1756_mean_squared_error/value", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'cu_dnngru/CudnnRNN', defined at:
  File "ta_skenario1.py", line 205, in <module>
    encoder_outputs, state_h = encoder_LSTM(x)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/keras/layers/recurrent.py", line 619, in __call__
    return super(RNN, self).__call__(inputs, **kwargs)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 757, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/keras/layers/cudnn_recurrent.py", line 109, in call
    output, states = self._process_batch(inputs, initial_state)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/keras/layers/cudnn_recurrent.py", line 299, in _process_batch
    rnn_mode='gru')
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/ops/gen_cudnn_rnn_ops.py", line 116, in cudnn_rnn
    is_training=is_training, name=name)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/home/paperspace/.conda/envs/gpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

UnknownError (see above for traceback): Fail to find the dnn implementation.
     [[node cu_dnngru/CudnnRNN (defined at ta_skenario1.py:205)  = CudnnRNN[T=DT_FLOAT, direction="unidirectional", dropout=0, input_mode="linear_input", is_training=true, rnn_mode="gru", seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](cu_dnngru/transpose, cu_dnngru/ExpandDims, gradients/while/Shape/Enter_grad/zeros/Const, cu_dnngru/concat)]]
     [[{{node mean_squared_error/value/_37}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1756_mean_squared_error/value", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Any idea? Thank you

UPDATE: I tried to downgrade CuDNN version from 7.5.0 to 7.1.4 but the result is remain same.



Solution 1:[1]

Configuring your GPU to allow growth worked for me with TF 2.0. I found this solution a few months ago in another issue when I had a problem running pre-TF 2.0. Can't remember where.

Add the following and it may be good.

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

Solution 2:[2]

This worked for me in Tensorflow 2, as suggested here

import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], enable=True)

Solution 3:[3]

Not sure if it can help, but in my case the problem was given by the use of multiple jupyter notebook file.

I was writing a simple code for a neural network and i decided to split it into 2 notebooks, one for the training and one for the prediction (if you didn't have resources/time to train your network, i provided my saved model in a file).

If I ran the two notebooks "together", so basically first the training and then the prediction one without disconnecting the kernel of the first code, i would have get this error.

Disconnecting the kernel of the first jupyter notebook before using the second one solved my problem.

Solution 4:[4]

Did you tested your installations (cuda, cudnn, tensorflow-gpu) ?

Test cuda: Check first if:

$ nvcc -V

display the right version of your cuda toolkit. Then you can test it with the following process:

First (it takes few minutes):

 $ cd ~/NVIDIA_CUDA-9.0_Samples
 $ make

and then:

$ cd ~/NVIDIA_CUDA-9.0_Samples/bin/x86_64/linux/release
$./deviceQuery

If you get: "Result: pass" at the end, you're all good!

Test cudnn:

$ cp -r /usr/src/cudnn_samples_v7/ $HOME
$ cd $HOME/cudnn_samples_v7/mnistCUDNN
$ make clean && make
$ ./mnistCUDNN

You should have as result: 'Test passed!'

Test tensorflow-gpu:

If cuda and cudnn are working, you can test your tensorflow installation with:

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

I advice you to install tensorflow in a conda environment using:

conda create --name tf_gpu tensorflow-gpu

For me (and after a lot of problems) it was working very well.

Sources: gpu installation for Ubuntu 18.04, tensorflow-gpu installation

Solution 5:[5]

For anyone experiencing this issue with TF2.0 and Cuda 10.0 with cuDNN-7, you are likely getting this because you accidentally upgraded cuDNN from 7.6.2 to something >7.6.5. Despite the TF docs stating that anything >=7.4.1 is working, this is not the case! Downgrade to CudNN as follows:

sudo apt-get install --no-install-recommends \
  cuda-10-0 \
  libcudnn7=7.6.2.24-1+cuda10.0  \
  libcudnn7-dev=7.6.2.24-1+cuda10.0

For the future you can put updates to cuDNN on hold in Ubuntu/Debian by marking them in aptitude:

sudo apt-mark hold libcudnn7 libcudnn7-dev

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 James Dedon
Solution 2 Alexander Higgins
Solution 3
Solution 4 Baptiste Pouthier
Solution 5 weidler