ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory - python-3.x

I have installed cuda-8.0 and cudnn5.1 on CentOS. Then, when importing tensorflow (python 3.6), it gives the error as above.
I have already set symbol link as below in /etc/profile. Are there any guys who occurred this kind of problem?
export PATH=/usr/local/cuda-8.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64:$LD_LIBRARY_PATH
Also, what makes me confused is that, I run nvcc -V, it shows
Cuda compilation tools, release 8.0, V8.0.61
However, when I run ./deviceQuery in folder /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery, on device 0: "Tesla M40", it shows
CUDA Driver Version / Runtime Version 9.1 / 8.0
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = Tesla M40

Check your version of tensorflow using "pip3 list | grep tensorflow" If it is of version tensorflow-gpu (1.5.0) then the required cuda version is 9.0 and cuDNN v7.
Look into the following link for more details:
https://github.com/tensorflow/tensorflow/releases
Tensorflow installation guide needs to be updated.

I had the same problem. Tensorflow 1.5.0 is precompiled to CUDA 9.0 (which is outdated; Sept 2017).
The newest CUDA version is CUDA 9.1 (Dec. 2017) and sudo pip install tensorflow-gpu will not work with the newest CUDA 9.1. There are two solutions to the problem:
1.) Install CUDA 9.0 next to CUDA 9.1 (this worked for me)
2.) Build Tensorflow by yourself from the git source code
Either way do not forget to add the PATH variables to your operating system, otherwise you receive the error message stated in the question from your python interpreter.

Related

THC/THC.h: No such file or directory

I am trying to compile this with cuda support: https://github.com/CharlesShang/DCNv2 for a Project. But everytime I try it gives me this error message:
/THC.h: No such file or directory
9 | #include <THC/THC.h>
I am using:
Arch Linux with kernel version 6.1.4
GTX 1080
python 3.6
pytorch 1.2.0
torchvision 0.4.0
cudatoolkit 10.0
gcc 7.5
I thought it might be incompatible cuda and gcc versions, but I tried multiple combinations and none of them worked. At the moment I am using cuda 10.0 with gcc version 7.5 as it should be compatible.
Any help is greatly appreciated.

Using CUDA 11.x but getting error: Unknown CUDA arch (8.6) or GPU not supported

I'm setting up a conda environment to use pytorch 1.4.0 (on Ubuntu 20.04.2), but getting the error message:
ValueError: Unknown CUDA arch (8.6) or GPU not supported
I know this has been asked before, but no answer fits my case. This answer suggests that the CUDA version is too old. However, I updated my CUDA version to the most recent, and get the same error message.
nvcc -V says I have CUDA 11 installed, and when I run nvidia-smi I get this info:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.84 Driver Version: 460.84 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
which, according to the NVIDIA docs, should work be compatible:
Another auxilliary question: What does the "8.6" in CUDA arch (8.6) represent?
Specific versions of PyTorch work only with specific versions of CUDA.
If you are using CUDA-11.1, you'll need a fairly recent version of PyTorch. You need to either upgrade your PyTorch, or downgrade your CUDA.
It seems you can grab PyTorch v1.4 for CUDA 10.0 from here:
pip install torch==1.4.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html

Tensorflow 2.2 and cudnn 8.0.3 not working together as they should. It still looks for cudnn 7.6.5 dll files

I have Tensorflow 2.2 and Cuda 10.1 with cuDnn 8.0.3
I am unable to run my scripts because it keeps looking for cuDnn 7 dll file: cudnn64_7.dll
I get the following:
Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found
Even though I installed the newly published cuDnn 8.0.3 for Cuda 10.1 (see cuDNN 8.x support matrix)
I went back to cuDNN 7.6.5 but I was hoping to get the "5 times faster" cuDNN v8.0 as NVIDIA claims.
Any help or workarounds on how to get this done? Googling gets me literally less than 5 results! as it seems not many got to try the new 8.0.3 (the one for 10.1)
Had the same issue. 8.0.3 version is the current and latest supported version of the library for CUDA 10.1. However, tensorflow is build for the earlier version, so you have to use that instead.
To elaborate, if you check this page: https://www.tensorflow.org/install/source_windows#tested_build_configurations
+----------------------+----------------+-----------+-------------+-------+------+
| Version | Python version | Compiler | Build tools | cuDNN | CUDA |
+----------------------+----------------+-----------+-------------+-------+------+
| tensorflow_gpu-2.3.0 | 3.5-3.8 | MSVC 2019 | Bazel 3.1.0 | 7.6 | 10.1 |
+----------------------+----------------+-----------+-------------+-------+------+
so, unless you build the TF locally - you have to use the supported version of cudnn.
That being said, however, if you check latest TF releases:
https://github.com/tensorflow/tensorflow/releases
you will then see the following TensorFlow 2.4.0-rc1 note:
TensorFlow pip packages are now built with CUDA11 and cuDNN 8.0.2.
You can use the release candidate version of TF, but then, you also have to upgrade CUDA to 11 (I am guessing version 11.0 since no postfix is mentioned) and use the cuDNN v8.0.2 (July 24th, 2020), for CUDA 11.0.
Just tested - this setup works. You just have to make sure to install numpy version 1.19.3 in order to avoid the problem mentioned in these threads
RuntimeError: The current Numpy installation fails to pass a sanity check due to a bug in the windows runtime
https://developercommunity.visualstudio.com/content/problem/1207405/fmod-after-an-update-to-windows-2004-is-causing-a.html

Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found

Installed Nvidia CUDA 11
Got the cuDNN 8.0 (I think)
Added the directory to PATH
Installed TensorFlow through (pip install tensorflow-gpu)
But I still get this error
Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
TF 2.4 supports CUDA 11
TF 2.3 needs CUDA 10.1
Just install CUDA 10.1 (you can have more than one instalation)

Where is the CUDA toolkit located on Ubuntu?

I installed Nvidia's 375 driver and CUDA 8.0 on Ubuntu 16.04 from Nvidia's .deb package. I want to build TensorFlow with GPU support. This is the output of TensorFlow's configure script:
./configure
You have bazel 0.4.5 installed.
Please specify the location of python. [Default is /usr/bin/python3]:
Found possible Python library paths:
/usr/local/lib/python3.5/dist-packages
/usr/lib/python3/dist-packages
Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages]
Using python library path: /usr/local/lib/python3.5/dist-packages
Do you wish to build TensorFlow with MKL support? [y/N]
No MKL support will be enabled for TensorFlow
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n]
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] y
Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N]
No XLA support will be enabled for TensorFlow
Do you wish to build TensorFlow with VERBS support? [y/N]
No VERBS support will be enabled for TensorFlow
Do you wish to build TensorFlow with OpenCL support? [y/N]
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] y
CUDA support will be enabled for TensorFlow
Do you want to use clang as CUDA compiler? [y/N]
nvcc will be used as CUDA compiler
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]:
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Invalid path to CUDA 8.0 toolkit. /usr/local/cuda/lib64/libcudart.so.8.0 cannot be found
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]:
The CUDA toolkit directory is not found at the default path, and I can't find it anywhere in /usr:
find /usr -type f -name '*cuda*'
/usr/src/linux-headers-4.4.0-151/include/linux/cuda.h
/usr/src/linux-headers-4.4.0-151/include/uapi/linux/cuda.h
/usr/src/linux-headers-4.4.0-142/include/linux/cuda.h
/usr/src/linux-headers-4.4.0-142/include/uapi/linux/cuda.h
/usr/lib/nvidia-384/bin/nvidia-cuda-mps-server
/usr/lib/nvidia-384/bin/nvidia-cuda-mps-control
/usr/lib/x86_64-linux-gnu/libicudata.so.55.1
/usr/share/man/man1/alt-nvidia-384-cuda-mps-control.1.gz
/usr/share/vim/vim74/syntax/cuda.vim
/usr/share/vim/vim74/indent/cuda.vim
/usr/include/linux/cuda.h
Did I miss something in the CUDA installation?
The .deb package I downloaded only installed the repository's metadata. As the documentation says (page 14), I had to install cuda after installing the package:
apt update
apt install cuda

Resources