Is it possible to use Theano with CUDA 6.5 and CuDNN 3.0? - theano

I ran a python program with Theano, but it errors with:
ImportError: cuDNN not available: Version is too old. Update to v5, was 3007.
So, is it possible to use Theano with CUDA 6.5 and CuDNN 3.0? Currently, I don't have the root privilege to install a newer version of CUDA (because the newer CUDA needs newer driver).

git clone theano from git repo, use git checkout to grab an older version, then install locally.

Related

How to tell PyTorch which CUDA version to take?

I have two version of CUDA installed on my Ubuntu 16.04 machine: 9.0 and 10.1.
They are located in /usr/local/cuda-9.0 and /usr/local/10.1 respectively.
If I install PyTorch 1.6.0 (which needs CUDA 10.1) via pip (pip install torch==1.6.0), it uses version 9.0 and thus detects no GPUs. I already changed my LD_LIBRARY_PATH to "/usr/local/cuda-10.1/lib64:/usr/local/cuda-10.1/cuda/extras/CUPTI/lib64" but PyTorch is still using CUDA 9.0.
How do I tell PyTorch to use CUDA 10.1?
Prebuilt wheels for torch built with different versions of CUDA are available at torch stable releases page. For example you can install torch v1.9.0 built with CUDA v11.1 like this:
pip install --upgrade torch==1.9.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
But not all the combinations are available.

Can I train in tensorflow with separate CUDA version in anaconda environment

I need to train a model in TensorFlow-gpu==2.3.0 which needs the CUDA version to be 10.1. But when I type 'nvidia-smi' it shows CUDA version to be 10.0.
I created a conda environment using, "conda create -n tf2-gpu tensorflow-gpu cudatoolkit=10.1"
after initiating training, it throws an error as tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version
How can I train using tensorflow-gpu in conda environment with another version of CUDA? And, I still need CUDA 10.0 to be there, as it helps my other training setup.
Yes, you can create two virtual environments in Anaconda with different tensorflow version. But CUDA and CuDNN will be installing compatible to that specified tensorflow-gpu.
You can find tensorflow-gpu build configuration details here to check supporting CUDA and cuDNN version.
Please check this similar issue link to create virtual environment in anaconda and to install specific tensorflow-gpu.

Could not find any cudnn.h matching version '8' in any subdirectory

For some specific purpose, I am building TF 1.14 from source with CUDA 11.1, CUDNN 8.0.4 & TensorRT 7.2 on Ubuntu 16.04 but getting an error:-
I have verified that CUDNN is installed at /usr/include/cudnn.h and following this stackoverflow answer I have copy pasted cudnn.h file at /usr/local/cuda/ and "libcudnn8_8.0.4.30-1+cuda11.1_amd64.deb" , "libcudnn8-dev_8.0.4.30-1+cuda11.1_amd64.deb" files to to /usr/local/cuda. Can anyone please help me?
As a side note, which I think probably not the cause of this issue; there are two CUDNN and CUDA versions installed on the machine.
Robert, CUDA 11.1, CUDNN 8.0.4 are not compatible with TF 1.14. First I would recommend you upgrade to tensorflow 2 but it is not necessary. I found it best to install tensorflow using Conda if you have Anaconda installed. Reason is conda will install the right versions of the cuda toolkit and cudnn automatically. Pip does not do that and you manually have to download the right versions and change the path environment variable to point to the directories where you placed them etc. So first I would uninstall tensorflow, then reinstall it with conda. If you need further information there is a good guide located here.

PyTorch having trouble detecting CUDA

I am running CNN on PyTorch. The torch.cuda.is_available() function returned false and no GPU is detected. However, I can run Keras model with GPU. Here is my system information:
OS: Ubuntu 18.04.3
Python 3.7.3 (Conda)
GPU: GTX1080Ti
Nvidia driver: 430.50
When I check nvidia-smi, the output said that the CUDA version is 10.1. However, the nvcc -V command tells me that it is CUDA 9.1.
I downloaded NVIDIA-Linux-x86_64-430.50.run from the official site and install it with command line. I installed CUDA 10.1 using these following command line recommended by the official site:
wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.243_418.87.00_linux.run
sudo sh cuda_10.1.243_418.87.00_linux.run
I installed PyTorch through pip install. What is wrong? Thanks in advance!
The default Pytorch 1.2 package depends on CUDA 10.0, but you have CUDA 9.1. The output of nvidia-smi just tells you the maximum CUDA version your GPU supports, nvcc gives the CUDA installed on your system. It seems that your installation of CUDA 10.1 was unsuccessful.
In addition to CUDA 10.0, Pytorch also supports CUDA 9.2 and I've found that the Pytorch package compiled for CUDA 10.0 also works with CUDA 10.1. So you can either upgrade your CUDA installation to 9.2 and install the Pytorch CUDA 9.2 package with
pip3 install torch==1.2.0+cu92 torchvision==0.4.0+cu92 -f https://download.pytorch.org/whl/torch_stable.html
Or get a working installation of CUDA 10.1. There are detailed Linux instructions here. (Note that you may have to remove previous installations of CUDA before installing a new one.)
FYI, this answer is a hack which could mess up your conda env, but may work more easily than installing a fresh env. A consistency-checking tool would be really helpful because of all the people having exactly this problem. Matching anaconda's CUDA version with the system driver and the actual hardware and the other system environment settings is challenging to say the least and almost an art.
I found that Anaconda improperly guesses the CUDA version to use frequently. So I have found the best way to fix this is to surgically uninstall and reinstall just pytorch with pip:
pip uninstall torch
pip install torch
Note that pip calls pytorch torch while conda calls it pytorch.
However, I also found that pip sometimes refuses to reinstall torch because it didn't get rid of the anaconda site package files. If that is the case you can very carefully remove them manually as:
rm -fr $HOME/miniconda3/envs/<ENV>/lib/python3.9/site-packages/torch/
rm -fr $HOME/miniconda3/envs/<ENV>/lib/python3.9/site-packages/torch-*.dist-info/
where should be replaced with your environment name and miniconda might be anaconda or something else depending on your installation.
Be very careful not to delete anything other than the torch files or you may mess something else up. Then you would be best served by installing yet another fresh environment.
After this pip install torch should work and torch.cuda.is_available() should return True. Unless there is another problem... YMMV.
Note that I recommend using miniconda because the full anaconda comes overloaded with packages and I find it quickly gets clogged and broken.

Torch.cuda.is_available() keeps switching to False

I have tried several solutions which hinted at what to do when the CUDA GPU is available and CUDA is installed but the Torch.cuda.is_available() returns False. They did help but only temporarily, meaning torch.cuda-is_available() reported True but after some time, it switched back to False. I use CUDA 9.0.176 and GTX 1080. What should I do to get the permanent effect?
I tried the following methods:
https://forums.fast.ai/t/torch-cuda-is-available-returns-false/16721/5
https://github.com/pytorch/pytorch/issues/15612
Note: When torch.cuda.is_available() works fine but then at some point switches to False, then I have to restart the computer and then it works again (for some time).
The reason for torch.cuda.is_available() resulting False is the incompatibility between the versions of pytorch and cudatoolkit.
As on Jun-2022, the current version of pytorch is compatible with cudatoolkit=11.3 whereas the current cuda toolkit version = 11.7. Source
Solution:
Uninstall Pytorch for a fresh installation. You cannot install an old version on top of a new version without force installation (using pip install --upgrade --force-reinstall <package_name>.
Run conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch to install pytorch.
Install CUDA 11.3 version from https://developer.nvidia.com/cuda-11.3.0-download-archive.
You are good to go.
Also with torch.cuda.is_available () had false.
But when installing the Nvidia driver to the most updated version 436.48, True is displayed. I previously updated Pytorch to 1.2.0.
I have windows 10 and Anaconda.
Install CUDA 9.1 using apt-get, following the instructions in this link:
https://cryptoandcoffee.com/mining-gems/cuda-9-0-install-ubuntu-16-04-apt-get/
Installed PyTorch using pip:
pip install torchvision ( this will install both torch and torchvision )
Rebooted
Now try it:
~$ python -c 'import torch; print torch.cuda.is_available()'
I saw this issue as well. The reason was the CUDA version used by Pytorch being out of sync with the installed Nvidia driver. As in Joe's answer, the solution was updating the Nvidia drivers. Some other important background info to be aware of:
Each release of CUDA requires a minimum Nvidia driver version (see here for a compatibility table).
You can check your Nvidia driver version with nvidia-smi.
Pytorch comes pre-packaged with a version of CUDA that may be different from the version you installed on your computer.
The CUDA version that you installed manually is the one shows up when you run nvidia-smi. Even if your driver version is compatible with this CUDA version, it may be incompatible with the Pytorch CUDA version.
You can get the Pytorch CUDA version by printing the torch.version.cuda variable in ipython or in a Python program. This is the version that determines the needed Nvidia driver version.

Resources