error message of WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu0 is not available - theano

I have installed theano using pip install theano, which was finished successfully. After typing import theano, I got the following warning message
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu0 is not available (error: Unable to get the number of gpus
What does it mean and how to resolve this issue? Thanks.

Related

anaconda ImportError: DLL load failed while importing cv2: The specified module could not be found

I am trying to install opencv on my laptop but I keep getting ImportError: DLL load failed while importing cv2: The specified module could not be found.
I tried installing installing with conda install -c conda-forge opencv in an anaconda env.
But I also get the same result if I use my normal python 3.10 interpreter with opencv-python and opencv-contrib-python.
I also tried to compile the module myself but cv2.cp310-win_amd64 gives me the same error as well...
Isnt there just an prebuild binary I can use? or should I look for an different ML module?
Yesterday when I came home I tried installing opencv-python on my personal computer. And it worked instantly. I have an GTX 1070 in that pc and an RTX A1000 in the work laptop I need opencv on. So I tought that the cuda cores might not be supported or something.
I found this tutorial https://machinelearningprojects.net/build-opencv-with-cuda-and-cudnn/
But after installing the Nvidia SDK, cudnn and compiling opencv from source I still get the same error.
Even with appending the opencv output folder and the cuda sdk bin folder to python's dll_path.
Disabling BUILD_SHARED_LIBS as suggested in here also does nothing....
https://forum.opencv.org/t/opencv-w-cuda-build-seems-successful-but-import-cv2-fails/11328/3
This happened to me just this morning, and I fixed it by installing opencv using pip. First, I removed conda's opencv installation. In the environment where it is installed type
conda remove opencv
Once it is removed, type
pip install opencv-python
Hope it helps
Using python 3.6 with conda install -c conda-forge opencv=3.2.0 as described here OpenCV-Python ImportError: DLL load failed: The specified module could not be found solved the issue on my laptop with an RTX A1000. Weird opencv works with python 3.10 on my desktop with an GTX 1070

AssertionError: Torch not compiled with CUDA enabled (depite several reinstallations)

Whenever I try to move a variable to cuda in pytorch (e.g. torch.zeros(1).cuda(), I get the error message "AssertionError: Torch not compiled with CUDA enabled". Besides,torch.cuda.is_available() returns False.
I have read several answers to approaching this error but for some reason several attempts to reinstall cuda and putorch didn't change anything. Here are some of the settings I used:
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
conda install pytorch torchvision cudatoolkit=11 -c pytorch-nightly
conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
Yet the same error remains. What could be the issue?
Some settings:
I'm using Ubuntu 20.04, GPU is RTX 2080, nvidia-smi works fine (NVIDIA-SMI 460.91.03, Driver Version: 460.91.03, (max possible) CUDA Version: 11.2)
Try installing with pip
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
You can go through this thread for detailed explanations
Pytorch for cuda 11.2

No module found torch

I am using jetson NX xavier kit having cuda 10.2.89, open Cv 4.1.1 and tensorRT 7.1.3 . Trying to install pytorch. Tried installing with this line
conda install pytorch torchvision cpuonly -c pytorch
but when i write this line
import torch
It throws an error by saying that module not installed.
How I can verify if pytorch has been installed correctly.
Try this one
conda install -c pytorch pytorch
After executing this command, you need to enter yes(if prompted in the command line) for installing all the related packages. If there is no conflict while installing the libraries, the PyTorch library will be installed.
To check if it is properly installed or not, type the command python in your command line and type import torch to check if it is properly installed or not.

Can I train in tensorflow with separate CUDA version in anaconda environment

I need to train a model in TensorFlow-gpu==2.3.0 which needs the CUDA version to be 10.1. But when I type 'nvidia-smi' it shows CUDA version to be 10.0.
I created a conda environment using, "conda create -n tf2-gpu tensorflow-gpu cudatoolkit=10.1"
after initiating training, it throws an error as tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version
How can I train using tensorflow-gpu in conda environment with another version of CUDA? And, I still need CUDA 10.0 to be there, as it helps my other training setup.
Yes, you can create two virtual environments in Anaconda with different tensorflow version. But CUDA and CuDNN will be installing compatible to that specified tensorflow-gpu.
You can find tensorflow-gpu build configuration details here to check supporting CUDA and cuDNN version.
Please check this similar issue link to create virtual environment in anaconda and to install specific tensorflow-gpu.

PyTorch having trouble detecting CUDA

I am running CNN on PyTorch. The torch.cuda.is_available() function returned false and no GPU is detected. However, I can run Keras model with GPU. Here is my system information:
OS: Ubuntu 18.04.3
Python 3.7.3 (Conda)
GPU: GTX1080Ti
Nvidia driver: 430.50
When I check nvidia-smi, the output said that the CUDA version is 10.1. However, the nvcc -V command tells me that it is CUDA 9.1.
I downloaded NVIDIA-Linux-x86_64-430.50.run from the official site and install it with command line. I installed CUDA 10.1 using these following command line recommended by the official site:
wget http://developer.download.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.243_418.87.00_linux.run
sudo sh cuda_10.1.243_418.87.00_linux.run
I installed PyTorch through pip install. What is wrong? Thanks in advance!
The default Pytorch 1.2 package depends on CUDA 10.0, but you have CUDA 9.1. The output of nvidia-smi just tells you the maximum CUDA version your GPU supports, nvcc gives the CUDA installed on your system. It seems that your installation of CUDA 10.1 was unsuccessful.
In addition to CUDA 10.0, Pytorch also supports CUDA 9.2 and I've found that the Pytorch package compiled for CUDA 10.0 also works with CUDA 10.1. So you can either upgrade your CUDA installation to 9.2 and install the Pytorch CUDA 9.2 package with
pip3 install torch==1.2.0+cu92 torchvision==0.4.0+cu92 -f https://download.pytorch.org/whl/torch_stable.html
Or get a working installation of CUDA 10.1. There are detailed Linux instructions here. (Note that you may have to remove previous installations of CUDA before installing a new one.)
FYI, this answer is a hack which could mess up your conda env, but may work more easily than installing a fresh env. A consistency-checking tool would be really helpful because of all the people having exactly this problem. Matching anaconda's CUDA version with the system driver and the actual hardware and the other system environment settings is challenging to say the least and almost an art.
I found that Anaconda improperly guesses the CUDA version to use frequently. So I have found the best way to fix this is to surgically uninstall and reinstall just pytorch with pip:
pip uninstall torch
pip install torch
Note that pip calls pytorch torch while conda calls it pytorch.
However, I also found that pip sometimes refuses to reinstall torch because it didn't get rid of the anaconda site package files. If that is the case you can very carefully remove them manually as:
rm -fr $HOME/miniconda3/envs/<ENV>/lib/python3.9/site-packages/torch/
rm -fr $HOME/miniconda3/envs/<ENV>/lib/python3.9/site-packages/torch-*.dist-info/
where should be replaced with your environment name and miniconda might be anaconda or something else depending on your installation.
Be very careful not to delete anything other than the torch files or you may mess something else up. Then you would be best served by installing yet another fresh environment.
After this pip install torch should work and torch.cuda.is_available() should return True. Unless there is another problem... YMMV.
Note that I recommend using miniconda because the full anaconda comes overloaded with packages and I find it quickly gets clogged and broken.

Resources