Google Colab GPU environment establish - python-3.x

I'm trying to run my deep learning code in Google Colab, I have installed cuda10.0.130 and cudnn7.6.4 for tensorflow 1.14.0, but the result of tf.test.is_gpu_available() is still false, I don't know what can I do now, can somebody give me some instructions? Here is the output of !sudo lsb_release -a and !nvidia-smi

Supported and Tested configurations for GPU versions are given here in this link
Supported Version for Cuda 10.1 and Cudnn 7.6 will be tensorflow_gpu-2.3.0
Also for TF 1.X versions CPU and GPU support are different
So you should do
!pip install tensorflow_gpu==1.14.0
for using GPU version of Tensorflow
Ref- https://www.tensorflow.org/install/gpu#older_versions_of_tensorflow

Related

Detectron2 Installation based on pytorch and cuda versions

I am fairly new to Object Detection and I'm trying to install Detectron2 on colab. My Pytorch version is 1.12 whereas Cuda version is 11.2. I referred the website https://detectron2.readthedocs.io/en/latest/tutorials/install.html to find the most appropriate installation command but I can't understand which one to go for since neither the pytorch version nor the Cuda version is available there.

UserWarning: CUDA initialization:

I have installed Pytorch 1.8.1+cu102 using a virtual environment on a HPC cluster.
torch.cuda.is_available()
is giving me the below output
UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 10010). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:109.)
return torch._C._cuda_getDeviceCount() > 0
False
What could be wrong ? I am not sure how I can update the driver. My requirements are:
torch==1.8.1+cu102
torch-cluster==1.5.9
torch-geometric==1.7.0
Firstly, you need to check which version you need for Pytorch. You can find the cuda version corresponding to Pytorch in the link below.
https://pytorch.org/get-started/previous-versions/
After you find the version, you need to check whether the version is available for your GPU device. You can find the list in the link below.
https://developer.nvidia.com/cuda-gpus
If there is no match, you need to change either pytorch requirement or your GPU device.

Can I use high version torch and low version cuda?

I'm setting up my Conda environment with a remote GPU to use Pytorch.
The GPU I use is only NVIDIA-SMI 396.54, so I can only use cuda version 9.2
However, I need to use a higher version torch to be able to use some attributes.
I tried
conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=9.2
But this results in
print(torch.version.cuda)>> None
torch.cuda.is_available() >> False
There are two things I would check.
You may have unintentionally installed the pytorch cpu version or had it in your environment first, before running the above command. Even if you install the gpu version of Pytorch, if you already have the cpu version of pytorch then torch.cuda.is_available() will return False. Therefore I suggest checking out this link:
Forum on why Pytorch is CPU version even after installing cudatoolkit version
Although, I am pretty sure the above thing is your problem, I suggest looking at this second thing.
For understanding how to download previous version of Pytorch refer to this link. https://pytorch.org/get-started/previous-versions/
After looking at this, I suggest starting a new conda env and running your conda install command first.
Sarthak Jain

CUDA RuntimeError when trying to use detectron2 [duplicate]

On a Windows 10 PC with an NVidia GeForce 820M
I installed CUDA 9.2 and cudnn 7.1 successfully,
and then installed PyTorch using the instructions at pytorch.org:
pip install torch==1.4.0+cu92 torchvision==0.5.0+cu92 -f https://download.pytorch.org/whl/torch_stable.html
But I get:
>>> import torch
>>> torch.cuda.is_available()
False
Your graphics card does not support CUDA 9.0.
Since I've seen a lot of questions that refer to issues like this I'm writing a broad answer on how to check if your system is compatible with CUDA, specifically targeted at using PyTorch with CUDA support. Various circumstance-dependent options for resolving issues are described in the last section of this answer.
The system requirements to use PyTorch with CUDA are as follows:
Your graphics card must support the required version of CUDA
Your graphics card driver must support the required version of CUDA
The PyTorch binaries must be built with support for the compute capability of your graphics card
Note: If you install pre-built binaries (using either pip or conda) then you do not need to install the CUDA toolkit or runtime on your system before installing PyTorch with CUDA support. This is because PyTorch, unless compiled from source, is always delivered with a copy of the CUDA library.
1. How to check if your GPU/graphics card supports a particular CUDA version
First, identify the model of your graphics card.
Before moving forward ensure that you've got an NVIDIA graphics card. AMD and Intel graphics cards do not support CUDA.
NVIDIA doesn't do a great job of providing CUDA compatibility information in a single location. The best resource is probably this section on the CUDA Wikipedia page. To determine which versions of CUDA are supported
Locate your graphics card model in the big table and take note of the compute capability version. For example, the GeForce 820M compute capability is 2.1.
In the bullet list preceding the table check to see if the required CUDA version is supported by the compute capability of your graphics card. For example, CUDA 9.2 is not supported for compute compatibility 2.1.
If your card doesn't support the required CUDA version then see the options in section 4 of this answer.
Note: Compute capability refers to the computational features supported by your graphics card. Newer versions of the CUDA library rely on newer hardware features, which is why we need to determine the compute capability in order to determine the supported versions of CUDA.
2. How to check if your GPU/graphics driver supports a particular CUDA version
The graphics driver is the software that allows your operating system to communicate with your graphics card. Since CUDA relies on low-level communication with the graphics card you need to have an up-to-date driver in order use the latest versions of CUDA.
First, make sure you have an NVIDIA graphics driver installed on your system. You can acquire the newest driver for your system from NVIDIA's website.
If you've installed the latest driver version then your graphics driver probably supports every CUDA version compatible with your graphics card (see section 1). To verify, you can check Table 3 in the CUDA release notes. In rare cases I've heard of the latest recommended graphics drivers not supporting the latest CUDA releases. You should be able to get around this by installing the CUDA toolkit for the required CUDA version and selecting the option to install compatible drivers, though this usually isn't required.
If you can't, or don't want to upgrade the graphics driver then you can check to see if your current driver supports the specific CUDA version as follows:
On Windows
Determine your current graphics driver version (Source https://www.nvidia.com/en-gb/drivers/drivers-faq/)
Right-click on your desktop and select NVIDIA Control Panel. From the
NVIDIA Control Panel menu, select Help > System Information. The
driver version is listed at the top of the Details window. For more
advanced users, you can also get the driver version number from the
Windows Device Manager. Right-click on your graphics device under
display adapters and then select Properties. Select the Driver tab and
read the Driver version. The last 5 digits are the NVIDIA driver
version number.
Visit the CUDA release notes and scroll down to Table 3. Use this table to verify your graphics driver is new enough to support the required version of CUDA.
On Linux/OS X
Run the following command in a terminal window
nvidia-smi
This should result in something like the following
Sat Apr 4 15:31:57 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 206... Off | 00000000:01:00.0 On | N/A |
| 0% 35C P8 16W / 175W | 502MiB / 7974MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1138 G /usr/lib/xorg/Xorg 300MiB |
| 0 2550 G /usr/bin/compiz 189MiB |
| 0 5735 G /usr/lib/firefox/firefox 5MiB |
| 0 7073 G /usr/lib/firefox/firefox 5MiB |
+-----------------------------------------------------------------------------+
Driver Version: ###.## is your graphic driver version. In the example above the driver version is 435.21.
CUDA Version: ##.# is the latest version of CUDA supported by your graphics driver. In the example above the graphics driver supports CUDA 10.1 as well as all compatible CUDA versions before 10.1.
Note: The CUDA Version displayed in this table does not indicate that the CUDA toolkit or runtime are actually installed on your system. This just indicates the latest version of CUDA your graphics driver is compatible with.
To be extra sure that your driver supports the desired CUDA version you can visit Table 3 on the CUDA release notes page.
3. How to check if a particular version of PyTorch is compatible with your GPU/graphics card compute capability
Even if your graphics card supports the required version of CUDA then it's possible that the pre-compiled PyTorch binaries were not compiled with support for your compute capability. For example, in PyTorch 0.3.1 support for compute capability <= 5.0 was dropped.
First, verify that your graphics card and driver both support the required CUDA version (see Sections 1 and 2 above), the information in this section assumes that this is the case.
The easiest way to check if PyTorch supports your compute capability is to install the desired version of PyTorch with CUDA support and run the following from a python interpreter
>>> import torch
>>> torch.zeros(1).cuda()
If you get an error message that reads
Found GPU0 XXXXX which is of cuda capability #.#.
PyTorch no longer supports this GPU because it is too old.
then that means PyTorch was not compiled with support for your compute capability. If this runs without issue then you should be good to go.
Update If you're installing an old version of PyTorch on a system with a newer GPU then it's possible that the old PyTorch release wasn't compiled with support for your compute capability. Assuming your GPU supports the version of CUDA used by PyTorch, then you should be able to rebuild PyTorch from source with the desired CUDA version or upgrade to a more recent version of PyTorch that was compiled with support for the newer compute capabilities.
4. Conclusion
If your graphics card and driver support the required version of CUDA (section 1 and 2) but the PyTorch binaries don't support your compute capability (section 3) then your options are
Compile PyTorch from source with support for your compute capability (see here)
Install PyTorch without CUDA support (CPU-only)
Install an older version of the PyTorch binaries that support your compute capability (not recommended as PyTorch 0.3.1 is very outdated at this point). AFAIK compute capability older than 3.X has never been supported in the pre-built binaries
Upgrade your graphics card
If your graphics card doesn't support the required version of CUDA (section 1) then your options are
Install PyTorch without CUDA support (CPU-only)
Install an older version of PyTorch that supports a CUDA version supported by your graphics card (still may require compiling from source if the binaries don't support your compute capability)
Upgrade your graphics card
To solve this issue, the following method answered for me:
1- First you have to update Anaconda.
2- In your notebook, select the following based on your system.
https://pytorch.org/
example for Windows:(This may take some time. Be patient)
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
3- Find and install the latest graphics card for your system through the following site:
https://www.nvidia.com/Download/index.aspx
4- Supported CUDA level of GPU and card. see this
The same error can appear when the version of your Pytorch supports different CUDA. For example, my Pytorch version was with CUDA 8.0 support, but I had CUDA 9.0 installed. To fix that I had to upgrade my Pytorch to cu90 like this:
pip install torch_nightly -f https://download.pytorch.org/whl/nightly/cu90/torch_nightly.html
Reference: here
I want to share also my experience, especially in the WSL2 environment. See my post here.
Despite I had installed the correct and latest drivers following the guide provided by NVidia here, my WSL was not able to detect any GPU both in PyTorch and in the whole environment.
My GPU is Nvidia GeForce RTX 1650 Ti, which is not listed in the Wiki link above but is actually shown in the NVidia page.
Downgrading to an older driver version found at this NVidia link, namely Driver Version: 472.39 helped me out. Now PyTorch can correctly detect the driver, as well as I can run containers that require GPU access since it is correctly found and used.
Hoping this will help someone in my situation.
ok here's my experience
my system is ubuntu 20.4, gpu - nvidi gtx 1060
when i go and change run the 'Nvidia X Server Settings' application i found under the PRIME Profiles
Nvidia On-Demand or Inter(power saving mode) is selected
giving torch.cuda.is_available() to False
i changed the GPU Mode to 'NVIDIA(Performance Mode) then i got True
NVIDIA X Server Setting-GUI
Just faced the same with my GPU (last available driver installed), none of above helped, searching for hours over Google also no luck. Here's what worked out for me:
Delete all environments that were created in Anaconda. Uninstall Anaconda and delete all related folders in "user" folder
Install Anaconda
Add conda-forge to channels https://conda-forge.org/docs/user/introduction.html
Run through installation Guide from NVIDIA https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html
Choose proper configuration and run conda installation https://pytorch.org/get-started/locally/
IF installation failed with well-known failed with initial frozen solve. Retrying with flexible solve. go with pip3 install instead of conda
Enjoy your GPU in Jupyter Notebook:
import torch
torch.cuda.is_available()
True
Step 1.) Check your cuda and GPU DRIVER version using
nvidia-smi .
This will be helpful in downloading the correct version of pytorch with this hardware
Step 2.) Check if you have installed gpu version of pytorch by using
conda list pytorch
If you get "cpu_" version of pytorch then you need to uninstall pytorch and reinstall it by below command
```` conda uninstall pytorch
conda install pytorch torchvision cudatoolkit=11.3 -c pytorch -c conda-forge ````
I had a similar issue with a GPU with MIG mode. I had to disable the MIG mode:
>> nvidia-smi -mig 0
As pointed out by ptrblck, it was enable by default but I didn't create any MIG devices.
You can try to create them (user guide).
In my case, I had CUDA 12.0, torch==1.10.0 and torchvision==0.11.1 packages installed. I uninstalled CUDA 12.0 and installed CUDA 11.3
torch and torchvision packages must be compatible with the CUDA. Anyone having this problem can check the below site for compatibility.
https://pytorch.org/get-started/previous-versions/

keras Installation with already installed Tensorflow GPU version in windows 10

I have following environment in my windows 10 machine
Python : 3.6.0
Anaconda:4.3.1
Tensorflow:1.1.0
Screen Shots
OS:Windows 10-64bit
Now when I am trying to install keras into my system I am getting a huge list of errros.
Detailed Error Log
Now I have two questions here.
Can I install keras into my system when I already have tensorflow GPU version which was really hard to install?
If keras can be installed into my this system configuration then will my tensorflow GPU version work properly afterwards?

Resources