How to use AMD GPU for fastai/pytorch?

How to use AMD GPU for fastai/pytorch? - pytorch

I'm using a laptop which has Intel Corporation HD Graphics 5500 (rev 09), and AMD Radeon r5 m255 graphics card.
Does anyone know how to it set up for Deep Learning, specifically fastai/Pytorch?

Update 3:
Since late 2020, torch-mlir project has come a long way and now supports all major Operating systems. Using torch-mlir you can now use your AMD, NVIDIA or Intel GPUs with the latest version of Pytorch.
You can download the binaries for your OS from here.
Update 2:
Since October 21, 2021, You can use DirectML version of Pytorch.
DirectML is a high-performance, hardware-accelerated DirectX 12
based library that provides GPU acceleration for ML based tasks. It supports all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
Update:
For latest version of PyTorch with DirectML see: torch-directml
you can install the latest version using pip:
pip install torch-directml
For detailed explanation on how to setup everything see Enable PyTorch with DirectML on Windows.
side note concerning pytorch-directml:
Microsoft has changed the way it released pytorch-directml. it deprecated the old 1.8 version and now the offers the new torch-directml(as apposed to the previously called pytorch-directml).
It is now installed as a plugin for the actual version of Pytorch and works align side it.
Old version:
The initial release of pytorch-directml (Oct 21, 2021):
Microsoft has release Pytorch_DML a few hours ago.
You can now install it (in windows or WSL) using pypi package:
pytorch-directml 1.8.0a0.dev211021
pip install pytorch-directml
So if you are on windows or using WSL, you can hop in and give this a try!
Update :
As of Pytorch 1.8 (March 04, 2021), AMD ROCm versions are made available from Pytorch's official website. You can now easily install them on Linux and Mac, the same way you used to install the CUDA/CPU versions.
Currently, the pip packages are being provided only. Also, the Mac and Windows platforms are still not supported (I haven't tested with WSL2 though!)
Old answer:
You need to install the ROCm version. The official AMD instructions on building Pytorch is here.
There was previously a wheel package for rocm, but it seems AMD doesn't distribute that anymore, and instead, you need to build PyTorch from the source as the guide which I linked to explains.
However, you may consult this page, to build the latest PyTorch version: The unofficial page of ROCm/PyTorch.

Update: In March 2021, Pytorch added support for AMD GPUs, you can just install it and configure it like every other CUDA based GPU. Here is the link
Don't know about PyTorch but, Even though Keras is now integrated with TF, you can use Keras on an AMD GPU using a library PlaidML link! made by Intel. It's pretty cool and easy to set up plus it's pretty handy to switch the Keras backends for different projects

Related

Choose CUDA version for ffmpeg encoding

I have multiple CUDA versions installed and I need all of them, so I can’t unistall any. The problem is when I try to encode with the nvenc_h264 encoder, it doesn’t work because it says that there are multiple version of CUDA.
I’m trying to choose the CUDA version, but I don’t find any parameter in ffmpeg documentation to do so.
Does anyone know how to choose the CUDA version?
I’m working with Linux 22 and the latest ffmpeg version with the NVIDIA libraries.

Tensorflow GPU: Error says .dll file not found, but it does exist

I've been trying to get CUDA to work with TensorFlow for a while now because the neural nets I've been building are now taking hours to train on my CPU, and it'd be great to get that big speed boost. However, whenever I try to use it with TensorFlow (it works with PyTorch, but I want to learn multiple APIs), it tells me that one of the .dll files needed to run CUDA doesn't exist, when it actually does.
I've downloaded and replaced that .dll with other versions from dll-files.com. I've tried uninstalling and reinstalling TensorFlow, CUDA, and cuDNN. I've tried different versions of CUDA, but that only caused all the .dll files to not be found (and yes, I did change the CUDA_PATH value). I've tried switching the PATH between C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0 and C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin to see if that changed anything.
If anyone could help with this, that would be much appreciated.
The errors I get when I run tf.test.is_gpu_available()
The file existing

Try installing a different older version of the CUDA toolkit on top of the version you have installed already. It fixed it for me however I also had to import all the previous dlls from the latest cuDNN toolkit into the new legacy CUDA toolkit installs as well.

Have you checked if your TF version was compatible with your CUDA version?
Check the compatibility matrix here: https://www.tensorflow.org/install/source#tested_build_configurations
Unless you compile TF from source, CUDA 11 is not supported yet.
In any case, I would avoid downloading dll from the website you mentioned.

How to install Python 3.8.1 on RHEL 8?

This is really frustrating.
I want to install the latest version of Python (at the time of this issue: Python 3.8.1) on RHEL 8, (RHEL being one of the most widely used distributions of Linux).
I would like to type:
#dnf install python
and have it install the latest version of Python.
I can't do this, and I do not know why.
When I go to python.org and click on 'install for Linux' I get a link to the source code.
There are no instructions there as to what to do with the source code.
I do not understand why this is.
I don't want the source code, I want to install python 3.8.1 executables for my platform (RHEL 8).
I search on how to install python 3.8.1 from source and get a long list of dependencies that I have to install and a long list of steps.
Is this because it is a very rare thing for companies to run Python on Linux?
Can we get together here and make it easy for folks to install Python on Linux?
I'm willing to pay money out of my daily earnings to setup a RHEL 8 repo to get Python 3.8 there if IBM/Redhat is not willing to do this.
Why does the official Python organization hate Linux?
Why does IBM / Redhat hate Python?
Can we bring the two together in peace and harmony so that they just get along?
This is very frustrating, I should be able to knock this task out in a few seconds, and it has turned into hours.
The same amount of hours to figure out how to do this is probably done every day by developers all over the world that want to install/run the latest version of Python on Linux (CentOS / RHEL).

Python 3.8 Application Stream is currently available with RHEL 8.2 beta. Since we support every new version of Python (3 years) that we release, we need to make sure that it's stable before bringing it to RHEL and the many hardware architectures it runs on. This is also important as customers expect technologies to be production grade. This table shows that over the years, we have officially supported more than one Python version simultaneously. You can download RHEL 8.2 beta from here. RHEL 8 was released with 2 versions of Python (2.7 and 3.6) because it's an important technology for us. We've used it ourselves for many years in building RHEL components and, among others in this industry, we had to rebuild it to 3.x (from 2.7).
FYI - new versions of Python and other components are released as Software Collections on RHEL 7, and as Application Streams on RHEL 8. The benefit of these is that the version of Python that's installed will have exactly the same packages and components for each system it's install it on. This simplifies things a lot (as you point out, it's complicated) and minimizes the "it works on my machine" issue.

PyTorch C++ - how to know the recommended version of cuDNN?

I've previously inferenced TensorFlow graphs from C++. Now I'm embarking on working out how to inference PyTorch graphs via C++.
My first question is, how can I know the recommended version of cuDNN to use with LibTorch, or if I'm doing my own PyTorch compile?
Determining the recommended CUDA version is easy. Upon going to https://pytorch.org/ and choosing the options under Quick Start Locally (PyTorch Build, Your OS, etc.) the site makes it pretty clear that CUDA 10.1 is recommended, but there is no mention of cuDNN version and upon Googling I'm unable to find a definitive answer for this.
From what I understand about PyTorch on ubuntu, if you use the Python version you have to install the CUDA driver (ex. so nvidia-smi works, version 440 currently), but the CUDA and cuDNN install are not actually required beyond the driver because they are included in the pip3 package, is this correct? If so, then is there a command I can run in a Python script that shows the version of CUDA (expected to be 10.1) and cuDNN that the pip pre-compiled .whl uses? I suspect there is such a command but I'm not familiar enough with PyTorch yet to know what that may be or how to look it up.
I've ran into compile and inferencing errors using C++ with TensorFlow when I was not using the specific recommended version of cuDNN for a certain version of TensorFlow and CUDA so I'm aware these version can be sensitive and I have to make the right choices from the get-go. If anybody can assist in determining the recommended version of cuDNN for a certain version of PyTorch that would be great.

CUDA is supported via the graphics card driver, AFAIK there's no separate "CUDA driver". The system graphics card driver pretty much just needs to be new enough to support the CUDA/cudNN versions for the selected PyTorch version. To the best of my knowledge backwards compatibility is included in most drivers. For example a driver that supports CUDA 10.1 (reported via nvidia-smi) will also likely support CUDA 8, 9, 10.0
If you installed with pip or conda then a version of CUDA and cudNN are included with the install. You can query the actual versions being used in python with torch.version.cuda and torch.backends.cudnn.version().

CUDA versions confusion

I'd like to begin learning CUDA but I'm confused about the versions. The latest release of CUDA is 3 and I have the CUDA 3.0.1 driver on my system but in theory my graphics card only supports 1.0. Can I use the features of the later versions or I need to stick to the 1.0?

The latest public release is 3.2 but 4.0 is out in beta to registered developers. Compatibility is based on the features that the hardware supports. You can use the latest version of the SDK but will need to compile for the feature set supported by your card and not attempt to use SDK features that are not supported. You do this by setting the arch flag.
There's some explanation here:
Fermi Compatibility Guide - NVIDIA

CUDA toolkit versions (3.1, 3.2, 4.0) are different from the graphics card compute capability (1.0 / 1.1 for older geforce cards, 1.2 for many mobile cards, 1.3 for slightly old geforce, 2.0+ for the latest fermi architecture). All the toolkits work with all cuda capable graphics cards. Although the complete functionality may not be available, you can still write functional cuda code.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string