Set Pytorch to run on AMD GPU - pytorch

According to the official docs, now PyTorch supports AMD GPUs. ROCm 4.2 can be installed through pip. But I can not find in Google nor the offical docs how to force my DL training to use the gpu. What is the AMD equivalent to the following command?
torch.device('cuda' if torch.cuda.is_available() else 'cpu')

ROCm translates CUDA codes. So torch.cuda.is_available() should work.

Related

NVIDIA Geforce GT 540M not appearing in physical device list

I have an ASUS laptop with built-in NVIDIA GeForce GT 540M. I have installed the Cuda toolkit and CuDNN package and other utilities like Visual C++. But whey I run this codetf.config.experimental.list_physical_devices(). only CPU is there no GPU appeared. My question is can I use this GPU for machine learning or not?
https://developer.nvidia.com/cuda-gpus
Under "CUDA-Enabled GeForce and TITAN Products" you can see that your GPU is CUDA enabled. It also has a compute capability of 2.1.
Incidentally all CUDA versions have a minimum compute capability they require.
https://tech.amikelive.com/node-930/cuda-compatibility-of-nvidia-display-gpu-drivers/
As you can see most of the latest versions of CUDA require compute capability 3 or higher.
If you'd like to try installing an older version of CUDA, make sure to check out the following link to see a list of tested OS - tensorflow - CUDA - cuDNN combinations.
Which TensorFlow and CUDA version combinations are compatible?

How to build Tensorflow for my AMD Linux system?

I recently installed Tensorflow onto my system. Upon fitting to the model, I am getting this error in the Jupyter Notebook terminal:
This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
I read that I am supposed to build Tensorflow for my system, but I am unaware how to. What kinda settings do I need to set? What compiler flags do I need? Here is my setup:
GPU: AMD RX 5700XT
CPU: AMD Ryzen 9 3900X
RAM: 64GB DDR4
OS: Ubuntu 20.04
Can someone give me advice on how to build Tensorflow for my system?
Most deep learning and machine learning frameworks and libraries use NVIDIA CUDA for GPU processing, we would need to select an NVIDIA graphics card.
Note While Amd has some excellent graphics card models, their compatibility and support with ML tasks are still experimental. So we will need to stick to NViDia.
AMD provides a ROCm enabled TensorFlow library for AMD GPUs. It's based on the ROCm software stack. To know more about this library: MIOpen - ROCm.
Based on these factors, the following graphics cards families can be recommended:
GeForce 10 series
GeForce 16 series.
GeForce 20 series.
GeForce 30 series

Which PyTorch version is CUDA 3.0 compatible?

I have a Nvidia GeForce GTX 770, which is CUDA 3.0 compatible, but upon running PyTorch training on the GPU, I get the warning
Found GPU0 GeForce GTX 770 which is of cuda capability 3.0.
PyTorch no longer supports this GPU because it is too old.
The minimum cuda capability that we support is 3.5.
and subsequently the error RuntimeError: CUDA error: no kernel image is available for execution on the device.
Was there an old PyTorch version, that supported graphics cards like mine with CUDA capability 3.0? If yes, which version, and where to find this information? Is there a table somewhere, where I can find the supported CUDA versions and compatibility versions?
If it is relevant, I have CUDA 10.1 installed.

Different message prompt on google colab vs Pycharm

I am running the same CNN model on the same dataset (with 50000 training examples) with exactly same parameters on both Google Colab (I think it has K80 GPU) and my own system (with GTX 1080 GPU and 8700K CPU). I am using the batch_size=32 on both but I am surprised to see that while training, Google Colab shows me;
while my own system (using PyCharm), shows me
I can understand that the difference in accuracies is may be due to different random initializations but why on Google Colab it shows the training during each epoch in terms of number of batches 1563/1563 while on my machine, it shows me in terms of number of examples in the training set i.e. 50000/50000
In both cases I am using tf.keras.
Has it anything to do with the version. On my (Windows) machine, the tensorflow-gpu version is 2.1.0 whereas, on the Google Colab (probably Linux) machine, it is 2.2.0. I cannot upgrade the version on my windows machine from 2.1.0 to 2.2.0, probably they are same as can be seen here;
Please correct me if I am wrong.

Setting up keras and tensoflow to operate with AMD GPU

I am trying to set up Keras in order to run models using my GPU. I have a Radeon RX580 and am running Windows 10.
I saw realized that CUDA only supports NVIDIA GPUs and was having difficulty finding a way to get my code to run on the GPU. I tried downloading and setting up plaidml but afterwards from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
only printed that I was running on a CPU and there was not a GPU available even though the plaidml setup was a success. I have read that PyOpenCl is needed but have not gotten a clear answer as to why or to what capacity. Does anyone know how to set up this AMD GPU to work properly? any help would be much appreciated. Thank you!
To the best of my knowledge, PlaidML was not working because I did not have the required prerequisites such as OpenCL. Once I downloaded the Visual Studio C++ build tools in order to install PyopenCL from a .whl file. This seemed to resolve the issue

Resources