how to enable AMD Radeon graphics to train deep learning models? - keras

we can train our deep learning model on GPU, I know how to enable NVIDIA graphics to train deep learning model, I just want to ask how can we use AMD Radeon graphics to train deep learning model in Jupyter Notebook...

Usually, we run the deep learning model in Nvidia graphic card out of the support of cuDNN and CUDA.
As I know, ROCm based on AMD graphic card now supports TensorFlow, Caffe, MXnet, etc.
you can try to realize this platform (ROCm), but according to the experiment run by some person show that the training phase and model performances are not good as running in Nvidia GPU.
As a solution for this, you can use google colab.

Related

TensorRT Nvidia - Mish activation not supported

Did anyone try quantizing model with Nvidia TensorRT, some layers are not supported such as Mish/Softplus functions. Does anyone have a solution on how to use those with quantized models?

Why use Caffe2 or Core-ML instead of LibTorch(.pt file) on iOS?

It seems like there are several ways to run Pytorch models on iOS.
PyTorch(.pt) -> onnx -> caffe2
PyTorch(.pt) -> onnx -> Core-ML (.mlmodel)
PyTorch(.pt) -> LibTorch (.pt)
PyTorch Mobile?
What is the difference between the above methods?
Why people use caffe2 or Core-ml (.mlmodel), which requires model format conversion, instead of LibTorch?
Core ML can use the Apple Neural Engine (ANE), which is much faster than running the model on the CPU or GPU. If a device has no ANE, Core ML can automatically fall back to the GPU or CPU.
I haven't really looked into PyTorch Mobile in detail, but I think it currently only runs on the CPU, not on the GPU. And it definitely won't run on the ANE because only Core ML can do that.
Converting models can be a hassle, especially from PyTorch which requires going through ONNX first. But you do end up with a much faster way to run those models.

Object detection slow and does not use GPU

I need to use Tensorflow Object Detection API to make some classification connected with recognition.
My problem is that using the API for detection with a pretrained coco model takes too much time and for sure does not use the GPU. I checked my tensorflow-gpu installation on different scripts and it works fine, but when I use this model for detection I can only see increse in CPU usage.
I checked different version of tensorflow (1.12, 1.14), different combinations of CUDA Toolkit (9.0, 10.0) and CuDNN (7.4.2, 7.5.1, 7.6.1) but it is all the same, also tried it on both Windows 7 and Ubuntu 16.04, no difference. My project however requires much faster detection time.
System information:
System: Windows 7, Ubuntu 16.04
Tensorflow: 1.12, 1.14
GPU: GTX 970
Run following python code, if it detects GPU then you can use GPU for training otherwise there is some problem,
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
One more thing, just because your CPU is utilizing does not mean GPU is not at work. CPU always be busy, GPU should also spike when you are training.
Paste the output of above code in the comment if you are not sure about the output.
Edit: After chat with OP on comments, I see the suggested code, and it is using pretrained model, so no training happening here. You are using model and not training a new model. So no gpu is being used.

Is it possible to set GPU affinity for a mixed precision NN, with FP32 and FP16 going to different GPUs?

I have a GTX 1080 and an RTX 2080. I want to train using both, but since the RTX can handle FP16 twice as fast, I'd like to set it up so that the training is multi-GPU and the RTX handles the FP16 layers and the GTX handles the FP32 layers.
Is this possible under tensorflow, pytorch, or keras?
Tensorflow
In TF, it is possible to specify for each layer on which device to be executed (GPU, CPU, or specific GPU if you have multiple GPUs ...). This is done using with tf.device('device_name') statement (you need to provide meaningful device_name). See Using multiple GPUs section.
Keras
Since this is possible in TF, that means that you can use it also in Keras, if you use TF as Keras backend (Keras is just a high-level neural networks API).
Note that in Keras there is a multi_gpu_model() function in Keras, but that only copies a whole model on multiple GPUs, you cannot specify which layer to put on specific GPU.

Train object detection models based on CNN on CPU only

Since we have object detection models based on CNN such as Fast RCNN, Faster RCNN, YOLO (You only look once), ssd (single shot detector).
I have tried running Faster RCNN using CAFFE but backward path is not implemented for CPU mode. Is there any CNN based model which I can use it to train using CPU.
Any help will be appreciated.
faster-rcnn layers in CPU: https://github.com/neuleaf/faster-rcnn-cpu
SSD's original implementation already supports CPU training.

Resources