TensorRT Nvidia - Mish activation not supported - activation

Did anyone try quantizing model with Nvidia TensorRT, some layers are not supported such as Mish/Softplus functions. Does anyone have a solution on how to use those with quantized models?

Related

Hybrid quantization of ONNX model possible?

From the documentation of the onnxruntime I could not find any information regarding support of hybrid quantization.
documentation on onnxruntime quantization
example code for quantizing ONNX model
I would like to quantize an ONNX model to uint8. The target hardware the model is meant to run on supports uint8 and int16 operations (for a few layers probably also float32could work).
I could not find any information on hybrid quantization support.
Is it possible to apply hybrid quantization, where sensitive layers get a higher resolution?
Or is it just 'all or nothing'?

how to enable AMD Radeon graphics to train deep learning models?

we can train our deep learning model on GPU, I know how to enable NVIDIA graphics to train deep learning model, I just want to ask how can we use AMD Radeon graphics to train deep learning model in Jupyter Notebook...
Usually, we run the deep learning model in Nvidia graphic card out of the support of cuDNN and CUDA.
As I know, ROCm based on AMD graphic card now supports TensorFlow, Caffe, MXnet, etc.
you can try to realize this platform (ROCm), but according to the experiment run by some person show that the training phase and model performances are not good as running in Nvidia GPU.
As a solution for this, you can use google colab.

How can I use pytorch pre-trained model without installing pytorch?

I only want to use pre-trained model in pytorch without installing the whole package.
Can I just copy the model module from pytorch?
I'm afraid you cannot do that: in order to run the model, you need not only the trained weights ('.pth.tar' file) but also the "structure" of the net: that is, the layers, how they are connected to each other etc. This network structure is coded in python and requires pytorch to be installed.
A way of using PyTorch models without Installing PyTorch is if the model is exported in Onnx format. Once the model is in Onnx format the model can be Imported into the Onnx runtime and ca be used for Inferencing. This tutorial should help you out.Pytorch ONNX

Is it possible to set GPU affinity for a mixed precision NN, with FP32 and FP16 going to different GPUs?

I have a GTX 1080 and an RTX 2080. I want to train using both, but since the RTX can handle FP16 twice as fast, I'd like to set it up so that the training is multi-GPU and the RTX handles the FP16 layers and the GTX handles the FP32 layers.
Is this possible under tensorflow, pytorch, or keras?
Tensorflow
In TF, it is possible to specify for each layer on which device to be executed (GPU, CPU, or specific GPU if you have multiple GPUs ...). This is done using with tf.device('device_name') statement (you need to provide meaningful device_name). See Using multiple GPUs section.
Keras
Since this is possible in TF, that means that you can use it also in Keras, if you use TF as Keras backend (Keras is just a high-level neural networks API).
Note that in Keras there is a multi_gpu_model() function in Keras, but that only copies a whole model on multiple GPUs, you cannot specify which layer to put on specific GPU.

What is cuDNN implementation of rnn cells in Tensorflow

To create RNN cells, there are classes like GRUCell and LSTMCell which can be used later to create RNN layers.
And also there are 2 other classes as CudnnGRU and CudnnLSTM which can be directly used to create RNN layers.
In the documentation they say that the latter classes have cuDNN implementation. Why should I use or not use this cuDNN implemented classes over classical RNN implementations when I'm creating a RNN model..?
In short: cudnnGRU and cudnnLSTM can/ must be used on GPU, normal rnn implementations not. So if you have tensorflow-gpu, cudnn implementation of RNN cells would run faster.
CuDNNLSTM and CuDNNGRU are the fast implementation backed by CuDNN. Both can only be run on the GPU, with the TensorFlow backend. The cuDNN is a GPU-accelerated library of primitives for deep neural networks.
The cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN is part of the NVIDIA Deep Learning SDK.
The cuDNN highlights include:
Up to 3x faster training of ResNet-50 and GNMT on Tesla V100 vs.
Tesla P100
Improved NHWC support for pooling and strided convolution
Get Improved performance for common workloads such as ResNet50 and SSD as batchnorm now supports NHWC data layout with an added option
to fuse batchnorm with Add and ReLu operations

Resources