How to deploy GPT-like model to Triton inference server? - pytorch

The tutorials on deployment GPT-like models inference to Triton looks like:
Preprocess our data as input_ids = tokenizer(text)["input_ids"]
Feed input to Triton inference server and get outputs_ids = model(input_ids)
Postprocess outputs like
outputs = outputs_ids.logits.argmax(axis=2)
outputs = tokenizer.decode(outputs)
I use finetuned GPT2 model and this method gives incorrect result. The correct result will be obtained by model.decode(input_ids) method.
There is the way to deploy finetuned GPT-like huggingface model to Triton with inference model.decode(input_ids) not model(input_ids)?

Related

Prediction for pretrained model on handwritten text(images)-Pytorch

I have a problem making a prediction using a pre-trained model that contains an encoder and decoder for handwritten text recognition.
What I did is the following:
checkpoint = torch.load("Model/SPAN/SPAN-PT-RA_rimes.pt",map_location=torch.device('cpu'))
encoder_state_dict = checkpoint['encoder_state_dict']
decoder_state_dict = checkpoint['decoder_state_dict']
img = torch.LongTensor(img).unsqueeze(1).to(torch.device('cpu'))
global_pred = decoder_state_dict(encoder_state_dict(img))
This generates this error:
TypeError: 'collections.OrderedDict' object is not callable
I would highly appreciate your help! ^_^
encoder_state_dict and decoder_state_dict are not the torch Models, but a collection (dictionary) of tensors that include pre-trained parameters of the checkpoint you loaded.
Feeding inputs (such as the input image you got transformed) to such collection of tensors does not make sense. In fact, you should use these stat_dicts (i.e., a collection of pre-trained tensors) to load them into the parameters of your model object that is mapped to the network. See torch.nn.Module class.

Bart model inference results after converting from hugginface to onnx

I followed the instructions to convert BART-LARGE-CNN model to ONNX here (https://github.com/huggingface/transformers/blob/master/docs/source/serialization.rst) using transformers.onnx script. The model was exported fine and I can run inference.
However, the results of the inference, from the 'last_hideen_state' are in logits (I think)? How can I parse this output for summarization purposes?
Here are screenshots of what I've done.
This is the resulting output from those two states:
I have implemented fast-Bart. Which essentially converts Bart model from Pytorch to Onnx- with generate capabilities.
fast-Bart

How to deploy mlflow model with data preprocessing(text data)

I have developed keras text classification model. I have preprocessed data(tokenization). I have logged trained model successfully(mlflow.keras.log_model). I have served model using mlflow serve. Now while doing prediction on text data I need to do preprocessing using same tokenizer object used for training.
How to preprocess test data and get predictions from served model.
You can log a custom python model:
https://www.mlflow.org/docs/latest/models.html#custom-python-models

Using pretrained models in Pytorch for Semantic Segmentation, then training only the fully connected layers with our own dataset

I am learning Pytorch and trying to understand how the library works for semantic segmentation.
What I've understood so far is that we can use a pre-trained model in pytorch. I've found an article which was using this model in the .eval() mode but I have not been able to find any tutorial on using such a model for training on our own dataset. I have a very small dataset and I need transfer learning to get results. My goal is to only train the FC layers with my own data. How is that achievable in Pytorch without complicating the code with OOP or so many .py files. I have been having a hard time figuring out such repos in github as I am not the most proficient person when it comes to OOP. I have been using Keras for Deep Learning until recently and there everything is easy and straightforward. Do I have the same options in Pycharm?
I appreciate any guidance on this. I need to run a piece of code that does the semantic segmentation and I am really confused about many of the steps I need to take.
Assume you start with a pretrained model called model. All of this occurs before you pass the model any data.
You want to find the layers you want to train by looking at all of them and then indexing them using model.children(). Running this command will show you all of the blocks and layers.
list(model.children())
Suppose you have now found the layers that you want to finetune (your FC layers as you describe). If the layers you want to train are the last 5 you can grab all of the layers except for the last 5 in order to set their requires_grad params to False so they don't train when you run the training algorithm.
list(model.children())[-5:]
Remove those layers:
layer_list = list(model.children())[-5:]
Rebuild model using sequential:
model_small = nn.Sequential(*list(model.children())[:-5])
Set requires_grad params to False:
for param in model_small.parameters():
param.requires_grad = False
Now you have a model called model_small that has all of the layers except the layers you want to train. Now you can reattach the layers that your removed and they will intrinsically have the requires_grad param set to True. Now when you train the model it will only update the weights on those layers.
model_small.avgpool_1 = nn.AdaptiveAvgPool2d()
model_small.lin1 = nn.Linear()
model_small.logits = nn.Linear()
model_small.softmax = nn.Softmax()
model = model_small.to(device)

Tensorflow - building a CNN model as described in the tutorial

I just completed the implementation of A Guide to TF Layers: Building a Convolutional Neural Network for the MNIST data set. The training model successfully ran and gave accuracy of 97.3%.
However, the tutorial does not mention how to use this new trained model to supply own images and see the predictions. Does anyone know how to use the output of the training model to make predictions? I see in the tmp/mnist_convnet_model$ folder, there are some output files like .pbtxt , meta files and index files. But I can't find instructions to use them for making predictions on my own images.
y_pred = tf.nn.softmax(your_final_layer)
y_pred_cls = tf.argmax(y_pred, dimension=1)
and for prediction
feed_dict = {x: [your_image]}
classification = tf.run(y_pred_cls, feed_dict)
print classification
This applies to just about any model you create

Resources