I used to run this script using GPUs on GCP, but I am now trying to implement it using TPUs. As far as I am concerned, TPUs should now be working fine with the transformers pipeline.
However, trying to set the device parameter throws RuntimeError: Cannot set version_counter for inference tensor
from transformers import pipeline
import torch
import torch_xla
import torch_xla.core.xla_model as xm
classifier = pipeline("text-classification",model='bhadresh-savani/distilbert-base-uncased-emotion', return_all_scores=True, device=device)
def detect_emotions(emotion_input):
"""Model Inference Section"""
prediction = classifier(emotion_input,)
output = {}
for emotion in prediction[0]:
output[emotion["label"]] = emotion["score"]
return output
detect_emotions('Rest in Power: The Trayvon Martin Story’ takes an emotional look back at the shooting that divided a nation')
How would this be rectified? What does this error even mean?
Related
I want to export roberta-base based language model to ONNX format. The model uses ROBERTA embeddings and performs text classification task.
from torch import nn
import torch.onnx
import onnx
import onnxruntime
import torch
import transformers
from logs:
17: pytorch: 1.10.2+cu113
18: CUDA: False
21: device: cpu
26: onnxruntime: 1.10.0
27: onnx: 1.11.0
PyTorch export
batch_size = 3
model_input = {
'input_ids': torch.empty(batch_size, 256, dtype=torch.int).random_(32000),
'attention_mask': torch.empty(batch_size, 256, dtype=torch.int).random_(2),
'seq_len': torch.empty(batch_size, 1, dtype=torch.int).random_(256)
}
model_file_path = os.path.join("checkpoints", 'model.onnx')
torch.onnx.export(da_inference.model, # model being run
model_input, # model input (or a tuple for multiple inputs)
model_file_path, # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=11, # the ONNX version to export the model to
operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK,
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = ['input_ids', 'attention_mask', 'seq_len'], # the model's input names
output_names = ['output'], # the model's output names
dynamic_axes={'input_ids': {0 : 'batch_size'},
'attention_mask': {0 : 'batch_size'},
'seq_len': {0 : 'batch_size'},
'output' : {0 : 'batch_size'}},
verbose=True)
I know there maybe problems converting some operators from ATen (A Tensor Library for C++11), if included in model architecture PyTorch Model Export to ONNX Failed Due to ATen.
Exports succeeds if I set the parameter operator_export_type=torch.onnx.OperatorExportTypes.ONNX_ATEN_FALLBACK which means 'leave as is ATen operators if not supported in ONNX'.
PyTorch export function gives me the following warning:
Warning: Unsupported operator ATen. No schema registered for this operator.
Warning: Shape inference does not support models with experimental operators: ATen
It looks like the only ATen operators in the model that are not converted to ONNX are situated inside layers LayerNorm.weight and LayerNorm.bias (I have several layers like that):
%1266 : Float(3, 256, 768, strides=[196608, 768, 1], requires_grad=0, device=cpu) =
onnx::ATen[cudnn_enable=1, eps=1.0000000000000001e-05, normalized_shape=[768], operator="layer_norm"]
(%1265, %model.utterance_rnn.base.encoder.layer.11.output.LayerNorm.weight,
%model.utterance_rnn.base.encoder.layer.11.output.LayerNorm.bias)
# /opt/conda/lib/python3.9/site-packages/torch/nn/functional.py:2347:0
Than model check passes OK:
model = onnx.load(model_file_path)
# Check that the model is well formed
onnx.checker.check_model(model)
# Print a human readable representation of the graph
print(onnx.helper.printable_graph(model.graph))
I also can visualize computation graph using Netron.
But when I try to perform inference using exported ONNX model it stalls with no logs or stdout. So this code will hang the system:
model_file_path = os.path.join("checkpoints", "model.onnx")
sess_options = onnxruntime.SessionOptions()
sess_options.log_severity_level = 0
ort_providers: List[str] = ["CUDAExecutionProvider"] if use_gpu else ['CPUExecutionProvider']
session = InferenceSession(model_file_path, providers=ort_providers, sess_options=sess_options)
Is there any suggestions to overcome this problem? From official documentation I see that torch.onnx models exported this way are probably runnable only by Caffe2.
This layers are not inside the base frozen roberta model, so this is additional layers that I added by myself. Is it possible to substitute the offending layers with similar ones and retrain the model?
Or Caffe2 is the best choice here and onnxruntime will not do the inference?
Update: I retrained the model on the basis of BERT cased embeddings, but the problem persists. The same ATen operators are not converted in ONNX.
It looks like the layers LayerNorm.weight and LayerNorm.bias are only in the model above BERT. So, what is your suggestions to change this layers and enable ONNX export?
Have you tried to export after defining the operator for onnx? Something along the lines of the following code by Huawei.
On another note, when loading a model, you can technically override anything you want. Putting a specific layer to equal your modified class that inherits the original, keeps the same behavior (input and output) but execution of it can be modified.
You can try to use this to save the model with changed problematic operators, transform it in onnx, and fine tune in such form (or even in pytorch).
This generally seems best solved by the onnx team, so long term solution might be to post a request for that specific operator on the github issues page (but probably slow).
Best way to go will be to rewrite the place in the model that uses these operator in a way it will convert look at this for reference.
if for example the issue is layer norm then you can write it yourself. another thing that help sometimes is not setting the axes as dynamic, since some op dont support it yet
I am trying to run this my model on Colab Multi core TPU but I really don't know how to do it. I tried this tutorial notebook but I got some error and I can't fix it but I think there is maybe simpler wait for to do it.
About my model:
class BERTModel(nn.Module):
def __init__(self,...):
super().__init__()
if ...:
self.bert_model = XLMRobertaModel.from_pretrained(...) # huggingface XLM-R
elif ...:
self.bert_model = others_model.from_pretrained(...) # huggingface XLM-R
... # some other model's parameters
def forward(self,...):
bert_input = ...
output = self.bert_model(bert_input)
... # some function that process on output
def other_function(self,...):
# just doing some process on output. like concat layers's embedding and return ...
class MAINModel(nn.Module):
def __init__(self,...):
super().__init__()
print('Using model 1')
self.bert_model_1 = BERTModel(...)
print('Using model 2')
self.bert_model_2 = BERTModel(...)
self.linear = nn.Linear(...)
def forward(self,...):
bert_input = ...
bert_output = self.bert_model(bert_input)
linear_output = self.linear(bert_output)
return linear_output
Can you please tell me how to run a model like my model on Colab TPU? I used Colab PRO to make sure Ram memory is not a big problem. Thanks you so so much.
I would work off the examples here: https://github.com/pytorch/xla/tree/master/contrib/colab
Maybe start with a simpler model like this: https://github.com/pytorch/xla/blob/master/contrib/colab/mnist-training.ipynb
In the pseudocode you shared, there is no reference to the torch_xla library, which is required to use PyTorch on TPUs. I'd recommend starting with on of the working Colab notebooks in that directory I shared and then swapping out parts of the model with your own model. There are a few (usually like 3-4) places in the overall training code you need to modify for a model that runs on GPUs using native PyTorch if you want to run that model on TPUs. See here for a description of some of the changes. The other big change is to wrap the default dataloader with a ParallelLoader as shown in the example MNIST colab I shared
If you have any specific error you see in one of the Colabs, feel free to open an issue : https://github.com/pytorch/xla/issues
If I have a trained model in Using pickle, or Joblib.
Lets say its Logistic regression or XGBoost.
I would like to host that model in AWS Sagemaker as endpoint without running a training job.
How to achieve that.
#Lets Say myBucketName contains model.pkl
model = joblib.load('filename.pkl')
# X_test = Numpy Array
model.predict(X_test)
I am not interested to sklearn_estimator.fit('S3 Train, S3 Validate' ) , I have the trained model
For Scikit Learn for example, you can get inspiration from this public demo https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_randomforest/Sklearn_on_SageMaker_end2end.ipynb
Step 1: Save your artifact (eg the joblib) compressed in S3 at s3://<your path>/model.tar.gz
Step 2: Create an inference script with the deserialization function model_fn. (Note that you could also add custom inference functions input_fn, predict_fn, output_fn but for scikit the defaults function work fine)
%%writefile inference_script.py. # Jupiter command to create file in case you're in Jupiter
import joblib
import os
def model_fn(model_dir):
clf = joblib.load(os.path.join(model_dir, "model.joblib"))
return clf
Step 3: Create a model associating the artifact with the right container
from sagemaker.sklearn.model import SKLearnModel
model = SKLearnModel(
model_data='s3://<your path>/model.tar.gz',
role='<your role>',
entry_point='inference_script.py',
framework_version='0.23-1')
Step 4: Deploy!
model.deploy(
instance_type='ml.c5.large', # choose the right instance type
initial_instance_count=1)
I am new to machine learning technology and I have a simple question. I have trained a network on the fashion MNIST and saved it, and then I would like to import the saved model for further processing instead of training the network each time I use the code. Can anyone help?
I have used the import function, but it always gives me an error
model.save("model_1.h5py") # save the model
model.import("model_2.h5py") # when I try to import the #model alwys gives me a valid synthax
Pytorch 1.0 has a feature to convert a model into a torch script program (serialized in a way) to enable its execution in C++ without Python dependencies.
The details are in this tutorial.
https://pytorch.org/tutorials/advanced/cpp_export.html
This is how it is done:
import torch
import torchvision
# An instance of your model.
model = A UNET MODEL FROM FASTAI which has hooks as required by UNET
# An example input you would normally provide to your model's forward() method.
example = torch.rand(1, 3, 224, 224)
# Use torch.jit.trace to generate a torch.jit.ScriptModule via tracing.
traced_script_module = torch.jit.trace(model, example)
In my use case, I am using a UNET Model for semantic segmentation. However, I trace the model using this method, I get the following error.
Forward or backward hooks can't be compiled
UNET Model uses hooks to save intermediate features which is used at later layers in the network. Is there a way around it? or This is still a limitation of this new method that it cannot work with Models using such hooks.
If you can use the UNET model from Pytorch hub. It will work with TorchScript.
import torch
# downloading the model from torchhub
model = torch.hub.load('mateuszbuda/brain-segmentation-pytorch', 'unet',
in_channels=3, out_channels=1, init_features=32, pretrained=True)
# downloading the sample
import urllib
url, filename = ("https://github.com/mateuszbuda/brain-segmentation-pytorch/raw/master/assets/TCGA_CS_4944.png", "TCGA_CS_4944.png")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)
# reading the sample and some prerequisites for transformation
import numpy as np
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
m, s = np.mean(input_image, axis=(0, 1)), np.std(input_image, axis=(0, 1))
preprocess = transforms.Compose([transforms.ToTensor(),transforms.Normalize(mean=m, std=s),])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0)
# creating the trace
traced_module = torch.jit.trace(model,input_batch)
# running the trace
traced_module(input_batch)
PS: Both torch.jit.trace/torch.jit.script does not support all torch functionality, so there is always tricky to use them with external libraries.
Maybe you could rewrite the model in C++, since c++ API has almost the same interface as python version.