Serve online learning models with mlflow - python-3.x

It is not clear to me if one could use mlflow to serve a model that is evolving continuously based on its previous predictions.
I need to be able to query a model in order to make a prediction on a sample of data which is the basic use of mlflow serve. However I also want the model to be updated internaly now that it has seen new data.
Is it possible or does it need a FR ?

I think that you should be able to do that by implementing the custom python model or custom flavor, as it's described in the documentation. In this case you need to create a class that is inherited from mlflow.pyfunc.PythonModel, and implement the predict method, and inside that method you're free to do anything. Here is just simple example from documentation:
class AddN(mlflow.pyfunc.PythonModel):
def __init__(self, n):
self.n = n
def predict(self, context, model_input):
return model_input.apply(lambda column: column + self.n)
and this model is then could be saved & loaded again just as normal models:
# Construct and save the model
model_path = "add_n_model"
add5_model = AddN(n=5)
mlflow.pyfunc.save_model(path=model_path, python_model=add5_model)
# Load the model in `python_function` format
loaded_model = mlflow.pyfunc.load_model(model_path)

Related

Run Pytorch stacked model on Colab TPU

I am trying to run this my model on Colab Multi core TPU but I really don't know how to do it. I tried this tutorial notebook but I got some error and I can't fix it but I think there is maybe simpler wait for to do it.
About my model:
class BERTModel(nn.Module):
def __init__(self,...):
super().__init__()
if ...:
self.bert_model = XLMRobertaModel.from_pretrained(...) # huggingface XLM-R
elif ...:
self.bert_model = others_model.from_pretrained(...) # huggingface XLM-R
... # some other model's parameters
def forward(self,...):
bert_input = ...
output = self.bert_model(bert_input)
... # some function that process on output
def other_function(self,...):
# just doing some process on output. like concat layers's embedding and return ...
class MAINModel(nn.Module):
def __init__(self,...):
super().__init__()
print('Using model 1')
self.bert_model_1 = BERTModel(...)
print('Using model 2')
self.bert_model_2 = BERTModel(...)
self.linear = nn.Linear(...)
def forward(self,...):
bert_input = ...
bert_output = self.bert_model(bert_input)
linear_output = self.linear(bert_output)
return linear_output
Can you please tell me how to run a model like my model on Colab TPU? I used Colab PRO to make sure Ram memory is not a big problem. Thanks you so so much.
I would work off the examples here: https://github.com/pytorch/xla/tree/master/contrib/colab
Maybe start with a simpler model like this: https://github.com/pytorch/xla/blob/master/contrib/colab/mnist-training.ipynb
In the pseudocode you shared, there is no reference to the torch_xla library, which is required to use PyTorch on TPUs. I'd recommend starting with on of the working Colab notebooks in that directory I shared and then swapping out parts of the model with your own model. There are a few (usually like 3-4) places in the overall training code you need to modify for a model that runs on GPUs using native PyTorch if you want to run that model on TPUs. See here for a description of some of the changes. The other big change is to wrap the default dataloader with a ParallelLoader as shown in the example MNIST colab I shared
If you have any specific error you see in one of the Colabs, feel free to open an issue : https://github.com/pytorch/xla/issues

PyTorch - Save just the model structure without weights and then load and train it

I want to separate model structure authoring and training. The model author designs the model structure, saves the untrained model to a file and then sends it training service which loads the model structure and trains the model.
Keras has the ability to save the model config and then load it.
How can the same be accomplished with PyTorch?
You can write your own function to do that in PyTorch. Saving of weights is straight forward where you simply do a torch.save(model.state_dict(), 'weightsAndBiases.pth').
For saving the model structure, you can do this:
(Assume you have a model class named Network, and you instantiate yourModel = Network())
model_structure = {'input_size': 784,
'output_size': 10,
'hidden_layers': [each.out_features for each in yourModel.hidden_layers],
'state_dict': yourModel.state_dict() #if you want to save the weights
}
torch.save(model_structure, 'model_structure.pth')
Similarly, we can write a function to load the structure.
def load_structure(filepath):
structure = torch.load(filepath)
model = Network(structure['input_size'],
structure['output_size'],
structure['hidden_layers'])
# model.load_state_dict(structure['state_dict']) if you had saved weights as well
return model
model = load_structure('model_structure.pth')
print(model)
Edit:
Okay, the above was the case when you had access to source code for your class, or if the class was relatively simple so you could define a generic class like this:
class Network(nn.Module):
def __init__(self, input_size, output_size, hidden_layers, drop_p=0.5):
''' Builds a feedforward network with arbitrary hidden layers.
Arguments
---------
input_size: integer, size of the input layer
output_size: integer, size of the output layer
hidden_layers: list of integers, the sizes of the hidden layers
'''
super().__init__()
# Input to a hidden layer
self.hidden_layers = nn.ModuleList([nn.Linear(input_size, hidden_layers[0])])
# Add a variable number of more hidden layers
layer_sizes = zip(hidden_layers[:-1], hidden_layers[1:])
self.hidden_layers.extend([nn.Linear(h1, h2) for h1, h2 in layer_sizes])
self.output = nn.Linear(hidden_layers[-1], output_size)
self.dropout = nn.Dropout(p=drop_p)
def forward(self, x):
''' Forward pass through the network, returns the output logits '''
for each in self.hidden_layers:
x = F.relu(each(x))
x = self.dropout(x)
x = self.output(x)
return F.log_softmax(x, dim=1)
However, that will only work for simple cases so I suppose that's not what you intended.
One option is, you can define the architecture of model in a separate .py file and import it along with other necessities(if the model architecture is complex) or you can altogether define the model then and there.
Another option is converting your pytorch model to onxx and saving it.
The other option is that, in Tensorflow you can create a .pb file that defines both the architecture and the weights of the model and in Pytorch you would do something like that this way:
torch.save(model, filepath)
This will save the model object itself, as torch.save() is just a pickle-based save at the end of the day.
model = torch.load(filepath)
This however has limitations, your model class definition might not for example be picklable(possible in some complicated models).
Because this is a such an iffy workaround, the answer that you'll usually get is - No, you have to declare the class definition before loading the trained model, ie you need to have access to the model class source code.
Side notes:
An official answer by one of the core PyTorch devs on limitations of loading a pytorch model without code:
We only save the source code of the class definition. We do not save beyond that (like the package sources that the class is referring to).
import foo
class MyModel(...):
def forward(input):
foo.bar(input)
Here the package foo is not saved in the model checkpoint.
There are limitations on robustly serializing python constructs. For example the default picklers cannot serialize lambdas. There are helper packages that can serialize more python constructs than the standard, but they still have limitations. Dill 25 is one such package.
Given these limitations, there is no robust way to have torch.load work without having the original source files.

example of doing simple prediction with pytorch-lightning

I have an existing model where I load some pre-trained weights and then do prediction (one image at a time) in pytorch. I am trying to basically convert it to a pytorch lightning module and am confused about a few things.
So currently, my __init__ method for the model looks like this:
self._load_config_file(cfg_file)
# just creates the pytorch network
self.create_network()
self.load_weights(weights_file)
self.cuda(device=0) # assumes GPU and uses one. This is probably suboptimal
self.eval() # prediction mode
What I can gather from the lightning docs, I can pretty much do the same, except not to do the cuda() call. So something like:
self.create_network()
self.load_weights(weights_file)
self.freeze() # prediction mode
So, my first question is whether this is the correct way to use lightning? How would lightning know if it needs to use the GPU? I am guessing this needs to be specified somewhere.
Now, for the prediction, I have the following setup:
def infer(frame):
img = transform(frame) # apply some transformation to the input
img = torch.from_numpy(img).float().unsqueeze(0).cuda(device=0)
with torch.no_grad():
output = self.__call__(Variable(img)).data.cpu().numpy()
return output
This is the bit that has me confused. Which functions do I need to override to make a lightning compatible prediction?
Also, at the moment, the input comes as a numpy array. Is that something that would be possible from the lightning module or do things always have to use some sort of a dataloader?
At some point, I want to extend this model implementation to do training as well, so want to make sure I do it right but while most examples focus on training models, a simple example of just doing prediction at production time on a single image/data point might be useful.
I am using 0.7.5 with pytorch 1.4.0 on GPU with cuda 10.1
LightningModule is a subclass of torch.nn.Module so the same model class will work for both inference and training. For that reason, you should probably call the cuda() and eval() methods outside of __init__.
Since it's just a nn.Module under the hood, once you've loaded your weights you don't need to override any methods to perform inference, simply call the model instance. Here's a toy example you can use:
import torchvision.models as models
from pytorch_lightning.core import LightningModule
class MyModel(LightningModule):
def __init__(self):
super().__init__()
self.resnet = models.resnet18(pretrained=True, progress=False)
def forward(self, x):
return self.resnet(x)
model = MyModel().eval().cuda(device=0)
And then to actually run inference you don't need a method, just do something like:
for frame in video:
img = transform(frame)
img = torch.from_numpy(img).float().unsqueeze(0).cuda(0)
output = model(img).data.cpu().numpy()
# Do something with the output
The main benefit of PyTorchLighting is that you can also use the same class for training by implementing training_step(), configure_optimizers() and train_dataloader() on that class. You can find a simple example of that in the PyTorchLightning docs.
Even though above answer suffices, if one takes note of following line
img = torch.from_numpy(img).float().unsqueeze(0).cuda(0)
One has to put both the model as well as image to the right GPU. On multi-gpu inference machine, this becomes a hassle.
To solve this, .predict was also recently produced, see more at https://pytorch-lightning.readthedocs.io/en/stable/deploy/production_basic.html

Define and Use new smoothing method in nltk language models

I'm trying to provide and test new smoothing method for language models. I'm using nltk tools and don't want to redefine everything from scratch. So is there any way to define and use my own smoothing method in nltk models?
Edit:
I'm trying to do something like this :
def my_smoothing_method(model) :
# some code using model (MLE) count
model = nltk.lm.MLE(n, smoothing_method=my_smoothing_method)
model.fit(train)
Here, you can see the definition of MLE. As you can see, there is no option of a smoothing function (but there are others in the same file, probably some of them fits your needs?).
The InterpolatedLanguageModel (see same file above) does accept a smoothing classifier which needs to implement alpha_gamma(word, context) and unigram_score(word) and be a subclass of Smoothing:
model = nltk.lm.InterpolatedLanguageModel(smoothing_cls=my_smoothing_method, order)
So if you really need to add functionality to the MLE class, you could do something like that, but I am not sure if this is a good idea :
class MLE_with_smoothing(LanguageModel):
"""Class for providing MLE ngram model scores.
Inherits initialization from BaseNgramModel.
"""
def unmasked_score(self, word, context=None):
"""Returns the MLE score for a word given a context.
Args:
- word is expcected to be a string
- context is expected to be something reasonably convertible to a tuple
"""
freq = self.context_counts(context).freq(word)
#Do some smothing
return

Why override Dataset instead of directly pass in input and labels, pytorch

Sorry if what I say here is wrong -- new to pytorch.
From what I can tell there are two main ways of getting training data and passing through a network. One is to override Dataset and the other is to just prepare your data correctly and then iterate over it, like shown in this example: pytorch classification example
which does something like
rnn(input, hidden, output)
for i in range(input.size()[0]):
output, hidden = rnn(input[i], hidden)
The other way would be to do something like
for epoch in range(epochs):
for data, target in trainloader:
computer model etc
where in this method, trainloader is from doing something like
trainloader = DataLoader(my_data)
after overriding getitem and len
My question here, is what are the differences between these methods, and why would you use one over the other? Also, it seems to me that overriding Dataset doesn't work for something that has lets say an input layer of size 100 nodes with an output of 10 nodes, since when you return getitem it needs a pair of (data, label). This seems like a case where I probably don't understand how to use Dataset very well, but that is why I'm asking in the first place. I think I read something about a collate function which might help in this scenario?
Dataset class and the Dataloader class in PyTorch help us to feed our own training data into the network. Dataset class is used to provide an interface for accessing all the training or testing samples in your dataset. In order to achieve this, you have to implement at least two methods, __getitem__ and __len__ so that each training sample can be accessed by its index. In the initialization part of the class, we load the dataset (as float type) and convert them into Float torch tensors. __getitem__ will return the features and target value.
What are the differences between these methods?
In PyTorch either you can prepare your data such that the PyTorch DataLoader can consume it and you get an iterable object or you can overload the default DataLoader to perform some custom operations like if you want to do some preprocessing of text/images, stack frames from videos clips, etc.
Our DataLoader behaves like an iterator, so we can loop over it and fetch a different mini-batch every time.
Basic Sample
from torch.utils.data import DataLoader
train_loader = DataLoader(dataset=train_data, batch_size=16, shuffle=True)
valid_loader = DataLoader(dataset=valid_data, batch_size=16, shuffle=True)
# To retrieve a sample mini-batch, one can simply run the command below —
# it will return a list containing two tensors:
# one for the features, another one for the labels.
next(iter(train_loader))
next(iter(valid_loader))
Custom Sample
import torch
from torch.utils.data import Dataset, Dataloader
class SampleData(Dataset):
def __init__(self, data):
self.data = torch.FloatTensor(data.values.astype('float'))
def __len__(self):
return len(self.data)
def __getitem__(self, index):
target = self.data[index][-1]
data_val = self.data[index] [:-1]
return data_val,target
train_dataset = SampleData(train_data)
valid_dataset = SampleData(valid_data)
device = "cuda" if torch.cuda.is_available() else "cpu"
kwargs = {'num_workers': 1, 'pin_memory': True} if device=='cuda' else {}
train_loader = DataLoader(train_dataset, batch_size=train_batch_size, shuffle=True, **kwargs)
test_loader = DataLoader(valid_dataset, batch_size=test_batch_size, shuffle=False, **kwargs)
Why would you use one over the other?
It solely depends on your use-case and the amount of control you want. PyTorch has given you all the power and it is you who is going to decide how much you want to. Suppose you are solving a simple image classification problem, then,
You can simply put all the images in a root folder with each subfolder containing the samples belonging to a particular class and label the folder with the class name. When training we just need to specify the path to the root folder and the PyTorch DataLoader will automatically pick images from each folder and training the model.
But on the other hand, if you have classifying video clips or video sequences generally known as video tagging in a large video file then you need to write your custom DataLoader to load the frames from the video, stack it and give input to the DataLoader.
Use can find some useful links below for further reference:
https://pytorch.org/docs/stable/data.html
https://stanford.edu/~shervine/blog/pytorch-how-to-generate-data-parallel
https://pytorch.org/tutorials/beginner/data_loading_tutorial.html

Resources