Save and load a Pytorch model - python-3.x

i am trying to train a pytorch model on colab then save the model parameters and load it on my local computer.
After training, the model parameters are stored as below:
torch.save(Model.state_dict(),PATH)
loaded as below:
device = torch.device('cpu')
Model.load_state_dict(torch.load(PATH, map_location=device))
error:
AttributeError: 'Sequential' object has no attribute 'copy'
Does anyone know how to solve this issue?

Your question does not provide sufficient details to be answered correctly. If you are trying to save and load your own model and have a class definition for it see this well known answer and clarify why that's not sufficient for your use.
If you are loading a torch.nn.Sequential model then as far as I know simply loading the model directly and just using it should be sufficient. If it's not post on the pytorch forum what error you get.
For now look at my example show casing loading a sequential model and then using it without error:
# test for saving everything with torch.save
import torch
import torch.nn as nn
from pathlib import Path
from collections import OrderedDict
import numpy as np
import pickle
path = Path('~/data/tmp/').expanduser()
path.mkdir(parents=True, exist_ok=True)
num_samples = 3
Din, Dout = 1, 1
lb, ub = -1, 1
x = torch.torch.distributions.Uniform(low=lb, high=ub).sample((num_samples, Din))
f = nn.Sequential(OrderedDict([
('f1', nn.Linear(Din,Dout)),
('out', nn.SELU())
]))
y = f(x)
# save data torch to numpy
x_np, y_np = x.detach().cpu().numpy(), y.detach().cpu().numpy()
db2 = {'f': f, 'x': x_np, 'y': y_np}
torch.save(db2, path / 'db_f_x_y')
db3 = torch.load(path / 'db_f_x_y')
f3 = db3['f']
x3 = db3['x']
y3 = db3['y']
xx = torch.tensor(x3)
yy3 = f3(xx)
print(yy3)
there should be an official answer how to save and load nn.Sequential models How does one save torch.nn.Sequential models in pytorch properly? but for now torch.save and torch.load seem to work just fine.

Related

How do I get access to the "last_hidden_state" for code generation models in huggingface?

I'm trying to obtain the "last_hidden_state" (as explained here) for code generation models over here. I am unable to figure out how to proceed, other than manually downloading each code-generation-model and checking if its key has that attribute using the following code -
import numpy as np
from datasets import load_dataset
from transformers import AutoTokenizer
from transformers import AutoModel, AutoModelForCausalLM
import torch
from sklearn.linear_model import LogisticRegression
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("codeparrot/codeparrot")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = AutoModelWithLMHead.from_pretrained("codeparrot/codeparrot").to(device)
inputs = tokenizer("def hello_world():", return_tensors="pt")
inputs = {k:v.to(device) for k,v in inputs.items()}
with torch.no_grad():
outputs = model(**inputs)
print(outputs.keys())
So far, I tried this strategy on CodeParrot and InCoder with no success. Perhaps there is a better way to access the values of the hidden layers?
The hidden_states of output from CodeGenForCausalLM is already the last_hidden_state for the codegen model. See: link
where hidden_states = transformer_outputs[0] is the output of CodeGenModel (link) and the transformer_outputs[0] is the last_hidden_state
if not return_dict:
return tuple(v for v in [hidden_states, presents, all_hidden_states, all_self_attentions] if v is not None)
return BaseModelOutputWithPast(
last_hidden_state=hidden_states,
past_key_values=presents,
hidden_states=all_hidden_states,
attentions=all_self_attentions,
)

Getting error while trying to save and apply existing machine learning model to new dataset?

I am trying to use this model https://github.com/aninda052/Disasters-on-social-media-NLP/blob/master/Disasters%20on%20social%20media.ipynb
, I searched for a way to save this model and use it with new dataset in other application an I find out use pickle, and I add this to code like this
import pickle
model_tfidf=LogisticRegression( C=30.0,class_weight='balanced', solver='newton-cg',
multi_class='multinomial', n_jobs=-1, random_state=5)
model_tfidf.fit(x_train_tfidf, y_train)
predicted_tfidf=model_tfidf.predict(x_test_tfidf)
Pkl_Filename = "Pickle_RL_Model.pkl"
with open(Pkl_Filename, 'wb') as file:
pickle.dump(model_tfidf, file)
after that I tried to create new project to load and use this model and the code is:
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
import pandas as pd
import pickle
with open('Pickle_RL_Model.pkl', 'rb') as file:
Pickled_LR_Model = pickle.load(file)
x=["hi disaster","flood disaster","cry sad bad ","srong storm"]
tfd=TfidfVectorizer()
new_data_vec=tfd.fit_transform(x)
Ypredict = Pickled_LR_Model.predict(new_data_vec)
but I got error said:
X has 8 features per sample; expecting 16988
I don't know what I did wrong, any help please.

Deploy Keras model on Spark

I have a trained keras model.
https://github.com/qubvel/efficientnet
I have a large updating dataset I want to get predictions on. Meaning to run my spark job every 2 hours or so.
What is the way to implement this? MlLib does not support efficientNet.
When searching online I saw this kind of implementation using sparkdl, but it does not support efficentNet as modelName parameter.
featurizer = DeepImageFeaturizer(inputCol="image", outputCol="features", modelName="InceptionV3")
rf = RandomForestClassifier(labelCol="label", featuresCol="features")
My naive approach would be
import efficientnet.keras as efn
model = efn.EfficientNetB0(weights='imagenet')
from sparkdl import readImages
image_df = readImages("flower_photos/sample/")
image_df.withcolumn("modelTags", efficient_net_udf($"image".data))
and creating a UDF that calls model.predict...
Another method I saw is
from keras.preprocessing.image import img_to_array, load_img
import numpy as np
import os
from pyspark.sql.types import StringType
from sparkdl import KerasImageFileTransformer
import efficientnet.keras as efn
model = efn.EfficientNetB0(weights='imagenet')
model.save("kerasModel.h5")
def loadAndPreprocessKeras(uri):
image = img_to_array(load_img(uri, target_size=(299, 299)))
image = np.expand_dims(image, axis=0)
return image
transformer = KerasImageFileTransformer(inputCol="uri", outputCol="predictions",
modelFile='path/kerasModel.h5',
imageLoader=loadAndPreprocessKeras,
outputMode="vector")
files = [os.path.abspath(os.path.join(dirpath, f)) for f in os.listdir("/data/myimages") if f.endswith('.jpg')]
uri_df = sqlContext.createDataFrame(files, StringType()).toDF("uri")
keras_pred_df = transformer.transform(uri_df)
What is the correct (and working) way to approach this?

How to reduce memory usage?

I am trying to generate pickle file of the predictions on my dataset. But after executing the code for 6 hours PC is going out of memory again and again. I wonder if anyone can help me with this?
from keras.models import load_model
import sys
sys.setrecursionlimit(10000)
import pickle
import os
import cv2
import glob
dirlist = []
imgdirs = os.listdir('/chars/')
imgdirs.sort(key=float)
for imgdir in imgdirs:
imglist = []
for imgfile in glob.glob(os.path.join('/chars/', imgdir, '*.png')):
img = cv2.imread(imgfile)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
model = load_model('mymodel.h5')
predictions=model.predict(img)
print('predicted model:', predictions)
imglist.append(predictions)
dirlist.append(imglist)
q = open("predict.pkl","wb")
pickle.dump(dirlist,q)
q.close()
First of all why you reload your model for every prediction?
Code would be much faster, if you load your model only once and then do the prediction.
Also if you load several pictures at once and you predict in batches that also would be a big speed boost.
What out of memory error do you get?
One from the tensorflow(or which backend you're using) or one from python?
My best guess would be that load_model is loading the same model over and over in the same tensorflow session till your resource is exhausted.
The Solution is, as stated above, to just load the model at the beginning once.

Tensorflow Scikit Flow get GraphDef for Android (save *.pb file)

I want to use my Tensorflow algorithm in an Android app. The Tensorflow Android example starts by downloading a GraphDef that contains the model definition and weights (in a *.pb file). Now this should be from my Scikit Flow algorithm (part of Tensorflow).
At the first glance it seems easy you just have to say classifier.save('model/') but the files saved to that folder are not *.ckpt, *.def and certainly not *.pb. Instead you have to deal with a *.pbtxt and a checkpoint (without ending) file.
I'm stuck there since quite a while. Here a code example to export something:
#imports
import tensorflow as tf
import tensorflow.contrib.learn as skflow
import tensorflow.contrib.learn.python.learn as learn
from sklearn import datasets, metrics
#skflow example
iris = datasets.load_iris()
feature_columns = learn.infer_real_valued_columns_from_input(iris.data)
classifier = learn.LinearClassifier(n_classes=3, feature_columns=feature_columns,model_dir="modeltest")
classifier.fit(iris.data, iris.target, steps=200, batch_size=32)
iris_predictions = list(classifier.predict(iris.data, as_iterable=True))
score = metrics.accuracy_score(iris.target, iris_predictions)
print("Accuracy: %f" % score)
The files you get are:
checkpoint
graph.pbtxt
model.ckpt-1.meta
model.ckpt-1-00000-of-00001
model.ckpt-200.meta
model.ckpt-200-00000-of-00001
Many possible workarounds I found would require having the GraphDef in a variable (don't know how with Scikit Flow). Or a Tensorflow session which doesn't seem to be required using Scikit Flow.
To save as pb file, you need to extract the graph_def from the constructed graph. You can do that as--
from tensorflow.python.framework import tensor_shape, graph_util
from tensorflow.python.platform import gfile
sess = tf.Session()
final_tensor_name = 'results:0' #Replace final_tensor_name with name of the final tensor in your graph
#########Build your graph and train########
## Your tensorflow code to build the graph
###########################################
outpt_filename = 'output_graph.pb'
output_graph_def = sess.graph.as_graph_def()
with gfile.FastGFile(outpt_filename, 'wb') as f:
f.write(output_graph_def.SerializeToString())
If you want to convert your trained variables to constants (to avoid using ckpt files to load the weights), you can use:
output_graph_def = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), [final_tensor_name])
Hope this helps!

Resources