Tensorflow Scikit Flow get GraphDef for Android (save *.pb file) - scikit-learn

I want to use my Tensorflow algorithm in an Android app. The Tensorflow Android example starts by downloading a GraphDef that contains the model definition and weights (in a *.pb file). Now this should be from my Scikit Flow algorithm (part of Tensorflow).
At the first glance it seems easy you just have to say classifier.save('model/') but the files saved to that folder are not *.ckpt, *.def and certainly not *.pb. Instead you have to deal with a *.pbtxt and a checkpoint (without ending) file.
I'm stuck there since quite a while. Here a code example to export something:
#imports
import tensorflow as tf
import tensorflow.contrib.learn as skflow
import tensorflow.contrib.learn.python.learn as learn
from sklearn import datasets, metrics
#skflow example
iris = datasets.load_iris()
feature_columns = learn.infer_real_valued_columns_from_input(iris.data)
classifier = learn.LinearClassifier(n_classes=3, feature_columns=feature_columns,model_dir="modeltest")
classifier.fit(iris.data, iris.target, steps=200, batch_size=32)
iris_predictions = list(classifier.predict(iris.data, as_iterable=True))
score = metrics.accuracy_score(iris.target, iris_predictions)
print("Accuracy: %f" % score)
The files you get are:
checkpoint
graph.pbtxt
model.ckpt-1.meta
model.ckpt-1-00000-of-00001
model.ckpt-200.meta
model.ckpt-200-00000-of-00001
Many possible workarounds I found would require having the GraphDef in a variable (don't know how with Scikit Flow). Or a Tensorflow session which doesn't seem to be required using Scikit Flow.

To save as pb file, you need to extract the graph_def from the constructed graph. You can do that as--
from tensorflow.python.framework import tensor_shape, graph_util
from tensorflow.python.platform import gfile
sess = tf.Session()
final_tensor_name = 'results:0' #Replace final_tensor_name with name of the final tensor in your graph
#########Build your graph and train########
## Your tensorflow code to build the graph
###########################################
outpt_filename = 'output_graph.pb'
output_graph_def = sess.graph.as_graph_def()
with gfile.FastGFile(outpt_filename, 'wb') as f:
f.write(output_graph_def.SerializeToString())
If you want to convert your trained variables to constants (to avoid using ckpt files to load the weights), you can use:
output_graph_def = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), [final_tensor_name])
Hope this helps!

Related

Getting error while trying to save and apply existing machine learning model to new dataset?

I am trying to use this model https://github.com/aninda052/Disasters-on-social-media-NLP/blob/master/Disasters%20on%20social%20media.ipynb
, I searched for a way to save this model and use it with new dataset in other application an I find out use pickle, and I add this to code like this
import pickle
model_tfidf=LogisticRegression( C=30.0,class_weight='balanced', solver='newton-cg',
multi_class='multinomial', n_jobs=-1, random_state=5)
model_tfidf.fit(x_train_tfidf, y_train)
predicted_tfidf=model_tfidf.predict(x_test_tfidf)
Pkl_Filename = "Pickle_RL_Model.pkl"
with open(Pkl_Filename, 'wb') as file:
pickle.dump(model_tfidf, file)
after that I tried to create new project to load and use this model and the code is:
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
import pandas as pd
import pickle
with open('Pickle_RL_Model.pkl', 'rb') as file:
Pickled_LR_Model = pickle.load(file)
x=["hi disaster","flood disaster","cry sad bad ","srong storm"]
tfd=TfidfVectorizer()
new_data_vec=tfd.fit_transform(x)
Ypredict = Pickled_LR_Model.predict(new_data_vec)
but I got error said:
X has 8 features per sample; expecting 16988
I don't know what I did wrong, any help please.

Explanation of mathematics in odt file saved from Decision Tree Regression model

I am trying to solve a regression problem using the decision tree algorithm where I want to know the mathematics lies in the odt file which is generated after saving the trained model. Here, I want to mention that none of the value of the variable is categorical here.
I have gone through this, this but their values are categorical.
The code I have written for this purpose is given below:
from sklearn import *
import numpy as np
import sklearn
data = [[2,5,1,10],[3,7,2,12],[5,9,4,14],[6,3,3,16],[2,5,8,7],[1,1,1,1]]
data = np.array(data)
type(data)
data
feature = data[:,:-1]
target = data[:,-1]
target = np.reshape(target,(-1,1))
model_tree = sklearn.tree.DecisionTreeRegressor()
model_tree.fit(feature, target)
import graphviz
dot_data = tree.export_graphviz(model_tree, out_file='manual_1.dot')
I have given here the graph which I have got from the saved odt file.

Save and load a Pytorch model

i am trying to train a pytorch model on colab then save the model parameters and load it on my local computer.
After training, the model parameters are stored as below:
torch.save(Model.state_dict(),PATH)
loaded as below:
device = torch.device('cpu')
Model.load_state_dict(torch.load(PATH, map_location=device))
error:
AttributeError: 'Sequential' object has no attribute 'copy'
Does anyone know how to solve this issue?
Your question does not provide sufficient details to be answered correctly. If you are trying to save and load your own model and have a class definition for it see this well known answer and clarify why that's not sufficient for your use.
If you are loading a torch.nn.Sequential model then as far as I know simply loading the model directly and just using it should be sufficient. If it's not post on the pytorch forum what error you get.
For now look at my example show casing loading a sequential model and then using it without error:
# test for saving everything with torch.save
import torch
import torch.nn as nn
from pathlib import Path
from collections import OrderedDict
import numpy as np
import pickle
path = Path('~/data/tmp/').expanduser()
path.mkdir(parents=True, exist_ok=True)
num_samples = 3
Din, Dout = 1, 1
lb, ub = -1, 1
x = torch.torch.distributions.Uniform(low=lb, high=ub).sample((num_samples, Din))
f = nn.Sequential(OrderedDict([
('f1', nn.Linear(Din,Dout)),
('out', nn.SELU())
]))
y = f(x)
# save data torch to numpy
x_np, y_np = x.detach().cpu().numpy(), y.detach().cpu().numpy()
db2 = {'f': f, 'x': x_np, 'y': y_np}
torch.save(db2, path / 'db_f_x_y')
db3 = torch.load(path / 'db_f_x_y')
f3 = db3['f']
x3 = db3['x']
y3 = db3['y']
xx = torch.tensor(x3)
yy3 = f3(xx)
print(yy3)
there should be an official answer how to save and load nn.Sequential models How does one save torch.nn.Sequential models in pytorch properly? but for now torch.save and torch.load seem to work just fine.

How to reduce memory usage?

I am trying to generate pickle file of the predictions on my dataset. But after executing the code for 6 hours PC is going out of memory again and again. I wonder if anyone can help me with this?
from keras.models import load_model
import sys
sys.setrecursionlimit(10000)
import pickle
import os
import cv2
import glob
dirlist = []
imgdirs = os.listdir('/chars/')
imgdirs.sort(key=float)
for imgdir in imgdirs:
imglist = []
for imgfile in glob.glob(os.path.join('/chars/', imgdir, '*.png')):
img = cv2.imread(imgfile)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
model = load_model('mymodel.h5')
predictions=model.predict(img)
print('predicted model:', predictions)
imglist.append(predictions)
dirlist.append(imglist)
q = open("predict.pkl","wb")
pickle.dump(dirlist,q)
q.close()
First of all why you reload your model for every prediction?
Code would be much faster, if you load your model only once and then do the prediction.
Also if you load several pictures at once and you predict in batches that also would be a big speed boost.
What out of memory error do you get?
One from the tensorflow(or which backend you're using) or one from python?
My best guess would be that load_model is loading the same model over and over in the same tensorflow session till your resource is exhausted.
The Solution is, as stated above, to just load the model at the beginning once.

Using Keras/sklearn with CalibratedClassifierCV from sklearn.calibration

Is it possible to use Keras model objects with CalibratedClassifierCV from sklearn.calibration? Or is there another way to performa isotonic regression in sklearn/other python packages without having to pass it a model object.
I tried using the sklearn wrapper for Keras, but it didn't work. Here is the doc for the CalibratedClassifierCV class.
You can train an isotonic regression a posteriori, after prediction. Let 'file1' be a csv containing your predictions pred and real observed events obs on a subset of data. Ideally, this subset has never been used before (not even in Keras training). Let file2 contain the predictions you want to calibrate (Keras predictions for the test set).
import pandas as pd
from sklearn.isotonic import IsotonicRegression
never_seen=pd.read_csv('file1')
uncalibrated=pd.read_csv('file2')
ir = IsotonicRegression( out_of_bounds = 'clip' )
ir.fit( never_seen.pred,never_seen.obs )
p_calibrated = ir.transform( uncalibrated.pred )

Resources