scikit learn upgrade causes failure when old models are loaded - python-3.x

I trained some data science models with scikit learn from v0.19.1. The models are stored in a pickle file. After upgrading to latest version (v0.23.1), I get the following error when I try to load them:
File "../../Utils/WebsiteContentSelector.py", line 100, in build_page_selector
page_selector = pickle.load(pkl_file)
AttributeError: Can't get attribute 'DeprecationDict' on <module 'sklearn.utils.deprecation' from '/usr/local/lib/python3.6/dist-packages/sklearn/utils/deprecation.py'>
Is there a way to upgrade without retraining all my models (which is very expensive)?

You used a new version of sklearn to load a model which was trained by an old version of sklearn.
So, the options are:
Retrain the model with current version of sklearn if you have the training script and data
Or fall back to the lower sklearn version reported in the warning message

Depending on the kind of sklearn model used, if the model is simple regression model, what is probably needed is to get the actual weights and bias (or intercept) values.
You can check these values in your model:
model.classes_
model.coef_
model.intercept_
they are of numpy type and can be pickled easily. Also, you need to get the same parameters passed to the model construction. For example:
tol
max_iter
and so on. With this, in the upgraded version, the same model created with the same parameters can read the weights and intercept.
In this way, no re-training is needed and you can use the upgrade sklearn.

When lib versions are not backward compatible you can do the following:
Downgrade sklearn back to the original version
Load each model, extract and store its coefficients (which are model-specific - check documentation)
Upgrade sklearn, load coefficients and init models with them, save models
Related question.

Related

convert tensorflow.keras model to keras model

I have an EfficientNet model (tensorflow.keras==2.4) and would like to use innvestigate to inspect the results, but it requires keras==2.2.4
Training code:
tensorflow.keras.__version__ # 2.4
model = tf.keras.applications.EfficientNetB1(**params)
# do training
model.save('testModel')
I have the model saved as file but can not load it into Keras 2.2.4. This is the point where I'm stuck, I couldn't figure out what to do to convert the model.
Use Innvestigate:
keras.__version__ # 2.2.4
keras.model.load_model('testModel') # Error
# some more stuff...
I also found this thread, might try it, but since efficient net has > 350 layers it is not really applicable
How to load tf.keras models with keras
I don't know if it's actually possible to convert models between tensorflow.keras and keras, I appreciate all help I can get.
Due to version incompatibility between tensorflow as keras, you were not able to load model.
Your issue will be resolved, once you upgrade keras and tensorflow to 2.5.

Is there any way to speed up the predicting process for tensorflow lattice?

I build my own model with Keras Premade Models in tensorflow lattice using python3.7 and save the trained model. However, when I use the trained model for predicting, the speed of predicting each data point is at millisecond level, which seems very slow. Is there any way to speed up the predicting process for tfl?
There are multiple ways to improve speed, but they may involve a tradeoff with prediction accuracy. I think the three most promising options are:
Reduce the number of features
Reduce the number of lattices per feature
Use an ensemble of lattice models where every lattice model only gets a subsets of the features and then average the predictions of the different models (like described here)
As the lattice model is a standard Keras model, I recommend trying OpenVINO. It optimizes your model by converting to Intermediate Representation (IR), performing graph pruning and fusing some operations into others while preserving accuracy. Then it uses vectorization in runtime. OpenVINO is optimized for Intel hardware, but it should work with any CPU.
It's rather straightforward to convert the Keras model to OpenVINO. The full tutorial on how to do it can be found here. Some snippets are below.
Install OpenVINO
The easiest way to do it is using PIP. Alternatively, you can use this tool to find the best way in your case.
pip install openvino-dev[tensorflow2]
Save your model as SavedModel
OpenVINO is not able to convert the HDF5 model, so you have to save it as SavedModel first.
import tensorflow as tf
from custom_layer import CustomLayer
model = tf.keras.models.load_model('model.h5', custom_objects={'CustomLayer': CustomLayer})
tf.saved_model.save(model, 'model')
Use Model Optimizer to convert SavedModel model
The Model Optimizer is a command-line tool that comes from OpenVINO Development Package. It converts the Tensorflow model to IR, a default format for OpenVINO. You can also try the precision of FP16, which should give you better performance without a significant accuracy drop (change data_type). Run in the command line:
mo --saved_model_dir "model" --data_type FP32 --output_dir "model_ir"
Run the inference
The converted model can be loaded by the runtime and compiled for a specific device, e.g., CPU or GPU (integrated into your CPU like Intel HD Graphics). If you don't know what the best choice for you is, use AUTO. If you care about latency, I suggest adding a performance hint (as shown below) to use the device that fulfills your requirement. If you care about throughput, change the value to THROUGHPUT or CUMULATIVE_THROUGHPUT.
# Load the network
ie = Core()
model_ir = ie.read_model(model="model_ir/model.xml")
compiled_model_ir = ie.compile_model(model=model_ir, device_name="AUTO", config={"PERFORMANCE_HINT":"LATENCY"})
# Get output layer
output_layer_ir = compiled_model_ir.output(0)
# Run inference on the input image
result = compiled_model_ir([input_image])[output_layer_ir]
Disclaimer: I work on OpenVINO.

Porting pre-trained keras models and run them on IPU

I am trying to port two pre-trained keras models into the IPU machine. I managed to load and run them using IPUstrategy.scope but I dont know if i am doing it the right way. I have my pre-trained models in .h5 file format.
I load them this way:
def first_model():
model = tf.keras.models.load_model("./model1.h5")
return model
After searching your ipu.keras.models.py file I couldn't find any load methods to load my pre-trained models, and this is why i used tf.keras.models.load_model().
Then i use this code to run:
cfg=ipu.utils.create_ipu_config()
cfg=ipu.utils.auto_select_ipus(cfg, 1)
ipu.utils.configure_ipu_system(cfg)
ipu.utils.move_variable_initialization_to_cpu()
strategy = ipu.ipu_strategy.IPUStrategy()
with strategy.scope():
model = first_model()
print('compile attempt\n')
model.compile("sgd", "categorical_crossentropy", metrics=["accuracy"])
print('compilation completed\n')
print('running attempt\n')
res = model.predict(input_img)[0]
print('run completed\n')
you can see the output here:link
So i have some difficulties to understand how and if the system is working properly.
Basically the model.compile wont compile my model but when i use model.predict then the system first compiles and then is running. Why is that happening? Is there another way to run pre-trained keras models on an IPU chip?
Another question I have is if its possible to load a pre-trained keras model inside an ipu.keras.model and then use model.fit/evaluate to further train and evaluate it and then save it for future use?
One last question I have is about the compilation part of the graph. Is there a way to avoid recompilation of the graph every time i use the model.predict() in a different strategy.scope()?
I use tensorflow2.1.2 wheel
Thank you for your time
To add some context, the Graphcore TensorFlow wheel includes a port of Keras for the IPU, available as tensorflow.python.ipu.keras. You can access the API documentation for IPU Keras at this link. This module contains IPU-specific optimised replacement for TensorFlow Keras classes Model and Sequential, plus more high-performance, multi-IPU classes e.g. PipelineModel and PipelineSequential.
As per your specific issue, you are right when you mention that there are no IPU-specific ways to load pre-trained Keras models at present. I would encourage you, as you appear to have access to IPUs, to reach out to Graphcore Support. When doing so, please attach your pre-trained Keras model model1.h5 and a self-contained reproducer of your code.
Switching topic to the recompilation question: using an executable cache prevents recompilation, you can set that up with environmental variable TF_POPLAR_FLAGS='--executable_cache_path=./cache'. I'd also recommend to take a look into the following resources:
this tutorial gathers several considerations around recompilation and how to avoid it when using TensorFlow2 on the IPU.
Graphcore TensorFlow documentation here explains how to use the pre-compile mode on the IPU.

scikit-learn: use classifier from older version

I have a pickled scikit-learn classifier trained in version 0.14.1, and I want to use it on my mac which has scikit-learn version 0.17.1. When I try to run clf.score I get the following error:
AttributeError: 'SVC' object has no attribute '_dual_coef_'
Googling this it looks like the problem is with the difference in version between making the classifier and now wanting to test it. If retraining is not an option, is there a way to upgrade the classifier to work in the new version of scikit-learn?

sklearn: Regression models on sparse data?

Does python's scikit-learn have any regression models that work well with sparse data?
I was poking around and found this "sparse linear regression" module, but it seems outdated. (It's so old that scikit-learn was called 'scikits-learn' at the time, I think.)
Most scikit-learn regression models (linear such as Ridge, Lasso, ElasticNet or non-linear, e.g. with RandomForestRegressor) support both dense and sparse input data recent versions of scikit-learn (0.16.0 is the latest stable version at the time of writing).
Edit: if you are unsure, check the docstring of the fit method of the class of interest.

Resources