Parse '.caffemodel.h5' for Lasagne - theano

I have trained a model from caffe in ".caffemodel.h5" format. I want to parse it to extract parameters and feed it to a lasagne model. How can I do it?

You will have to replicate the architecture of your caffemodel in terms of Lasagne layers first, by hand, or use a script that does this (a script that takes a caffe protobuf and translates it to lasagne exists AFAIK, or should be made if not. In any case it exists in sklearn-theano.)
By hand, you need to open the hf5 file using e.g. pytables, and then replicate the architecture in Lasagne.

Related

Convert timm model to huggingface

I have a (PyTorch) timm ViT-B/16 model that's been pre-trained on a bunch of domain specific data. I'd like to be able to load the parameters to an equivalent model created using the huggingface transformers library for usage with multi-modal data.
Googling hasn't really helped me locate a convenience function to do the conversion. Apart from going layer by layer and manually translating the keys of the state dictionary, is there any way to do this conversion?
And in case I'm missing something, if there's an intervening layer (say a BatchNorm) that doesn't have an equivalent in either model - is the conversion still useful?

Using AllenNLP Interpret with a HuggingFace model

I would like to use AllenNLP Interpret (code + demo) with a PyTorch classification model trained with HuggingFace (electra base discriminator). Yet, it is not obvious to me, how I can convert my model, and use it in a local allen-nlp demo server.
How should I proceed ?
Thanks in advance
If your task is binary classification, you can look at the BoolQ example in https://github.com/allenai/allennlp-models/blob/main/training_config/classification/boolq_roberta.jsonnet. You can change that configuration to use a different model (such as Electra).
We also just put some new documentation out for the Interpret functionality: https://guide.allennlp.org/interpret
To give you a more specific answer, I'll need to know some more details, like what the task is you're trying to solve, how you trained the original model, etc.

Can I combine two ONNX graphs together, passing the output from one as input to another?

I have a model, exported from pytorch, I'll call main_model.onnx. It has an input node I'll call main_input that expects a list of integers. I can load this in onnxruntime and send a list of ints and it works great.
I made another ONNX model I'll call pre_model.onnx with input pre_input and output pre_output. This preprocesses some text so input is the text, and pre_output is a list of ints, exactly as main_model.onnx needs for input.
My goal here is, using the Python onnx.helper tools, create one uber-model that accepts text as input, and runs through my pre-model.onnx, possibly some connector node (Identity maybe?), and then through main_model.onnx all in one big combined.onnx model.
I have tried using pre_model.graph.node+Identity connector+main_model.graph.node as nodes in a new graph, but the parameters exported from pytorch are lost this way. Is there a way to keep all those parameters and everything around, and export this one even larger combined ONNX model?
This is possible to achieve albeit a bit tricky. You can explore the Python APIs offered by ONNX (https://github.com/onnx/onnx/blob/master/docs/PythonAPIOverview.md). This will allow you to load models to memory and you'll have to "compose" your combined model using the APIs exposed (combine both the GraphProto messages into one - this is easier said than done - you' ll have to ensure that you don't violate the onnx spec while doing this) and finally store the new Graphproto in a new ModelProto and you have your combined model. I would also run it through the onnx checker on completion to ensure the model is valid post its creation.
If you have static size inputs, sclblonnx package is an easy solution for merging Onnx models. However, it does not support dynamic size inputs.
For dynamic size inputs, one solution would be writing your own code using ONNX API as stated earlier.
Another solution would be converting the two ONNX models to a framework(Tensorflow or PyTorch) using tools like onnx-tensorflow or onnx2pytorch. Then pass the outputs of one network as inputs of the other network and export the whole network to Onnx format.

What is the difference between a .ckpt and a .pth file in Pytorch?

I am following a code from GitHub that uses Pytorch.
The model is saved using :
model.save(ARGS.working_dir + '/model_%d.ckpt' % (epoch+1)).
What is the difference between using .pth and .ckpt in Pytorch?
There is no difference. the extension in Pytorch models that you see is something random. You can choose anything.
People usually use pth to indicate a PyTorcH model (and hence .pth). but then again its completely up to you on how you want to save your model.

Extracting fixed vectors from BioBERT without using terminal command?

If we want to use weights from pretrained BioBERT model, we can execute following terminal command after downloading all the required BioBERT files.
os.system('python3 extract_features.py \
--input_file=trial.txt \
--vocab_file=vocab.txt \
--bert_config_file=bert_config.json \
--init_checkpoint=biobert_model.ckpt \
--output_file=output.json')
The above command actually reads individual file containing the text, reads the textual content from it, and then writes the extracted vectors to another file. So, the problem with this is that it could not be scaled easily for very large data-sets containing thousands of sentences/paragraphs.
Is there is a way to extract these features on the go (using an embedding layer) like it could be done for the word2vec vectors in PyTorch or TF1.3?
Note: BioBERT checkpoints do not exist for TF2.0, so I guess there is no way it could be done with TF2.0 unless someone generates TF2.0 compatible checkpoint files.
I will be grateful for any hint or help.
You can get the contextual embeddings on the fly, but the total time spend on getting the embeddings will always be the same. There are two options how to do it: 1. import BioBERT into the Transformers package and treat use it in PyTorch (which I would do) or 2. use the original codebase.
1. Import BioBERT into the Transformers package
The most convenient way of using pre-trained BERT models is the Transformers package. It was primarily written for PyTorch, but works also with TensorFlow. It does not have BioBERT out of the box, so you need to convert it from TensorFlow format yourself. There is convert_tf_checkpoint_to_pytorch.py script that does that. People had some issues with this script and BioBERT (seems to be resolved).
After you convert the model, you can load it like this.
import torch
from transformers import *
# Load dataset, tokenizer, model from pretrained model/vocabulary
tokenizer = BertTokenizer.from_pretrained('directory_with_converted_model')
model = BertModel.from_pretrained('directory_with_converted_model')
# Call the model in a standard PyTorch way
embeddings = model([tokenizer.encode("Cool biomedical tetra-hydro-sentence.", add_special_tokens=True)])
2. Use directly BioBERT codebase
You can get the embeddings on the go basically using the code that is exctract_feautres.py. On lines 346-382, they initialize the model. You get the embeddings by calling estimator.predict(...).
For that, you need to format your format the input. First, you need to format the string (using code on line 326-337) and then apply and call convert_examples_to_features on it.

Resources