Custom Loss Function with Spacy Textcat - python-3.x

I've been looking around for a while now. I would like to know if it's possible to modify/customize the loss function of the spaCy textcategorizer.
I mean, when you want to distill a model (for instance BERT) and want to add a regression component in the loss function to optimize (regarding the probabilities of each class instead only the labels), I don't understand where I should look for. I tried to explore some spaCy code but there is only a function to get the loss.
If someone know where to look for to visualize the loss function and change it (by writing a subclass for instance) it would be nice !
Thanks
Arnault

SpaCy is ultimately built on top of thinc and therefore, if you want to do custom work, you should tinker with Thinc, not SpaCy. SpaCy typically allows you to initialize a pipe with a raw Thinc model.
Especially since SpaCy's philosophy is to provide one implementation that works well not necessarily a super customizable framework.

Related

Convert timm model to huggingface

I have a (PyTorch) timm ViT-B/16 model that's been pre-trained on a bunch of domain specific data. I'd like to be able to load the parameters to an equivalent model created using the huggingface transformers library for usage with multi-modal data.
Googling hasn't really helped me locate a convenience function to do the conversion. Apart from going layer by layer and manually translating the keys of the state dictionary, is there any way to do this conversion?
And in case I'm missing something, if there's an intervening layer (say a BatchNorm) that doesn't have an equivalent in either model - is the conversion still useful?

How to get the inference compute graph of the pytorch model?

I want to hand write a framework to perform inference of a given neural network. The network is so complicated, so to make sure my implementation is correct, I need to know how exactly the inference process is done on device.
I tried to use torchviz to visualize the network, but what I got seems to be the back propagation compute graph, which is really hard to understand.
Then I tried to convert the pytorch model to ONNX format, following the instruction enter link description here, but when I tried to visualize it, it seems that the original layers of the model had been seperated into very small operators.
I just want to get the result like this
How can I get this? Thanks!
Have you tried saving the model with torch.save (https://pytorch.org/tutorials/beginner/saving_loading_models.html) and opening it with Netron? The last view you showed is a view of the Netron app.
You can try also the package torchview, which provides several features (useful especially for large models). For instance you can set the display depth (depth in nested hierarchy of moduls).
It is also based on forward prop
github repo
Disclaimer: I am the author of the package
Note: The accepted format for tool is pytorch model

what does this command do for a bert transformers?

!pip install transformers
from transformers import InputExample, InputFeatures
What are InputExample and InputFeatures here?
thanks.
Check out the documentation.
Processors
This library includes processors for several traditional tasks. These
processors can be used to process a dataset into examples that can be
fed to a model.
And
class transformers.InputExample
A single training/test example for simple sequence classification.
As well as
class transformers.InputFeatures
A single set of features of data. Property names are the same names as
the corresponding inputs to a model.
So basically InputExample is just a raw input and InputFeatures is the (numerical) feature representation of that Input that the model uses.
I couldn't find any tutorial explicitly explaining this but you can check out Chapter 4 (From text to features) in this tutorial where it is nicely explained on an example.
From my experience the transformers library has an absolute ton of classes and structures so going too deep into the technical implementation can make it easy to get lost in. For starters I would recommend trying to get an idea of the broader picture by just getting some example projects to work as well as checking out their 🤗 Course.

Tensorflow NLP with BERT Preprocessing data

so this is a specific question involving two Tensorflow text classification tutorials on tensorflow.org. Sorry if this is the wrong place to ask.
Basically, there are two tutorials, one is "Classify Text with BERT" https://www.tensorflow.org/text/tutorials/classify_text_with_bert
And the other is "Fine-tuning a BERT model"
https://www.tensorflow.org/text/tutorials/fine_tune_bert
In these two tutorials, it describes preprocessing data. In "Classify Text with BERT", they use a preprocessing model provided by Tensorflow Hub, but in "Fine-tuning a BERT model", they implement python code which tokenizes the data and encodes it and some other stuff. Basically, it seems like the latter method is a lot more complicated than the former.
My question is, why does one tutorial use a preprocessing model provided, while the other actually implements python code? Is there a difference between the two tutorials that requires them to use their specific preprocessing methods?
Thank you!

Using AllenNLP Interpret with a HuggingFace model

I would like to use AllenNLP Interpret (code + demo) with a PyTorch classification model trained with HuggingFace (electra base discriminator). Yet, it is not obvious to me, how I can convert my model, and use it in a local allen-nlp demo server.
How should I proceed ?
Thanks in advance
If your task is binary classification, you can look at the BoolQ example in https://github.com/allenai/allennlp-models/blob/main/training_config/classification/boolq_roberta.jsonnet. You can change that configuration to use a different model (such as Electra).
We also just put some new documentation out for the Interpret functionality: https://guide.allennlp.org/interpret
To give you a more specific answer, I'll need to know some more details, like what the task is you're trying to solve, how you trained the original model, etc.

Resources