what does this command do for a bert transformers? - keras

!pip install transformers
from transformers import InputExample, InputFeatures
What are InputExample and InputFeatures here?
thanks.

Check out the documentation.
Processors
This library includes processors for several traditional tasks. These
processors can be used to process a dataset into examples that can be
fed to a model.
And
class transformers.InputExample
A single training/test example for simple sequence classification.
As well as
class transformers.InputFeatures
A single set of features of data. Property names are the same names as
the corresponding inputs to a model.
So basically InputExample is just a raw input and InputFeatures is the (numerical) feature representation of that Input that the model uses.
I couldn't find any tutorial explicitly explaining this but you can check out Chapter 4 (From text to features) in this tutorial where it is nicely explained on an example.
From my experience the transformers library has an absolute ton of classes and structures so going too deep into the technical implementation can make it easy to get lost in. For starters I would recommend trying to get an idea of the broader picture by just getting some example projects to work as well as checking out their 🤗 Course.

Related

How many data is needed to creat useful word vectors using gensim?

I try to do named entity recognition (nlp) on tweets for identifying if a tweet talks about diseases. For that purpose I'm currently try to create my own word2vec in gensim and import them for training a new ner model in spacy. My question would be, how many data is needed in order to create a useful word vector for my purpose?
There's no fixed floor; it depends on your specific data/goals/parameters.
But the word2vec algorithm in general benefits from lots of data.
For word-tokens of interest, you'll want your training data to include many (at least dozens) realistic & subtly-contrasting examples of their usage in surrounding-word contexts.
The only real test of whether your data is sufficient: try it, tweak it, see if it gives useful results for your needs.

How to get the inference compute graph of the pytorch model?

I want to hand write a framework to perform inference of a given neural network. The network is so complicated, so to make sure my implementation is correct, I need to know how exactly the inference process is done on device.
I tried to use torchviz to visualize the network, but what I got seems to be the back propagation compute graph, which is really hard to understand.
Then I tried to convert the pytorch model to ONNX format, following the instruction enter link description here, but when I tried to visualize it, it seems that the original layers of the model had been seperated into very small operators.
I just want to get the result like this
How can I get this? Thanks!
Have you tried saving the model with torch.save (https://pytorch.org/tutorials/beginner/saving_loading_models.html) and opening it with Netron? The last view you showed is a view of the Netron app.
You can try also the package torchview, which provides several features (useful especially for large models). For instance you can set the display depth (depth in nested hierarchy of moduls).
It is also based on forward prop
github repo
Disclaimer: I am the author of the package
Note: The accepted format for tool is pytorch model

Tensorflow NLP with BERT Preprocessing data

so this is a specific question involving two Tensorflow text classification tutorials on tensorflow.org. Sorry if this is the wrong place to ask.
Basically, there are two tutorials, one is "Classify Text with BERT" https://www.tensorflow.org/text/tutorials/classify_text_with_bert
And the other is "Fine-tuning a BERT model"
https://www.tensorflow.org/text/tutorials/fine_tune_bert
In these two tutorials, it describes preprocessing data. In "Classify Text with BERT", they use a preprocessing model provided by Tensorflow Hub, but in "Fine-tuning a BERT model", they implement python code which tokenizes the data and encodes it and some other stuff. Basically, it seems like the latter method is a lot more complicated than the former.
My question is, why does one tutorial use a preprocessing model provided, while the other actually implements python code? Is there a difference between the two tutorials that requires them to use their specific preprocessing methods?
Thank you!

Tensorflow and Bert What are they exactly and what's the difference between them?

I'm interested in NLP and I come up with Tensorflow and Bert, both seem to be from Google and both seem to be the best thing for Sentiment Analysis as of today but I don't understand what are they exactly and what is the difference between them... Can someone explain?
Tensorflow is an open-source library for machine learning that will let you build a deep learning model/architecture. But the BERT is one of the architectures itself. You can build many models using TensorFlow including RNN, LSTM, and even the BERT. The transformers like the BERT are a good choice if you just want to deploy a model on your data and you don't care about the deep learning field itself. For this purpose, I recommended the HuggingFace library that provides a straightforward way to employ a transformer model in just a few lines of code. But if you want to take a deeper look at these models, I will suggest you to learns about the well-known deep learning architectures for text data like RNN, LSTM, CNN, etc., and try to implement them using an ML library like Tensorflow or PyTorch.
Bert and Tensorflow is not different thing , There are not only 2, but many implementations of BERT. Most are basically equivalent.
The implementations that you mentioned are:
The original code by Google, in Tensorflow. https://github.com/google-research/bert
Implementation by Huggingface, in Pytorch and Tensorflow, that reproduces the same results as the original implementation and uses the same checkpoints as the original BERT article. https://github.com/huggingface/transformers
These are the differences regarding different aspects:
In terms of results, there is no difference in using one or the other, as they both use the same checkpoints (same weights) and their results have been checked to be equal.
In terms of reusability, HuggingFace library is probably more reusable, as it is designed specifically for that. Also, it gives you the freedom of choosing TensorFlow or Pytorch as deep learning framework.
In terms of performance, they should be the same.
In terms of community support (e.g. asking questions in github or stackoverflow about them), HuggingFace library is better suited, as there are a lot of people using it.
Apart from BERT, the transformers library by HuggingFace has implementations for lots of models: OpenAI GPT-2, RoBERTa, ELECTRA, ...

Custom Loss Function with Spacy Textcat

I've been looking around for a while now. I would like to know if it's possible to modify/customize the loss function of the spaCy textcategorizer.
I mean, when you want to distill a model (for instance BERT) and want to add a regression component in the loss function to optimize (regarding the probabilities of each class instead only the labels), I don't understand where I should look for. I tried to explore some spaCy code but there is only a function to get the loss.
If someone know where to look for to visualize the loss function and change it (by writing a subclass for instance) it would be nice !
Thanks
Arnault
SpaCy is ultimately built on top of thinc and therefore, if you want to do custom work, you should tinker with Thinc, not SpaCy. SpaCy typically allows you to initialize a pipe with a raw Thinc model.
Especially since SpaCy's philosophy is to provide one implementation that works well not necessarily a super customizable framework.

Resources