What are differences between AutoModelForSequenceClassification vs AutoModel - nlp

We can create a model from AutoModel(TFAutoModel) function:
from transformers import AutoModel
model = AutoModel.from_pretrained('distilbert-base-uncase')
In other hand, a model is created by AutoModelForSequenceClassification(TFAutoModelForSequenceClassification):
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification('distilbert-base-uncase')
As I know, both models use distilbert-base-uncase library to create models.
From name of methods, the second class( AutoModelForSequenceClassification ) is created for Sequence Classification.
But what are really differences in 2 classes? And how to use them correctly?
(I searched in huggingface but it is not clear)

The difference between AutoModel and AutoModelForSequenceClassification model is that AutoModelForSequenceClassification has a classification head on top of the model outputs which can be easily trained with the base model

Related

Difference between Simple transformers NERModel pipeline and Transformers FromPreTrained Pipeline

I want to fine tune a BERT NER model on my dataset and custom labels and I can't understand the difference between using Simple Transformers NER model: https://simpletransformers.ai/docs/ner-model/ where I can easily specify which model I want to train on my dataset or using Transformers' From Pre Trained: https://huggingface.co/docs/transformers/model_doc/auto
1- Simple Transformers NER Model:
model = NERModel('bert', 'd4data/biomedical-ner-all', labels=custom_labels args=train_args,use_cuda=False)
model.train_model(train_data, eval_data=dev_data)
result, model_outputs, preds_list = model.eval_model(test_data)
model = NERModel('bert', 'outputs/best_model', labels=custom_labels, args=train_args,use_cuda=False)
2- From Pre Trained
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("d4data/biomedical-ner-all")
model = AutoModelForTokenClassification.from_pretrained("d4data/biomedical-ner-all")
I Tried using Both approaches but I really don't understand the core difference between them, can someone please explain when do I use any approach?

Loading a GPU trained BERTopic model on CPU?

I trained a BERTopic model on a GPU, and now for visualization purposes I want to load it on a CPU.
But when I tried to do that I got:
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
When I tried to use the suggested fix I got the same problem?
Saw some fix that suggests to save the model without its embeddings model, but don't want to retrain an resave unless its the last option, and would also love if someone could explain what's this embedding model and what's going on under the hood.
topic_model = torch.load(args.model, map_location=torch.device('cpu'))
When you want to save the BERTopic model without the embedding model, you can run the following:
from bertopic import BERTopic
from sklearn.datasets import fetch_20newsgroups
from sentence_transformers import SentenceTransformer
docs = fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))['data']
# Train the model
embedding_model = SentenceTransformer("all-MiniLM-L6-v2")
topic_model = BERTopic(embedding_model=embedding_model)
topics, probs = topic_model.fit_transform(docs)
# Save the model without the embedding model
topic_model.save("my_model", save_embedding_model=False)
This should prevent any issues with GPU/CPU if you are not using any of the cuML sub-models in BERTopic.
Saw some fix that suggests to save the model without its embeddings model, but don't want to retrain an resave unless its the last option, and would also love if someone could explain what's this embedding model and what's going on under the hood.
The embedding model is typically a pre-trained model that actually is not learning from the input data. There are options to make it learn during training but that requires a custom component in BERTopic. In other words, when you use a pre-trained model, it is no problem removing that pre-trained model when saving the topic model as there would be no need to re-train the model.
In other words, we would first save our topic model in our GPU environment without the embedding model:
topic_model.save("my_model", save_embedding_model=False)
Then, we load in our saved BERTopic model in our CPU environment and then pass the pre-trained embedding model:
from sentence_transformers import SentenceTransformer
embedding_model = SentenceTransformer("all-MiniLM-L6-v2")
topic_model = BERTopic.load("my_model", embedding_model=embedding_model )
You can learn more about the role of the embedding model here.

What is the different for torchvision.models.resnet and torch.hub.load?

There are two method for using resnet of pytorch.
methods 1:
import torch
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet50', pretrained=True)
model.eval()
methods 2:
import torch
net = models.resnet50(pretrained=True)
Are they load the same model. If not what is the difference?
The only difference that there is between your models if you load them in that way it's the number of layers, since you're loading resnet18 with Torch Hub and resnet50 with Models (thus, also the pretrained weights). They behave differently, you can see more about that in this paper.
Torch Hub also lets you publish pretrained models in your repository, but since you're loading it from 'pytorch/vision:v0.10.0' (which is the same repository from which Models is loading the neural networks), there should be no difference between:
model = torch.hub.load('pytorch/vision', 'resnet18', pretrained=True)
and
model = models.resnet18(pretrained=True)

How do I replace keras.utils.multi_gpu_model for training with multiple gpus

As of tensorflow 2.4 tensorflow.keras.utils.multi_gpu_model has been removed. I am looking for a way to replace this simple command to train with multiple gpus.
from tensorflow.keras.models import load_model
model = load_model("my_model.h5")
if gpus>1:
from tensorflow.keras.utils import multi_gpu_model
model = multi_gpu_model(model, gpus=gpus)
Where model is a loaded model that can be used to train or make predictions on multiple gpus.
One way to train with multiple gpus is to use a distributed strategy. The way I found that works pretty much as a drop in replacement is a MirroredStrategy
session = MirroredStrategy()
with session.scope():
model = load_model("my_model.h5")
This way when the model is used, inside of this block, it is used on multiple gpus.

Use gensim Random Projection in sklearn SVM

Is it possible to use a gensim Random Projection to train a SVM in sklearn?
I need to use gensim's tfidf implementation because it's better at dealing with large inputs and then want to put that into a random projection on which I will train my SVM. I'd also be happy to just pass the tfidf model generated by gensim to sklearn and use their random projection, if that makes things easier.
But so far I haven't found a way to get either model out of gensim into sklearn.
I have tried using gensim.matutils.corpus2cscbut of course that doesn't work: neither TfidfModel nor RpModel are corpi, so now I'm clueless at what to try next.
This is now very easy thanks to an awesome gensim contribution from Chinmaya Pancholi (see post here).
Simply import the sklearn wrapper from `gensim:
from gensim.sklearn_api import RpTransformer
Then, you can use the model to do analysis as you would any other sklearn classifier:
model = RpTransformer(num_topics=2)
clf = svm.SVC()
pipe = Pipeline([('features', model,), ('classifier', clf)])
pipe.fit(X_train, y_train)
One thing to be aware of, when using the gensim models, is that you still need to perform the dictionary and corpus steps. So instead of fitting your model on X_train, you'll have to do something along the following lines:
dictionary = Dictionary(X_train)
corpus_train = [dictionary.doc2bow(text) for text in X_train]
corpus_test = [dictionary.doc2bow(text) for text in X_test]
Then fit/predict your model on corpus_train or corpus_test.

Resources