Difference between Simple transformers NERModel pipeline and Transformers FromPreTrained Pipeline - nlp

I want to fine tune a BERT NER model on my dataset and custom labels and I can't understand the difference between using Simple Transformers NER model: https://simpletransformers.ai/docs/ner-model/ where I can easily specify which model I want to train on my dataset or using Transformers' From Pre Trained: https://huggingface.co/docs/transformers/model_doc/auto
1- Simple Transformers NER Model:
model = NERModel('bert', 'd4data/biomedical-ner-all', labels=custom_labels args=train_args,use_cuda=False)
model.train_model(train_data, eval_data=dev_data)
result, model_outputs, preds_list = model.eval_model(test_data)
model = NERModel('bert', 'outputs/best_model', labels=custom_labels, args=train_args,use_cuda=False)
2- From Pre Trained
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("d4data/biomedical-ner-all")
model = AutoModelForTokenClassification.from_pretrained("d4data/biomedical-ner-all")
I Tried using Both approaches but I really don't understand the core difference between them, can someone please explain when do I use any approach?

Related

Loading a GPU trained BERTopic model on CPU?

I trained a BERTopic model on a GPU, and now for visualization purposes I want to load it on a CPU.
But when I tried to do that I got:
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
When I tried to use the suggested fix I got the same problem?
Saw some fix that suggests to save the model without its embeddings model, but don't want to retrain an resave unless its the last option, and would also love if someone could explain what's this embedding model and what's going on under the hood.
topic_model = torch.load(args.model, map_location=torch.device('cpu'))
When you want to save the BERTopic model without the embedding model, you can run the following:
from bertopic import BERTopic
from sklearn.datasets import fetch_20newsgroups
from sentence_transformers import SentenceTransformer
docs = fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))['data']
# Train the model
embedding_model = SentenceTransformer("all-MiniLM-L6-v2")
topic_model = BERTopic(embedding_model=embedding_model)
topics, probs = topic_model.fit_transform(docs)
# Save the model without the embedding model
topic_model.save("my_model", save_embedding_model=False)
This should prevent any issues with GPU/CPU if you are not using any of the cuML sub-models in BERTopic.
Saw some fix that suggests to save the model without its embeddings model, but don't want to retrain an resave unless its the last option, and would also love if someone could explain what's this embedding model and what's going on under the hood.
The embedding model is typically a pre-trained model that actually is not learning from the input data. There are options to make it learn during training but that requires a custom component in BERTopic. In other words, when you use a pre-trained model, it is no problem removing that pre-trained model when saving the topic model as there would be no need to re-train the model.
In other words, we would first save our topic model in our GPU environment without the embedding model:
topic_model.save("my_model", save_embedding_model=False)
Then, we load in our saved BERTopic model in our CPU environment and then pass the pre-trained embedding model:
from sentence_transformers import SentenceTransformer
embedding_model = SentenceTransformer("all-MiniLM-L6-v2")
topic_model = BERTopic.load("my_model", embedding_model=embedding_model )
You can learn more about the role of the embedding model here.

alternative for maximum entropy for NLP model

I am working on a NLP model where the model identifies the ARGs given the PREDICATE.
I am using MaxEnt for the model.
My model works fine. I trained the model with a Train dataset, created all the features, and then tested it with a Test dataset.
I wanted to try this with some other package and not with MaxEnt.
Can someone suggest what else can I use?

What are differences between AutoModelForSequenceClassification vs AutoModel

We can create a model from AutoModel(TFAutoModel) function:
from transformers import AutoModel
model = AutoModel.from_pretrained('distilbert-base-uncase')
In other hand, a model is created by AutoModelForSequenceClassification(TFAutoModelForSequenceClassification):
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification('distilbert-base-uncase')
As I know, both models use distilbert-base-uncase library to create models.
From name of methods, the second class( AutoModelForSequenceClassification ) is created for Sequence Classification.
But what are really differences in 2 classes? And how to use them correctly?
(I searched in huggingface but it is not clear)
The difference between AutoModel and AutoModelForSequenceClassification model is that AutoModelForSequenceClassification has a classification head on top of the model outputs which can be easily trained with the base model

Compatibility between keras and tf.keras models

I am interested in training a model in tf.keras and then loading it with keras. I know this is not highly-advised, but I am interested in using tf.keras to train the model because
tf.keras is easier to build input pipelines
I want to take advantage of the tf.dataset API
and I am interested in loading it with keras because
I want to use coreml to deploy the model to ios.
I want to use coremltools to convert my model to ios, and coreml tools only works with keras, not tf.keras.
I have run into a few road-blocks, because not all of the tf.keras layers can be loaded as keras layers. For instance, I've had no trouble with a simple DNN, since all of the Dense layer parameters are the same between tf.keras and keras. However, I have had trouble with RNN layers, because tf.keras has an argument time_major that keras does not have. My RNN layers have time_major=False, which is the same behavior as keras, but keras sequential layers do not have this argument.
My solution right now is to save the tf.keras model in a json file (for the model structure) and delete the parts of the layers that keras does not support, and also save an h5 file (for the weights), like so:
model = # model trained with tf.keras
# save json
model_json = model.to_json()
with open('path_to_model_json.json', 'w') as json_file:
json_ = json.loads(model_json)
layers = json_['config']['layers']
for layer in layers:
if layer['class_name'] == 'SimpleRNN':
del layer['config']['time_major']
json.dump(json_, json_file)
# save weights
model.save_weights('path_to_my_weights.h5')
Then, I use the coremlconverter tool to convert from keras to coreml, like so:
with CustomObjectScope({'GlorotUniform': glorot_uniform()}):
coreml_model = coremltools.converters.keras.convert(
model=('path_to_model_json','path_to_my_weights.h5'),
input_names=#inputs,
output_names=#outputs,
class_labels = #labels,
custom_conversion_functions = { "GlorotUniform": tf.keras.initializers.glorot_uniform
}
)
coreml_model.save('my_core_ml_model.mlmodel')
My solution appears to be working, but I am wondering if there is a better approach? Or, is there imminent danger in this approach? For instance, is there a better way to convert tf.keras models to coreml? Or is there a better way to convert tf.keras models to keras? Or is there a better approach that I haven't thought of?
Any advice on the matter would be greatly appreciated :)
Your approach seems good to me!
In the past, when I had to convert tf.keras model to keras model, I did following:
Train model in tf.keras
Save only the weights tf_model.save_weights("tf_model.hdf5")
Make Keras model architecture using all layers in keras (same as the tf keras one)
load weights by layer names in keras: keras_model.load_weights(by_name=True)
This seemed to work for me. Since, I was using out of box architecture (DenseNet169), I had to very less work to replicate tf.keras network to keras.

Use gensim Random Projection in sklearn SVM

Is it possible to use a gensim Random Projection to train a SVM in sklearn?
I need to use gensim's tfidf implementation because it's better at dealing with large inputs and then want to put that into a random projection on which I will train my SVM. I'd also be happy to just pass the tfidf model generated by gensim to sklearn and use their random projection, if that makes things easier.
But so far I haven't found a way to get either model out of gensim into sklearn.
I have tried using gensim.matutils.corpus2cscbut of course that doesn't work: neither TfidfModel nor RpModel are corpi, so now I'm clueless at what to try next.
This is now very easy thanks to an awesome gensim contribution from Chinmaya Pancholi (see post here).
Simply import the sklearn wrapper from `gensim:
from gensim.sklearn_api import RpTransformer
Then, you can use the model to do analysis as you would any other sklearn classifier:
model = RpTransformer(num_topics=2)
clf = svm.SVC()
pipe = Pipeline([('features', model,), ('classifier', clf)])
pipe.fit(X_train, y_train)
One thing to be aware of, when using the gensim models, is that you still need to perform the dictionary and corpus steps. So instead of fitting your model on X_train, you'll have to do something along the following lines:
dictionary = Dictionary(X_train)
corpus_train = [dictionary.doc2bow(text) for text in X_train]
corpus_test = [dictionary.doc2bow(text) for text in X_test]
Then fit/predict your model on corpus_train or corpus_test.

Resources