How to load BertforSequenceClassification models weights into BertforTokenClassification model? - nlp

Initially, I have a fine-tuned BERT base cased model using a text classification dataset and I have used BertforSequenceClassification class for this.
from transformers import BertForSequenceClassification, AdamW, BertConfig
# Load BertForSequenceClassification, the pretrained BERT model with a single
# linear classification layer on top.
model = BertForSequenceClassification.from_pretrained(
"bert-base-uncased", # Use the 12-layer BERT model, with an uncased vocab.
num_labels = 2, # The number of output labels--2 for binary classification.
# You can increase this for multi-class tasks.
output_attentions = False, # Whether the model returns attentions weights.
output_hidden_states = False, # Whether the model returns all hidden-states.
)
Now I want to use this fine-tuned BERT model weights for Named Entity Recognition and I have to use BertforTokenClassification class for this. I'm unable to figure out how to load the fine-tuned BERT model weights into the new model created using BertforTokenClassification.
Thanks in advance.......................

You can get weights from the bert inside the first model and load into the bert inside the second:
new_model = BertForTokenClassification(config=config)
new_model.bert.load_state_dict(model.bert.state_dict())

This worked for me
new_model = BertForTokenClassification.from_pretrained('/config path')
new_model.bert.load_state_dict(model.bert.state_dict())

Related

Performing MLM pretraining on BERT pretrained model to use model in Sentence Transformer for semantic similarity

I have a NLP use case to compute semantic similarity between sentences that are very specific to my use case.
I want to use Sentence Transformers library to do this, which provides with state of the art result for this goal.
I have a BERT model specifically trained for the sBERT task and I know I can finetune the model with pair of sentences as inputs and similarity score as labels.
However, I would also like to continue BERT pretraining with Mask Language Modeling task on this model.
Does it make sense to instantiate a BertForMaskedLM object from this model already trained for sentence transformer task in order to continue its pretraining, and then load it as a SentenceTransformer model to finetune it on sentence pairs?
I would do as such, with example on Camembert French NLP model from huggingface :
For the MLM part:
from transformers import CamembertTokenizer, CamembertForMaskedLM, LineByLineTextDataset, DataCollatorForLanguageModeling, Trainer, TrainingArguments
tokenizer = CamembertTokenizer.from_pretrained("dangvantuan/sentence-camembert-large")
model = CamembertForMaskedLM.from_pretrained("dangvantuan/sentence-camembert-large")
dataset = LineByLineTextDataset(
tokenizer=tokenizer,
file_path=LOCAL_DATASET_PATH,
block_size=512
)
data_collator = DataCollatorForLanguageModeling(
tokenizer=tokenizer, mlm=True, mlm_probability=0.15
)
training_args = TrainingArguments(
output_dir=LOCAL_MODEL_PATH,
overwrite_output_dir=True,
num_train_epochs=25,
save_steps=500,
save_total_limit=2,
seed=1,
auto_find_batch_size=True
)
trainer = Trainer(
model=model,
args=training_args,
data_collator=data_collator,
train_dataset=dataset,
)
trainer.train()
trainer.save_model(LOCAL_MODEL_PATH + "/my_model")
To get it as SentenceTransformer model:
from sentence_transformers import SentenceTransformer, models
word_embedding_model = models.Transformer(
LOCAL_MODEL_PATH + "/my_model",
tokenizer_name_or_path=tokenizer_path,
max_seq_length=max_seq_length
)
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension())
model = SentenceTransformer(modules=[word_embedding_model, pooling_model])
Thanks !

Check if the way of evaluating keras model via unseen data is correct

I studied Keras and created my first neural network model as the following:
from keras.layers import Dense
import keras
from keras import Sequential
from sklearn.metrics import accuracy_score
tr_X, tr_y = getTrainingData()
# NN Architecture
model = Sequential()
model.add(Dense(16, input_dim=tr_X.shape[1]))
model.add(keras.layers.advanced_activations.PReLU())
model.add(Dense(16))
model.add(keras.layers.advanced_activations.PReLU())
model.add(Dense(1, activation='sigmoid'))
# Compile the Model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the Model
model.fit(tr_X, tr_y, epochs=1000, batch_size=200, validation_split=0.2)
# ----- Evaluate the Model (Using UNSEEN data) ------
ts_X, ts_y = getTestingData()
yhat_classes = model.predict_classes(ts_X, verbose=0)[:, 0]
accuracy = accuracy_score(ts_y, yhat_classes)
print(accuracy)
I am not sure about the last portion of my code, i.e., model evaluation using model.predict_classes() where new data are loaded via a custom method getTestingData(). See my goal is to test the final model using new UNSEEN data to evaluate its prediction. My question is about this part: Am I evaluating the model correctly?
Thank you,
Yes, that is correct. You can use predict or predict_classes to get the predictions on test data. If you need the loss & metrics directly, you can use the evaluate method by feeding ts_X and ts_y.
y_pred = model.predict(ts_X)
loss, accuracy = model.evaluate(ts_X, ts_y)
https://keras.io/models/model/#predict
https://keras.io/models/model/#evaluate
Difference between predict & predict_classes: What is the difference between "predict" and "predict_class" functions in keras?

How to extract features from a layer of the pretrained ResNet model Keras

I trained a model with Resnet3D and I want to extract the neurons of a layer. I plan to use them with the SVM classifier. How can I extract these weights and put them to the numpy array?
Load the weights by keras
model = Resnet3DBuilder.build_resnet_18((128, 96, 96, 3), nClass[0])
model.load_weights('drive/app/models/3d_resnet_modelq.hdf5')
extract a layer
dns = model.layers[-1].output
now what should i do?
If you just want to visualise the features, in pure Keras you can define a Model with the desired layer as output:
from keras.models import Model
model_cut = Model(inputs=model.inputs, output=model.layers[-1].output)
features = model_cut.predict(x) # Assuming you have your images in x
Note that in order for this to work, model must have been compiled at least once.

How to print the weights of Keras embedding?

import gensim
Load Google's pre-trained Word2Vec model.
model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
embed = model.get_keras_embedding()
print(embed.weights)
output: []
How to print the weights of embedding layer in Keras, currently it is printing empty list?

Modify layers in resnet model

I am trying to train resnet50 model for image classification problem. I have loaded the pretrained 'imagenet' weights before training the model on the dataset I have. I want to insert a layer (mean subtraction layer) in-between the input layer and the first convolutiuon layer.
model = ResNet50(weights='imagenet')
def mean_subtract(img):
img = T.set_subtensor(img[:,0,:,:],img[:,0,:,:] - 123.68)
img = T.set_subtensor(img[:,1,:,:],img[:,1,:,:] - 116.779)
img = T.set_subtensor(img[:,2,:,:],img[:,2,:,:] - 103.939)
return img / 255.0
I want to insert inputs = Lambda(mean_subtract, name='mean_subtraction')(inputs) next to the input layer and connect this to the first convolution layer of resnet model without losing the weights saved.
How do I do that?
Thanks!
Quick answer (Seems better than adding the function to the model)
Use the preprocessing function as described here: preprocessing images generated using keras function ImageDataGenerator() to train resnet50 model
Long answer
Since your function doesn't change shapes, you can put it in an outer model without changing the Resnet model (changing models may not be so simple, I always try to mount new models with parts from other models if needed).
resnet_model = ResNet50(weights='imagenet')
inputs = Input((None,None,3))
#it seems you're using (3,None,None) instead.
#choose based on your "data_format", which by default is channels_last
outputs = Lambda(mean_subtract,output_shape=not_necessary_with_tensorflow)(inputs)
outputs = resnet_model(outputs)
model = Model(inputs, outputs)

Resources