How to view the changes in a huggingface model after training?

How to view the changes in a huggingface model after training? - nlp

I trained a BART model (facebook-cnn) for summarization and compared summaries with a pretrained model
model_before_tuning_1 = AutoModelForSeq2SeqLM.from_pretrained(model_name)
trainer = Seq2SeqTrainer(
model=model,
args=training_args,
data_collator=data_collator,
train_dataset=train_data,
eval_dataset=validation_data,
tokenizer=tokenizer,
compute_metrics=compute_metrics,
)
trainer.train()
Summaries from model() and model_before_tuning_1() are different but when i compare the model config and/or print(model) it gives exact same things for both.
How to know, what exact parameters have this training changed?

You can compare state_dict of the models. I.e. model.state_dict() and model_before_tuning_1.state_dict().
State_dict contains learnable parameters that change during traning. For further details see: https://pytorch.org/tutorials/recipes/recipes/what_is_state_dict.html
Otherwise, printing the models or model config gives you the same results because the architecure does not change during training.

Related

Data collator not set in trainer class?

I am training a language model using a Hugging face model. I am using a RoBERTa model and I am getting a problem when training. This is how I create the Trainer class using a DataCollatorForLanguageModeling as data_collator.
trainer = Trainer(
model=model,
args=training_args,
data_collator=collator,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
tokenizer=tokenizer
#prediction_loss_only=True,
)
However, when I call trainer.get_train_dataloader().collate_fn it is using a RemoveColumnsCollator. I think this is the reason why the training is not working.

I found out this is a wrapper class for the data collator passed as an argument. It is possible to find it by doing
trainer.get_train_dataloader().collate_fn.data_collator

shall i use a sequential model or Functional API to model a neural network for two input 2D matrix

Good morning,
i tried to use a sequential model to create my neural network which have a multiple input (concatenated). But i want to know if shall i use The Keras functional API to CREATE my model.
in1= loadtxt('in1.csv', delimiter=',')#2D matrix
in2= loadtxt('in2.csv', delimiter=',')#2D matrix
y= loadtxt('y.csv', delimiter=',') #2D matrix (output labels)
X_train=np.hstack((in1,in2))
y_train=y
model = Sequential()
model.add(Dense(nbinneuron, input_dim=2*nx,activation='tanh',kernel_initializer='normal'))
model.add(Dropout(0.5))
#output layer
model.add(Dense(2, activation='tanh'))
opt =Adalta(lr=0.01)
model.compile(loss='mean_squared_error', optimizer=opt, metrics=['mse'])
# fit the keras model on the dataset
history=model.fit(X_train, y_train,validation_data=(X_test, y_test), epochs=500,verbose=0)
...
thanks an advance

A Sequential Model can only have one input and one output. To build a model with multiple inputs (and/or multiple outputs), you need to use the Functional API.

LSTM model weights to train data for text classification

I built a LSTM model for text classification using Keras. Now I have new data to be trained. instead of appending to the original data and retrain the model, I thought of training the data using the model weights. i.e. making the weights to get trained with the new data.
However, irrespective of the volume i train, the model is not predicting the correct classification (even if i give the same sentence for prediction). What could be the reason?
Kindly help me.

Are you using the following to save the trained model?
model.save('model.h5')
model.save_weights('model_weights.h5')
And the following to load it?
from keras.models import load_model
model = load_model('model.h5') # Load the architecture
model = model.load_weights('model_weights.h5') # Set the weights
# train on new data
model.compile...
model.fit...
The model loaded is the exact same as the model being saved here. If you are doing this, then there must be something different in the data (in comparison with what it is trained on).

Tensorflow - building a CNN model as described in the tutorial

I just completed the implementation of A Guide to TF Layers: Building a Convolutional Neural Network for the MNIST data set. The training model successfully ran and gave accuracy of 97.3%.
However, the tutorial does not mention how to use this new trained model to supply own images and see the predictions. Does anyone know how to use the output of the training model to make predictions? I see in the tmp/mnist_convnet_model$ folder, there are some output files like .pbtxt , meta files and index files. But I can't find instructions to use them for making predictions on my own images.

y_pred = tf.nn.softmax(your_final_layer)
y_pred_cls = tf.argmax(y_pred, dimension=1)
and for prediction
feed_dict = {x: [your_image]}
classification = tf.run(y_pred_cls, feed_dict)
print classification
This applies to just about any model you create

keras: load saved model weights in a model for evaluation

I finishing model training processing. During training, I used ModelCheckpint to save the weights of the best model by:
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1,
save_best_only=True, mode='max')
After training, I load the model weights in to a model for evaluation, but I found the model does not give the best accuracy observed during training. I reload the model as follows:
model.load_weights(filepath) #load saved weights
model = Sequential()
model.add(Convolution2D(32, 7, 7, input_shape=(3, 128, 128)))
....
....
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
#evaluate the model
scores = model.evaluate_generator(test_generator,val_samples)
print("Accuracy = ", scores[1])
The highest accuracy saved by Modelcheckpoint is about 85%, but the re-compiled model only gives an accuracy of 16%?
Is there something wrong I am doing?
To be safe, is there any way to directly save the best model rather than the model weights?

Putting model.load_weights(filepath) after compiling the model fixes the problem!!
But I am still curious about saving the best model during training

Two tips for making sure you're using the best model trained:
Add the val_acc to the file name
You can create your ModelCheckpoint like this:
checkpoint = ModelCheckpoint('my-model-{val_acc:.2f}.hdf5', monitor='val_acc', verbose=1,
save_best_only=True, mode='max')
That way, you'll have multiple files, and you would be able to make sure you pick the best model.
Read the training output
When you look at the output of Keras while fitting, you'll see:
Epoch 000XX: val_acc improved from 0.8 to 0.85, saving model to my-model-0.85.hdf5

Let's say you have a bunch of data that you are training on and you decide to save the weights for your best iteration only. Now, if you have not iterated through all of your data before you find your 'best' model weights you will be effectively throwing away data and any later evaluation using the so called best weights will not correlate to your in-batch evaluation.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to view the changes in a huggingface model after training? - nlp

Related

Data collator not set in trainer class?

shall i use a sequential model or Functional API to model a neural network for two input 2D matrix

LSTM model weights to train data for text classification

Tensorflow - building a CNN model as described in the tutorial

keras: load saved model weights in a model for evaluation

Categories

Resources