I trained a BART model (facebook-cnn) for summarization and compared summaries with a pretrained model
model_before_tuning_1 = AutoModelForSeq2SeqLM.from_pretrained(model_name)
trainer = Seq2SeqTrainer(
model=model,
args=training_args,
data_collator=data_collator,
train_dataset=train_data,
eval_dataset=validation_data,
tokenizer=tokenizer,
compute_metrics=compute_metrics,
)
trainer.train()
Summaries from model() and model_before_tuning_1() are different but when i compare the model config and/or print(model) it gives exact same things for both.
How to know, what exact parameters have this training changed?
You can compare state_dict of the models. I.e. model.state_dict() and model_before_tuning_1.state_dict().
State_dict contains learnable parameters that change during traning. For further details see: https://pytorch.org/tutorials/recipes/recipes/what_is_state_dict.html
Otherwise, printing the models or model config gives you the same results because the architecure does not change during training.
Related
I am training a language model using a Hugging face model. I am using a RoBERTa model and I am getting a problem when training. This is how I create the Trainer class using a DataCollatorForLanguageModeling as data_collator.
trainer = Trainer(
model=model,
args=training_args,
data_collator=collator,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
tokenizer=tokenizer
#prediction_loss_only=True,
)
However, when I call trainer.get_train_dataloader().collate_fn it is using a RemoveColumnsCollator. I think this is the reason why the training is not working.
I found out this is a wrapper class for the data collator passed as an argument. It is possible to find it by doing
trainer.get_train_dataloader().collate_fn.data_collator
Good morning,
i tried to use a sequential model to create my neural network which have a multiple input (concatenated). But i want to know if shall i use The Keras functional API to CREATE my model.
in1= loadtxt('in1.csv', delimiter=',')#2D matrix
in2= loadtxt('in2.csv', delimiter=',')#2D matrix
y= loadtxt('y.csv', delimiter=',') #2D matrix (output labels)
X_train=np.hstack((in1,in2))
y_train=y
model = Sequential()
model.add(Dense(nbinneuron, input_dim=2*nx,activation='tanh',kernel_initializer='normal'))
model.add(Dropout(0.5))
#output layer
model.add(Dense(2, activation='tanh'))
opt =Adalta(lr=0.01)
model.compile(loss='mean_squared_error', optimizer=opt, metrics=['mse'])
# fit the keras model on the dataset
history=model.fit(X_train, y_train,validation_data=(X_test, y_test), epochs=500,verbose=0)
...
thanks an advance
A Sequential Model can only have one input and one output. To build a model with multiple inputs (and/or multiple outputs), you need to use the Functional API.
I built a LSTM model for text classification using Keras. Now I have new data to be trained. instead of appending to the original data and retrain the model, I thought of training the data using the model weights. i.e. making the weights to get trained with the new data.
However, irrespective of the volume i train, the model is not predicting the correct classification (even if i give the same sentence for prediction). What could be the reason?
Kindly help me.
Are you using the following to save the trained model?
model.save('model.h5')
model.save_weights('model_weights.h5')
And the following to load it?
from keras.models import load_model
model = load_model('model.h5') # Load the architecture
model = model.load_weights('model_weights.h5') # Set the weights
# train on new data
model.compile...
model.fit...
The model loaded is the exact same as the model being saved here. If you are doing this, then there must be something different in the data (in comparison with what it is trained on).
I just completed the implementation of A Guide to TF Layers: Building a Convolutional Neural Network for the MNIST data set. The training model successfully ran and gave accuracy of 97.3%.
However, the tutorial does not mention how to use this new trained model to supply own images and see the predictions. Does anyone know how to use the output of the training model to make predictions? I see in the tmp/mnist_convnet_model$ folder, there are some output files like .pbtxt , meta files and index files. But I can't find instructions to use them for making predictions on my own images.
y_pred = tf.nn.softmax(your_final_layer)
y_pred_cls = tf.argmax(y_pred, dimension=1)
and for prediction
feed_dict = {x: [your_image]}
classification = tf.run(y_pred_cls, feed_dict)
print classification
This applies to just about any model you create
I finishing model training processing. During training, I used ModelCheckpint to save the weights of the best model by:
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1,
save_best_only=True, mode='max')
After training, I load the model weights in to a model for evaluation, but I found the model does not give the best accuracy observed during training. I reload the model as follows:
model.load_weights(filepath) #load saved weights
model = Sequential()
model.add(Convolution2D(32, 7, 7, input_shape=(3, 128, 128)))
....
....
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
#evaluate the model
scores = model.evaluate_generator(test_generator,val_samples)
print("Accuracy = ", scores[1])
The highest accuracy saved by Modelcheckpoint is about 85%, but the re-compiled model only gives an accuracy of 16%?
Is there something wrong I am doing?
To be safe, is there any way to directly save the best model rather than the model weights?
Putting model.load_weights(filepath) after compiling the model fixes the problem!!
But I am still curious about saving the best model during training
Two tips for making sure you're using the best model trained:
Add the val_acc to the file name
You can create your ModelCheckpoint like this:
checkpoint = ModelCheckpoint('my-model-{val_acc:.2f}.hdf5', monitor='val_acc', verbose=1,
save_best_only=True, mode='max')
That way, you'll have multiple files, and you would be able to make sure you pick the best model.
Read the training output
When you look at the output of Keras while fitting, you'll see:
Epoch 000XX: val_acc improved from 0.8 to 0.85, saving model to my-model-0.85.hdf5
Let's say you have a bunch of data that you are training on and you decide to save the weights for your best iteration only. Now, if you have not iterated through all of your data before you find your 'best' model weights you will be effectively throwing away data and any later evaluation using the so called best weights will not correlate to your in-batch evaluation.