I am new to deep learning and doing some classification problems.
I use EarlyStopping and ModelCheckpoint in my callbacks list but when training is starting, the baseline of the model checkpoint is negative infinity and overwrite 'best_model.h5'.
However, 'best_model.h5' already store my last best model. I want to set the baseline of ModelCheckpoint to the performance of my last best model on the data.
Can anyone help me?
es = EarlyStopping(monitor='val_accuracy', mode='max', verbose=1, patience=3)
mc = ModelCheckpoint('best_model.h5', monitor='val_accuracy', mode='max', save_best_only=True, verbose=1)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, validation_data=(x_valid, y_valid), batch_size=400,\
epochs=20, callbacks=[es, mc])
enter image description here
Do this:
mc = ModelCheckpoint('best_model-{epoch:04d}_{val_accuracy:.2f}.h5', monitor='val_accuracy', mode='max', save_best_only=True, verbose=1)
This will save your new best model with epoch number and validation_accuracy without overwriting best_model.h5. This should later help you pick the best models and compare.
I think your problem was you wanted to save the val_acc before the first epoch - Back to the mechanism of a general machine learning problem, I don't think the accuracy value before the first iteration makes sense to a comparison (your model has not been trained on the given dataset). If you do want, you could check the validation loss (val_loss) if possible.
But if you want to save the log of your training process, you don't need to save model for each epoch. You could use the history function as (import matplotlib.pyplot as plt)
results = model.fit(x_train, y_train, validation_data=(x_valid, y_valid), batch_size=400,epochs=20, callbacks=[es, mc])
plt.figure(figsize=(8, 8))
plt.title("Learning curve")
plt.plot(results.history["loss"], label="loss")
plt.plot(results.history["val_loss"], label="val_loss")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.legend()
plt.savefig('loss.png')
plt.figure(figsize=(8, 8))
plt.title("Learning curve")
plt.plot(results.history["acc"], label="accuracy")
plt.plot(results.history["val_acc"], label="accuracy")
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.legend()
plt.savefig('acc.png')
Related
Is there a way to reload the weights from a certain epoch or the best weights from the model checkpoint files created by ModelCheckpoint once the training is over?
I have trained that trained for 10 epochs and created a checkpoint that only saved weights after each epoch. The final epoch's val_categorical_accuracy is a bit lower than epoch no. 5. I know I should have set save_best_only=True but I missed that.
So now, is there a way to get the weights from the best epoch or the epoch number 5?
Also, does ModelCheckpoint overwrites weights after each epoch in
the checkpoint file?
What are my options here? Thanks for your help in advance.
Below is my implementation:
checkpoint_path = 'saved_model/cp.ckpt'
checkpoint_dir = os.path.dirname(checkpoint_path)
print(checkpoint_dir)
lstm_model.fit(X_train_seq_pad, y_train_cat,
epochs=100,
validation_data=(X_val_seq_pad, y_val_cat),
callbacks=[callbacks.EarlyStopping(monitor='val_loss', patience=3),
callbacks.ModelCheckpoint(filepath=checkpoint_path,
save_weights_only=True,
verbose=1)])
If the filepath doesn't contain formatting options like {epoch} then filepath will be overwritten by each new better model. In your case, that's why you can't get the weight at a specific epoch (e.g epoch 5).
Your option here, however, is to choose the formatting option in the ModelCheckpoint callback during training time. Such as
tf.keras.callbacks.ModelCheckpoint(
filepath='model.{epoch:02d}-{val_loss:.4f}.h5',
save_freq='epoch', verbose=1, monitor='val_loss',
save_weights_only=True, save_best_only=False
)
This will save the model weight (in .h5 format) at each epoch, in a different but convenient way. Additionally, if we choose save_best_only to True, it will save best weights in the same way.
Code Example
Here is one end-to-end working example for reference. We will save model weights at each epoch in a convenient way with a formatting option that we will define the filepath parameter as follows:
img = tf.random.normal([20, 32], 0, 1, tf.float32)
tar = np.random.randint(2, size=(20, 1))
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, input_dim = 32, activation= 'relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
callback_list = [
tf.keras.callbacks.ModelCheckpoint(
filepath='model.{epoch:02d}-{val_loss:.4f}.h5',
save_freq='epoch', verbose=1, monitor='val_loss',
save_weights_only=True, save_best_only=False
)
]
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(img, tar, epochs=5, verbose=2, validation_split=0.2,
callbacks=callback_list)
It will save the model weight at each epoch. And I will find every weight in my local disk.
# model.epoch_number_score.h5
model.01-0.8022.h5
model.02-0.8014.h5
model.03-0.8005.h5
model.04-0.7997.h5
model.05-0.7989.h5
However, note that I used save_best_only = False, but If we set it to True, you then only get the best weight in the same way. Something like this:
# model.epoch_number_score.h5
model.01-0.8022.h5
model.03-0.8005.h5
model.05-0.7989.h5
I need explanation of save best only option of ModelCheckpoint. If I have a code like this
model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
cp = [ModelCheckpoint(filepath=path+"/model-lstmMulti", verbose=1, save_best_only=True)]
history_callback = model.fit(X, y, epochs=350, verbose=1, callbacks=cp)
and then I want to see the accuracy of that best model:
acc_history = history_callback.history["acc"]
np.savetxt(path+"/acc_history.txt", np.asarray(acc_history))
I got the array ie. accuracy of the models of all epochs. Why I don't get only one value - accuracy of the best model?
ModelCheckpoint is a callback function used to save model file (h5) after epochs. It doesn't affect the return history of fit() method. Just use np.max to get the best acc from acc history will do your job.
My task is to learn defected items in a factory. It means, I try to detect defected goods or fine goods. This led a problem where one class dominates the others (one class is 99.7% of the data) as the defected items were very rare. Training accuracy is 0.9971 and validation accuracy is 0.9970. It sounds amazing.
But the problem is, the model only predicts everything is 0 class which is fine goods. That means, it fails to classify any defected goods.
How can I solve this problem? I have checked other questions and tried out, but I still have the situation. the total data points are 122400 rows and 5 x features.
In the end, my confusion matrix of the test set is like this
array([[30508, 0],
[ 92, 0]], dtype=int64)
which does a terrible job.
My code is as below:
le = LabelEncoder()
y = le.fit_transform(y)
ohe = OneHotEncoder(sparse=False)
y = y.reshape(-1,1)
y = ohe.fit_transform(y)
scaler = StandardScaler()
x = scaler.fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.25, random_state = 777)
#DNN Modelling
epochs = 15
batch_size =128
Learning_rate_optimizer = 0.001
model = Sequential()
model.add(Dense(5,
kernel_initializer='glorot_uniform',
activation='relu',
input_shape=(5,)))
model.add(Dense(5,
kernel_initializer='glorot_uniform',
activation='relu'))
model.add(Dense(8,
kernel_initializer='glorot_uniform',
activation='relu'))
model.add(Dense(2,
kernel_initializer='glorot_uniform',
activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr = Learning_rate_optimizer),
metrics=['accuracy'])
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
y_pred = model.predict(x_test)
confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))
Thank you
it sounds like you have highly imbalanced dataset, the model is learning only how to classify fine goods.
you can try one of the approaches listed here:
https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/
The best attempt would be to firstly take almost equal portions of data of both classes, split them into train-test-val, train the classifier and do thorough testing on your complete dataset. You can also try and use data augmentation techniques to your other set to get more data from the same set. Keep on iterating and maybe even try and change your loss function to suit your condition.
I am a newbie in ML and was experimenting with emotion detection on the text.
So I have an ISEAR dataset which contains tweets with their emotion labeled.
So my current accuracy is 63% and I want to increase to at least 70% or even more maybe.
Heres the code :
inputs = Input(shape=(MAX_LENGTH, ))
embedding_layer = Embedding(vocab_size,
64,
input_length=MAX_LENGTH)(inputs)
# x = Flatten()(embedding_layer)
x = LSTM(32, input_shape=(32, 32))(embedding_layer)
x = Dense(10, activation='relu')(x)
predictions = Dense(num_class, activation='softmax')(x)
model = Model(inputs=[inputs], outputs=predictions)
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['acc'])
model.summary()
filepath="weights-simple.hdf5"
checkpointer = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
history = model.fit([X_train], batch_size=64, y=to_categorical(y_train), verbose=1, validation_split=0.1,
shuffle=True, epochs=10, callbacks=[checkpointer])
That's a pretty general question, optimizing the performance of a neural network may require tuning many factors.
For instance:
The optimizer chosen: in NLP tasks rmsprop is also a popular
optimizer
Tweaking the learning rate
Regularization - e.g dropout, recurrent_dropout, batch norm. This may help the model to generalize better
More units in the LSTM
More dimensions in the embedding
You can try grid search, e.g. using different optimizers and evaluate on a validation set.
The data may also need some tweaking, such as:
Text normalization - better representation of the tweets - remove unnecessary tokens (#, #)
Shuffle the data before the fit - keras validation_split creates a validation set using the last data records
There is no simple answer to your question.
I trained a Keras Sequential Model and Loaded the same later. Both the model are giving different accuracy.
I have came across a similar question but was not able solve the problem.
Sample Code :
Loading and Traing the model
model = gensim.models.FastText.load('abc.simple')
X,y = load_data()
Vectors = np.array(vectors(X))
X_train, X_test, y_train, y_test = train_test_split(Vectors, np.array(y),
test_size = 0.3, random_state = 0)
X_train = X_train.reshape(X_train.shape[0],100,max_tokens,1)
X_test = X_test.reshape(X_test.shape[0],100,max_tokens,1)
data for input to our model
print(X_train.shape)
model2 = train()
score = model2.evaluate(X_test, y_test, verbose=0)
print(score)
Training Accuracy is 90%.
Saved the Model
# Saving Model
model_json = model2.to_json()
with open("model_architecture.json", "w") as json_file:
json_file.write(model_json)
model2.save_weights("model_weights.h5")
print("Saved model to disk")
But after I restarted the kernel and just loaded the saved model and runned it on same set of data, accuracy got reduced.
#load json and create model
json_file = open('model_architecture.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)
#load weights into new model
loaded_model.load_weights("model_weights.h5")
print("Loaded model from disk")
# evaluate loaded model on test data
loaded_model.compile(loss='binary_crossentropy', optimizer='rmsprop',
metrics=['accuracy'])
score = loaded_model.evaluate(X_test, y_test, verbose=0)
print(score)
Accuracy got reduced to 75% on the same set of data.
How to make it consistent ?
I have tried the following but of no help :
from keras.backend import manual_variable_initialization
manual_variable_initialization(True)
Even , I saved the whole model at once( weights and architecture) but was not able to solve this issue
Not sure, if your issue has been solved but for future comers.
I had exactly the same problem with saving and loading the weights. So on loading the model the accuracy and loss were changed greatly from 68% accuracy to 2 %. In my experiment, I am using Tensorflow as backend with Keras model layers Embedding, LSTM and Dense. My issue got solved by fixing the seed for keras which uses NumPy random generator and since I am using Tensorflow as backend, I also fixed the seed for it.
These are the lines I added at the top of my file where the model is also defined.
from numpy.random import seed
seed(42)# keras seed fixing
import tensorflow as tf
tf.random.set_seed(42)# tensorflow seed fixing
I hope this helps.
For more information have a look at this- https://machinelearningmastery.com/reproducible-results-neural-networks-keras/
I had the same problem due to a silly mistake of mine - after loading the model I had in my data generator the shuffle option (useful for the training) turned to True instead of False. After changing it to False the model predicted as expected. It would be nice if keras could take care of this automatically. This is my critical code part:
pred_generator = pred_datagen.flow_from_directory(
directory='./ims_dir',
target_size=(100, 100),
color_mode="rgb",
batch_size=1,
class_mode="categorical",
shuffle=False,
)
model = load_model(logpath_ms)
pred=model.predict_generator(pred_generator, steps = N, verbose=1)
My code worked when I scaled my dataset before reevaluating the model. I did this treatment before saving the model and had forgotten to repeat this procedure when I opened the model and wanted to evaluate it again. After I did that, the accuracy value appeared as it should \o/
model_saved = keras.models.load_model('tuned_cnn_1D_HAR_example.h5')
trainX, trainy, testX, testy = load_dataset()
trainX, testX = scale_data(trainX, testX, True)
score = model_saved.evaluate(testX, testy, verbose=0)
print("%s: %.2f%%" % (model_saved.metrics_names[1], score[1]*100))
inside of my function scale_data I used StandardScaler()