I finishing model training processing. During training, I used ModelCheckpint to save the weights of the best model by:
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1,
save_best_only=True, mode='max')
After training, I load the model weights in to a model for evaluation, but I found the model does not give the best accuracy observed during training. I reload the model as follows:
model.load_weights(filepath) #load saved weights
model = Sequential()
model.add(Convolution2D(32, 7, 7, input_shape=(3, 128, 128)))
....
....
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
#evaluate the model
scores = model.evaluate_generator(test_generator,val_samples)
print("Accuracy = ", scores[1])
The highest accuracy saved by Modelcheckpoint is about 85%, but the re-compiled model only gives an accuracy of 16%?
Is there something wrong I am doing?
To be safe, is there any way to directly save the best model rather than the model weights?
Putting model.load_weights(filepath) after compiling the model fixes the problem!!
But I am still curious about saving the best model during training
Two tips for making sure you're using the best model trained:
Add the val_acc to the file name
You can create your ModelCheckpoint like this:
checkpoint = ModelCheckpoint('my-model-{val_acc:.2f}.hdf5', monitor='val_acc', verbose=1,
save_best_only=True, mode='max')
That way, you'll have multiple files, and you would be able to make sure you pick the best model.
Read the training output
When you look at the output of Keras while fitting, you'll see:
Epoch 000XX: val_acc improved from 0.8 to 0.85, saving model to my-model-0.85.hdf5
Let's say you have a bunch of data that you are training on and you decide to save the weights for your best iteration only. Now, if you have not iterated through all of your data before you find your 'best' model weights you will be effectively throwing away data and any later evaluation using the so called best weights will not correlate to your in-batch evaluation.
Related
I'm training my model using ModelCheckpoint save_best_only=True and EarlyStopping patience=5. I use my test data for validation. When applying evaluate() using the same test data and the same callbacks, I get the same metrics as the last epoch model's val_ metrics, not the "best model" ones. How can I use evaluate() and get my best model's metrics? I want to use evaluate() on out-of-time data to get these metrics for comparison and need to make sure it is using the best model.
model.fit(x_train, y_train, callbacks=[stopper, checkpoint], validation_data=(x_test, y_test))
model.evaluate(x_test, y_test, callbacks=[stopper, checkpoint])
expect metrics from epoch 4, but get metrics from epoch 9
After training, you need to load the best_model and then evaluate it. Also I'm not sure if you need the callbacks in model.evaluate().
So code could be something like this,
# callback to save the best model
checkpoint_path = 'best_model.h5'
checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
save_best_only=True)
# train
model.fit(x_train,
y_train,
callbacks=[stopper, checkpoint],
validation_data=(x_test, y_test))
# load best model
model.load_weights(checkpoint_path)
# evaluate
model.evaluate(x_test, y_test)
I am trying to add data augmentation as a layer to a model but I am getting the following error.
TypeError: The added layer must be an instance of class Layer. Found: <tensorflow.python.keras.preprocessing.image.ImageDataGenerator object at 0x7f8c2dea0710>
data_augmentation = tf.keras.preprocessing.image.ImageDataGenerator(
rotation_range=30, horizontal_flip=True)
model = Sequential()
model.add(data_augmentation)
model.add(Dense(1028,input_shape=(final_features.shape[1],)))
model.add(Dropout(0.7,input_shape=(final_features.shape[1],)))
model.add(Dense(n_classes, activation= 'softmax', kernel_regularizer='l2'))
model.compile(optimizer=adam,
loss='categorical_crossentropy',
metrics=['accuracy'])
history = model.fit(final_features, y,
batch_size=batch_size,
epochs=epochs,
validation_split=0.2,
callbacks=[lrr,EarlyStop])
I have also tried this way:
data_augmentation = Sequential(
[
preprocessing.RandomFlip("horizontal"),
preprocessing.RandomRotation(0.1),
preprocessing.RandomZoom(0.1),
]
)
model = Sequential()
model.add(data_augmentation)
model.add(Dense(1028,input_shape=(final_features.shape[1],)))
model.add(Dropout(0.7,input_shape=(final_features.shape[1],)))
model.add(Dense(n_classes, activation= 'softmax', kernel_regularizer='l2'))
model.compile(optimizer=adam,
loss='categorical_crossentropy',
metrics=['accuracy'])
history = model.fit(final_features, y,
batch_size=batch_size,
epochs=epochs,
validation_split=0.2,
callbacks=[lrr,EarlyStop])
It gives an error:
ValueError: Input 0 of layer sequential_7 is incompatible with the layer: expected ndim=4, found ndim=2. Full shape received: [128, 14272]
Could you please advice how I can use augmentation in Keras?
In your first case, you are using ImageDataGenerator as a layer, which is not: as the name says, it is just a generator which applies random transformations to images (image augmentation) before feeding the network. So, the images are augmented in CPU and then feed to the neural network which can run in GPU if you have one.
Generators are usually used also to avoid loading huge datasets into memory since they allow to load only the batches being used soon.
In the second case, you are using image augmentation as layers of your model properly. The difference here is that the augmentation is run as part of your model, so if you have a GPU available for instance, those operations will run in GPU.
The problem with your second case is in the model itself (in fact the model is also wrong in the first approach, you only get an error there with the bad usage of ImageDataGenerator before your execution arrives to the model).
Note that you are using images as inputs, so, the input should be of shape (height, width, channels), but then you are starting your model with a dense layer, which expects a single array of shape (n_features,).
If your model needs to start with a Dense layer (strange, but may be ok in some case) then you need first to use Flatten layer to convert images of shape (h,w,c) into vectors of shape (h*w*c,). This change will solve your second approach for sure.
That said, you don't need to specify the input shape on every single layer: doing it in your first layer should be enough.
Last, but not least: are you sure this model is being feed with images? According to your fit call, it looks like you are using previously extracted features that may be vectors (this make sense with your current model architecture but makes no sense with the usage of image augmentation).
Please, provide more details with respect to your data to clarify this point.
Good morning,
i tried to use a sequential model to create my neural network which have a multiple input (concatenated). But i want to know if shall i use The Keras functional API to CREATE my model.
in1= loadtxt('in1.csv', delimiter=',')#2D matrix
in2= loadtxt('in2.csv', delimiter=',')#2D matrix
y= loadtxt('y.csv', delimiter=',') #2D matrix (output labels)
X_train=np.hstack((in1,in2))
y_train=y
model = Sequential()
model.add(Dense(nbinneuron, input_dim=2*nx,activation='tanh',kernel_initializer='normal'))
model.add(Dropout(0.5))
#output layer
model.add(Dense(2, activation='tanh'))
opt =Adalta(lr=0.01)
model.compile(loss='mean_squared_error', optimizer=opt, metrics=['mse'])
# fit the keras model on the dataset
history=model.fit(X_train, y_train,validation_data=(X_test, y_test), epochs=500,verbose=0)
...
thanks an advance
A Sequential Model can only have one input and one output. To build a model with multiple inputs (and/or multiple outputs), you need to use the Functional API.
I am a bit confused on how Keras fits the models. In general, Keras models are fitted by simply using model.fit(...) something like the following:
model.fit(X_train, y_train, epochs=300, batch_size=64, validation_data=(X_test, y_test))
My question is: Because I stated the testing data by the argument validation_data=(X_test, y_test), does it mean that each epoch is independent? In other words, I understand that at each epoch, Keras train the model using the training data (after getting shuffled) followed by testing the trained model using the provided validation_data. If that's the case, then no matter how many epochs I choose, I only take the results of the last epoch!!
If this scenario is correct, so we do we need multiple epoches? Unless these epoches are dependent somwhow where each epoch uses the same NN weights from the previous epoch, correct?
Thank you
When Keras fit your model it pass throught all the dataset at each epoch by a step corresponding to your batch_size.
For exemple if you have a dataset of 1000 items and a batch_size of 8, the weight of your model will be updated by using 8 items and this until it have seen all your data set.
At the end of that epoch, the model will try to do a prediction on your validation set.
If we have made only one epoch, it would mean that the weight of the model is updated only once per element (because it only "saw" one time the complete dataset).
But in order to minimize the loss function and by backpropagation, we need to update those weights multiple times in order to reach the optimum loss, so pass throught all the dataset multiple times, in other word, multiple epochs.
I hope i'm clear, ask if you need more informations.
I'll run some larger models and want to try intermediate results.
Therefore, I try to use checkpoints to save the best model after each epoch.
This is my code:
model = Sequential()
model.add(LSTM(700, input_shape=(X_modified.shape[1], X_modified.shape[2]), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(700, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(700))
model.add(Dropout(0.2))
model.add(Dense(Y_modified.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# Save the checkpoint in the /output folder
filepath = "output/text-gen-best.hdf5"
# Keep only a single checkpoint, the best over test accuracy.
checkpoint = ModelCheckpoint(filepath,
monitor='val_acc',
verbose=1,
save_best_only=True,
mode='max')
model.fit(X_modified, Y_modified, epochs=100, batch_size=50, callbacks=[checkpoint])
But I am still getting the warning after the first epoch:
/usr/local/lib/python3.6/site-packages/keras/callbacks.py:432: RuntimeWarning: Can save best model only with val_acc available, skipping.
'skipping.' % (self.monitor), RuntimeWarning)
To add metrics=['accuracy'] to the model was in other SO questions (e.g. Unable to save weights while using pre-trained VGG16 model) the solution, but here the error still remains.
You are trying to checkpoint the model using the following code
# Save the checkpoint in the /output folder
filepath = "output/text-gen-best.hdf5"
# Keep only a single checkpoint, the best over test accuracy.
checkpoint = ModelCheckpoint(filepath,
monitor='val_acc',
verbose=1,
save_best_only=True,
mode='max')
ModelCheckpoint will consider the argument monitor to take the decision of saving the model or not. In your code it is val_acc. So it will save the weights if there is a increase in the val_acc.
Now in your fit code,
model.fit(X_modified, Y_modified, epochs=100, batch_size=50, callbacks=[checkpoint])
you haven't provided any validation data. ModelCheckpoint can't save the weights because it don't have the monitor argument to check.
In order to do check pointing based on val_acc you must provide some validation data like this.
model.fit(X_modified, Y_modified, validation_data=(X_valid, y_valid), epochs=100, batch_size=50, callbacks=[checkpoint])
If you don't want to use validation data for whatever be the reason and implement check pointing, you have to change the ModelCheckpoint to work based on acc or loss like this
# Save the checkpoint in the /output folder
filepath = "output/text-gen-best.hdf5"
# Keep only a single checkpoint, the best over test accuracy.
checkpoint = ModelCheckpoint(filepath,
monitor='acc',
verbose=1,
save_best_only=True,
mode='max')
Keep in mind that you have to change mode to min if you are going to monitor the loss
It is missing, not because the metric is missing, but because you have no validation data. Add some through the validation_data parameter to fit, or use validation_split.
I had the same issue, just edit 'val_acc' to 'val_accuracy'
you just have to change monitor='val_acc' to -> monitor='val_accuracy'
as your metrics is metrics=['accuracy']