In machine learning tutorials using keras, the code to train the machine learning model is this typical one-liner.
model.fit(X_train,
Y_train,
nb_epoch=5,
batch_size = 128,
verbose=1,
validation_split=0.1)
This seems easy when the training data X_train and Y_train is small. X_train and Y_train are numpy ndarrays. In practical situations, the training data can go into gigabytes which may be too large to be even fitted into the RAM of the computer.
How do you send data into model.fit() when the training data is too huge?
There is a simple solution for that in Keras. You can simply use python generators, where your data is lazy loaded. If you have Images you can also use the ImageDataGenerator.
def generate_data(x, y, batch_size):
while True:
batch = []
for b in range(batch_size):
batch.append(myDataSlice)
yield np.array(batch )
model.fit_generator(
generator=generate_data(x, y, batch_size),
steps_per_epoch=num_batches,
validation_data=list_batch_generator(x_val, y_val, batch_size),
validation_steps=num_batches_test)
Related
I'm training my model using ModelCheckpoint save_best_only=True and EarlyStopping patience=5. I use my test data for validation. When applying evaluate() using the same test data and the same callbacks, I get the same metrics as the last epoch model's val_ metrics, not the "best model" ones. How can I use evaluate() and get my best model's metrics? I want to use evaluate() on out-of-time data to get these metrics for comparison and need to make sure it is using the best model.
model.fit(x_train, y_train, callbacks=[stopper, checkpoint], validation_data=(x_test, y_test))
model.evaluate(x_test, y_test, callbacks=[stopper, checkpoint])
expect metrics from epoch 4, but get metrics from epoch 9
After training, you need to load the best_model and then evaluate it. Also I'm not sure if you need the callbacks in model.evaluate().
So code could be something like this,
# callback to save the best model
checkpoint_path = 'best_model.h5'
checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
save_best_only=True)
# train
model.fit(x_train,
y_train,
callbacks=[stopper, checkpoint],
validation_data=(x_test, y_test))
# load best model
model.load_weights(checkpoint_path)
# evaluate
model.evaluate(x_test, y_test)
i am training an autoencoder neural network for my work purpose.However i am taking
the image numpy array dataset as input(total samples 16110) and want to split dataset into training and test set using the below autoencoder.fit command. Additionally while training the network it is writing like Train on 12856 samples, validate on 3254 samples.
However, i need to save both the training and testing data into separate files. How can i do it?
es=EarlyStopping(monitor='val_loss',mode='min',verbose=1,patience=5)
mc=ModelCheckpoint('best_model.h5',monitor='val_loss',mode='min',save_best_only=True)
history = autoencoder.fit(dataNoise,dataNoise, epochs=30, batch_size=256, shuffle=256,callbacks=[es,mc], validation_split = 0.2)
you can use the train_test_split function from sklearn. See code below
from sklearn.model_selection import train_test_split
train_split=.9 # set this as the % you want for training
train_noise, valid_noise=train_test_split(dataNoise, train_size=train_split, shuffle=True,
random_state=123)
now use train noise as x,y and valid noise for validation data in model.fit
My Question is I ran a keras model for 100 epochs(gave epochs=100) and stopped for some time for cooling purpose of CPU and GPU.
Than I ran 100 epochs again and loss is decreasing from where it has stopped in the previous 100 epochs .
Is it works in all conditions.
Like if there are on 1000 epochs I want to train my model, can I stop after every 100 epochs and wait until my cpu and GPU cools and run the next 100 epochs.
Can I do this?
It will not work in all conditions. For examples if you shuffle the data and perform a validation split like this :
fit(x,y,epochs=1, verbose=1, validation_split=0.2, shuffle=True)
You will use the entire dataset for training which is not what you expect.
Furthermore, by doing multiple fit you will erase history information (accuracy, loss, etc at each epoch), given by :
model.history
So some callback functions that use this history will not work properly, like EarlyStopping (source code here).
Otherwise, it works as it does not mess around with the keras optimizer as you can see in the source code of keras optimizers (Adadelta optimizer).
However, I do not recommend to do this. Because it could cause bugs in future development. A cleaner way to do that would be to create a custom callback function with a delay like this :
import time
class DelayCallback(keras.callbacks.Callback):
def __init__(self,delay_value=10, epoch_to_complete=10):
self.delay_value = delay_value # in second
self.epoch_to_complete = epoch_to_complete
def on_epoch_begin(self, epoch, logs={}):
if (epoch+1) % self.epoch_to_complete == 0:
print("cooling down")
time.sleep(self.delay_value)
return
model.fit(x_train, y_train,
batch_size=32,
epochs=20,
verbose=1, callbacks=[DelayCallback()])
This question already has answers here:
How to get reproducible results in keras
(11 answers)
Closed 4 years ago.
I trained a keras model on Mnist keeping the training and model hyperparameters same. The training and validation data was exactly same. I got five different accuracies- 0.71, 0.62, 0.59, 0.52, 0.46 in different training sessions. The model was trained on 8 epochs from scratch everytime
This is the code:
def train():
model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Dense(10, activation="softmax"))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=32, epochs=8, verbose=0)
results = model.evaluate(x_test, y_test, verbose=0)
print(results[1])
for i in range(5):
print(i)
train()
Results:
0
2019-01-23 13:26:54.475110: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
0.7192
1
0.6223
2
0.5976
3
0.5223
4
0.4624
It may be only because the weights of the models are generated randomly everytime. Suppose, I train 2 models with same data and hyperparameters. Since, they have different weights initially, their loss and accuracy would vary. But, after a certain number of epochs, they both would converge at a same point where the accuracies and losses of both the models seem equal. This point could be the minima with respect to the loss, since the data is same. Otherwise, it could be a point from where both the models acquire a same path towards convergence.
In your case, maybe training for a greater number of epochs would bring equal losses and accuracies to all the models.
I finishing model training processing. During training, I used ModelCheckpint to save the weights of the best model by:
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1,
save_best_only=True, mode='max')
After training, I load the model weights in to a model for evaluation, but I found the model does not give the best accuracy observed during training. I reload the model as follows:
model.load_weights(filepath) #load saved weights
model = Sequential()
model.add(Convolution2D(32, 7, 7, input_shape=(3, 128, 128)))
....
....
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
#evaluate the model
scores = model.evaluate_generator(test_generator,val_samples)
print("Accuracy = ", scores[1])
The highest accuracy saved by Modelcheckpoint is about 85%, but the re-compiled model only gives an accuracy of 16%?
Is there something wrong I am doing?
To be safe, is there any way to directly save the best model rather than the model weights?
Putting model.load_weights(filepath) after compiling the model fixes the problem!!
But I am still curious about saving the best model during training
Two tips for making sure you're using the best model trained:
Add the val_acc to the file name
You can create your ModelCheckpoint like this:
checkpoint = ModelCheckpoint('my-model-{val_acc:.2f}.hdf5', monitor='val_acc', verbose=1,
save_best_only=True, mode='max')
That way, you'll have multiple files, and you would be able to make sure you pick the best model.
Read the training output
When you look at the output of Keras while fitting, you'll see:
Epoch 000XX: val_acc improved from 0.8 to 0.85, saving model to my-model-0.85.hdf5
Let's say you have a bunch of data that you are training on and you decide to save the weights for your best iteration only. Now, if you have not iterated through all of your data before you find your 'best' model weights you will be effectively throwing away data and any later evaluation using the so called best weights will not correlate to your in-batch evaluation.