This question already has answers here:
How to get reproducible results in keras
(11 answers)
Closed 4 years ago.
I trained a keras model on Mnist keeping the training and model hyperparameters same. The training and validation data was exactly same. I got five different accuracies- 0.71, 0.62, 0.59, 0.52, 0.46 in different training sessions. The model was trained on 8 epochs from scratch everytime
This is the code:
def train():
model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Dense(10, activation="softmax"))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=32, epochs=8, verbose=0)
results = model.evaluate(x_test, y_test, verbose=0)
print(results[1])
for i in range(5):
print(i)
train()
Results:
0
2019-01-23 13:26:54.475110: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
0.7192
1
0.6223
2
0.5976
3
0.5223
4
0.4624
It may be only because the weights of the models are generated randomly everytime. Suppose, I train 2 models with same data and hyperparameters. Since, they have different weights initially, their loss and accuracy would vary. But, after a certain number of epochs, they both would converge at a same point where the accuracies and losses of both the models seem equal. This point could be the minima with respect to the loss, since the data is same. Otherwise, it could be a point from where both the models acquire a same path towards convergence.
In your case, maybe training for a greater number of epochs would bring equal losses and accuracies to all the models.
Related
Is there a way to calculate the training accuracy after completing training process with Keras or Tensorflow?
model.history has all the information required.
For example, after running model for 5 epochs, you can access the loss and accuracy as follows
history=model.history.history
print(history)
{'loss': [0.2212433920122683, 0.097910506768773, 0.06874677832927555, 0.05441241036520029, 0.0430859369851804], 'accuracy': [0.9342333, 0.9698667, 0.97786665, 0.98211664, 0.9856]}
If you want to access loss and accuracy during model.evaluate, you can do as follows
history2 =model.evaluate(x_test, y_test)
print(history2) # output[0.07691180044879548, 0.9772]
I am a bit confused on how Keras fits the models. In general, Keras models are fitted by simply using model.fit(...) something like the following:
model.fit(X_train, y_train, epochs=300, batch_size=64, validation_data=(X_test, y_test))
My question is: Because I stated the testing data by the argument validation_data=(X_test, y_test), does it mean that each epoch is independent? In other words, I understand that at each epoch, Keras train the model using the training data (after getting shuffled) followed by testing the trained model using the provided validation_data. If that's the case, then no matter how many epochs I choose, I only take the results of the last epoch!!
If this scenario is correct, so we do we need multiple epoches? Unless these epoches are dependent somwhow where each epoch uses the same NN weights from the previous epoch, correct?
Thank you
When Keras fit your model it pass throught all the dataset at each epoch by a step corresponding to your batch_size.
For exemple if you have a dataset of 1000 items and a batch_size of 8, the weight of your model will be updated by using 8 items and this until it have seen all your data set.
At the end of that epoch, the model will try to do a prediction on your validation set.
If we have made only one epoch, it would mean that the weight of the model is updated only once per element (because it only "saw" one time the complete dataset).
But in order to minimize the loss function and by backpropagation, we need to update those weights multiple times in order to reach the optimum loss, so pass throught all the dataset multiple times, in other word, multiple epochs.
I hope i'm clear, ask if you need more informations.
In machine learning tutorials using keras, the code to train the machine learning model is this typical one-liner.
model.fit(X_train,
Y_train,
nb_epoch=5,
batch_size = 128,
verbose=1,
validation_split=0.1)
This seems easy when the training data X_train and Y_train is small. X_train and Y_train are numpy ndarrays. In practical situations, the training data can go into gigabytes which may be too large to be even fitted into the RAM of the computer.
How do you send data into model.fit() when the training data is too huge?
There is a simple solution for that in Keras. You can simply use python generators, where your data is lazy loaded. If you have Images you can also use the ImageDataGenerator.
def generate_data(x, y, batch_size):
while True:
batch = []
for b in range(batch_size):
batch.append(myDataSlice)
yield np.array(batch )
model.fit_generator(
generator=generate_data(x, y, batch_size),
steps_per_epoch=num_batches,
validation_data=list_batch_generator(x_val, y_val, batch_size),
validation_steps=num_batches_test)
I am using LSTM for action recognition. My basic LSTM implemented on Keras is getting accuracy of 76% on test data on one dataset(CAD60) but when I use other dataset, my model gets stuck at a loss. Its predicting a single class always.
What can be the problem since I am using the exact framework, features on both the dataset. Even I tried to tune the learning rate change the optimizer but it didn't worked.
model = Sequential() # input has shape (samples, timesteps, locations)
model.add(LSTM(128, batch_input_shape=(batch_size, timesteps, data_dim)))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
I finishing model training processing. During training, I used ModelCheckpint to save the weights of the best model by:
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1,
save_best_only=True, mode='max')
After training, I load the model weights in to a model for evaluation, but I found the model does not give the best accuracy observed during training. I reload the model as follows:
model.load_weights(filepath) #load saved weights
model = Sequential()
model.add(Convolution2D(32, 7, 7, input_shape=(3, 128, 128)))
....
....
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
#evaluate the model
scores = model.evaluate_generator(test_generator,val_samples)
print("Accuracy = ", scores[1])
The highest accuracy saved by Modelcheckpoint is about 85%, but the re-compiled model only gives an accuracy of 16%?
Is there something wrong I am doing?
To be safe, is there any way to directly save the best model rather than the model weights?
Putting model.load_weights(filepath) after compiling the model fixes the problem!!
But I am still curious about saving the best model during training
Two tips for making sure you're using the best model trained:
Add the val_acc to the file name
You can create your ModelCheckpoint like this:
checkpoint = ModelCheckpoint('my-model-{val_acc:.2f}.hdf5', monitor='val_acc', verbose=1,
save_best_only=True, mode='max')
That way, you'll have multiple files, and you would be able to make sure you pick the best model.
Read the training output
When you look at the output of Keras while fitting, you'll see:
Epoch 000XX: val_acc improved from 0.8 to 0.85, saving model to my-model-0.85.hdf5
Let's say you have a bunch of data that you are training on and you decide to save the weights for your best iteration only. Now, if you have not iterated through all of your data before you find your 'best' model weights you will be effectively throwing away data and any later evaluation using the so called best weights will not correlate to your in-batch evaluation.