Why is accuracy different between Keras model.fit and model.evaluate? - keras

I am trying to fit a Keras model and use both the history object and evaluate
function to see how well the model performs. The code to compute so is below:
optimizer = Adam (lr=learning_rate)
model.compile(loss='categorical_crossentropy',
optimizer=optimizer,
metrics=['accuracy')
for epoch in range (start_epochs, start_epochs + epochs):
history = model.fit(X_train, y_train, verbose=0, epochs=1,
batch_size=batch_size,
validation_data=(X_val, y_val))
print (history.history)
score = model.evaluate(X_train, y_train, verbose=0)
print ('Training accuracy', model.metrics_names, score)
score = model.evaluate(X_val, y_val, verbose=0)
print ('Validation accuracy', model.metrics_names, score)
To my surprise the accuracy and loss results of the training set differ between history and evaluate. As the results for the validation set are equal it seems some blunder from my side but I cannot find anything. I have given the output for the first four epochs below. I got the same results for metric 'mse': training set differs, test set equal. Anybody any idea?
{'val_loss': [13.354823187591416], 'loss': [2.7036468725265874], 'val_acc': [0.11738484422572477], 'acc': [0.21768202061048531]}
Training accuracy ['loss', 'acc'] [13.265716915499048, 0.1270430906536911]
Validation accuracy ['loss', 'acc'] [13.354821096026349, 0.11738484398216939]
{'val_loss': [11.733116257598105], 'loss': [1.8158155931229045], 'val_acc': [0.26745913783295899], 'acc': [0.34522040671733062]}
Training accuracy ['loss', 'acc'] [11.772184015560292, 0.26721149086656992]
Validation accuracy ['loss', 'acc'] [11.733116155570542, 0.26745913818722139]
{'val_loss': [7.1503656643815061], 'loss': [1.5667824202566349], 'val_acc': [0.26597325444044367], 'acc': [0.44378405117114739]}
Training accuracy ['loss', 'acc'] [7.0615554528994506, 0.26250619121327617]
Validation accuracy ['loss', 'acc'] [7.1503659895943672, 0.26597325408618128]
{'val_loss': [4.2865109046890693], 'loss': [1.4087548087645783], 'val_acc': [0.13893016366866509], 'acc': [0.49232293093422957]}
Training accuracy ['loss', 'acc'] [4.1341019072350802, 0.14338781575775195]
Validation accuracy ['loss', 'acc'] [4.2865103747125541, 0.13893016344725112]

There is nothing to be surprised, the metrics on the training set are just the mean over all batches during training, as the weights are changing with each batch.
Using model.evaluate will keep the model weights fixed and compute loss/accuracy for the whole data you give in. If you want to have the loss/accuracy on the training set, then you have to use model.evaluate and pass the training set to it. The history object does not have the true loss/accuracy on the training set.

Related

Keras tuner best model does not work better than a manually configured model and MSE is very high for train set with this best model

I am working on timeseries data and I used keras tuner to find the best model. Keras tuner returns a very good MSE for best model. But when I use this best model to predict train and test set, it returns high MSE for training set and lower MSE for test set, but the RMSE is normal for both. Also, when I use the model that I configured manually, the results are better than best model from keras tuner! I cannot understand why the results does not make sense, am I doing something wrong? Here is the code.
`
def build_model(hp):
model = keras.Sequential()
model.add(keras.layers.ConvLSTM2D(filters=hp.Int('units1',
min_value=25, max_value=512, step=32, default=128),
kernel_size=(1,1),
activation=hp.Choice('activation1',
values=['relu', 'tanh', 'sigmoid'], default='relu'),
input_shape=(n_past, 1, 1, 1)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(units=hp.Int('units3',
min_value=10, max_value=128, step=8, default=128),
activation=hp.Choice('activation_2',
values=['relu', 'tanh', 'sigmoid'], default='relu')))
model.add(keras.layers.Dense(1, activation=hp.Choice('activation_2',
values=['relu', 'tanh', 'sigmoid'], default='relu')))
model.compile(loss='mae', optimizer=keras.optimizers.Adam(hp.Float('learning_rate',
min_value=1e-4, max_value=1e-2,
sampling='LOG', default=1e-3)), metrics=['mae'])
return model
bayesian_opt_tuner = BayesianOptimization(build_model, objective='mae', max_trials=20, executions_per_trial=1,
directory=os.path.normpath('C:/keras_tuning'), project_name='timeseries_temp_ts_test_from_TF_ex',
overwrite=True)
EVALUATION_INTERVAL = 200
EPOCHS = 2
bayesian_opt_tuner.search(trainX, trainy,
epochs=EPOCHS,
validation_data=(testX, testy),
validation_steps=50,
steps_per_epoch=EVALUATION_INTERVAL)
model = bayesian_opt_tuner.get_best_models(1)[0]
model.summary()
`
The best MSE score is 0.365387, but when I predict the train and test set the MSE is 28.58 for train set and 6.36 for test set and RMSE is 5.35 and 2.52. While with my own model which is below the MSE of train and test set is 5.95 and 2.39 and RMSE is 2.44 and 1.55.
`
model = Sequential()
model.add(ConvLSTM2D(filters=64, kernel_size=(1,1), activation='relu', input_shape=(n_past, 1, 1, 1))) model.add(Flatten())
model.add(Dense(32))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')
model.summary()
`

training loss during LSTM training is higher than validation loss

I am training an LSTM to predict a time series. I have tried an encoder-decoder, without any dropout. I divided my data n 70% training and 30% validation. The total points in the training set and validation set are around 107 and 47 respectively. However, the validation loss is always greater than training loss. below is the code.
seed(12346)
tensorflow.random.set_seed(12346)
Lrn_Rate=0.0005
Momentum=0.8
sgd=SGD(lr=Lrn_Rate, decay = 1e-6, momentum=Momentum, nesterov=True)
adam=Adam(lr=Lrn_Rate, beta_1=0.9, beta_2=0.999, amsgrad=False)
optimizernme=sgd
optimizernmestr='sgd'
callbacks= EarlyStopping(monitor='loss',patience=50,restore_best_weights=True)
train_X1 = numpy.reshape(train_X1, (train_X1.shape[0], train_X1.shape[1], 1))
test_X1 = numpy.reshape(test_X1, (test_X1.shape[0], test_X1.shape[1], 1))
train_Y1 = train_Y1.reshape((train_Y1.shape[0], train_Y1.shape[1], 1))
test_Y1= test_Y1.reshape((test_Y1.shape[0], test_Y1.shape[1], 1))
model = Sequential()
Hiddenunits=240
DenseUnits=100
n_features=1
n_timesteps= look_back
model.add(Bidirectional(LSTM(Hiddenunits, activation='relu', return_sequences=True,input_shape=
(n_timesteps, n_features))))#90,120 worked for us uk
model.add(Bidirectional(LSTM( Hiddenunits, activation='relu',return_sequences=False)))
model.add(RepeatVector(1))
model.add(Bidirectional(LSTM( Hiddenunits, activation='relu',return_sequences=True)))
model.add(Bidirectional(LSTM(Hiddenunits, activation='relu', return_sequences=True)))
model.add(TimeDistributed(Dense(DenseUnits, activation='relu')))
model.add(TimeDistributed(Dense(1)))
model.compile(loss='mean_squared_error', optimizer=optimizernme)
history=model.fit(train_X1,train_Y1,validation_data(test_X1,test_Y1),batch_size=batchsize,epochs=250,
callbacks=[callbacks,TqdmCallback(verbose=0)],shuffle=True,verbose=0)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss'+ modelcaption)
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
the training loss is coming greater than validation loss. training loss =0.02 and validation loss are approx 0.004 please the attached picture. I tried many things including dropouts and adding more hidden units but it did not solve the problem. Any comments suggestion is appreciated
Better performance on a validation set compared to a training set is an indication that your model is overfitting to the training data. It sounds like you have tried creating the model architecture with and without a dropout layer (which combats overfitting) but the results are similar.
One possibility is data leakage, which is when information outside of the training data is used in the creation of the training data set. For example, if you normalized or standardized all of the data at once, instead of separately for the training and validation sets, then you are implicitly using the validation data in your training data, which would lead to the model overfitting.

custom loss - keras

The following two models/compilations behave differently:
def custom_loss(y_true, y_pred):
return keras.losses.binary_crossentropy(y_true, y_pred)
optimizer = Adam(lr=5e-3)
model.compile(loss=custom_loss, optimizer=optimizer, metrics=['accuracy'])
And:
optimizer = Adam(lr=5e-3)
model.compile(loss=keras.losses.binary_crossentropy, optimizer=optimizer, metrics=['accuracy'])
What can be the reason?
If you implement a custom binary cross-entropy loss, you should also specify the right accuracy metric. This is because if you use Keras' binary cross-entropy, then Keras will automatically adjust which accuracy metric to use (between binary and categorical accuracy).
This doesn't happen if you use a custom loss, and then Keras will default to categorical accuracy, which is actually wrong, producing incorrect accuracy values. For example:
model.compile(loss=custom_loss, optimizer=optimizer, metrics=['binary_accuracy'])

Keras for learning a random multidimensional function(regression)

I am using Keras to learn the surface of a random function. Basically I am sampling bunch of points to be used as training data. I am using the following code to generate the network.
def create_model(optimizer='adam'):
model = Sequential()
units = 100
dim= 6
dropout= 1
## making the model graph, Stacking layers is done by .add():
model.add(Dense(units=units, input_dim=dim, activation='sigmoid'))
model.add(Dropout(dropout))
model.add(Dense(units=units, activation='sigmoid'))
model.add(Dropout(dropout))
model.add(Dense(units=units, activation="sigmoid"))
model.add(Dropout(dropout))
model.add(Dense(units=1, activation = 'linear'))
# optmiser = keras.optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=1e-08, decay=0.0)
# optmiser = keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
# configure the model's learning process; loss and optimisation etc
model.compile(loss='mse',
optimizer=optimizer, metrics=["accuracy"])
return model
I am getting the following logs during training,
451/667 [===================>..........] - ETA: 0s - loss: nan - acc: 0.0000e+00
I think I am doing something wrong in creating the network or the choice of different parameter. Any help is appreciated.
Thanks,
The input to Dropout represents the fraction of the input units to drop (see here). Thus, by doing
model.add(Dropout(dropout))
with dropout=1, you basically throw away all the units. You need to choose dropout strictly smaller than 1.

From SKLearn to Keras - What is the difference?

I'm trying to go from SKLearn to Keras in order to make specific improvements to my models.
However, I can't get the same performance I had with my SKLearn model :
mlp = MLPClassifier(
solver='adam', activation='relu',
beta_1=0.9, beta_2=0.999, learning_rate='constant',
alpha=0, hidden_layer_sizes=(238,),
max_iter=300
)
dev_score(mlp)
Gives ~0.65 score everytime
Here is my corresponding Keras code :
def build_model(alpha):
level_moreargs = {'kernel_regularizer':l2(alpha), 'kernel_initializer': 'glorot_uniform'}
model = Sequential()
model.add(Dense(units=238, input_dim=X.shape[1], **level_moreargs))
model.add(Activation('relu'))
model.add(Dense(units=class_names.shape[0], **level_moreargs)) # output
model.add(Activation('softmax'))
model.compile(loss=keras.losses.categorical_crossentropy, # like sklearn
optimizer=keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0),
metrics=['accuracy'])
return model
k_dnn = KerasClassifier(build_fn=build_model, epochs=300, batch_size=200, validation_data=None, shuffle=True, alpha=0.5, verbose=0)
dev_score(k_dnn)
From looking at the documentation (and digging into SKLearn code), this should correspond exactly to the same thing.
However, I get ~0.5 accuracy when I run this model, which is very bad.
And if I set alpha to 0, SKLearn's score barely changes (0.63), while Keras's goes random from 0.2 to 0.4.
What is the difference between these models ? Why is Keras, although being supposed to be better than SKLearn, outperformed by so far here ? What's my mistake ?
Thanks,

Resources