How to interpret Keras model.fit output? - keras

I've just started using Keras. The sample I'm working on has a model and the following snippet is used to run the model
from sklearn.preprocessing import LabelBinarizer
label_binarizer = LabelBinarizer()
y_one_hot = label_binarizer.fit_transform(y_train)
model.compile('adam', 'categorical_crossentropy', ['accuracy'])
history = model.fit(X_normalized, y_one_hot, nb_epoch=3, validation_split=0.2)
I get the following response:
Using TensorFlow backend. Train on 80 samples, validate on 20 samples Epoch 1/3
32/80 [===========>..................] - ETA: 0s - loss: 1.5831 - acc:
0.4062 80/80 [==============================] - 0s - loss: 1.3927 - acc:
0.4500 - val_loss: 0.7802 - val_acc: 0.8500 Epoch 2/3
32/80 [===========>..................] - ETA: 0s - loss: 0.9300 - acc:
0.7500 80/80 [==============================] - 0s - loss: 0.8490 - acc:
0.8000 - val_loss: 0.5772 - val_acc: 0.8500 Epoch 3/3
32/80 [===========>..................] - ETA: 0s - loss: 0.6397 - acc:
0.8750 64/80 [=======================>......] - ETA: 0s - loss: 0.6867 - acc:
0.7969 80/80 [==============================] - 0s - loss: 0.6638 - acc:
0.8000 - val_loss: 0.4294 - val_acc: 0.8500
The documentation says that fit returns
A History instance. Its history attribute contains all information
collected during training.
Does anyone know how to interpret the history instance?
For example, what does 32/80 mean? I assume 80 is the number of samples but what is 32? ETA: 0s ??

ETA = Estimated Time of Arrival.
80 is the size of your training set, 32/80 and 64/80 mean that your batch size is 32 and currently the first batch (or the second batch respectively) is being processed.
loss and acc refer to the current loss and accuracy of the training set.
At the end of each epoch your trained NN is evaluated against your validation set. This is what val_loss and val_acc refer to.
The history object returned by model.fit() is a simple class with some fields, e.g. a reference to the model, a params dict and, most importantly, a history dict. It stores the values of loss and acc (or any other used metric) at the end of each epoch. For 2 epochs it will look like this:
{
'val_loss': [16.11809539794922, 14.12947562917035],
'val_acc': [0.0, 0.0],
'loss': [14.890108108520508, 12.088571548461914],
'acc': [0.0, 0.25]
}
This comes in very handy if you want to visualize your training progress.
Note: if your validation loss/accuracy starts increasing while your training loss/accuracy is still decreasing, this is an indicator of overfitting.
Note 2: at the very end you should test your NN against some test set that is different from you training set and validation set and thus has never been touched during the training process.

32 is your batch size. 32 is the default value that you can change in your fit function if you wish to do so.
After the first batch is trained Keras estimates the training duration (ETA: estimated time of arrival) of one epoch which is equivalent to one round of training with all your samples.
In addition to that you get the losses (the difference between prediction and true labels) and your metric (in your case the accuracy) for both the training and the validation samples.

Related

tensorflow earlystopping does not work properly

im dealing with bunch of image dataset
however it takes a lot of time to learn, so i used earlystopping in tensorflow
this is my callback option & fit option
(I know monitoring acc is not a good option, but just wanted to see how earlystopping works)
tf.keras.callbacks.EarlyStopping(
monitor='accuracy',
patience=3,
#mode='max',
verbose=2,
baseline=0.98)
model.fit(x, y, batch_size=16, epochs=10, verbose=2, validation_split=0.2, callbacks=callbacks)
however, this is the result
101/101 - 42s - loss: 6.9557 - accuracy: 6.2461e-04 - val_loss: 6.9565 - val_accuracy: 0.0000e+00
Epoch 2/10
101/101 - 39s - loss: 6.9549 - accuracy: 0.0019 - val_loss: 6.9558 - val_accuracy: 0.0000e+00
Epoch 3/10
101/101 - 37s - loss: 6.9537 - accuracy: 0.0037 - val_loss: 6.9569 - val_accuracy: 0.0000e+00
Epoch 00003: early stopping
since monitoring value 'accuracy' kept increasing, expected it not to stop.
plus, I want earlystopping to monitor acc like this
acc=0, acc=0.1....acc=0.5, acc=0.4, acc=0.5, acc=0.6 #dont stop if increases again in patience epoch
acc=0, acc=0.1....acc=0.5, acc=0.3, acc=0.4, acc=0.35 #stop if acc does not increases again in patience epoch
how should i do that?
The issue is with the use of Baseline
As per the documentation it is defined as :
Baseline value for the monitored quantity. Training will stop if the model doesn't show improvement over the baseline.
By setting Baseline to 98% you are stating that the model's accuracy starts at 98% and it it does not improve over the baseline over 3 epochs stop training.
Instead do the following as per your use case:
tf.keras.callbacks.EarlyStopping(
monitor='accuracy',
min_delta=0.001,
patience=3,
mode='auto',
verbose=2,
baseline=None
)

keras, how to interpret the output(loss, acc calculated on what data) of each batch in each epoch during training

I understand there is a similar question "How to interpret Keras model.fit output?" , but my question is more specific, I am wondering how the loss and acc output of each batch inside one epoch are calculated?
is it calculated upon the validation set?
or on the trained samples so far in each epoch?(I think its this one )
or anything else?
below is a sample output during my training:
Epoch x/20:
...
54320/55200 [============================>.] - ETA: 0s - loss: 1.2083 - acc: 0.9554
54440/55200 [============================>.] - ETA: 0s - loss: 1.2083 - acc: 0.9554
54560/55200 [============================>.] - ETA: 0s - loss: 1.2083 - acc: 0.9555
...
my configuration:
model.fit(x_train, y_train,
batch_size=10,
epochs=20,
verbose=1,
validation_split=0.08)
Thank you!
The values are:
loss - training data, calculated every batch
acc - training data, calculated every batch
val_loss - test data, calculated every epoch
val_acc - test data, calculated every epoch

How can I get training accuracy output in Keras?

I use fit_generator(data_generator, steps_per_epoch=total/batch_size, epochs=epochs, verbose=2,callbacks=mylist) in Keras during training, while I don't know how to let it print training accuracy while training?
It seems like it's doing the training without any info printed...
From the docs for fit (same case for fit_generator):
verbose: 0 for no logging to stdout, 1 for progress bar logging, 2 for one log line per epoch.
Here is the output for the MNIST CNN example:
with verbose=2 (your case):
Train on 60000 samples, validate on 10000 samples
Epoch 1/2
- 298s - loss: 0.3323 - acc: 0.8983 - val_loss: 0.0735 - val_acc: 0.9763
Epoch 2/2
- 305s - loss: 0.1111 - acc: 0.9672 - val_loss: 0.0502 - val_acc: 0.9838
where training loss and acc are indeed available, but only after the end of each epoch.
with verbose=1 (snapshot):
Train on 60000 samples, validate on 10000 samples
Epoch 1/2
25088/60000 [===========>..................] - ETA: 2:51 - loss: 0.5471 - acc: 0.8305
where training loss and acc are available during the epoch, along with a progress bar.
Since it seems you are looking for the second case, change to verbose=1.

Keras model output information/log level

I am using Keras to build a neural network model:
model_keras = Sequential()
model_keras.add(Dense(4, input_dim=input_num, activation='relu',kernel_regularizer=regularizers.l2(0.01)))
model_keras.add(Dense(1, activation='linear',kernel_regularizer=regularizers.l2(0.01)))
sgd = optimizers.SGD(lr=0.01, clipnorm=0.5)
model_keras.compile(loss='mean_squared_error', optimizer=sgd)
model_keras.fit(X_norm_train, y_norm_train, batch_size=20, epochs=100)
The output looks like below. I am wondering if it is possible to out the loss, say every 10 epochs instead of every epoch? Thanks!
Epoch 1/200
20/20 [==============================] - 0s - loss: 0.2661
Epoch 2/200
20/20 [==============================] - 0s - loss: 0.2625
Epoch 3/200
20/20 [==============================] - 0s - loss: 0.2590
Epoch 4/200
20/20 [==============================] - 0s - loss: 0.2556
Epoch 5/200
20/20 [==============================] - 0s - loss: 0.2523
Epoch 6/200
20/20 [==============================] - 0s - loss: 0.2490
Epoch 7/200
20/20 [==============================] - 0s - loss: 0.2458
Epoch 8/200
20/20 [==============================] - 0s - loss: 0.2427
Epoch 9/200
20/20 [==============================] - 0s - loss: 0.2397
Epoch 10/200
20/20 [==============================] - 0s - loss: 0.2367
Epoch 11/200
20/20 [==============================] - 0s - loss: 0.2338
Epoch 12/200
20/20 [==============================] - 0s - loss: 0.2309
Epoch 13/200
20/20 [==============================] - 0s - loss: 0.2281
Epoch 14/200
20/20 [==============================] - 0s - loss: 0.2254
Epoch 15/200
20/20 [==============================] - 0s - loss: 0.2228
:
It is not possible to reduce frequency of logging to stdout, however, passing verbose=0 argument to fit() method would turn logging completely off.
Since the loop over epochs is not exposed in the Keras' sequential model, one way to collect scalar variable summaries with a custom frequency would be using Keras callbacks. In particular, you could use TensorBoard (assuming you are running with tensorflow backend) or CSVLogger (any backend) callbacks to collect any scalar variable summaries (training loss, in your case):
from keras.callbacks import TensorBoard
model_keras = Sequential()
model_keras.add(Dense(4, input_dim=input_num, activation='relu',kernel_regularizer=regularizers.l2(0.01)))
model_keras.add(Dense(1, activation='linear',kernel_regularizer=regularizers.l2(0.01)))
sgd = optimizers.SGD(lr=0.01, clipnorm=0.5)
model_keras.compile(loss='mean_squared_error', optimizer=sgd)
TB = TensorBoard(histogram_freq=10, batch_size=20)
model_keras.fit(X_norm_train, y_norm_train, batch_size=20, epochs=100, callbacks=[TB])
Setting histogram_freq=10 will save loss every 10 epochs.
EDIT: passing validation_data=(...) to the fit method will also allow to check validation level metrics.
Create a Keras callback to reduce the number of log lines. By default, Keras print log per every epoch. The following code prints only 10 log lines regardless the number of epochs.
class callback(tf.keras.callbacks.Callback):
def on_epoch_end(this,Epoch,Logs):
L = Logs["loss"];
if Epoch%Lafte==Lafte-1: #Log after a number of epochs
print(f"Average batch loss: {L:.9f}");
if Epoch==Epochs-1:
print(f"Fin-avg batch loss: {L:.9f}"); #Final average
Model = model();
Model.compile(...);
Dsize = ... #Number of samples in training data
Bsize = ... #Number of samples to process in 1 batch
Steps = 1000; #Number of batches to use to train
Epochs = round(Steps/(Dsize/Bsize));
Lafte = round(Epochs/10); #Log 10 times only, regardless of num of Epochs
if Lafte==0: Lafte=1; #Avoid modulus by zero in on_epoch_end
Model.fit(Data, epochs=Epochs, steps_per_epoch=round(Dsize/Bsize),
callbacks=[callback()], verbose=0);

Training Keras model with HDF5Matrix results in very slow learning

I am using HDF5Matrix to load a dataset and train my model with it. In the first epoch I obtain about 10% of accuracy.
At the moment, my dataset is not very large, so I can copy the contents of the HDF5Matrix to a numpy array and train with it. I reinitialise the model, and this time, in the first epoch I obtain a 40% accuracy.
For more information about the HDF5Matrix, see this example.
I understand in the fit method, the parameters shuffle must be either False or 'batch'. I get the same behaviour either way.
Does anybody have the same problem? Could you tell me if there is something I am doing wrong?
This is a snippet of the code:
using HDF5Matrix
from keras.utils.io_utils import HDF5Matrix
x_train = HDF5Matrix('../data/default_data.h5', 'data')
y_train = HDF5Matrix('../data/default_data.h5', 'labels') # create the model ...
# train the model
model.fit(x_train, y_train, epochs=200, batch_size=2048, shuffle='batch') # which outputs:
Epoch 1/200
1758510/1758510 [==============================] - 42s - loss: 2.5574 - categorical_accuracy: 0.1032
Epoch 2/200
1758510/1758510 [==============================] - 41s - loss: 2.3145 - categorical_accuracy: 0.1553
Epoch 3/200
1758510/1758510 [==============================] - 41s - loss: 2.1931 - categorical_accuracy: 0.2067
Epoch 4/200
694272/1758510 [==========>...................] - ETA: 24s - loss: 2.1055 - categorical_accuracy: 0.2328
Using numpy array
# create the model again
...
# copy the HDF5Matrix to a numpy array
X_training = x_train[0:1758510]
Y_training = y_train[0:1758510]
# check X_training is equal to x_train
...
# train the model again
model.fit(X_training,
Y_training,
epochs=200,
batch_size=256,
shuffle=True)
# which outputs
Epoch 1/200
1758510/1758510 [==============================] - 27s - loss: 1.5019 - categorical_accuracy: 0.4710
Epoch 2/200
89600/1758510 [>.............................] - ETA: 26s - loss: 1.2786 - categorical_accuracy: 0.5523
Thank you very much

Resources