tensorflow earlystopping does not work properly

tensorflow earlystopping does not work properly - python-3.x

im dealing with bunch of image dataset
however it takes a lot of time to learn, so i used earlystopping in tensorflow
this is my callback option & fit option
(I know monitoring acc is not a good option, but just wanted to see how earlystopping works)
tf.keras.callbacks.EarlyStopping(
monitor='accuracy',
patience=3,
#mode='max',
verbose=2,
baseline=0.98)
model.fit(x, y, batch_size=16, epochs=10, verbose=2, validation_split=0.2, callbacks=callbacks)
however, this is the result
101/101 - 42s - loss: 6.9557 - accuracy: 6.2461e-04 - val_loss: 6.9565 - val_accuracy: 0.0000e+00
Epoch 2/10
101/101 - 39s - loss: 6.9549 - accuracy: 0.0019 - val_loss: 6.9558 - val_accuracy: 0.0000e+00
Epoch 3/10
101/101 - 37s - loss: 6.9537 - accuracy: 0.0037 - val_loss: 6.9569 - val_accuracy: 0.0000e+00
Epoch 00003: early stopping
since monitoring value 'accuracy' kept increasing, expected it not to stop.
plus, I want earlystopping to monitor acc like this
acc=0, acc=0.1....acc=0.5, acc=0.4, acc=0.5, acc=0.6 #dont stop if increases again in patience epoch
acc=0, acc=0.1....acc=0.5, acc=0.3, acc=0.4, acc=0.35 #stop if acc does not increases again in patience epoch
how should i do that?

The issue is with the use of Baseline
As per the documentation it is defined as :
Baseline value for the monitored quantity. Training will stop if the model doesn't show improvement over the baseline.
By setting Baseline to 98% you are stating that the model's accuracy starts at 98% and it it does not improve over the baseline over 3 epochs stop training.
Instead do the following as per your use case:
tf.keras.callbacks.EarlyStopping(
monitor='accuracy',
min_delta=0.001,
patience=3,
mode='auto',
verbose=2,
baseline=None
)

Related

LSTM model is producing really bad results for multiclass text classification for imbalanced small dataset

I am training a LSTM model on my current dataset to predict the multiclass categories - there are 18 mutually exclusive categories and the dataset has ~ 500 rows only (a really small dataset). I am handling the class imbalance using the following:
from sklearn.utils import class_weight
class_weights = list(class_weight.compute_class_weight('balanced',
classes = np.unique(df['categories']),
y = df['categories']))
weights = {}
for index, weight in enumerate(class_weights):
weights[index] = weight
Post this I am building my LSTM model and have been evaluating this model using PRC in tf.metrics as this is an imbalanced target classification problem
METRICS = [ tf.metrics.AUC(name='prc', curve='PR'), # precision-recall curve]
model = Sequential()
model.add(Embedding(MAX_NB_WORDS, EMBEDDING_DIM, input_length=X.shape[1]))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(18, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=METRICS)
print(model.summary())
and finally:
history = model.fit(X_train,
y_train,
batch_size=10,
epochs=10,
verbose=1,
class_weight=weights,
validation_data=(X_test,y_test))
Now when I look at the results, the training prc is coming out to be really high whereas my val_prc is really low. An example with 10 epochs:
Epoch 1/10
30/30 [==============================] - 5s 174ms/step - loss: 2.9951 - prc: 0.0682 -
val_loss: 2.8865 - val_prc: 0.0639
Epoch 2/10
30/30 [==============================] - 5s 169ms/step - loss: 2.9556 - prc: 0.0993 -
val_loss: 2.8901 - val_prc: 0.0523
.....
Epoch 8/10
30/30 [==============================] - 6s 189ms/step - loss: 1.2494 - prc: 0.6415 -
val_loss: 3.0662 - val_prc: 0.0728
Epoch 9/10
30/30 [==============================] - 6s 210ms/step - loss: 0.9237 - prc: 0.8302 -
val_loss: 3.0624 - val_prc: 0.1006
Epoch 10/10
30/30 [==============================] - 6s 184ms/step - loss: 0.7452 - prc: 0.9017 -
val_loss: 3.5035 - val_prc: 0.0821
My questions are:
Is the evaluation metric correct that I am using considering it is an imbalanced class problem?
Am I treating the imbalance correctly with the code that I have written in the first place and most importantly, am I using this correct in the model.fit() ?
How can I resolve this? Is there any alternative approach that you can suggest?

Zero validation loss and validation accuracy at classification problem

I'm running a multiclass classification problem using the below resnet model:
resnet = tf.keras.applications.ResNet50(
include_top=False ,
weights='imagenet' ,
input_shape=(96, 96, 3) ,
pooling="avg"
)
for layer in resnet.layers:
layer.trainable = True
model_resnet = tf.keras.Sequential()
model_resnet.add(resnet)
model_resnet.add(tf.keras.layers.Flatten())
model_resnet.add(tf.keras.layers.Dense(8, activation='softmax',name='output') )
model_resnet.compile( loss="sparse_categorical_crossentropy" , optimizer=tf.keras.optimizers.Adam(learning_rate=0.001) ,metrics=['accuracy'])
I also used a train and a test generator as below:
train_generator=img_gen.flow_from_dataframe(dataframe=train_dataset,x_col="file_loc",y_col='expr',target_size=(96, 96),batch_size=91,class_mode="raw")
test_generator=img_gen.flow_from_dataframe(dataframe=test_dataset,x_col="file_loc",target_size=(96, 96),batch_size=93,y_col=None,shuffle=False,class_mode=None)
when I am running the code below I get the wanted results and everything works fine
model_resnet.fit_generator(train_generator,
steps_per_epoch=STEP_SIZE_TRAIN_resnet,
epochs=20
)
I wanted to compute the validation accuracy of every epoch so I wrote something like this
model_path = f"/content/weights" + "{val_accuracy:.4f}.hdf5"
checkpoint = tf.keras.callbacks.ModelCheckpoint(
model_path,
monitor='val_accuracy',
save_best_only=True,
mode='max',
verbose=1
)
history = model_resnet.fit_generator(
train_generator,
epochs=5,
steps_per_epoch=STEP_SIZE_TRAIN_resnet,
validation_data=test_generator,
validation_steps=STEP_SIZE_TEST_resnet,
max_queue_size=1,
shuffle=True,
callbacks=[checkpoint],
verbose=1
)
The problem is that for every epoch the validation loss and validation accuracy remain zero even though the training loss and accuracy change. I ran this code for over 20 epochs and it doesn't change at all. I can't find what am I doing wrong since without this it works perfectly,does anyone have any idea?
Epoch 1: val_accuracy improved from -inf to 0.00000, saving model to /content/weights0.0000.hdf5
500/500 [==============================] - 30s 60ms/step - loss: 1.0213 - accuracy: 0.6546 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 2/5
500/500 [==============================] - ETA: 0s - loss: 0.9644 - accuracy: 0.6672
Epoch 2: val_accuracy did not improve from 0.00000
500/500 [==============================] - 29s 58ms/step - loss: 0.9644 - accuracy: 0.6672 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Edit: I didn't specify the test labels of the test dataset because I used to compute the accuracy score as below:
y_pred = model_resnet.predict(test_generator)
y_pred_max = np.argmax(y_pred, axis=1)
y_true = test_dataset["expr"].to_numpy()
print("accuracy",accuracy_score(y_true, y_pred_max))
I changed the test_generator as below:
test_generator=img_gen.flow_from_dataframe(dataframe=test_dataset,x_col="file_loc",target_size=(96, 96),batch_size=93,y_col='expr',shuffle=False,class_mode=None)
but nothing has changed, it still results in zero

As #Dr.Snoopy said, the problems were that I didn't specify the test labels in these generator (which are required to compute accuracy) and I had different class modes in the generator,the correct was "raw" in both.

Validation and Test accuracy at random performance, whereas Train accuracy very high

I am trying to build a classifier in TensorFlow2.1 for CIFAR10 using ResNet50 pre-trained over imagenet from keras.application and then stacking a small FNN on top of it:
# Load ResNet50 pre-trained on imagenet
resn = applications.resnet50.ResNet50(weights='imagenet', input_shape=(IMG_SIZE, IMG_SIZE, 3), pooling='avg', include_top=False)
# Load CIFAR10
(c10_train, c10_test), info = tfds.load(name='cifar10', split=['train', 'test'], with_info=True, as_supervised=True)
# Make sure all the layers are not trainable
for layer in resn.layers:
layer.trainable = False
# Transfert Learning for CIFAR10: fine-tune the network by stacking a trainable FNN on top of Resnet
from tensorflow.keras import models, layers
def build_model():
model = models.Sequential()
# Feature extractor
model.add(resn)
# Small FNN
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dropout(0.4))
model.add(layers.Dense(10, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.SGD(learning_rate=0.1),
metrics=['accuracy'])
return model
# Build the resulting net
resn50_c10 = build_model()
I am facing the following issue when it comes to validate or test the accuracy:
history = resn50_c10.fit_generator(c10_train.shuffle(1000).batch(BATCH_SIZE), validation_data=c10_test.batch(BATCH_SIZE), epochs=20)
Epoch 1/20
25/25 [==============================] - 113s 5s/step - loss: 0.9659 - accuracy: 0.6634 - val_loss: 2.8157 - val_accuracy: 0.1000
Epoch 2/20
25/25 [==============================] - 109s 4s/step - loss: 0.8908 - accuracy: 0.6920 - val_loss: 2.8165 - val_accuracy: 0.1094
Epoch 3/20
25/25 [==============================] - 116s 5s/step - loss: 0.8743 - accuracy: 0.7038 - val_loss: 2.7555 - val_accuracy: 0.1016
Epoch 4/20
25/25 [==============================] - 132s 5s/step - loss: 0.8319 - accuracy: 0.7166 - val_loss: 2.8398 - val_accuracy: 0.1013
Epoch 5/20
25/25 [==============================] - 132s 5s/step - loss: 0.7903 - accuracy: 0.7253 - val_loss: 2.8624 - val_accuracy: 0.1000
Epoch 6/20
25/25 [==============================] - 132s 5s/step - loss: 0.7697 - accuracy: 0.7325 - val_loss: 2.8409 - val_accuracy: 0.1000
Epoch 7/20
25/25 [==============================] - 132s 5s/step - loss: 0.7515 - accuracy: 0.7406 - val_loss: 2.7697 - val_accuracy: 0.1000
#... (same for the remaining epochs)
Although the model seems to learn adequately from the training split, both the accuracy and loss for the validation set does not improve at all. What is causing this behavior?
I am excluding this is overfitting since I am applying Dropout and since the model seems to never really improve on the test set.
What I have done so far:
Check the one-hot labelling is consistent throughout train and test
Tried different FNN configurations
Tried the method fit_generator instead of fit
Preprocess the image, resized the images w/ different input_shapes
and experienced always the same problem.
Any hint would be extremely appreciated.

The problem is likely due to loading data using tfds and then passing to Keras .fit
Try to load your data with
from keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
And then
fit(x=x_train, y=y_train, batch_size=BATCH_SIZE, epochs=20, verbose=1, callbacks=None, validation_split=0.2, validation_data=None, shuffle=True)

Apparently the problem was caused uniquely by the use of ResNet50.
As a workaround, I downloaded and used other pre-trained deep networks such as keras.applications.vgg16.VGG16, keras.applications.densenet.DenseNet121 and the accuracy on the test set increased as expected.
UPDATE
The above part of this answer is just a palliative. In order to understand what is really happening and eventually use transfer learning properly with ResNet50, keep on reading.
The root cause appears to be found in how Keras handles the Batch Normalization layer:
During fine-tuning, if a Batch Normalization layer is frozen it uses the mini-batch statistics. I believe this is incorrect and it can lead to reduced accuracy especially when we use Transfer learning. A better approach in this case would be to use the values of the moving mean and variance.
As explained more in-depth here: https://github.com/keras-team/keras/pull/9965
Even though the correct approach has been implemented in TensorFlow 2 when we use tf.keras.applications we reference the TensorFlow 1.0 behavior for Batch Normalization. That's why we need to explicitly inject the reference to TensorFlow 2 by adding the argument layers=tf.keras.layers when loading modules. So in my case, the loading of ResNet50 will become
history = resn50_c10.fit_generator(c10_train.shuffle(1000).batch(BATCH_SIZE), validation_data=c10_test.batch(BATCH_SIZE), epochs=20, layers=tf.keras.layers)
and that will do the trick.
Credits for the solution to #rpeloff: https://github.com/keras-team/keras/pull/9965#issuecomment-549126009

How can I get training accuracy output in Keras?

I use fit_generator(data_generator, steps_per_epoch=total/batch_size, epochs=epochs, verbose=2,callbacks=mylist) in Keras during training, while I don't know how to let it print training accuracy while training?
It seems like it's doing the training without any info printed...

From the docs for fit (same case for fit_generator):
verbose: 0 for no logging to stdout, 1 for progress bar logging, 2 for one log line per epoch.
Here is the output for the MNIST CNN example:
with verbose=2 (your case):
Train on 60000 samples, validate on 10000 samples
Epoch 1/2
- 298s - loss: 0.3323 - acc: 0.8983 - val_loss: 0.0735 - val_acc: 0.9763
Epoch 2/2
- 305s - loss: 0.1111 - acc: 0.9672 - val_loss: 0.0502 - val_acc: 0.9838
where training loss and acc are indeed available, but only after the end of each epoch.
with verbose=1 (snapshot):
Train on 60000 samples, validate on 10000 samples
Epoch 1/2
25088/60000 [===========>..................] - ETA: 2:51 - loss: 0.5471 - acc: 0.8305
where training loss and acc are available during the epoch, along with a progress bar.
Since it seems you are looking for the second case, change to verbose=1.

How to interpret Keras model.fit output?

I've just started using Keras. The sample I'm working on has a model and the following snippet is used to run the model
from sklearn.preprocessing import LabelBinarizer
label_binarizer = LabelBinarizer()
y_one_hot = label_binarizer.fit_transform(y_train)
model.compile('adam', 'categorical_crossentropy', ['accuracy'])
history = model.fit(X_normalized, y_one_hot, nb_epoch=3, validation_split=0.2)
I get the following response:
Using TensorFlow backend. Train on 80 samples, validate on 20 samples Epoch 1/3
32/80 [===========>..................] - ETA: 0s - loss: 1.5831 - acc:
0.4062 80/80 [==============================] - 0s - loss: 1.3927 - acc:
0.4500 - val_loss: 0.7802 - val_acc: 0.8500 Epoch 2/3
32/80 [===========>..................] - ETA: 0s - loss: 0.9300 - acc:
0.7500 80/80 [==============================] - 0s - loss: 0.8490 - acc:
0.8000 - val_loss: 0.5772 - val_acc: 0.8500 Epoch 3/3
32/80 [===========>..................] - ETA: 0s - loss: 0.6397 - acc:
0.8750 64/80 [=======================>......] - ETA: 0s - loss: 0.6867 - acc:
0.7969 80/80 [==============================] - 0s - loss: 0.6638 - acc:
0.8000 - val_loss: 0.4294 - val_acc: 0.8500
The documentation says that fit returns
A History instance. Its history attribute contains all information
collected during training.
Does anyone know how to interpret the history instance?
For example, what does 32/80 mean? I assume 80 is the number of samples but what is 32? ETA: 0s ??

ETA = Estimated Time of Arrival.
80 is the size of your training set, 32/80 and 64/80 mean that your batch size is 32 and currently the first batch (or the second batch respectively) is being processed.
loss and acc refer to the current loss and accuracy of the training set.
At the end of each epoch your trained NN is evaluated against your validation set. This is what val_loss and val_acc refer to.
The history object returned by model.fit() is a simple class with some fields, e.g. a reference to the model, a params dict and, most importantly, a history dict. It stores the values of loss and acc (or any other used metric) at the end of each epoch. For 2 epochs it will look like this:
{
'val_loss': [16.11809539794922, 14.12947562917035],
'val_acc': [0.0, 0.0],
'loss': [14.890108108520508, 12.088571548461914],
'acc': [0.0, 0.25]
}
This comes in very handy if you want to visualize your training progress.
Note: if your validation loss/accuracy starts increasing while your training loss/accuracy is still decreasing, this is an indicator of overfitting.
Note 2: at the very end you should test your NN against some test set that is different from you training set and validation set and thus has never been touched during the training process.

32 is your batch size. 32 is the default value that you can change in your fit function if you wish to do so.
After the first batch is trained Keras estimates the training duration (ETA: estimated time of arrival) of one epoch which is equivalent to one round of training with all your samples.
In addition to that you get the losses (the difference between prediction and true labels) and your metric (in your case the accuracy) for both the training and the validation samples.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string