ImageDataGenerator performs worse - python-3.x

I build a neural network with and without ImageDataGenerator. When I use it without than it works fine. When I use it with IDG than both accuracy and valid_accuracy-scores are really bad. So I think I am doing something wrong.
I wanted to use the IDG to see what augmentation could do for my neural network. But even when I get rid of all the augmentation it still performs bad.
Here is my code for the IDG:
image_size=224
train_datagen = ImageDataGenerator(rescale=1./255, validation_split = 0.2)
train_generator = train_datagen.flow_from_directory('images',
target_size = (image_size, image_size),
batch_size = 10
class_mode = 'categorical'
subset='training')
validation_generator = train_datagen.flow_from_directory('images',
target_size = (image_size, image_size),
batch_size = 10
class_mode = 'categorical'
subset='training')
When I fit it I use this code:
chat = model.fit_generator(train_generator, steps_per_epoch = train_generator.samples // 10,
validation_data = validation_generator,
validation_steps = validation_generator.samples // 10,
epochs = 10)
Am I doing something wrong? Does the IDG perform an operation on the images that I don't see but changes something that influences the images in some way?
When I plot my images, I don't see anything strange.
Hope someone can give me some tips!

When you say that the performance is worse with data augmentation, are you comparing both on the same dataset?
Generally there is a chance of mistake of comparing the accuracy of model trained with data augmentation on the augmented dataset with the model trained without data augmentation on the regular dataset.
It is important to keep in mind that augmented datasets can be harder to deal with for the model. Therefore, even if the accuracy isn't as high as before, it might be actually higher when evaluated on the regular dataset.

Related

InceptionV3 transfer learning with Keras overfitting too soon

I'm using a pre trained InceptionV3 on Keras to retrain the model to make a binary image classification (data labeled with 0's and 1's).
I'm reaching about 65% of accuracy on my k-fold validation with never seen data, but the problem is the model is overfitting to soon. I need to improve this average accuracy, and I guess there is something related to this overfitting problem.
Here are the loss values on epochs:
Here is the code. The dataset and label variables are Numpy Arrays.
dataset = joblib.load(path_to_dataset)
labels = joblib.load(path_to_labels)
le = LabelEncoder()
labels = le.fit_transform(labels)
labels = to_categorical(labels, 2)
X_train, X_test, y_train, y_test = sk.train_test_split(dataset, labels, test_size=0.2)
X_train, X_val, y_train, y_val = sk.train_test_split(X_train, y_train, test_size=0.25) # 0.25 x 0.8 = 0.2
X_train = np.array(X_train)
y_train = np.array(y_train)
X_val = np.array(X_val)
y_val = np.array(y_val)
X_test = np.array(X_test)
y_test = np.array(y_test)
aug = ImageDataGenerator(
rotation_range=20,
zoom_range=0.15,
horizontal_flip=True,
fill_mode="nearest")
pre_trained_model = InceptionV3(input_shape = (299, 299, 3),
include_top = False,
weights = 'imagenet')
for layer in pre_trained_model.layers:
layer.trainable = False
x = layers.Flatten()(pre_trained_model.output)
x = layers.Dense(1024, activation = 'relu')(x)
x = layers.Dropout(0.2)(x)
x = layers.Dense(2, activation = 'softmax')(x) #already tried with sigmoid activation, same behavior
model = Model(pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr = 0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy']) #Already tried with Adam optimizer, same behavior
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=100)
mc = ModelCheckpoint('best_model_inception_rmsprop.h5', monitor='val_accuracy', mode='max', verbose=1, save_best_only=True)
history = model.fit(x=aug.flow(X_train, y_train, batch_size=32),
validation_data = (X_val, y_val),
epochs = 100,
callbacks=[es, mc])
The training dataset has 2181 images and validation has 727 images.
Something is wrong, but I can't tell what...
Any thoughts of what can be done to improve it?
One way to avoid overfitting is to use a lot of data. The main reason overfitting happens is because you have a small dataset and you try to learn from it. The algorithm will have greater control over this small dataset and it will make sure it satisfies all the datapoints exactly. But if you have a large number of datapoints, then the algorithm is forced to generalize and come up with a good model that suits most of the points.
Suggestions:
Use a lot of data.
Use less deep network if you have a small number of data samples.
If 2nd satisfies then don't use huge number of epochs - Using many epochs leads is kinda forcing your model to learn that and your model will learn it well but can not generalize.
From your loss graph , i see that the model is generalized at early epoch ( where there is intersection of both the train & val score) so plz try to use the model saved at that epoch ( and not the later epochs which seems to overfit)
Second option what you have is use lot of training samples..
If you have less no. of training samples then use data augmentations
Have you tried following?
Using a higher dropout value
Lower Learning Rate (lr=0.00001 or lr=0.000001 ...)
More data augmentation you can use.
It seems to me your data amount is low. You may use a lower ratio for test and validation (10%, 10%).

What is the point of data augmentation?

The code below is based on François Collet. He uses it to show that when the training set is small (2000 images), data augmentation improves the classification power in the validation set (which is true!).
My questions are:
If the model.fit_generator method uses steps_per_epoch = 2000 // batch_size. Are we using 2000 images per epoch?
If yes. What is the point of data augmentation if I use an augmented sample size equal in size to the original one?
batch_size = 32
# Train data augmentation
train_datagen = ImageDataGenerator(
rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size = (150, 150),
batch_size = batch_size,
class_mode = 'binary')
# Train data generation
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
validation_dir,
target_size = (150, 150),
batch_size = 20,
class_mode = 'binary')
# training and validation
history = model.fit_generator(train_generator,
steps_per_epoch = 2000 // batch_size,
epochs = 100,
validation_data = validation_generator,
validation_steps = 500 // batch_size)
The code that you posted is quite dated, but serves the purpose for the explanation.
Admittedly we have 2000 images. We use them all, but the number of steps that are performed in that epoch is 2000//batch_size, since you update the weights of the network after a batch of size batch_size. Therefore you perform 2000//batch_size steps.
At the same time, think about augmentation as enrichment at run-time. When we use augmentation you do not create new examples which are physically stored on your drive, but when you load the batch into the memory. This means that, out of the batch that contains batch_size elements, some of them are modified(augmented), and fed into your network. Each augmentation has a probability associated, that is there is N % probability (you can set it even manually if you want) that your image is subjected to that specific augmentation.
But this means that as the training progresses, as the number of epochs increase, your network gets to see many more images than the initial size of 2000.
In the snippet that you have provided, steps_per_epoch = 2000 // batch_size essentially means that the model will see the whole 2000 images during an epoch but out of those 2000 images many of them will be replaced with their augmented counterparts based on either a specific probability that you can provide or by randomly choosing the images.
For eg. Consider you are building a Dog-vs-Cats classifier and the dataset is made up of images of right-facing dogs and left-facing cats only. In this case, if you don't apply augmentation (horizontal_flipping) then the model might learn that all the left facing animals are cats which will lead to incorrect results when given an image of a left-facing dog.
Augmentation here (specifically horizontal_flipping) will randomly flip the images of cats and dogs enabling the model to reach a better solution and hence make it more robust!
Augmentation Happens in-place no new images are generated.

Speed problem of model.fit() in TF2 when loading data using DataGenerator

I run a simple classification problem with a small dataset on tf2 with two different ways on how to load the data.
In the first way, I loaded the data by reading images and loading them into (train_x, train_y) and (test_w,test_y).
The training was quite fast and fine.
Then, I wanted to try with using DataGenerator as such
training_datagen = ImageDataGenerator(
rescale = 1./255,
rotation_range=15,
fill_mode='nearest')
validation_datagen = ImageDataGenerator(rescale = 1./255)
train_generator = training_datagen.flow_from_directory(
TRAINING_DIR,
target_size=(224,224),
class_mode='categorical'
)
validation_generator = validation_datagen.flow_from_directory(
VALIDATION_DIR,
target_size=(224,224),
class_mode='categorical'
)
and then I run the training with the command
H = model.fit(
train_generator,
batch_size=2,
validation_data= validation_generator,
verbose = 1,
epochs=EPOCHS)
then, the training becomes extremely slow. One epoch lasts several minutes, while in the previous case, the whole training was less than 15 seconds.
I did not understand what is the problem. It seems this problem is shared among several developers but not clear why the training becomes so slow when using a data generator.
Thanks
The issue was also addressed here
https://github.com/keras-team/keras/issues/12683#issuecomment-614963118

CNN alternates between good performance and chance

I have a binary classification problem I am trying to solve with a CNN written in Keras. The input are very sparse 200X125X2 tensors (can be though of as two images stacked together), and its nonzero elements are only ones (representing neuron spike trains). The input is generated using a data generator that I have built, so the model is trained using the fit_generator function.
I have tried various architectures, and some show a decent performance (~88%), but the thing is that sometimes when I train new models, they don't seem to work at all, giving a chance (50%) result every epoch. The weird thing is that it happens sometimes to the same architectures that worked well before. I am running the code on Google Colab (GPU) with TensorFlow 2.0. I have check multiple times that I haven't changed anything in the code. I know that random initialization of the weights and biases may cause slight changes in the performance, but it looks like something else.
Any ideas will be very helpful. Thanks!
Here is the relevant code for one of the models that had this problem (I am using unusual kernels, I know):
# General settings
x_max = 10
x_size, t_size, N_features = parameters(x_max)
batch_size = 64
N_epochs = 10
N_final = 10*N_features
N_final = int(N_final - N_final%(batch_size))
N_val = 100*batch_size
N_test = N_final/5
# Setting up the architecture of the network and compiling
model = Sequential()
model.add(SeparableConv2D(50, (50,30), data_format='channels_first', input_shape=(2,x_size, t_size)))
model.add(MaxPooling2D(pool_size=2, data_format='channels_first'))
model.add(SeparableConv2D(100, (10,6), data_format='channels_first', input_shape=(2,x_size, t_size)))
model.add(MaxPooling2D(pool_size=2, data_format='channels_first'))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fitiing the model on generated data
filepath="......hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
start = time.time()
fit_history = model_delta.fit_generator(generator = data_generator(batch_size,x_max,'delta','_',100),
steps_per_epoch = N_final//batch_size,
validation_data = data_generator(batch_size,x_max,'delta','_',100),
validation_steps = N_val//batch_size,
callbacks = [checkpoint],
epochs = N_epochs)
end = time.time()
The most suspicious thing I see is a 'relu' near the end of the model. Depending on the initialization and on the learning rate, ReLUs can be unlucky and fall into an all-zeros case. When this happens, they completely stop gradients and don't train anymore.
By the looks of your problem (sometimes it works, sometimes it doesn't), it seems very plausible that it's the relu.
So, the first suggesion (this always solves it) is to add a batch normalization before the activation:
model.add(Dense(100))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dense(1, activation='sigmoid')
Hint, if you are going to use it with the 4D tensors before the flatten, remember to use the channels dimension: BatchNormalization(1).

Emotion detection on text

I am a newbie in ML and was experimenting with emotion detection on the text.
So I have an ISEAR dataset which contains tweets with their emotion labeled.
So my current accuracy is 63% and I want to increase to at least 70% or even more maybe.
Heres the code :
inputs = Input(shape=(MAX_LENGTH, ))
embedding_layer = Embedding(vocab_size,
64,
input_length=MAX_LENGTH)(inputs)
# x = Flatten()(embedding_layer)
x = LSTM(32, input_shape=(32, 32))(embedding_layer)
x = Dense(10, activation='relu')(x)
predictions = Dense(num_class, activation='softmax')(x)
model = Model(inputs=[inputs], outputs=predictions)
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['acc'])
model.summary()
filepath="weights-simple.hdf5"
checkpointer = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
history = model.fit([X_train], batch_size=64, y=to_categorical(y_train), verbose=1, validation_split=0.1,
shuffle=True, epochs=10, callbacks=[checkpointer])
That's a pretty general question, optimizing the performance of a neural network may require tuning many factors.
For instance:
The optimizer chosen: in NLP tasks rmsprop is also a popular
optimizer
Tweaking the learning rate
Regularization - e.g dropout, recurrent_dropout, batch norm. This may help the model to generalize better
More units in the LSTM
More dimensions in the embedding
You can try grid search, e.g. using different optimizers and evaluate on a validation set.
The data may also need some tweaking, such as:
Text normalization - better representation of the tweets - remove unnecessary tokens (#, #)
Shuffle the data before the fit - keras validation_split creates a validation set using the last data records
There is no simple answer to your question.

Resources