Image classification CNN model loss and accuracy flatline - conv-neural-network

I am working on a CNN model for multi-class image classification, while both loss and accuracy show flatline and values stay almost same.
Could you please help have a look if any mistakes made and much appreciate if any advice? thanks a lot in advance.
Loss and accuracy:
Input data
(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
(24296, 32, 32, 1) (6075, 32, 32, 1) (24296, 6) (6075, 6)
X_train:
y_train:
CNN code
model
model = Sequential()
model.add(Conv2D(16, (2,2), activation = 'relu', input_shape = (32,32,1)))
model.add(MaxPooling2D((2,2)))
model.add(Conv2D(32, (2,2), activation = 'relu'))
model.add(MaxPooling2D((2,2)))
model.add(Conv2D(64, (2,2), activation = 'relu'))
model.add(MaxPooling2D((2,2)))
model.add(Conv2D(128, (2,2), activation = 'relu'))
model.add(MaxPooling2D((2,2)))
model.add(Flatten())
model.add(Dense(100, activation = 'relu'))
model.add(Dense(6, activation = 'softmax'))
compile
model.compile(loss = 'categorical_crossentropy',
optimizer = optimizers.RMSprop(learning_rate=0.001),
metrics = ['accuracy'])
early stopping and fit
es = EarlyStopping(patience = 5, verbose=2)
history = model.fit(X_train, y_train,
validation_split = 0.2,
callbacks=[es],
epochs=100,
batch_size=64)
I checked the community, tried different optimizer (adam, sgd and RMSprop), parameters like learning rate and also different layers, but similar result.I expect the loss drop and accuracy increase, no flatline.

For what I can see the model is not learning, it's returning the same predictions for all inputs.
Is it 6 classes you're trying to predict? model.add(Dense(6, activation = 'softmax'))
Is your dataset big enough? Without enough parameters to represent
your data the model will have high loss.
Also, when is your model supposed to stop? are you monitoring
val_accuracy or val_loss and the training phase is terminating
when these metrics stops increasing or decreasing?
Did you One-Hot-Encode your targets so to use
categorical_crossentropy ? Otherwise try
sparse_categorical_crossentropy?

Related

Difference between keras.evaluate() accuracy e manually calculated accuracy by prediction: Multilabel classification

I'm using keras to perform multilabel classification. I'm using 'binary_crossentropy' as loss function, metrics=['accuracy'], 'sigmoid' as activation.
During the training I saw accuracy above the 90%, and also using evaluate on the test set I have something similar.
If I try to manually compute the accuracy using predict module the accuracy also on the training set dramatically leaves 45%.
This is the model:
model = keras.models.Sequential()
model.add(keras.layers.Conv2D(8, kernel_size=3, strides=2, activation='relu', input_shape=(N_qubits, N_qubits,2)))
model.add(keras.layers.Conv2D(16, kernel_size=2, activation='relu'))
model.add(keras.layers.Dense(64, activation='relu'))
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Conv2D(32, kernel_size=1, activation='relu'))
model.add(keras.layers.GlobalMaxPooling2D())
model.add(keras.layers.Dense(units=y_train.shape[1], activation='sigmoid'))
adam_optimizer = keras.optimizers.adam()
model.compile(loss='binary_crossentropy', optimizer=adam_optimizer, metrics=['accuracy'])
history = model.fit(X_train, y_train, batch_size=150, epochs=1000, verbose=1, validation_split=0.05)
Here where I use evaluate()
results = model.evaluate(X_test, y_test, batch_size=128)
Here when I use predict()
y_pred_train = model.predict(X_train)
y_pred_test = model.predict(X_test)

Multiclass Dataset Imbalance

from tensorflow.keras.preprocessing.image import ImageDataGenerator
import tensorflow as tf
train_path = 'Skin/Train'
test_path = 'Skin/Test'
train_gen = ImageDataGenerator(rescale=1./255)
train_generator = train_gen.flow_from_directory(train_path,target_size=
(300,300),batch_size=30,class_mode='categorical')
model = tf.keras.models.Sequential([
# Note the input shape is the desired size of the image 300x300 with 3 bytes color
# This is the first convolution
tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(600, 450, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
# The second convolution
tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# The third convolution
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# The fourth convolution
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# The fifth convolution
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
# Flatten the results to feed into a DNN
tf.keras.layers.Flatten(),
# 512 neuron hidden layer
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(9, activation='softmax')
])
from tensorflow.keras.optimizers import RMSprop
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(lr=0.001),
metrics=['acc'])
history = model.fit_generator(
train_generator,
steps_per_epoch=8,
epochs=15,
verbose=2, class_weight = ? )
I have issue in achieving accuracy, I am training a 9 classes dataset, in which class 1 , 4, and 5 have only 100, 96, 90 images while the remaining classes are having above 500 images. Due to which I am unable to achieve higher accuracy as the weights are skewed toward images which are higher in number. I want that during training all classes are considered equal i.e 500. Would be appreciated if I could upsample the classes via tensorflow or any keras function code. rather than manually upsampling or downsampling the images in folders.
You can, instead, use a class_weight argument in your fit method.
For upsampling, you need a lot of manual work, that's inevitable.
Assuming you have an output with shape (anything, 9), and you know the totals of each class:
totals = np.array([500,100,500,500,96,90,.......])
totalMean = totals.mean()
weights = {i: totalMean / count for i, count in enumerate(totals)}
model.fit(....., class_weight = weights)

LSTM Grid Search

I have a code below which implements an architecture (in grid search), to yield appropriate parameters for input, nodes, epochs, batch size and differenced time series input.
The challenge I have is to convert the neural network from just having one LSTM hidden layer, to multiple LSTM hidden layers.
At the moment, I could only run the code with Dense-type hidden layers, without having any errors thrown, otherwise I get dimension errors, tuple errors and so on.
The problem is only persistent in the neural network architecture section.
Original code that works:
def model_fit(train, config):
# unpack config
n_input, n_nodes, n_epochs, n_batch, n_diff = config
# Data
if n_diff > 0:
train = difference(train, n_diff)
# Time series to supervised format
data = series_to_supervised(train, n_in=n_input)
train_x, train_y = data[:, :-1], data[:, -1]
# Reshaping input data into [samples, timesteps, features]
n_features = 1
train_x = train_x.reshape((train_x.shape[0], train_x.shape[1], n_features))
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features)))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(1))
# Compile model (Grid search architecture)
model.compile(loss='mse', optimizer='adam')
# fit model
model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch, verbose=0)
return model
Modified LSTM-hidden layer code, that fails to run:
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
Another variant that also threw an error - ValueError: Error when checking target: expected time_distributed_4 to have 3 dimensions, but got array with shape (34844, 1)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=False))
model.add(RepeatVector(n_input))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))
Could anyone with any suggestion please help me ?
Try to set return_sequences=False at the last layer.

Loss & val_loss of keras CNN

I have a dataset of about 160k images of 160 classes and I'm trying to classify them using CNN. Training on 120k images for 20 epochs i start with loss ~ 4.9 and val_loss ~ 4.6 which improves to about 3.3 and 3.2 after 20 epochs. I Really tried to read the documentation of Keras and understand what does that mean, but i couldn't, so I'm asking if someone would explain to me in the context of my model what does that mean for my model. I mean what does the loss score represent? What does it say for the model?
num_classes = 154
batch_size = 64
input_shape = (50,50,3)
epochs = 20
X, y = load_data()
# input image dimensions
img_rows, img_cols = 50, 50
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Conv2D(64, kernel_size=(5, 5),
activation='relu',
padding = 'same',
input_shape=input_shape))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3)))
model.add(Dropout(0.20))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
I think it will help to watch a few tutorials online on CNN, to begin with. Basically, you want your loss to reduce with the training epochs which is what is observed in your case. Typically we look at how both losses are evolving over the entire training period. Observing how the train and validation loss is changing helps us have an understand whether the model is overfitting or not. You can check This link for a basic explanation to detect overfitting.
Ideally, you would want both your training as well as test loss to reduce with iterations. It is a measure of the error committed by the model in classification (as accuracy increase you expect the loss to reduce)

How can I train on video data using Keras? "transfer learning"

I want to train my model on video data for gesture recognition, proposed using LSTM's and TimeDistributed layers. Would this be ideal way to tackle my problem?
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(Conv2D(32, (3, 3)))
model.add(Dropout(0.25))
model.add(Conv2D(32, (3, 3)))
model.add(MaxPooling2D(pool_size=pool_size))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# Run epochs of sampling data then training
For temporal sequence data LSTM networks are generally the right choice. If you want to analyze video then a combination with 2d convolutions sounds reasonable to me. However, you have to apply TimeDistributed on all layers which dont expect sequence data. In your example that means all laysers expect LSTM.
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(Dropout(0.25))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(TimeDistributed(MaxPooling2D(pool_size=pool_size)))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# run epochs of sampling data then training
The last Dense Layer can stay this way because the final lstm doesnt output a sequence.

Resources