Maximize accuracy of multilabel image classification - conv-neural-network

I am building a multilabel image classification network. The dataset contains 70k images, total number of classes are 12. With respect to the entire dataset, 12 classes has more than 10% images. Out of 12 classes, 3 classes are above 70%. I am using VGG16 network without its associated classifier.
As the training results, I am getting max of 68% validation accuracy. I have tried changing the number of units per Dense layer (512,256,128 etc), increased the number of layers (5, 6 layers), added/removed Dropout layer (with 0.5), kernel_regularization (L1=0.1, L2=0.1).
As accuracy is not the appropriate metric for multilabel classification, I am trying to incorporate HammingLoss as the metric. But it is not working, here is the issue that I opened on the GitHub repo of HammingLoss.
What can be done to improve the accuracy?
What point I am missing in case of incorporating HammingLoss?
For classification, I am using the network as:
network.add(vggBase)
network.add(tf.keras.layers.Dense(256, activation='relu'))
network.add(tf.keras.layers.Dense(64, activation='relu'))
network.add(tf.keras.layers.Dense(12, activation='sigmoid'))
network.compile(optimizer=tf keras.optimizers.Adam(learning_rate=0.001), loss=tf.keras.losses.BinaryCrossentropy(), metrics=['accuracy'])

I recommend you to use Keras Tuner for tuning.
If Hammingloss is not working for you, you could use a differnet metric as a workaround, like pr_auc for instance. The metric choice depends strongly on what you want to achieve with your model. Maybe towardsdatascience/evaluating-multi-label-classifiers can help you to find that out.

Related

What do units and layers do in neural network?

model = keras.Sequential([
# the hidden ReLU layers
layers.Dense(units=4, activation='relu', input_shape=[2]),
layers.Dense(units=3, activation='relu'),
# the linear output layer
layers.Dense(units=1),
])
The above is a Keras sequential model example from Kaggle. I'm having a problem understanding these two things.
Are the units the number of nodes in a hidden layer? I see some people put 250 or what ever. What does the number do when it gets changed higher or lower?
Why would another hidden layer need to be added? What does it actually do the data to add more and more layers?
Answers in brief
units is representing how many neurons in a particular layer.When you have higher number,model has higher parameters to update during learning.Same thing goes to layers as well.(more layers and more neurons take more time to train the model).selecting how many neurons is depend on the use case and dataset and model architecture.
When you have more hidden layers, you have more parameters to update.More parameters and layers meaning model is able to understand complex relationships hidden in the data. For example when you have a image classification(multiple), you need more deep layers with neurons to understand the features in the image, which use to classify in final layer.
play with tensorflow playground,it will give great idea when you change the layers and neurons.

Increasing the smoothness of the accuracy curve in image classification

I have developed a Convolutional Neural Network using TILDA image dataset which gives over 90% of accuracy with the following model. I used 4 batches and 100 epochs to the model.
model = keras.Sequential([
layers.Input((30,30,1)),
layers.Conv2D(8,2,padding='same', activation='relu',kernel_regularizer=regularizers.l2(0.01)),
layers.BatchNormalization(),
layers.Conv2D(16,2,padding='same', activation='relu',kernel_regularizer=regularizers.l2(0.01)),
layers.BatchNormalization(),
layers.Conv2D(32,2,padding='same', activation='sigmoid',kernel_regularizer=regularizers.l2(0.01)),
layers.BatchNormalization(),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dropout(0.5),
layers.Dense(5, activation = "softmax"),
])
Using the above model I could plot the following graphs for the training and validation accuracy.
Do you have any suggestions to increase the smoothness of these curves? What can be the possible reasons for getting such curves? I appreciate your recommendations to improve this model.
The following may help in getting a smoother curve:
NEVER use dropout before the final layer. MaxPool + Dropout in your model discards 87.5% of the data flowing into the final layer. Avoid pooling as well, unless you need global or adaptive pooling to get a fixed shape output. If you must pool, you need a much larger number of kernels to compensate for the loss in information.
Use a lower learning rate. From what the training curve tells, the model is directed to a minima, but with several bumps.
Are you using SGD without momentum? If yes, introduce, momentum. Also consider adaptive optimizers with inbuilt momentum, like Adam.
Why the sigmoid in between? Sigmoid reduces the gradient magnitude and makes learning slower.
If you only care about the curve and are not restricted by number of parameters, consider adding a few more layers and/or channels.

How should I change the model if accuracy is very low?

I am new to deep learning and I want to build an image classifier using CNN(keras). I have built a model with 2 convolution layers (filters = 32 , kernel = 3x3) followed by a MaxPooling layer(2x2) and this repeated 2 times. Finally 2 fully connected layers. I am getting an accuracy of 50%. My question is how do we choose the model to begin with. Like how do we decide that there should be 2 convolution layers followed by a MaxPooling layer or 1 convolution and 1 MaxPooling layer. Also how do we choose the number of filters in each convolution layer and the kernel size.
If my model is not working then how to decide what changes to be made to the model .
model = Sequential()
model.add(Convolution2D(32,3,3,input_shape=
(280,280,3),activation='relu'))
model.add(Convolution2D(32,3,3,activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
#model.add(Dropout(0.25))
model.add(Convolution2D(64,3,3,activation='relu'))
model.add(Convolution2D(64,3,3,activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
#model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(output_dim=256 , activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(output_dim=5,activation='softmax'))
I am getting an accuracy of 50% after 5 epochs. What changes should i make in my model?
Let us first start with the more straightforward part. Knowing the number of input and output layers and the number of their neurons is the easiest part. Every network has a single input layer and a single output layer. The number of neurons in the input layer equals the number of input variables in the data being processed. The number of neurons in the output layer equals the number of outputs associated with each input.
But the challenge is knowing the number of hidden layers and their neurons.
The answer is you cannot analytically calculate the number of layers or the number of nodes to use per layer in an artificial neural network to address a specific real-world predictive modeling problem.
The number of layers and the number of nodes in each layer are model hyperparameters that you must specify and learn.
You must discover the answer using a robust test harness and controlled experiments. Regardless of the heuristics, you might encounter, all answers will come back to the need for careful experimentation to see what works best for your specific dataset.
Again the filter size is one such hyperparameter you should specify before training your network.
For an image recognition problem, if you think that a big amount of pixels are necessary for the network to recognize the object you will use large filters (as 11x11 or 9x9). If you think what differentiates objects are some small and local features you should use small filters (3x3 or 5x5).
These are some tips but do not exist any rules.
There are many tricks to increase the accuracy of your deep learning model. Kindly refer to this link Improve deep learning model performance.
Hope this will help you.

Accuracy on middle layer of autoencoder implemente using Keras

I have implemented an autoencoder using Keras. I understand that I can add accuracy performance metric as follows:
autoencoder.compile(optimizer='adam',
loss='mean_squared_error',
metrics=['accuracy'])
My question is:
Is the accuracy metric applied on the last layer of the decoder by default? If so, how can I set it so that it would get the representations from middle (hidden) layer to compute accuracy performance? Do I need to define a custom metric? How would that work?
It seems that what you really want is a multiple output network.
So on top of your middle layer that defines your embedding, add a layer (or more) to do your classification.
Then have a look at Multiple outputs in Keras to create your global cost.
You may also want to start by training the autoendoder only, then the classifier additional layers only to see the performance, you can also balance the accuracy of the encoder vs the accuracy of the classifier as a loss, training "both" networks at the same time.

Overfitting after one epoch

I am training a model using Keras.
model = Sequential()
model.add(LSTM(units=300, input_shape=(timestep,103), use_bias=True, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(units=536))
model.add(Activation("sigmoid"))
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
while True:
history = model.fit_generator(
generator = data_generator(x_[train_indices],
y_[train_indices], batch = batch, timestep=timestep),
steps_per_epoch=(int)(train_indices.shape[0] / batch),
epochs=1,
verbose=1,
validation_steps=(int)(validation_indices.shape[0] / batch),
validation_data=data_generator(
x_[validation_indices],y_[validation_indices], batch=batch,timestep=timestep))
It is a multiouput classification accoriding to scikit-learn.org definition:
Multioutput regression assigns each sample a set of target values.This can be thought of as predicting several properties for each data-point, such as wind direction and magnitude at a certain location.
Thus, it is a recurrent neural network I tried out different timestep sizes. But the result/problem is mostly the same.
After one epoch, my train loss is around 0.0X and my validation loss is around 0.6X. And this values keep stable for the next 10 epochs.
Dataset is around 680000 rows. Training data is 9/10 and validation data is 1/10.
I ask for intuition behind that..
Is my model already over fittet after just one epoch?
Is 0.6xx even a good value for a validation loss?
High level question:
Therefore it is a multioutput classification task (not multi class), I see the only way by using sigmoid an binary_crossentropy. Do you suggest an other approach?
I've experienced this issue and found that the learning rate and batch size have a huge impact on the learning process. In my case, I've done two things.
Reduce the learning rate (try 0.00005)
Reduce the batch size (8, 16, 32)
Moreover, you can try the basic steps for preventing overfitting.
Reduce the complexity of your model
Increase the training data and also balance each sample per class.
Add more regularization (Dropout, BatchNorm)

Resources