Find top layers for a fine-tuned model - keras

I want to use a fine-tuned model, based on MobileNetV2 (pre-trained on Keras). But I need to add top layers in order to classify my images into 2 classes. I would like to know how to choose the "architecture" of layers that I need ?
In some examples, people use SVM Classifer or series of Dense layer with a specific number of neurons as top layers.
The following code (by default), it works :
self.base_model = base_model
x = self.base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(2, activation='softmax')(x)
Is there any methodology to find the best solution ?

I'll recommend either Dropout or BatchNormalization. Dense can be easily overfitted because it has too many parameters in a layer. Both layers can regularize the model well. GlobalAveragePooling2D is a good choice because it also acts like regularizer itself.
I'll also suggest that, for the binary classification problem, you can change the output layer to be Dense(1, activation='sigmoid') to predict only P(class1), where you can calculate P(class2) by 1-P(class1). The loss you should use in this case will be binary_crossentropy instead of categorical_crossentropy.

Related

neural network binary classification softmax logsofmax and loss function

I am building a binary classification where the class I want to predict is present only <2% of times. I am using pytorch
The last layer could be logosftmax or softmax.
self.softmax = nn.Softmax(dim=1) or self.softmax = nn.LogSoftmax(dim=1)
my questions
I should use softmax as it will provide outputs that sum up to 1 and I can check performance for various prob thresholds. is that understanding correct?
if I use softmax then can I use cross_entropy loss? This seems to suggest that it is okay to use
if i use logsoftmax then can I use cross_entropy loss? This seems to suggest that I shouldnt.
if I use softmax then is there any better option than cross_entropy loss?
` cross_entropy = nn.CrossEntropyLoss(weight=class_wts)`

stacked Bi-LSTM compare to 1 layer of Bi-LSTM

I trained my model on BI LSTM for multi class text classification, but my result is not different when I use 2 stacked BI LSTM like below compare to just 1 layer of BI LSTM, any idea about that ?
max_len = 409
max_words = 17666
emb_dim = 100
BB = Sequential()
BB.add(Embedding(max_words, emb_dim,weights=[embedding_matrix], input_length=max_len))
BB.add(Bidirectional(LSTM(64, return_sequences=True,dropout=0.4, recurrent_dropout=0.4)))
BB.add(Bidirectional(LSTM(34, return_sequences=False,dropout=0.4, recurrent_dropout=0.4)))
BB.add(Dropout(0.5))
BB.add(Dense(3, activation='softmax'))
BB.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
BB.summary()
One thing that could be the issue is that your embeddings dimension is quite low a(100), maybe you could try some other embedding type as text representation, like word2vec, which has an embedding dimension of 300, if I am not mistaken.
On the modeling side, would try experimenting with increasing LSTM units first (try 100 or more), then stacking LSTM layers on top of each other.
I also, noticed that your model is very heavily regularized through Dropout, recurrent and otherwise. Did you notice overfitting while training your model? If not, I would suggest removing those as they can affect training if unnecessarily added.

ResNet50 and VGG16 for data with 2 channels

Is there any way that I can try to modify ResNet50 and VGG16 where my data(spectograms) is of the shape (64,256,2)?
I understand that I can take some layers out and modify them(output, dense) but I am not really sure for input channels.
Can anyone suggest a way to accommodate 2 channels in the models? Help is much appreciated!
You can use a different number of channels in the input (and a different height and width), but in this case, you cannot use the pretrained imagenet weights. You have to train from scratch. You can create them as follows:
from tensorflow import keras # or just import keras
vggnet = keras.applications.vgg16.VGG16(input_shape=(64,256,2), include_top=False, weights=None)
Note the weights=None argument. It means initialize weights randomly. If you have number of channels set to 3, you could use weights='imagenet', but in your case, you have 2 channels, so it won't work and you have to set it to None. The include_top=False is there for you to add final classification layers with different categories yourself. You could also create vgg19.VGG19 in the same way. For ResNet, you could similarly create it as follows:
resnet = keras.applications.resnet50.ResNet50(input_shape=(64, 256, 2), weights=None, include_top=False)
For other models and versions of vgg and resnet, please check here.

How to use MC Dropout on a variational dropout LSTM layer on keras?

I'm currently trying to set up a (LSTM) recurrent neural network with Keras (tensorflow backend).
I would like to use variational dropout with MC Dropout on it.
I believe that variational dropout is already implemented with the option "recurrent_dropout" of the LSTM layer but I don't find any way to set a "training" flag to put on to true like a classical Dropout layer.
This is quite easy in Keras, first you need to define a function that takes both model input and the learning_phase:
import keras.backend as K
f = K.function([model.layers[0].input, K.learning_phase()],
[model.layers[-1].output])
For a Functional API model with multiple inputs/outputs you can use:
f = K.function([model.inputs, K.learning_phase()],
[model.outputs])
Then you can call this function like f([input, 1]) and this will tell Keras to enable the learning phase during this call, executing Dropout. Then you can call this function multiple times and combine the predictions to estimate uncertainty.
The source code for "Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning" (2015) is located at https://github.com/yaringal/DropoutUncertaintyExps/blob/master/net/net.py. They also use Keras and the code is quite easy to understand. The Dropout layers are used without the Sequential api in order to pass the training parameter. This is a different approach to the suggestion from Matias:
inter = Dropout(dropout_rate)(inter, training=True)

Keras Model - Functional API - adding layers to existing model

I am trying to learn to use the Keras Model API for modifying a trained model for the purpose of fine-tuning it on the go:
A very basic model:
inputs = Input((x_train.shape[1:]))
x = BatchNormalization(axis=1)(inputs)
x = Flatten()(x)
outputs = Dense(10, activation='softmax')(x)
model1 = Model(inputs, outputs)
model1.compile(optimizer=Adam(lr=1e-5), loss='categorical_crossentropy', metrics=['categorical_accuracy'])
The architecture of it is
InputLayer -> BatchNormalization -> Flatten -> Dense
After I do some training batches on it I want to add some extra Dense layer between the Flatten one and the outputs:
x = Dense(32,activation='relu')(model1.layers[-2].output)
outputs = model1.layers[-1](x)
However, when I run it, i get this:
ValueError: Input 0 is incompatible with layer dense_1: expected axis -1 of input shape to have value 784 but got shape (None, 32)
Could someone please explain what is going on and how/if can I add layers to an already trained model?
Thank you
A Dense layer is made strictly for a certain input dimension. That dimension cannot be changed after you define it (it would need a different number of weights).
So, if you really want to add layers before a dense layer that is already used, you need to make sure that the outputs of the last new layer is the same shape as the flatten's output. (It says you need 784, so your new last dense layer needs 784 units).
Another approach
Since you're adding intermediate layers, it's pointless to keep the last layer: it was trained specifically for a certain input, if you change the input, then you need to train it again.
Well... since you need to train it again anyway, why keep it? Just create a new one that will be suited to the shapes of your new previous layers.

Resources