I am trying to learn to use the Keras Model API for modifying a trained model for the purpose of fine-tuning it on the go:
A very basic model:
inputs = Input((x_train.shape[1:]))
x = BatchNormalization(axis=1)(inputs)
x = Flatten()(x)
outputs = Dense(10, activation='softmax')(x)
model1 = Model(inputs, outputs)
model1.compile(optimizer=Adam(lr=1e-5), loss='categorical_crossentropy', metrics=['categorical_accuracy'])
The architecture of it is
InputLayer -> BatchNormalization -> Flatten -> Dense
After I do some training batches on it I want to add some extra Dense layer between the Flatten one and the outputs:
x = Dense(32,activation='relu')(model1.layers[-2].output)
outputs = model1.layers[-1](x)
However, when I run it, i get this:
ValueError: Input 0 is incompatible with layer dense_1: expected axis -1 of input shape to have value 784 but got shape (None, 32)
Could someone please explain what is going on and how/if can I add layers to an already trained model?
A Dense layer is made strictly for a certain input dimension. That dimension cannot be changed after you define it (it would need a different number of weights).
So, if you really want to add layers before a dense layer that is already used, you need to make sure that the outputs of the last new layer is the same shape as the flatten's output. (It says you need 784, so your new last dense layer needs 784 units).
Another approach
Since you're adding intermediate layers, it's pointless to keep the last layer: it was trained specifically for a certain input, if you change the input, then you need to train it again.
Well... since you need to train it again anyway, why keep it? Just create a new one that will be suited to the shapes of your new previous layers.


Data augmentation in Keras model

I am trying to add data augmentation as a layer to a model but I am getting the following error.
TypeError: The added layer must be an instance of class Layer. Found: <tensorflow.python.keras.preprocessing.image.ImageDataGenerator object at 0x7f8c2dea0710>
data_augmentation = tf.keras.preprocessing.image.ImageDataGenerator(
rotation_range=30, horizontal_flip=True)
model = Sequential()
model.add(Dense(n_classes, activation= 'softmax', kernel_regularizer='l2'))
history =, y,
I have also tried this way:
data_augmentation = Sequential(
model = Sequential()
model.add(Dense(n_classes, activation= 'softmax', kernel_regularizer='l2'))
history =, y,
It gives an error:
ValueError: Input 0 of layer sequential_7 is incompatible with the layer: expected ndim=4, found ndim=2. Full shape received: [128, 14272]
Could you please advice how I can use augmentation in Keras?
In your first case, you are using ImageDataGenerator as a layer, which is not: as the name says, it is just a generator which applies random transformations to images (image augmentation) before feeding the network. So, the images are augmented in CPU and then feed to the neural network which can run in GPU if you have one.
Generators are usually used also to avoid loading huge datasets into memory since they allow to load only the batches being used soon.
In the second case, you are using image augmentation as layers of your model properly. The difference here is that the augmentation is run as part of your model, so if you have a GPU available for instance, those operations will run in GPU.
The problem with your second case is in the model itself (in fact the model is also wrong in the first approach, you only get an error there with the bad usage of ImageDataGenerator before your execution arrives to the model).
Note that you are using images as inputs, so, the input should be of shape (height, width, channels), but then you are starting your model with a dense layer, which expects a single array of shape (n_features,).
If your model needs to start with a Dense layer (strange, but may be ok in some case) then you need first to use Flatten layer to convert images of shape (h,w,c) into vectors of shape (h*w*c,). This change will solve your second approach for sure.
That said, you don't need to specify the input shape on every single layer: doing it in your first layer should be enough.
Last, but not least: are you sure this model is being feed with images? According to your fit call, it looks like you are using previously extracted features that may be vectors (this make sense with your current model architecture but makes no sense with the usage of image augmentation).
Please, provide more details with respect to your data to clarify this point.

Find top layers for a fine-tuned model

I want to use a fine-tuned model, based on MobileNetV2 (pre-trained on Keras). But I need to add top layers in order to classify my images into 2 classes. I would like to know how to choose the "architecture" of layers that I need ?
In some examples, people use SVM Classifer or series of Dense layer with a specific number of neurons as top layers.
The following code (by default), it works :
self.base_model = base_model
x = self.base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(2, activation='softmax')(x)
Is there any methodology to find the best solution ?
I'll recommend either Dropout or BatchNormalization. Dense can be easily overfitted because it has too many parameters in a layer. Both layers can regularize the model well. GlobalAveragePooling2D is a good choice because it also acts like regularizer itself.
I'll also suggest that, for the binary classification problem, you can change the output layer to be Dense(1, activation='sigmoid') to predict only P(class1), where you can calculate P(class2) by 1-P(class1). The loss you should use in this case will be binary_crossentropy instead of categorical_crossentropy.

Keras lstm and dense layer

How is dense layer changing the output coming from LSTM layer? How come that from 50 shaped output from previous layer i get output of size 1 from dense layer that is used for prediction?
Lets say i have this basic model:
model = Sequential()
model.add(Dense(1, activation="softmax"))
Is the Dense layer taking the values coming from previous layer and assigning the probablity(using softmax function) of each of the 50 inputs and then taking it out as an output?
No, Dense layers do not work like that, the input has 50-dimensions, and the output will have dimensions equal to the number of neurons, one in this case. The output is a weighted linear combination of the input plus a bias.
Note that with the softmax activation, it makes no sense to use it with a one neuron layer, as the softmax is normalized, the only possible output will be constant 1.0. That's probably now what you want.

How add new class in saved keras sequential model

I have 10 class dataset with this I got 85% accuracy, got the same accuracy on a saved model.
now I want to add a new class, how to add a new class To the saved model.
I tried by deleting the last layer and train but model get overfit and in prediction every Images show same result (newly added class).
This is what I did
base_model_layers = model.output
pred = Dense(11, activation='softmax')(base_model_layers)
model = Model(inputs=model.input, outputs=pred)
# compile and fit step
I have trained model with 10 class I want to load the model train with class 11 data and give predictions.
Using the model.pop() method and then the Keras Model() API will lead you to an error. The Model() API does not have the .pop() method, so if you want to re-train your model more than once you will have this error.
But the error only occurs if you, after the re-training, save the model and use the new saved model in the next re-training.
Another very wrong and used approach is to use the model.layers.pop(). This time the problem is that function only removes the last layer in the copy it returns. So, the model still has the layer, and just the method's return does not have the layer.
I recommend the following solution:
Admitting you have your already trained model saved in the model variable, something like:
model = load_my_trained_model_function()
# creating a new model
model_2 = Sequential()
# getting all the layers except the output one
for layer in model.layers[:-1]: # just exclude last layer from copying
# prevent the already trained layers from being trained again
# (you can use layers[:-n] to only freeze the model layers until the nth layer)
for layer in model_2.layers:
layer.trainable = False
# adding the new output layer, the name parameter is important
# otherwise, you will add a Dense_1 named layer, that normally already exists, leading to an error
model_2.add(Dense(num_neurons_you_want, name='new_Dense', activation='softmax'))
Now you should specify the compile and fit methods to train your model and it's done:
# trains the model
model_history =, y_train,
Note that by adding a new output layer we do not have the weights and biases adjusted in the last training.
Thereby we lost pretty much everything from the previous training.
We need to save the weights and biases of the output layer of the previous training, and then we must add them to the new output layer.
We also must think if we should let all the layers train or not, or even if we should allow the training of only some intercalated layers.
To get the weights and biases from the output layer using Keras we can use the following method:
# weights_training[0] = layer weights
# weights_training[1] = layer biases
weights_training = model.layers[-1].get_weights()
Now you should specify the weights for the new output layer. You can use, for example, the mean of the weights for the weights of the new classes. It's up to you.
To set the weights and biases of the new output layer using Keras we can use the following method:
base_model_layers = model.output
pred = Dense(11, activation='softmax')(base_model_layers)
model = Model(inputs=model.input, outputs=pred)
Freeze the first layers, before train it
for layer in model.layers[:-2]:
layer.trainable = False
I am assuming that the problem is singlelabel-multiclass classification i.e. a sample will belong to only 1 of the 11 classes.
This answer will be completely based on implementing the way humans learn into machines. Hence, this will not provide you with a proper code of how to do that but it will tell you what to do and you will be able to easily implement it in keras.
How does a human child learn when you teach him new things? At first, we ask him to forget the old and learn the new. This does not actually mean that the old learning is useless but it means that for the time while he is learning the new, the old knowledge should not interfere as it will confuse the brain. So, the child will only learn the new for sometime.
But the problem here is, things are related. Suppose, the child learned C programming language and then learned compilers. There is a relation between compilers and programming language. The child cannot master computer science if he learns these subjects separately, right? At this point we introduce the term 'intelligence'.
The kid who understands that there is a relation between the things he learned before and the things he learned now is 'intelligent'. And the kid who finds the actual relation between the two things is 'smart'. (Going deep into this is off-topic)
What I am trying to say is:
Make the model learn the new class separately.
And then, make the model find a relation between the previously learned classes and the new class.
To do this, you need to train two different models:
The model which learns to classify on the new class: this model will be a binary classifier. It predicts a 1 if the sample belongs to class 11 and 0 if it doesn't. Now, you already have the training data for samples belonging to class 11 but you might not have data for the samples which doesn't belong to class 11. For this, you can randomly select samples which belong to classes 1 to 10. But note that the ratio of samples belonging to class 11 to that not belonging to class 11 must be 1:1 in order to train the model properly. That means, 50% of the samples must belong to class 11.
Now, you have two separate models: the one which predicts class 1-10 and one which predicts class 11. Now, concatenate the outputs of (the 2nd last layers) these two models with a newly created Dense layer with 11 nodes and let the whole model retrain itself adjusting the weights of pretrained two models and learning new weights of the dense layer. Keep the learning rate low.
The final model is the third model which is a combination of two models (without last Dense layer) + a new Dense layer.
Is it possible to save a trained layer to use layer on Keras?

I haven't used Keras and I'm thinking whether to use it or not.
I want to save a trained layer to use later. For example:
I train a model.
Then, I gain a trained layer t_layer.
I have another model to train which consists of layer1, layer2, layer3 .
I want to use t_layer as layer2 and not to update this layer(i.e. t_layer does not learn any more).
This may be an odd attempt, but I want to try this. Is this possible on Keras?
Yes, it is.
You will probably have to save the layer's weights and biases instead of saving the layer itself, but it's possible.
Keras also allows you to save entire models.
Suppose you have a model in the var model:
weightsAndBiases = model.layers[i].get_weights()
This is a list of numpy arrays, very probably with two arrays: weighs and biases. You can simply use to save these two arrays and later you can create a similar layer and give it the weights:
from keras.layers import *
from keras.models import Model
inp = Input(....)
out1 = SomeKerasLayer(...)(inp)
out2 = AnotherKerasLayer(....)(out1)
model = Model(inp,out2)
#above is the usual process of creating a model
#supposing layer 2 is the layer you want (you can also use names)
weights = numpy.load(...path to your saved weights)
biases = numpy.load(... path to your saved biases)
You can make layers untrainable (must be done before the model compilation):
model.layers[2].trainable = False
Then you compile the model:
And there you go, a model, whose one layer is untrainable and has weights and biases defined by you, taken from somewhere else.
Yes, it is a common practice in transfer learning, see here.
Thjs piece_to_share below can be one or more layers.
piece_to_share = tf.keras.Model(...)
full_model = tf.keras.Sequential([piece_to_share, ...])
