When a Keras model accept multiple inputs, its layers behave like there is just one input. It might be a bug.
model = vgg19.VGG19(weights='imagenet', include_top=False, pooling='avg')
model(image1)
model(image2)
model.get_output_at(0)
model.get_output_at(1)
#no error here
outputs_0 = [layer.get_output_at(0) for layer in model.layers]
#no error here
outputs_1 = [layer.get_output_at(1) for layer in model.layers]
#error "Asked to get output at node 1, but the layer has only 1 inbound nodes."
I'm really not sure about what is outputs_0, since model have two inputs, image1 and image2, and when a layer return its output, what is its corresponding input?
In keras, If you have a model: .
print your model, you can know layer name;
wrap a new model;
get output;
from keras.models import Model
print(<your_model>.summary())
<new_model> = Model(inputs=<your_model>.input, outputs=<your_model>.get_layer('your layer_name').get_output_at(<index_number>))
<your_output> = <new_model>.predict(<your_input>)
Regardless of the model's inputs and outputs, there is no rule about how the layers behave inside a model. A model may have many internal branches and reuse (or not) the same layer with different inputs, yielding thus different outputs. A layer will only have "output at 1 (or more)" if that layer was used more than once.
The only certain things are:
the input layers will match the model's input (see 1),
and the output layers will match the model's output (see 1).
But anything is possible in between (see 2).
(1) - But, a model that has many inputs/outputs actually has many "input/output layers". Each output layer has a single output. If you check the "model" outputs, you have many, but if you check the "layers" outputs, then there are several output layers, each yealding a single output (output at 0 only). The same is valid for model's inputs vs input layers.
(2) - Even though, the most common option is to have layers being used only once, and thus having only "output at 0", without additional outputs.
Related
I'm working on a project that involves signal classification. I'm trying different models of ANN using keras to see which one is better, for now focusing in simple networks but I'm struggling with the LSTM one, following this example: https://machinelearningmastery.com/how-to-develop-rnn-models-for-human-activity-recognition-time-series-classification/.
My inputs are 1D signals that I get from an electronic sensor that will be divided in 3 different categories. See here one signal of two different categories so you see they are quite different over time.
https://www.dropbox.com/s/9ctdegtuyjamp48/example_signals.png?dl=0
To start with a simple model, we are trying the following simple model. Since signals are of different length, a masking process has been performed on them, enlarge each one to the longest one with the masked value of -1000 (impossible value to happen in our signal). Data is correctly reshaped from 2D to 3D (since is needed in 3D for the LSTM layer) using the following command since I only have a feature:
Inputs = Inputs.reshape((Inputs.shape[0],Inputs.shape[1],1))
Then, data is divided in training and validation one and feed into the following model:
model = Sequential()
model.add(Masking(mask_value=-1000, input_shape=(num_steps,1)))
model.add(LSTM(20, return_sequences=False))
model.add(Dense(15, activation='sigmoid'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
However, for some reason, each time the network is trained, it always predicts that ALL signals are from the same category, possibly a different category each time is trained, usually the one with more cases inside the input data. If I force the network to be trained with same amount of data for each category, keeps giving me the same result.
I don't think this behaviour is normal: having bad accuracy could happen but this must have to do with some elementary error in the model that I'm not spotting since the given data in the training is correctly inputted, no errors there, rechecked multiple times. Anyone has any idea why this is happening?
Let me know if any more info could be useful to add to this post.
Just for any curious reader: at the end I could resolve it by normalizing the data.
def LSTM_input_normalize(inputs):
new_inputs = []
for in_ in inputs:
if -1000 in in_:
start_idx = np.where(in_ == -1000)[0][0] # index of the first "-1000" in the sequence
else:
start_idx = in_.shape[0]
# compute mean and std of the current sequence
curr_mean = np.mean(in_[:start_idx])
curr_std = np.std(in_[:start_idx])
# normalize the single sequence
in_[:start_idx] = (in_[:start_idx] - curr_mean) / curr_std
new_inputs.append(in_)
return np.array(new_inputs)
I have 10 class dataset with this I got 85% accuracy, got the same accuracy on a saved model.
now I want to add a new class, how to add a new class To the saved model.
I tried by deleting the last layer and train but model get overfit and in prediction every Images show same result (newly added class).
This is what I did
model.pop()
base_model_layers = model.output
pred = Dense(11, activation='softmax')(base_model_layers)
model = Model(inputs=model.input, outputs=pred)
# compile and fit step
I have trained model with 10 class I want to load the model train with class 11 data and give predictions.
Using the model.pop() method and then the Keras Model() API will lead you to an error. The Model() API does not have the .pop() method, so if you want to re-train your model more than once you will have this error.
But the error only occurs if you, after the re-training, save the model and use the new saved model in the next re-training.
Another very wrong and used approach is to use the model.layers.pop(). This time the problem is that function only removes the last layer in the copy it returns. So, the model still has the layer, and just the method's return does not have the layer.
I recommend the following solution:
Admitting you have your already trained model saved in the model variable, something like:
model = load_my_trained_model_function()
# creating a new model
model_2 = Sequential()
# getting all the layers except the output one
for layer in model.layers[:-1]: # just exclude last layer from copying
model_2.add(layer)
# prevent the already trained layers from being trained again
# (you can use layers[:-n] to only freeze the model layers until the nth layer)
for layer in model_2.layers:
layer.trainable = False
# adding the new output layer, the name parameter is important
# otherwise, you will add a Dense_1 named layer, that normally already exists, leading to an error
model_2.add(Dense(num_neurons_you_want, name='new_Dense', activation='softmax'))
Now you should specify the compile and fit methods to train your model and it's done:
model_2.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# model.fit trains the model
model_history = model_2.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_split=0.1)
EDIT:
Note that by adding a new output layer we do not have the weights and biases adjusted in the last training.
Thereby we lost pretty much everything from the previous training.
We need to save the weights and biases of the output layer of the previous training, and then we must add them to the new output layer.
We also must think if we should let all the layers train or not, or even if we should allow the training of only some intercalated layers.
To get the weights and biases from the output layer using Keras we can use the following method:
# weights_training[0] = layer weights
# weights_training[1] = layer biases
weights_training = model.layers[-1].get_weights()
Now you should specify the weights for the new output layer. You can use, for example, the mean of the weights for the weights of the new classes. It's up to you.
To set the weights and biases of the new output layer using Keras we can use the following method:
model_2.layers[-1].set_weights(weights_re_training)
model.pop()
base_model_layers = model.output
pred = Dense(11, activation='softmax')(base_model_layers)
model = Model(inputs=model.input, outputs=pred)
Freeze the first layers, before train it
for layer in model.layers[:-2]:
layer.trainable = False
I am assuming that the problem is singlelabel-multiclass classification i.e. a sample will belong to only 1 of the 11 classes.
This answer will be completely based on implementing the way humans learn into machines. Hence, this will not provide you with a proper code of how to do that but it will tell you what to do and you will be able to easily implement it in keras.
How does a human child learn when you teach him new things? At first, we ask him to forget the old and learn the new. This does not actually mean that the old learning is useless but it means that for the time while he is learning the new, the old knowledge should not interfere as it will confuse the brain. So, the child will only learn the new for sometime.
But the problem here is, things are related. Suppose, the child learned C programming language and then learned compilers. There is a relation between compilers and programming language. The child cannot master computer science if he learns these subjects separately, right? At this point we introduce the term 'intelligence'.
The kid who understands that there is a relation between the things he learned before and the things he learned now is 'intelligent'. And the kid who finds the actual relation between the two things is 'smart'. (Going deep into this is off-topic)
What I am trying to say is:
Make the model learn the new class separately.
And then, make the model find a relation between the previously learned classes and the new class.
To do this, you need to train two different models:
The model which learns to classify on the new class: this model will be a binary classifier. It predicts a 1 if the sample belongs to class 11 and 0 if it doesn't. Now, you already have the training data for samples belonging to class 11 but you might not have data for the samples which doesn't belong to class 11. For this, you can randomly select samples which belong to classes 1 to 10. But note that the ratio of samples belonging to class 11 to that not belonging to class 11 must be 1:1 in order to train the model properly. That means, 50% of the samples must belong to class 11.
Now, you have two separate models: the one which predicts class 1-10 and one which predicts class 11. Now, concatenate the outputs of (the 2nd last layers) these two models with a newly created Dense layer with 11 nodes and let the whole model retrain itself adjusting the weights of pretrained two models and learning new weights of the dense layer. Keep the learning rate low.
The final model is the third model which is a combination of two models (without last Dense layer) + a new Dense layer.
Thank you..
In Keras you can specify a dropout layer like this:
model.add(Dropout(0.5))
But with a GRU cell you can specify the dropout as a parameter in the constructor:
model.add(GRU(units=512,
return_sequences=True,
dropout=0.5,
input_shape=(None, features_size,)))
What's the difference? Is one preferable to the other?
In Keras' documentation it adds it as a separate dropout layer (see "Sequence classification with LSTM")
The recurrent layers perform the same repeated operation over and over.
In each timestep, it takes two inputs:
Your inputs (a step of your sequence)
Internal inputs (can be states and the output of the previous step, for instance)
Note that the dimensions of the input and output may not match, which means that "your input" dimensions will not match "the recurrent input (previous step/states)" dimesions.
Then in every recurrent timestep there are two operations with two different kernels:
One kernel is applied to "your inputs" to process and transform it in a compatible dimension
Another (called recurrent kernel by keras) is applied to the inputs of the previous step.
Because of this, keras also uses two dropout operations in the recurrent layers. (Dropouts that will be applied to every step)
A dropout for the first conversion of your inputs
A dropout for the application of the recurrent kernel
So, in fact there are two dropout parameters in RNN layers:
dropout, applied to the first operation on the inputs
recurrent_dropout, applied to the other operation on the recurrent inputs (previous output and/or states)
You can see this description coded either in GRUCell and in LSTMCell for instance in the source code.
What is correct?
This is open to creativity.
You can use a Dropout(...) layer, it's not "wrong", but it will possibly drop "timesteps" too! (Unless you set noise_shape properly or use SpatialDropout1D, which is currently not documented yet)
Maybe you want it, maybe you dont. If you use the parameters in the recurrent layer, you will be applying dropouts only to the other dimensions, without dropping a single step. This seems healthy for recurrent layers, unless you want your network to learn how to deal with sequences containing gaps (this last sentence is a supposal).
Also, with the dropout parameters, you will be really dropping parts of the kernel as the operations are dropped "in every step", while using a separate layer will let your RNN perform non-dropped operations internally, since your dropout will affect only the final output.
I am trying to learn to use the Keras Model API for modifying a trained model for the purpose of fine-tuning it on the go:
A very basic model:
inputs = Input((x_train.shape[1:]))
x = BatchNormalization(axis=1)(inputs)
x = Flatten()(x)
outputs = Dense(10, activation='softmax')(x)
model1 = Model(inputs, outputs)
model1.compile(optimizer=Adam(lr=1e-5), loss='categorical_crossentropy', metrics=['categorical_accuracy'])
The architecture of it is
InputLayer -> BatchNormalization -> Flatten -> Dense
After I do some training batches on it I want to add some extra Dense layer between the Flatten one and the outputs:
x = Dense(32,activation='relu')(model1.layers[-2].output)
outputs = model1.layers[-1](x)
However, when I run it, i get this:
ValueError: Input 0 is incompatible with layer dense_1: expected axis -1 of input shape to have value 784 but got shape (None, 32)
Could someone please explain what is going on and how/if can I add layers to an already trained model?
Thank you
A Dense layer is made strictly for a certain input dimension. That dimension cannot be changed after you define it (it would need a different number of weights).
So, if you really want to add layers before a dense layer that is already used, you need to make sure that the outputs of the last new layer is the same shape as the flatten's output. (It says you need 784, so your new last dense layer needs 784 units).
Another approach
Since you're adding intermediate layers, it's pointless to keep the last layer: it was trained specifically for a certain input, if you change the input, then you need to train it again.
Well... since you need to train it again anyway, why keep it? Just create a new one that will be suited to the shapes of your new previous layers.
I am looking for a way to implement the following network structure (currently using Keras, might be theano however):
Assume we're given some simple network, but it is not possible to compute the desired loss based on this output directly, instead another operation is needed and the loss will be defined based on the output of this operation. However, this operation does not only need the output of the network but the full network object (eg its gradient).
How can this be done? I think the operation could be performed either in a custom layer on top of the network or in a custom loss function - but for neither version I see a way to access the full network. Any suggestions?
Assume we're given some simple network, but it is not possible to compute the desired loss based on this output directly, instead another operation is needed and the loss will be defined based on the output of this operation. However, this operation does not only need the output of the network but the full network object (eg its gradient).
Say, you have the following model.
import keras.applications.vgg16 as vgg16
model = vgg16.VGG16(weights='imagenet')
model.summary()
For example, now you want to delete the last layer of this model which is actually predicts a category (a vector of length 1000 because imagenet has 1000 categories) for the input image.
# Remove last Linear/Dense layer.
model.layers.pop()
model.outputs = [model.layers[-1].output]
model.layers[-1].outbound_nodes = []
model.summary()
Now, lets add a linear layer (with output size 10) to this model and use the output of the modified neural network model.
model.add(Dense(10, activation='softmax'))
model.summary()
You will get a vector (of length 10) as an output from this model.
You can compile and train the model using model.compile() and model.fit() functions. You can set what type of loss function you want to use to train the model.