Keras replacing input of network - keras

I have a similar problem to Keras replacing input layer, however I need to remove also the next layer, and that will require different input shape.
Here is a simplification of what I'm trying to do:
a = Input(shape=(64,))
b = Dense(32)(a)
c = Dense(16)(b)
d = Dense(8)(c)
model = Model(inputs=a, outputs=d)
print('input shape = ' + str(model.input_shape))
print('input shape = ' + str(model.input_shape))
new_input = Input(shape=(32,))
new_output = model(new_input)
new_model = Model(new_input, new_output)
But the input shape of the model remains the same:
Layer (type) Output Shape Param #
input_1 (InputLayer) (None, 64) 0
dense_1 (Dense) (None, 32) 2080
dense_2 (Dense) (None, 16) 528
dense_3 (Dense) (None, 8) 136
Total params: 2,744
Trainable params: 2,744
Non-trainable params: 0
input shape = (None, 64)
Layer (type) Output Shape Param #
dense_2 (Dense) (None, 16) 528
dense_3 (Dense) (None, 8) 136
Total params: 664
Trainable params: 664
Non-trainable params: 0
input shape = (None, 64)
And that prevents me from creating new model, so the code above fails with:
ValueError: Dimensions must be equal, but are 32 and 64 for 'model_1/dense_1/MatMul' (op: 'MatMul') with input shapes: [?,32], [64,32].
Any ideas how to do that?

It might not be possible to do in the way that you describe. The accepted answer on this post explains it a little.
Their solution was to rebuild the layer with the correct input shape, then load the pre-trained weights for that specific layer.


Keras with Hierarchical LSTM

I had a problem about hierarchical lstm in keras. It works well when the data is 2 dimensions. When I changed it to three dimensions, it does not work. My data is (25,10,2)
I want to build a hierarchical lstm, the first layer lstm will convert each data with shape (10,2) into a vector, there are 25 vectors feed into the second layer lstm. The input data in the first layer lstm is (10,2). I used two embeddings and multiply them. I appreciate if anyone can help.
def H_LSTM():
single_input = Input(shape=(10,2),dtype='int32')
in_sentence = Lambda(lambda x: single_input[:,:, 0:1], output_shape=(maxlen,))(single_input)
in_sentence = Reshape((maxlen,), input_shape = (maxlen,1))(in_sentence)
in_drug = Lambda(lambda x: single_input[:, :, 1:1], output_shape=(maxlen,))(single_input)
in_drug = Reshape((maxlen,), input_shape = (maxlen,1))(in_drug)
embedded_sentence = Embedding(len(word_index) + 1, embedding_dim, weights=[embedding_matrix],
input_length=maxlen, trainable=True, mask_zero=False)(in_sentence)
embedded_drug = Embedding(len(word_index) + 1, embedding_dim, weights=[embedding_matrix],
input_length=maxlen, trainable=True, mask_zero=False)(in_drug)
embedded_sequences = Multiply()([embedded_sentence, embedded_drug])
lstm_sentence = LSTM(100)(embedded_sequences)
encoded_model = Model(inputs = single_input, outputs = lstm_sentence)
sequence_input = Input(shape=(25,10,2),dtype='int32')
seq_encoded = TimeDistributed(encoded_model)(sequence_input)
seq_encoded = Dropout(0.2)(seq_encoded)
# Encode entire sentence
seq_encoded = LSTM(100)(seq_encoded)
# Prediction
prediction = Dense(2, activation='softmax')(seq_encoded)
model = Model(inputs = sequence_input, outputs = prediction)
return model
Model Summary:
Layer (type) Output Shape Param # Connected to
input_3 (InputLayer) (None, 10, 2) 0
lambda_3 (Lambda) (None, 10) 0 input_3[0][0]
lambda_4 (Lambda) (None, 10) 0 input_3[0][0]
reshape_3 (Reshape) (None, 10) 0 lambda_3[0][0]
reshape_4 (Reshape) (None, 10) 0 lambda_4[0][0]
embedding_3 (Embedding) (None, 10, 128) 4895744 reshape_3[0][0]
embedding_4 (Embedding) (None, 10, 128) 4895744 reshape_4[0][0]
multiply_2 (Multiply) (None, 10, 128) 0 embedding_3[0][0]
lstm_3 (LSTM) (None, 100) 91600 multiply_2[0][0]
Total params: 9,883,088
Trainable params: 9,883,088
Non-trainable params: 0
Model: "model_4"
Layer (type) Output Shape Param #
input_4 (InputLayer) (None, 25, 10, 2) 0
time_distributed_2 (TimeDist (None, 25, 100) 9883088
dropout_2 (Dropout) (None, 25, 100) 0
lstm_4 (LSTM) (None, 100) 80400
dense_2 (Dense) (None, 2) 202
Total params: 9,963,690
Trainable params: 9,963,690
Non-trainable params: 0
Error Message:
InvalidArgumentError: You must feed a value for placeholder tensor 'input_3' with dtype int32 and shape [?,10,2]
[[node input_3 (defined at D:\Users\Jinhe.Shi\AppData\Local\Continuum\anaconda3\lib\site-packages\keras\backend\ ]] [Op:__inference_keras_scratch_graph_6214]
Function call stack:
Update: the framework is shown in the following, the difference is no attention layer and I added two embeddings in the lower layer lstm.
enter image description here
Model fit:
The error happens during the model fitting.
model2 = H_LSTM();
print("model fitting - Hierachical network"), Y_train, nb_epoch=3, batch_size=100, validation_data=(X_test, Y_test))
The input data likes:
enter image description here

KERAS: Pretrained a CNN+Dense model. How to freeze CNN weights and substitute Dense with LSTM?

I trained and load a cnn+dense model:
# load model
cnn_model = load_model('my_cnn_model.h5')
The output is this (I have images dimension 2 X 3600):
Layer (type) Output Shape Param #
conv2d_1 (Conv2D) (None, 2, 3600, 32) 128
conv2d_2 (Conv2D) (None, 2, 1800, 32) 3104
max_pooling2d_1 (MaxPooling2 (None, 2, 600, 32) 0
conv2d_3 (Conv2D) (None, 2, 600, 64) 6208
conv2d_4 (Conv2D) (None, 2, 300, 64) 12352
max_pooling2d_2 (MaxPooling2 (None, 2, 100, 64) 0
conv2d_5 (Conv2D) (None, 2, 100, 128) 24704
conv2d_6 (Conv2D) (None, 2, 50, 128) 49280
max_pooling2d_3 (MaxPooling2 (None, 2, 16, 128) 0
flatten_1 (Flatten) (None, 4096) 0
dense_1 (Dense) (None, 1024) 4195328
dense_2 (Dense) (None, 1024) 1049600
dense_3 (Dense) (None, 3) 3075
Total params: 5,343,779
Trainable params: 5,343,779
Non-trainable params: 0
Now, what I want is to leave weights up to flatten and replace dense layers with LSTM to train the added LSTM part.
I just wrote:
# freeze model
base_model = cnn_model(input_shape=(2, 3600, 1))
#base_model = cnn_model
base_model.trainable = False
# Adding the first lstm layer
x = LSTM(1024,activation='relu',return_sequences='True')(base_model.output)
# Adding the second lstm layer
x = LSTM(1024, activation='relu',return_sequences='False')(x)
# Adding the output
output = Dense(3,activation='linear')(x)
# Final model creation
model = Model(inputs=[base_model.input], outputs=[output])
But I obtained:
base_model = cnn_model(input_shape=(2, 3600, 1))
TypeError: __call__() missing 1 required positional argument: 'inputs'
I know I have to add TimeDistributed ideally in the Flatten layer, but I do not know how to do.
Moreover I'm not sure about base_model.trainable = False if it do exactly what I want.
Can you please help me to do the job?
Thank you very much!
You can't directly take the output from Flatten(), LSTM needs 2-d features (time, filters). You have to reshape your tensors.
You can take the output from the layer before flatten (max-pooling), let's say this layer has index i in the model, we can take the output from that layer and reshape it based on our needs and pass it to LSTM.
before_flatten = base_model.layers[i].output # i is the index of the layer from which you want to take the model output
conv2lstm_reshape = Reshape((-1, 2))(before_flatten) # you have to select it, the temporal dim and filters
# Adding the first lstm layer
x = LSTM(1024,activation='relu',return_sequences='True')(conv2lstm_reshape)
# Adding the second lstm layer
x = LSTM(1024, activation='relu',return_sequences='False')(x)
# Adding the output
output = Dense(3,activation='linear')(before_flatten)
# Final model creation
model = Model(inputs=[base_model.input], outputs=[output])

Keras functional API slower than Sequential / Not improving

SOLVED!(Had to set trainable=true in the sequential model)
I am currently changing my Keras model from Sequential to the functional API. While the Sequential model does improve to an accuracy of 1 after like 10 epochs, the functional API model does not even reach 0.7 and does not further improve. Apart from the Input layer, both nets should be the same.
model = Sequential()
model.add(Embedding(20000, 256,input_length = 30))
model.add(LSTM(256, dropout=0.3, recurrent_dropout=0.3))
model.compile(loss = 'binary_crossentropy', optimizer=Adam(lr=0.0001),metrics = ['accuracy'])
Output is:
Layer (type) Output Shape Param #
embedding_6 (Embedding) (None, 30, 256) 5120000
spatial_dropout1d_5 (Spatial (None, 30, 256) 0
lstm_5 (LSTM) (None, 256) 525312
dense_6 (Dense) (None, 1) 257
Total params: 5,645,569
Trainable params: 5,645,569
Non-trainable params: 0
For the functional API:
inputs = Input(shape=(31,))
embed = Embedding(20000, 256, trainable=False)(inputs)
drop = (SpatialDropout1D(0.4))(embed)
lstm = LSTM(256, dropout=0.3, recurrent_dropout=0.3)(drop)
acti = Dense(1,activation='sigmoid')(lstm)
model = Model(inputs=inputs, outputs=acti)
model.compile(loss = 'binary_crossentropy', optimizer=Adam(lr=0.0001),metrics = ['accuracy'])
Model: "model_5"
Layer (type) Output Shape Param #
input_8 (InputLayer) (None, 31) 0
embedding_7 (Embedding) (None, 31, 256) 5120000
spatial_dropout1d_6 (Spatial (None, 31, 256) 0
lstm_6 (LSTM) (None, 256) 525312
dense_7 (Dense) (None, 1) 257
Total params: 5,645,569
Trainable params: 525,569
Non-trainable params: 5,120,000
Have I overseen something or can someone explain my results?

Keras Flatten not creating 1D output

I am trying to build a 1D CNN but I can't get the right dimensions passed to my last dense layer
The architecture of my model is
model_CNN.add(Conv1D(14, 29, activation='relu', input_shape=(X_train.shape[1], 1)))
model_CNN.add(Conv1D(30, 22, activation='relu'))
With a summary that looks like
Layer (type) Output Shape Param #
conv1d_71 (Conv1D) (None, 3304, 14) 420
conv1d_72 (Conv1D) (None, 3283, 30) 9270
flatten_18 (Flatten) (None, 98490) 0
dense_102 (Dense) (None, 176) 17334416
dense_103 (Dense) (None, 5) 885
Total params: 17,344,991
Trainable params: 17,344,991
Non-trainable params: 0
When I try to fit my model, I confirm that my input shape is correct (240, 3332, 1), but then I get the following error
ValueError: Error when checking target: expected dense_103
to have 2 dimensions, but got array with shape (240, 5, 1)
So my flatten function is not creating a 1D array, but also somehow the input only fails on the second dense layer, not the first. What's going on?

How the number of parameters associated with BatchNormalization layer is 2048?

I have the following code.
x = keras.layers.Input(batch_shape = (None, 4096))
hidden = keras.layers.Dense(512, activation = 'relu')(x)
hidden = keras.layers.BatchNormalization()(hidden)
hidden = keras.layers.Dropout(0.5)(hidden)
predictions = keras.layers.Dense(80, activation = 'sigmoid')(hidden)
mlp_model = keras.models.Model(input = [x], output = [predictions])
And this is the model summary:
Layer (type) Output Shape Param # Connected to
input_3 (InputLayer) (None, 4096) 0
dense_1 (Dense) (None, 512) 2097664 input_3[0][0]
batchnormalization_1 (BatchNorma (None, 512) 2048 dense_1[0][0]
dropout_1 (Dropout) (None, 512) 0 batchnormalization_1[0][0]
dense_2 (Dense) (None, 80) 41040 dropout_1[0][0]
Total params: 2,140,752
Trainable params: 2,139,728
Non-trainable params: 1,024
The size of the input for the BatchNormalization (BN) layer is 512. According to Keras documentation, shape of the output for BN layer is same as input which is 512.
Then how the number of parameters associated with BN layer is 2048?
These 2048 parameters are in fact [gamma weights, beta weights, moving_mean(non-trainable), moving_variance(non-trainable)], each having 512 elements (the size of the input layer).
The batch normalization in Keras implements this paper.
As you can read there, in order to make the batch normalization work during training, they need to keep track of the distributions of each normalized dimensions. To do so, since you are in mode=0by default, they compute 4 parameters per feature on the previous layer. Those parameters are making sure that you properly propagate and backpropagate the information.
So 4*512 = 2048, this should answer your question.
