Ensemble of CNN and RNN model in keras - nlp

trying to implement the model from paper Ensemble Application of Convolutional and Recurrent Neural Networks for Multi-label Text Categorization in keras
The model looks like the following (taken from the paper)
I have the code as
document_input = Input(shape=(None,), dtype='int32')
embedding_layer = Embedding(vocab_size, WORD_EMB_SIZE, weights=[initial_embeddings],
input_length=DOC_SEQ_LEN, trainable=True)
convs = []
filter_sizes = [2,3,4,5]
doc_embedding = embedding_layer(document_input)
for filter_size in filter_sizes:
l_conv = Conv1D(filters=256, kernel_size=filter_size, padding='same', activation='relu')(doc_embedding)
l_pool = MaxPooling1D(filter_size)(l_conv)
convs.append(l_pool)
l_merge = Concatenate(axis=1)(convs)
l_flat = Flatten()(l_merge)
l_dense = Dense(100, activation='relu')(l_flat)
l_dense_3d = Reshape((1,int(l_dense.shape[1])))(l_dense)
gene_variation_input = Input(shape=(None,), dtype='int32')
gene_variation_embedding = embedding_layer(gene_variation_input)
rnn_layer = LSTM(100, return_sequences=False, stateful=True)(gene_variation_embedding,initial_state=[l_dense_3d])
l_flat = Flatten()(rnn_layer)
output_layer = Dense(9, activation='softmax')(l_flat)
model = Model(inputs=[document_input,gene_variation_input], outputs=[output_layer])
I dont know whether I am setting up the Text feature vector right in the above diagram right ! I tried and I get the error as
ValueError: Layer lstm_9 expects 3 inputs, but it received 2 input tensors. Input received: [<tf.Tensor 'embedding_10_1/Gather:0' shape=(?, ?, 200) dtype=float32>, <tf.Tensor 'reshape_9/Reshape:0' shape=(?, 1, 100) dtype=float32>]
I did follow the section on Note on specifying the initial state of RNNs in keras documentation and code
Any help appreciated.
update:
The suggestion and some more reading into the code the model looks like this
embedding_layer = Embedding(vocab_size, WORD_EMB_SIZE, weights=[initial_embeddings], trainable=True)
document_input = Input(shape=(DOC_SEQ_LEN,), batch_shape=(BATCH_SIZE, DOC_SEQ_LEN),dtype='int32')
doc_embedding = embedding_layer(document_input)
convs = []
filter_sizes = [2,3,4,5]
for filter_size in filter_sizes:
l_conv = Conv1D(filters=256, kernel_size=filter_size, padding='same', activation='relu')(doc_embedding)
l_pool = MaxPooling1D(filter_size)(l_conv)
convs.append(l_pool)
l_merge = Concatenate(axis=1)(convs)
l_flat = Flatten()(l_merge)
l_dense = Dense(100, activation='relu')(l_flat)
gene_variation_input = Input(shape=(GENE_VARIATION_SEQ_LEN,), batch_shape=(BATCH_SIZE, GENE_VARIATION_SEQ_LEN),dtype='int32')
gene_variation_embedding = embedding_layer(gene_variation_input)
rnn_layer = LSTM(100, return_sequences=False,
batch_input_shape=(BATCH_SIZE, GENE_VARIATION_SEQ_LEN, WORD_EMB_SIZE),
stateful=False)(gene_variation_embedding, initial_state=[l_dense, l_dense])
output_layer = Dense(9, activation='softmax')(rnn_layer)
model = Model(inputs=[document_input,gene_variation_input], outputs=[output_layer])
model summary
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_8 (InputLayer) (32, 9) 0
____________________________________________________________________________________________________
input_7 (InputLayer) (32, 4000) 0
____________________________________________________________________________________________________
embedding_6 (Embedding) multiple 73764400 input_7[0][0]
input_8[0][0]
____________________________________________________________________________________________________
conv1d_13 (Conv1D) (32, 4000, 256) 102656 embedding_6[0][0]
____________________________________________________________________________________________________
conv1d_14 (Conv1D) (32, 4000, 256) 153856 embedding_6[0][0]
____________________________________________________________________________________________________
conv1d_15 (Conv1D) (32, 4000, 256) 205056 embedding_6[0][0]
____________________________________________________________________________________________________
conv1d_16 (Conv1D) (32, 4000, 256) 256256 embedding_6[0][0]
____________________________________________________________________________________________________
max_pooling1d_13 (MaxPooling1D) (32, 2000, 256) 0 conv1d_13[0][0]
____________________________________________________________________________________________________
max_pooling1d_14 (MaxPooling1D) (32, 1333, 256) 0 conv1d_14[0][0]
____________________________________________________________________________________________________
max_pooling1d_15 (MaxPooling1D) (32, 1000, 256) 0 conv1d_15[0][0]
____________________________________________________________________________________________________
max_pooling1d_16 (MaxPooling1D) (32, 800, 256) 0 conv1d_16[0][0]
____________________________________________________________________________________________________
concatenate_4 (Concatenate) (32, 5133, 256) 0 max_pooling1d_13[0][0]
max_pooling1d_14[0][0]
max_pooling1d_15[0][0]
max_pooling1d_16[0][0]
____________________________________________________________________________________________________
flatten_4 (Flatten) (32, 1314048) 0 concatenate_4[0][0]
____________________________________________________________________________________________________
dense_6 (Dense) (32, 100) 131404900 flatten_4[0][0]
____________________________________________________________________________________________________
lstm_4 (LSTM) (32, 100) 120400 embedding_6[1][0]
dense_6[0][0]
dense_6[0][0]
____________________________________________________________________________________________________
dense_7 (Dense) (32, 9) 909 lstm_4[0][0]
====================================================================================================
Total params: 206,008,433
Trainable params: 206,008,433
Non-trainable params: 0
____________________________________________________________________________________________________

An LSTM has 2 hidden states, but you are providing only 1 initial state. You could do one of the following:
Replace LSTM with an RNN which has only 1 hidden state, such as GRU:
rnn_layer = GRU(100, return_sequences=False, stateful=True)
(gene_variation_embedding,initial_state=[l_dense_3d])
Or pass zeros as initial state for the second hidden state of LSTM:
zeros = Lambda(lambda x: K.zeros_like(x), output_shape=lambda s: s)(l_dense_3d)
rnn_layer = LSTM(100, return_sequences=False, stateful=True)
(gene_variation_embedding,initial_state=[l_dense_3d, zeros])

Related

Keras with Hierarchical LSTM

I had a problem about hierarchical lstm in keras. It works well when the data is 2 dimensions. When I changed it to three dimensions, it does not work. My data is (25,10,2)
I want to build a hierarchical lstm, the first layer lstm will convert each data with shape (10,2) into a vector, there are 25 vectors feed into the second layer lstm. The input data in the first layer lstm is (10,2). I used two embeddings and multiply them. I appreciate if anyone can help.
def H_LSTM():
single_input = Input(shape=(10,2),dtype='int32')
in_sentence = Lambda(lambda x: single_input[:,:, 0:1], output_shape=(maxlen,))(single_input)
in_sentence = Reshape((maxlen,), input_shape = (maxlen,1))(in_sentence)
in_drug = Lambda(lambda x: single_input[:, :, 1:1], output_shape=(maxlen,))(single_input)
in_drug = Reshape((maxlen,), input_shape = (maxlen,1))(in_drug)
embedded_sentence = Embedding(len(word_index) + 1, embedding_dim, weights=[embedding_matrix],
input_length=maxlen, trainable=True, mask_zero=False)(in_sentence)
embedded_drug = Embedding(len(word_index) + 1, embedding_dim, weights=[embedding_matrix],
input_length=maxlen, trainable=True, mask_zero=False)(in_drug)
embedded_sequences = Multiply()([embedded_sentence, embedded_drug])
lstm_sentence = LSTM(100)(embedded_sequences)
encoded_model = Model(inputs = single_input, outputs = lstm_sentence)
sequence_input = Input(shape=(25,10,2),dtype='int32')
seq_encoded = TimeDistributed(encoded_model)(sequence_input)
seq_encoded = Dropout(0.2)(seq_encoded)
# Encode entire sentence
seq_encoded = LSTM(100)(seq_encoded)
# Prediction
prediction = Dense(2, activation='softmax')(seq_encoded)
model = Model(inputs = sequence_input, outputs = prediction)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
return model
Model Summary:
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_3 (InputLayer) (None, 10, 2) 0
__________________________________________________________________________________________________
lambda_3 (Lambda) (None, 10) 0 input_3[0][0]
__________________________________________________________________________________________________
lambda_4 (Lambda) (None, 10) 0 input_3[0][0]
__________________________________________________________________________________________________
reshape_3 (Reshape) (None, 10) 0 lambda_3[0][0]
__________________________________________________________________________________________________
reshape_4 (Reshape) (None, 10) 0 lambda_4[0][0]
__________________________________________________________________________________________________
embedding_3 (Embedding) (None, 10, 128) 4895744 reshape_3[0][0]
__________________________________________________________________________________________________
embedding_4 (Embedding) (None, 10, 128) 4895744 reshape_4[0][0]
__________________________________________________________________________________________________
multiply_2 (Multiply) (None, 10, 128) 0 embedding_3[0][0]
embedding_4[0][0]
__________________________________________________________________________________________________
lstm_3 (LSTM) (None, 100) 91600 multiply_2[0][0]
==================================================================================================
Total params: 9,883,088
Trainable params: 9,883,088
Non-trainable params: 0
__________________________________________________________________________________________________
None
Model: "model_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_4 (InputLayer) (None, 25, 10, 2) 0
_________________________________________________________________
time_distributed_2 (TimeDist (None, 25, 100) 9883088
_________________________________________________________________
dropout_2 (Dropout) (None, 25, 100) 0
_________________________________________________________________
lstm_4 (LSTM) (None, 100) 80400
_________________________________________________________________
dense_2 (Dense) (None, 2) 202
=================================================================
Total params: 9,963,690
Trainable params: 9,963,690
Non-trainable params: 0
Error Message:
InvalidArgumentError: You must feed a value for placeholder tensor 'input_3' with dtype int32 and shape [?,10,2]
[[node input_3 (defined at D:\Users\Jinhe.Shi\AppData\Local\Continuum\anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py:3009) ]] [Op:__inference_keras_scratch_graph_6214]
Function call stack:
keras_scratch_graph
Update: the framework is shown in the following, the difference is no attention layer and I added two embeddings in the lower layer lstm.
enter image description here
Model fit:
The error happens during the model fitting.
model2 = H_LSTM();
print("model fitting - Hierachical network")
model2.fit(X_train, Y_train, nb_epoch=3, batch_size=100, validation_data=(X_test, Y_test))
The input data likes:
enter image description here

How to train a siamese neural network for image matching?

I need to identify if two fingerprints (from id card and sensor) match or not. Below some examples from my database (3000 pairs of images):
Example of matching images
Example of non-matching images
I am trying to train a siamese network which receives a pair of images and its output is [1, 0] if they don't match and [0, 1] if they match, then I created my model with Keras:
image_left = Input(shape=(200, 200, 1))
image_right = Input(shape=(200, 200, 1))
vector_left = conv_base(image_left)
vector_right = conv_base(image_right)
merged_features = concatenate([vector_left, vector_right], axis=-1)
fc1 = Dense(64, activation='relu')(merged_features)
fc1 = Dropout(0.2)(fc1)
# # fc2 = Dense(128, activation='relu')(fc1)
pred = Dense(2, activation='softmax')(fc1)
model = Model(inputs=[image_left, image_right], outputs=pred)
Where conv_base is a convolutional architecture. Actually, I have tried with ResNet, leNet, MobileNetV2 and NASNet from keras.applications, but they don't work.
conv_base = NASNetMobile(weights = None,
include_top=True,
classes=256)
My model summary is similar as shown below (depending on corresponding network used):
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_2 (InputLayer) (None, 200, 200, 1) 0
__________________________________________________________________________________________________
input_3 (InputLayer) (None, 200, 200, 1) 0
__________________________________________________________________________________________________
NASNet (Model) (None, 256) 4539732 input_2[0][0]
input_3[0][0]
__________________________________________________________________________________________________
concatenate_5 (Concatenate) (None, 512) 0 NASNet[1][0]
NASNet[2][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 64) 32832 concatenate_5[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 64) 0 dense_1[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 2) 130 dropout_1[0][0]
==================================================================================================
Total params: 4,572,694
Trainable params: 4,535,956
Non-trainable params: 36,738
Additional to convolutional architecture changes, I've tried with using pre-trained weights, setting all layers as trainable, setting last convolutional layers as trainable, data augmentation, using categorical_crossentropy and contrastive_loss functions, changing learning rate, but they all have same behavior. It is, training and validation accuracy are always 0.5.
Does anybody have an idea about what I am missing/doing wrong?
Thank you.

convolutional autoencoder to analyse long 1-D sequences

I have a dataset of 1-D vectors each 3001 digits long. I have used a simple convolutional network to perform binary classification on these sequences:
shape=train_X.shape[1:]
model = Sequential()
model.add(Conv1D(75,3,strides=1, input_shape=shape, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
The network achieves ~60% accuracy.
I now would like to create an autoencoder to discover the regular pattern that is distinguishing samples where the label is '1' vs those where it is '0'. i.e. to generate an exemplary sequence- that is representative of the '1' labeled samples.
Based on previous blogs and posts I have tried to put together an autoencoder that can achieve this:
input_sig = Input(batch_shape=(None,3001,1))
x = Conv1D(64,3, activation='relu', padding='same')(input_sig)
x1 = MaxPooling1D(2)(x)
x2 = Conv1D(32,3, activation='relu', padding='same')(x1)
x3 = MaxPooling1D(2)(x2)
flat = Flatten()(x3)
encoded = Dense(1,activation = 'relu')(flat)
x2_ = Conv1D(32, 3, activation='relu', padding='same')(x3)
x1_ = UpSampling1D(2)(x2_)
x_ = Conv1D(64, 3, activation='relu', padding='same')(x1_)
upsamp = UpSampling1D(2)(x_)
decoded = Conv1D(1, 3, activation='sigmoid', padding='same')(upsamp)
autoencoder = Model(input_sig, decoded)
autoencoder.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
This looks as follows:
autoencoder.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_57 (InputLayer) (None, 3001, 1) 0
_________________________________________________________________
conv1d_233 (Conv1D) (None, 3001, 64) 256
_________________________________________________________________
max_pooling1d_115 (MaxPoolin (None, 1500, 64) 0
_________________________________________________________________
conv1d_234 (Conv1D) (None, 1500, 32) 6176
_________________________________________________________________
max_pooling1d_116 (MaxPoolin (None, 750, 32) 0
_________________________________________________________________
conv1d_235 (Conv1D) (None, 750, 32) 3104
_________________________________________________________________
up_sampling1d_106 (UpSamplin (None, 1500, 32) 0
_________________________________________________________________
conv1d_236 (Conv1D) (None, 1500, 64) 6208
_________________________________________________________________
up_sampling1d_107 (UpSamplin (None, 3000, 64) 0
_________________________________________________________________
conv1d_237 (Conv1D) (None, 3000, 64) 12352
=================================================================
Total params: 28,096
Trainable params: 28,096
Non-trainable params: 0
hence everything seems to be going smoothly until I train the netowrk
autoencoder.fit(train_X,train_y,epochs=3,batch_size=100,validation_data=(test_X, test_y))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/bsxcto/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1630, in fit
batch_size=batch_size)
File "/home/bsxcto/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1480, in _standardize_user_data
exception_prefix='target')
File "/home/bsxcto/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 113, in _standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking target: expected conv1d_237 to have 3 dimensions, but got array with shape (32318, 1)
Hence I have tried adding a 'Reshape' layer before the last one.
upsamp = UpSampling1D(2)(x_)
flat = Flatten()(upsamp)
reshaped = Reshape((3000,64))(flat)
decoded = Conv1D(1, 3, activation='sigmoid', padding='same')(reshaped)
in which case the network looks as follows:
autoencoder.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_59 (InputLayer) (None, 3001, 1) 0
_________________________________________________________________
conv1d_243 (Conv1D) (None, 3001, 64) 256
_________________________________________________________________
max_pooling1d_119 (MaxPoolin (None, 1500, 64) 0
_________________________________________________________________
conv1d_244 (Conv1D) (None, 1500, 32) 6176
_________________________________________________________________
max_pooling1d_120 (MaxPoolin (None, 750, 32) 0
_________________________________________________________________
conv1d_245 (Conv1D) (None, 750, 32) 3104
_________________________________________________________________
up_sampling1d_110 (UpSamplin (None, 1500, 32) 0
_________________________________________________________________
conv1d_246 (Conv1D) (None, 1500, 64) 6208
_________________________________________________________________
up_sampling1d_111 (UpSamplin (None, 3000, 64) 0
_________________________________________________________________
flatten_111 (Flatten) (None, 192000) 0
_________________________________________________________________
reshape_45 (Reshape) (None, 3000, 64) 0
_________________________________________________________________
conv1d_247 (Conv1D) (None, 3000, 1) 193
=================================================================
Total params: 15,937
Trainable params: 15,937
Non-trainable params: 0
But the same error results:
Error when checking target: expected conv1d_247 to have 3 dimensions, but got array with shape (32318, 1)
My questions are:
1) Is this a feasible way of finding the pattern that is distinguishing samples with label '1' vs '0'?
2) how can I make the final layer accept the final output of the last upsampling layer?
original = Sequential()
original.add(Conv1D(75,repeat_length,strides=stride, input_shape=shape, activation='relu’,padding=‘same’))
original.add(MaxPooling1D(repeat_length))
original.add(Flatten())
original.add(Dense(1, activation='sigmoid'))
original.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
calculate_roc(original......)
mod=Sequential()
mod.add(original.layers[0])
mod.add(original.layers[1])
mod.add(Conv1D(75,window, activation='relu', padding='same'))
mod.add(UpSampling1D(window))
mod.add(Conv1D(1, 1, activation='sigmoid', padding='same'))
mod.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
mod.layers[0].trainable=False
mod.layers[1].trainable=False
mod.fit(train_X,train_X,epochs=1,batch_size=100)
decoded_imgs = mod.predict(test_X)
x=decoded_imgs.mean(axis=0)
plt.plot(x)

Issue Merging Two Layers in an LSTM Seq2Seq Model for Q&A Use Case

I'm trying to build a Q&A model based off of the bAbI Task 8 example and I am having trouble merging two of my input layers into one layer. Here is my current model architecture:
story_input = Input(shape=(story_maxlen,vocab_size), name='story_input')
story_input_proc = Embedding(vocab_size, latent_dim, name='story_input_embed', input_length=story_maxlen)(story_input)
story_input_proc = Reshape((latent_dim,story_maxlen), name='story_input_reshape')(story_input_proc)
query_input = Input(shape=(query_maxlen,vocab_size), name='query_input')
query_input_proc = Embedding(vocab_size, latent_dim, name='query_input_embed', input_length=query_maxlen)(query_input)
query_input_proc = Reshape((latent_dim,query_maxlen), name='query_input_reshape')(query_input_proc)
story_query = dot([story_input_proc, query_input_proc], axes=(1, 1), name='story_query_merge')
encoder = LSTM(latent_dim, return_state=True, name='encoder')
encoder_output, state_h, state_c = encoder(story_query)
encoder_output = RepeatVector(3, name='encoder_3dim')(encoder_output)
encoder_states = [state_h, state_c]
decoder = LSTM(latent_dim, return_sequences=True, name='decoder')(encoder_output, initial_state=encoder_states)
answer_output = Dense(vocab_size, activation='softmax', name='answer_output')(decoder)
model = Model([story_input, query_input], answer_output)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
and here is the output of model.summary()
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
story_input (InputLayer) (None, 358, 38) 0
__________________________________________________________________________________________________
query_input (InputLayer) (None, 5, 38) 0
__________________________________________________________________________________________________
story_input_embed (Embedding) (None, 358, 64) 2432 story_input[0][0]
__________________________________________________________________________________________________
query_input_embed (Embedding) (None, 5, 64) 2432 query_input[0][0]
__________________________________________________________________________________________________
story_input_reshape (Reshape) (None, 64, 358) 0 story_input_embed[0][0]
__________________________________________________________________________________________________
query_input_reshape (Reshape) (None, 64, 5) 0 query_input_embed[0][0]
__________________________________________________________________________________________________
story_query_merge (Dot) (None, 358, 5) 0 story_input_reshape[0][0]
query_input_reshape[0][0]
__________________________________________________________________________________________________
encoder (LSTM) [(None, 64), (None, 17920 story_query_merge[0][0]
__________________________________________________________________________________________________
encoder_3dim (RepeatVector) (None, 3, 64) 0 encoder[0][0]
__________________________________________________________________________________________________
decoder (LSTM) (None, 3, 64) 33024 encoder_3dim[0][0]
encoder[0][1]
encoder[0][2]
__________________________________________________________________________________________________
answer_output (Dense) (None, 3, 38) 2470 decoder[0][0]
==================================================================================================
Total params: 58,278
Trainable params: 58,278
Non-trainable params: 0
__________________________________________________________________________________________________
where vocab_size = 38, story_maxlen = 358, query_maxlen = 5, latent_dim = 64, and the batch size = 64.
When I try to train this model I get the error:
Input to reshape is a tensor with 778240 values, but the requested shape has 20480
Here is the formula for those two values:
input_to_reshape = batch_size * latent_dim * query_maxlen * vocab_size
requested_shape = batch_size * latent_dim * query_maxlen
Where I'm At
I believe the error message is saying the shape of the tensor inputted into the query_input_reshape layer is (?, 5, 38, 64) but it is expecting a tensor of shape (?, 5, 64) (see formulas above), but I could be wrong on that.
When I change the target_shape input of Reshape to be 3D (i.e. Reshape((latent_dim,query_maxlen,vocab_size)) I get the error total size of new array must be unchanged, which doesn't make any sense to me because the input is 3D. You would think that Reshape((latent_dim,query_maxlen)) would give me that error because it'd be changing a 3D tensor into a 2D tensor, but it compiles fine, so I've no clue what's going on there.
The only reason I'm using Reshape is because I need to merge the two tensors as an input into the LSTM encoder. When I try to get rid of the Reshape layers I just get dimension mismatch errors when I try to compile the model. The model architecture above at least compiles but I can't train it.
Can someone please help me figure out how I can merge the story_input and query_input layers? Thanks!

Prepending Downsample layer to Resnet50 Pretrained Model

I am using keras 1.1.1 in windows 7 with tensorflow backend.
I am trying to prepend the stock Resnet50 pretained model with an image downsampler. Below is my code.
from keras.applications.resnet50 import ResNet50
import keras.layers
# this could also be the output a different Keras model or layer
input = keras.layers.Input(shape=(400, 400, 1)) # this assumes K.image_dim_ordering() == 'tf'
x1 = keras.layers.AveragePooling2D(pool_size=(2,2))(input)
x2 = keras.layers.Flatten()(x1)
x3 = keras.layers.RepeatVector(3)(x2)
x4 = keras.layers.Reshape((200, 200, 3))(x3)
x5 = keras.layers.ZeroPadding2D(padding=(12,12))(x4)
m = keras.models.Model(input, x5)
model = ResNet50(input_tensor=m.output, weights='imagenet', include_top=False)
but I get an error which I am unsure how to fix.
builtins.Exception: Graph disconnected: cannot obtain value for tensor
Output("input_2:0", shape=(?, 400, 400, 1), dtype=float32) at layer
"input_2". The following previous layers were accessed without issue:
[]
You can use both the Functional API and Sequential approaches to solve this. See working example for both approaches below:
from keras.applications.ResNet50 import ResNet50
from keras.models import Sequential, Model
from keras.layers import AveragePooling2D, Flatten, RepeatVector, Reshape, ZeroPadding2D, Input, Dense
pretrained = ResNet50(input_shape=(224, 224, 3), weights='imagenet', include_top=False)
# Sequential method
model_1 = Sequential()
model_1.add(AveragePooling2D(pool_size=(2,2),input_shape=(400, 400, 1)))
model_1.add(Flatten())
model_1.add(RepeatVector(3))
model_1.add(Reshape((200, 200, 3)))
model_1.add(ZeroPadding2D(padding=(12,12)))
model_1.add(pretrained)
model_1.add(Dense(1))
# functional API method
input = Input(shape=(400, 400, 1))
x = AveragePooling2D(pool_size=(2,2),input_shape=(400, 400, 1))(input)
x = Flatten()(x)
x = RepeatVector(3)(x)
x = Reshape((200, 200, 3))(x)
x = ZeroPadding2D(padding=(12,12))(x)
x = pretrained(x)
preds = Dense(1)(x)
model_2 = Model(input,preds)
model_1.summary()
model_2.summary()
The summaries (replace resnet for xception):
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
average_pooling2d_1 (Average (None, 200, 200, 1) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 40000) 0
_________________________________________________________________
repeat_vector_1 (RepeatVecto (None, 3, 40000) 0
_________________________________________________________________
reshape_1 (Reshape) (None, 200, 200, 3) 0
_________________________________________________________________
zero_padding2d_1 (ZeroPaddin (None, 224, 224, 3) 0
_________________________________________________________________
xception (Model) (None, 7, 7, 2048) 20861480
_________________________________________________________________
dense_1 (Dense) (None, 7, 7, 1) 2049
=================================================================
Total params: 20,863,529
Trainable params: 20,809,001
Non-trainable params: 54,528
_________________________________________________________________
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 400, 400, 1) 0
_________________________________________________________________
average_pooling2d_2 (Average (None, 200, 200, 1) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 40000) 0
_________________________________________________________________
repeat_vector_2 (RepeatVecto (None, 3, 40000) 0
_________________________________________________________________
reshape_2 (Reshape) (None, 200, 200, 3) 0
_________________________________________________________________
zero_padding2d_2 (ZeroPaddin (None, 224, 224, 3) 0
_________________________________________________________________
xception (Model) (None, 7, 7, 2048) 20861480
_________________________________________________________________
dense_2 (Dense) (None, 7, 7, 1) 2049
=================================================================
Total params: 20,863,529
Trainable params: 20,809,001
Non-trainable params: 54,528
_________________________________________________________________
Both approaches work fine. If you plan on freezing the pretrained model and letting pre/post layers learn -- and afterward finetuning the model, the approach I found to work goes like so:
# given the same resnet model as before...
model = load_model('modelname.h5')
# pull out the nested model
nested_model = model.layers[5] # assuming the model is the 5th layer
# loop over the nested model to allow training
for l in nested_model.layers:
l.trainable=True
# insert the trainable pretrained model back into the original
model.layer[5] = nested_model

Resources