Add LSTM layer after Conv2D layers and add some other inputs - keras

I'm working on a racing game that uses reinforcement learning. To train the model I'm facing an issue when implementing the neural network. I found some examples that use CNN. But it seems like adding extra LSTM layer will increase the model efficiency. I found the following example.
https://team.inria.fr/rits/files/2018/02/ICRA18_EndToEndDriving_CameraReady.pdf
The network I need to implement
The problem is I'm not sure how can I implement the LSTM layer here. How can I give following inputs to LSTM layer
Processed image output
current speed
last action
Here is the code I'm currently using. I want to add the LSTM layer after Conv2D.
self.__nb_actions = 28
self.__gamma = 0.99
#Define the model
activation = 'relu'
pic_input = Input(shape=(59,255,3))
img_stack = Conv2D(16, (3, 3), name='convolution0', padding='same', activation=activation, trainable=train_conv_layers)(pic_input)
img_stack = MaxPooling2D(pool_size=(2,2))(img_stack)
img_stack = Conv2D(32, (3, 3), activation=activation, padding='same', name='convolution1', trainable=train_conv_layers)(img_stack)
img_stack = MaxPooling2D(pool_size=(2, 2))(img_stack)
img_stack = Conv2D(32, (3, 3), activation=activation, padding='same', name='convolution2', trainable=train_conv_layers)(img_stack)
img_stack = MaxPooling2D(pool_size=(2, 2))(img_stack)
img_stack = Flatten()(img_stack)
img_stack = Dropout(0.2)(img_stack)
img_stack = Dense(128, name='rl_dense', kernel_initializer=random_normal(stddev=0.01))(img_stack)
img_stack=Dropout(0.2)(img_stack)
output = Dense(self.__nb_actions, name='rl_output', kernel_initializer=random_normal(stddev=0.01))(img_stack)
opt = Adam()
self.__action_model = Model(inputs=[pic_input], outputs=output)
self.__action_model.compile(optimizer=opt, loss='mean_squared_error')
self.__action_model.summary()
Thanks

There are various methods to do that, First, reshape the output of conv output and feed it to lstm layer. Here is an explained example with various method Shaping data for LSTM, and feeding output of dense layers to LSTM

Related

Getting the output from a specific layer in a CNN

I'm trying to understand how to work on CNNs. My question is that I would like to be able to extract the "encoded" version of an image from this model. The encoded version should be the output of the Dense 4096 .
def get_siamese_model(input_shape):
# Define the tensors for the two input images
left_input = Input(input_shape)
right_input = Input(input_shape)
# Convolutional Neural Network
model = Sequential()
model.add(Conv2D(64, (10,10), activation='relu', input_shape=input_shape,
kernel_initializer=initialize_weights, kernel_regularizer=l2(2e-4)))
model.add(MaxPooling2D())
model.add(Conv2D(128, (7,7), activation='relu',
kernel_initializer=initialize_weights,
bias_initializer=initialize_bias, kernel_regularizer=l2(2e-4)))
model.add(MaxPooling2D())
model.add(Conv2D(128, (4,4), activation='relu', kernel_initializer=initialize_weights,
bias_initializer=initialize_bias, kernel_regularizer=l2(2e-4)))
model.add(MaxPooling2D())
model.add(Conv2D(256, (4,4), activation='relu', kernel_initializer=initialize_weights,
bias_initializer=initialize_bias, kernel_regularizer=l2(2e-4)))
model.add(Flatten())
model.add(Dense(4096, activation='sigmoid',
kernel_regularizer=l2(1e-3),
kernel_initializer=initialize_weights,bias_initializer=initialize_bias))
# Generate the encodings (feature vectors) for the two images
encoded_l = model(left_input)
encoded_r = model(right_input)
# Add a customized layer to compute the absolute difference between the encodings
L1_layer = Lambda(lambda tensors:K.abs(tensors[0] - tensors[1]))
L1_distance = L1_layer([encoded_l, encoded_r])
# Add a dense layer with a sigmoid unit to generate the similarity score
prediction = Dense(1,activation='sigmoid',bias_initializer=initialize_bias)(L1_distance)
# Connect the inputs with the outputs
siamese_net = Model(inputs=[left_input,right_input],outputs=prediction)
# return the model
return siamese_net
Moreover, can I give as input only one image and get the encoding of that image?
Many thanks!
Edit: I have tried by doing this
layer_name = 'sequential_3'
model2= Model(inputs=model.input, outputs=model.get_layer(layer_name).output)
but i get this error
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 105, 105, 1), dtype=tf.float32, name='conv2d_12_input'), name='conv2d_12_input', description="created by layer 'conv2d_12_input'") at layer "conv2d_12". The following previous layers were accessed without issue: []
And I don't know how to fix it

Stacking fully connected layers on top of two autoencoders for classification

I'm training autoencoders on 2D images using convolutional layers and would like to put fully connected layers on top of encoder part for classification. My autoencoder is defined as follows (just a simple one for illustration):
def encoder(input_img):
conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
conv1 = BatchNormalization()(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(pool1)
conv2 = BatchNormalization()(conv2)
return conv2
def decoder(conv2):
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv2)
conv3 = BatchNormalization()(conv3)
up1 = UpSampling2D((2,2))(conv3)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(up1)
return decoded
autoencoder = Model(input_img, decoder(encoder(input_img)))
My input images are of size (64,80,1). Now when stacking fully connected layers on top of the encoder I'm doing the following:
def fc(enco):
flat = Flatten()(enco)
den = Dense(128, activation='relu')(flat)
out = Dense(num_classes, activation='softmax')(den)
return out
encode = encoder(input_img)
full_model = Model(input_img,fc(encode))
for l1,l2 in zip(full_model.layers[:19],autoencoder.layers[0:19]):
l1.set_weights(l2.get_weights())
For only one autoencoder this works but the problem now is that I have 2 autoencoders trained on sets of images all of size (64, 80, 1).
For every label I have as input two images of size (64, 80, 1) and one label (0 or 1). I need to feed image 1 into the first autoencoder and image 2 into the second autoencoder. But how can I combine both autoencoders in the full_model in above code?
Another problem is also the input to the fit() method. Until now with only one autoencoder the input consisted just of numpy arrays of images (e.g. (1000,64,80,1)) but with two autoencoders I would have two sets of images as input. How can I feed this into the fit() method so that the first autoencoder consumes the first set of images and the second autoencoder the second set?
Q: How can I combine both autoencoders in full_model?
A: You could concatenate the bottleneck layers enco_1 and enco_2 of both autoencoders within fc:
def fc(enco_1, enco_2):
flat_1 = Flatten()(enco_1)
flat_2 = Flatten()(enco_2)
flat = Concatenate()([enco_1, enco_2])
den = Dense(128, activation='relu')(flat)
out = Dense(num_classes, activation='softmax')(den)
return out
encode_1 = encoder_1(input_img_1)
encode_2 = encoder_2(input_img_2)
full_model = Model([input_img_1, input_img_2], fc(encode_1, encode_2))
Note that the last part where you manually set the weights of the encoder is unnecessary - see https://keras.io/getting-started/functional-api-guide/#shared-layers
Q: How can I feed this into the fit method so that the first autoencoder consumes the first set of images and the second autoencoder the second set?
A: In the code above, note that the two encoders are fed with different inputs (one for each image set). Now, provided that the model is defined in this way, you can call full_model.fit as follows:
full_model.fit(x=[images_set_1, images_set_2],
y=label,
...)
NOTE: Not tested.

Tuning neural network hyperparameters when using Keras functional API

I have a neural network that contains two branches. One branch takes input to a convolution neural network. And other branch is a fully connected layer. I merge these two branches and then get an output using softmax. I can not use a sequential model because it's deprecated and therefore, had to use functional API.
I want to tune the hyperparameters for a convolutional neural network branch. For example, I want to figure out how many convolution layers I should use. If it was a sequential model I would've used a for loop but since I am using a functional API I can't really do that. I've attached my code. Could anyone tell me how can optimise my neural network for number of convolutions in a smart way instead of making a lot of different scripts with different number of convolution layers.
Suggestions would be appreciated.
i1 = Input(shape=(xtest.shape[1], xtest.shape[2]))
###Convolution branch
c1 = Conv1D(128*2, kernel_size=ksize,activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(i1)
c1 = Conv1D(128*2, kernel_size=ksize, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(c1)
c1 = AveragePooling1D(pool_size=ksize)(c1)
c1 = Dropout(0.2)(c1)
c1 = Conv1D(128*2, kernel_size=ksize, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(c1)
c1 = AveragePooling1D(pool_size=ksize)(c1)
c1 = Dropout(0.2)(c1)
c1 = Flatten()(c1)
###fully connected branch
i2 = Input(shape=(5000, ))
c2 = Dense(64, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(i2)
c2 = Dropout(0.1)(c2)
###concatenating the two branches
c = concatenate([c1, c2])
x = Dense(256, activation='relu', kernel_initializer='normal',kernel_regularizer=keras.regularizers.l2(l2_lambda))(c)
x = Dropout(0.25)(x)
###Output branch
output = Dense(num_classes, activation='softmax')(x)
model = Model([i1, i2], [output])
model.summary()
With sequential models I can use a for loop so for example:
layers = [1,2,3,4,5]
b1 = Sequential()
b1.add(Conv1D(128*2, kernel_size=ksize,
activation='relu',
input_shape=( xtest.shape[1], xtest.shape[2]),
kernel_regularizer=keras.regularizers.l2(l2_lambda)))
for layer in layers:
count = layer
while count > 0:
b1.add(Conv1D(128*2, kernel_size=ksize, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
count -= 1
b1.add(MaxPooling1D(pool_size=ksize))
b1.add(Dropout(0.2))
b1.add(Flatten())
b2 = Sequential()
b2.add(Dense(64, input_shape = (5000,), activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
for layer in layers:
count = layer
while count > 0:
b2.add(Dense(64,, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
model = Sequential()
model.add(Merge([b1, b2], mode = 'concat'))
model.add(Dense(256, activation='relu', kernel_initializer='normal',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
model.add(Dropout(0.25))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),
metrics=['accuracy'])
This is the minimal example of a model with a variable number of layers using Keras Functional API:
from keras.layers import Input, Conv2D, Dense, Dropout, Flatten, MaxPool2D
from keras.models import Model
def build_model(num_layers, input_shape, num_classes):
input = Input(shape=input_shape)
x = Conv2D(32, (3, 3), activation='relu')(input)
# Suppose you want to find out how many additional convolutional
# layers to add here.
for _ in num_layers:
x = Conv2D(32, (3, 3), activation='relu')(x)
x = MaxPool2D((2, 2))(x)
x = Flatten()(x)
x = Dense(64, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(num_classes, activation='softmax')(x)
return Model(inputs=input, outputs=x)
model = build_model(num_layers=2, input_shape=(128, 128), num_classes=3)
These are the steps I would follow to find out how many 'middle' convolutional layers to use:
Train several models with num_layers parameter set to various values. The code to build all those models is exactly the same, only the value of num_layers parameter changes across different training runs.
Choose the one that has the best values of metrics you care about.
That's it!
Side note: as far as I know, Keras Sequential model isn't deprecated.
You can dynamically set your model structure using the functional API as well. For the convolutional branch you could use something like:
layer_shapes = (64, 64, 32)
for _ in layers:
b1 = Conv1D(128*2, kernel_size=ksize, activation='relu', kernel_regularizer=keras.regularizers.l2(l2_lambda))(b1)
You just need to replace the Sequential.add by the corresponding variable assignment.

Converting keras.applications.resnet50 to a Sequential gives error

I want to convert pretrained ResNet50 model from keras.application to a Sequential model but it gives input_shape error.
Input 0 is incompatible with layer res2a_branch1: expected axis -1 of input shape to have value 64 but got shape (None, 25, 25, 256)
I read this https://github.com/keras-team/keras/issues/9721 and as I understand the reason of error is skip_connections.
Is there a way to convert it to a Sequential or how can I add my custom model to end of this ResNet Model.
This is the code I've tried.
from keras.applications import ResNet50
height = 100 #dimensions of image
width = 100
channel = 3 #RGB
# Create pre-trained ResNet50 without top layer
model = ResNet50(include_top=False, weights="imagenet", input_shape=(height, width, channel))
# Get the ResNet50 layers up to res5c_branch2c
model = Model(input=model.input, output=model.get_layer('res5c_branch2c').output)
model.trainable = False
for layer in model.layers:
layer.trainable = False
model = Sequential(model.layers)
I want to add this to end of it. Where can I start?
model.add(Conv2D(32, (3,3), activation = 'relu', input_shape = inputShape))
model.add(MaxPooling2D(2,2))
model.add(BatchNormalization(axis = chanDim))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3,3), activation = 'relu'))
model.add(MaxPooling2D(2,2))
model.add(BatchNormalization(axis = chanDim))
model.add(Dropout(0.2))
model.add(Conv2D(64, (3,3), activation = 'relu'))
model.add(MaxPooling2D(2,2))
model.add(BatchNormalization(axis = chanDim))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(64, activation = 'relu'))
model.add(BatchNormalization(axis = chanDim))
model.add(Dropout(0.5))
model.add(Dense(classes, activation = 'softmax'))
Use functionl API of Keras.
First take ResNet50,
from keras.models import Model
from keras.applications import ResNet50
height = 100 #dimensions of image
width = 100
channel = 3 #RGB
# Create pre-trained ResNet50 without top layer
model_resnet = ResNet50(include_top=False, weights="imagenet", input_shape=(height, width, channel))
And add module of your model as follows, and use output of ResNet to input of the next layer
conv1 = Conv2D(32, (3,3), activation = 'relu')(model_resnet.output)
pool1 = MaxPooling2D(2,2)(conv1)
bn1 = BatchNormalization(axis=chanDim)(pool1)
drop1 = Dropout(0.2)(bn1)
Add this way all of your layer and at last for example,
flatten1 = Flatten()(drop1)
fc2 = Dense(classes, activation='softmax')(flatten1)
And, use Model() to create the final model.
model = Model(inputs=model_resnet.input, outputs=fc2)

Dimension errors in neural network in Keras

I am trying to implement a neural network where I merge/concatenate a fully connected neural network with a convolution neural network. But when I fit the model, I get the following error:
ValueError: All input arrays (x) should have the same number of
samples. Got array shapes: [(1, 100, 60, 4500), (100, 4500)]
I have two different inputs:
image(dimensions: 1,100,60,4500) where 1 is the channel, 100: # of sample, 60*4500 (dimension of my image). This goes to my convolution neural network
positions(dimensions: 100,4500): where 100 refers to samples.
Dimension for my output is 100,2.
The code for my neural network is:
###Convolution neural network
b1 = Sequential()
b1.add(Conv2D(128*2, kernel_size=3,activation='relu',data_format='channels_first',
input_shape=(100,60,4500)))
b1.add(Conv2D(128*2, kernel_size=3, activation='relu'))
b1.add(Dropout(0.2))
b1.add(Conv2D(128*2, kernel_size=4, activation='relu'))
b1.add(Dropout(0.2))
b1.add(Flatten())
b1.summary()
###Fully connected feed forward neural network
b2 = Sequential()
b2.add(Dense(64, input_shape = (4500,), activation='relu'))
b2.add(Dropout(0.1))
b2.summary()
model = Sequential()
###Concatenating the two networks
concat = concatenate([b1.output, b2.output], axis=-1)
x = Dense(256, activation='relu', kernel_initializer='normal')(concat)
x = Dropout(0.25)(x)
output = Dense(2, activation='softmax')(x)
model = Model([b1.input, b2.input], [output])
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),
metrics=['accuracy'])
history = model.fit([image, positions], Ytest, batch_size=10,
epochs=1,
verbose=1)
Also, the reason why my 'image' array is 4 dimensional is because in the beginning it was just (100,60,4500) but then I ran into the following error:
ValueError: Error when checking input: expected conv2d_10_input to
have 4 dimensions, but got array with shape (100, 60, 4500)
And upon googling I found out that it expects # of channels as an input too. And after I added the # of channel, this error went away but then I ran into the other error that I mentioned in the beginning.
So can someone tell me how to solve for the error (the one I specified in the beginning)? Help would be appreciated.
It is not a good practice to mix Sequential and Functional API.
You can implement the model like this
i1 = Input(shape=(1, 60, 4500))
c1 = Conv2D(128*2, kernel_size=3,activation='relu',data_format='channels_first')(i1)
c1 = Conv2D(128*2, kernel_size=3, activation='relu')(c1)
c1 = Dropout(0.2)(c1)
c1 = Conv2D(128*2, kernel_size=4, activation='relu')(c1)
c1 = Dropout(0.2)(c1)
c1 = Flatten()(c1)
i2 = Input(shape=(4500, ))
c2 = Dense(64, input_shape = (4500,), activation='relu')(i2)
c2 = Dropout(0.2)(c2)
c = concatenate([c1, c2])
x = Dense(256, activation='relu', kernel_initializer='normal')(c)
x = Dropout(0.25)(x)
output = Dense(2, activation='softmax')(x)
model = Model([i1, i2], [output])
model.summary()
Note the shape of i1 is shape=(1, 60, 4500). You have set data_format='channels_first' in Conv2D layer hence you need 1 in the beginning.
Compiled the model like this
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
Placeholder data
import numpy as np
X_img = np.zeros((100, 1, 60, 4500))
X_pos = np.ones((100, 4500))
Y = np.zeros((100, 2))
Training
history = model.fit([X_img, X_pos], Y, batch_size=1,
epochs=1,
verbose=1)
You number of samples (batch size) should always be the first dimension. So, your data should have shape (100, 1, 60, 4500) for image and (100, 4500) for positions. The argument channels_first for the Conv2D layer means that the channels is the first non-batch dimension.
You also need to change the input shape to (1, 60, 4500) in the first Conv2D layer.

Resources