Tuning neural network hyperparameters when using Keras functional API - python-3.x

I have a neural network that contains two branches. One branch takes input to a convolution neural network. And other branch is a fully connected layer. I merge these two branches and then get an output using softmax. I can not use a sequential model because it's deprecated and therefore, had to use functional API.
I want to tune the hyperparameters for a convolutional neural network branch. For example, I want to figure out how many convolution layers I should use. If it was a sequential model I would've used a for loop but since I am using a functional API I can't really do that. I've attached my code. Could anyone tell me how can optimise my neural network for number of convolutions in a smart way instead of making a lot of different scripts with different number of convolution layers.
Suggestions would be appreciated.
i1 = Input(shape=(xtest.shape[1], xtest.shape[2]))
###Convolution branch
c1 = Conv1D(128*2, kernel_size=ksize,activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(i1)
c1 = Conv1D(128*2, kernel_size=ksize, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(c1)
c1 = AveragePooling1D(pool_size=ksize)(c1)
c1 = Dropout(0.2)(c1)
c1 = Conv1D(128*2, kernel_size=ksize, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(c1)
c1 = AveragePooling1D(pool_size=ksize)(c1)
c1 = Dropout(0.2)(c1)
c1 = Flatten()(c1)
###fully connected branch
i2 = Input(shape=(5000, ))
c2 = Dense(64, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(i2)
c2 = Dropout(0.1)(c2)
###concatenating the two branches
c = concatenate([c1, c2])
x = Dense(256, activation='relu', kernel_initializer='normal',kernel_regularizer=keras.regularizers.l2(l2_lambda))(c)
x = Dropout(0.25)(x)
###Output branch
output = Dense(num_classes, activation='softmax')(x)
model = Model([i1, i2], [output])
model.summary()
With sequential models I can use a for loop so for example:
layers = [1,2,3,4,5]
b1 = Sequential()
b1.add(Conv1D(128*2, kernel_size=ksize,
activation='relu',
input_shape=( xtest.shape[1], xtest.shape[2]),
kernel_regularizer=keras.regularizers.l2(l2_lambda)))
for layer in layers:
count = layer
while count > 0:
b1.add(Conv1D(128*2, kernel_size=ksize, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
count -= 1
b1.add(MaxPooling1D(pool_size=ksize))
b1.add(Dropout(0.2))
b1.add(Flatten())
b2 = Sequential()
b2.add(Dense(64, input_shape = (5000,), activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
for layer in layers:
count = layer
while count > 0:
b2.add(Dense(64,, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
model = Sequential()
model.add(Merge([b1, b2], mode = 'concat'))
model.add(Dense(256, activation='relu', kernel_initializer='normal',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
model.add(Dropout(0.25))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),
metrics=['accuracy'])

This is the minimal example of a model with a variable number of layers using Keras Functional API:
from keras.layers import Input, Conv2D, Dense, Dropout, Flatten, MaxPool2D
from keras.models import Model
def build_model(num_layers, input_shape, num_classes):
input = Input(shape=input_shape)
x = Conv2D(32, (3, 3), activation='relu')(input)
# Suppose you want to find out how many additional convolutional
# layers to add here.
for _ in num_layers:
x = Conv2D(32, (3, 3), activation='relu')(x)
x = MaxPool2D((2, 2))(x)
x = Flatten()(x)
x = Dense(64, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(num_classes, activation='softmax')(x)
return Model(inputs=input, outputs=x)
model = build_model(num_layers=2, input_shape=(128, 128), num_classes=3)
These are the steps I would follow to find out how many 'middle' convolutional layers to use:
Train several models with num_layers parameter set to various values. The code to build all those models is exactly the same, only the value of num_layers parameter changes across different training runs.
Choose the one that has the best values of metrics you care about.
That's it!
Side note: as far as I know, Keras Sequential model isn't deprecated.

You can dynamically set your model structure using the functional API as well. For the convolutional branch you could use something like:
layer_shapes = (64, 64, 32)
for _ in layers:
b1 = Conv1D(128*2, kernel_size=ksize, activation='relu', kernel_regularizer=keras.regularizers.l2(l2_lambda))(b1)
You just need to replace the Sequential.add by the corresponding variable assignment.

Related

Is there a way to selectively connect layers with Keras TensorFlow?

I have an autoencoder type tandem network consisting of a pre-trained forward DNN (weights frozen) taking the output from an untrained inverse DNN. I wish to have a direct mapping between the models such that the output layer of the first network represents the input tensor to the second. I am currently using Keras API sequential model to add dense layers, however, these are fully connected.I've included a diagram here (please have a look)
Here is a snippet of my code:
(`#tandem architecture (with weights loaded from pre trained model)
Tandem = keras.models.Sequential()
Tandem.add(Dense(2, name = 'CIE_input'))
Tandem.add(Dense(1000, activation='relu', name = 'IH1'))
Tandem.add(Dense(1000, activation='relu', name = 'IH2'))
Tandem.add(Dense(3, name = 'Iout')) #need to feed a 3 layer input to FDNN
#FDNN for prediction:
Tandem.add(Dense(3, name = 'input',trainable = False))
Tandem.add(Dense(1000, activation='relu', name = 'FH1', trainable = False))
Tandem.add(Dense(1000, activation='relu', name = 'FH2', trainable = False))
Tandem.add(Dense(1000, activation='relu', name = 'FH3', trainable = False))
Tandem.add(Dense(1000, activation='relu', name = 'FH4', trainable = False))
Tandem.add(Dense(1000, activation='relu', name = 'FH5', trainable = False))
Tandem.add(Dense(1000, activation='relu', name = 'FH6', trainable = False))
Tandem.add(Dense(1000, activation='relu', name = 'FH7', trainable = False))
Tandem.add(Dense(1000, activation='relu', name = 'FH8', trainable = False))
Tandem.add(Dense(1000, activation='relu', name = 'FH9', trainable = False))
Tandem.add(Dense(1000, activation='relu', name = 'FH10', trainable = False))
Tandem.add(Dense(2, name = 'output')) # output layer (predicted colour (CIE))
Tandem.compile(loss='mse', optimizer='adam',metrics=['mean_squared_error','accuracy'])
#train the model for one batch to initialize variables (needed before loading weights by name)
Tandem.train_on_batch(y_train[:1], y_train[:1])
#load weights from pre-trained model
Tandem.load_weights('/content/gdrive/My Drive/Colab Notebooks/Models/FDNN_Weights.h5', by_name=True)`
In addition, I would like to make the connection between the two networks fixed and not allow rescaling. I am new to TensorFlow and Keras (as well as StackOverflow) so I'd be very appreciative of any advice on how to do this simply.
I recommend using the Functional API in tf.keras. It helps you to create models with several inner connections and multiple inputs and outputs.
Here is the official TensorFlow document.
Also, I recommend these posts (1 , 2 , 3 ).
I found an error in my code; I had not defined the input shape to the sequential model. I have now removed the layers named 'input CIE' and 'input', defining the input dimensions of layers 'IH1' and 'FH1' to be 2 and 3 respectively. This proper model definition allows for the models to be connected directly, forcing the output of the inverse model to converge to 3 values.

How to apply triplet loss function in resnet50 for the purpose of deepranking

I try to create image embeddings for the purpose of deep ranking using a triplet loss function. The idea is that we can take a pretrained CNN (e.g. resnet50 or vgg16), remove the FC layers and add an L2 normalization function to retrieve unit vectors which can then be compared via a distance metric (e.g. cosine similarity). As far as I understand the predicted vectors that come out of a pretrained CNN are not optimal, but are a good start. By adding the triplet loss function we can re-train the network to keep similar pictures 'close' to each other and different pictures 'far' apart in the feature space. Inspired by this notebook , I tried to setup the following code, but I get an error ValueError: The name "conv1_pad" is used 3 times in the model. All layer names should be unique..
# Anchor, Positive and Negative are numpy arrays of size (200, 256, 256, 3), same for the test images
pic_size=256
def shared_dnn(inp):
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(3, pic_size, pic_size),
input_tensor=inp)
x = base_model.output
x = Flatten()(x)
x = Lambda(lambda x: K.l2_normalize(x,axis=1))(x)
for layer in base_model.layers[15:]:
layer.trainable = False
return x
anchor_input = Input((3, pic_size,pic_size ), name='anchor_input')
positive_input = Input((3, pic_size,pic_size ), name='positive_input')
negative_input = Input((3, pic_size,pic_size ), name='negative_input')
encoded_anchor = shared_dnn(anchor_input)
encoded_positive = shared_dnn(positive_input)
encoded_negative = shared_dnn(negative_input)
merged_vector = concatenate([encoded_anchor, encoded_positive, encoded_negative], axis=-1, name='merged_layer')
model = Model(inputs=[anchor_input,positive_input, negative_input], outputs=merged_vector)
#ValueError: The name "conv1_pad" is used 3 times in the model. All layer names should be unique.
model.compile(loss=triplet_loss, optimizer=adam_optim)
model.fit([Anchor,Positive,Negative],
y=Y_dummy,
validation_data=([Anchor_test,Positive_test,Negative_test],Y_dummy2), batch_size=512, epochs=500)
I am new to keras and I am not quite sure how to solve this. The author in the link above creates his own CNN from scratch, but I would like to build it upon resnet (or vgg16). How can I configure ResNet50 to use a triplet loss function (in the link above you find also the source code for the triplet loss function).
In your ResNet50 definition, you've written
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(3, pic_size, pic_size), input_tensor=inp)
Remove the input_tensor argument. Change input_shape=inp.
If you're using TF backend as you mentioned the input should be (256, 256, 3), then your input should be (pic_size, pic_size, 3).
def shared_dnn(inp):
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=inp)
x = base_model.output
x = Flatten()(x)
x = Lambda(lambda x: K.l2_normalize(x,axis=1))(x)
for layer in base_model.layers[15:]:
layer.trainable = False
return x
img_shape=(256, 256, 3)
anchor_input = Input(img_shape, name='anchor_input')
positive_input = Input(img_shape, name='positive_input')
negative_input = Input(img_shape, name='negative_input')
encoded_anchor = shared_dnn(anchor_input)
encoded_positive = shared_dnn(positive_input)
encoded_negative = shared_dnn(negative_input)
merged_vector = concatenate([encoded_anchor, encoded_positive, encoded_negative], axis=-1, name='merged_layer')
model = Model(inputs=[anchor_input,positive_input, negative_input], outputs=merged_vector)
model.compile(loss=triplet_loss, optimizer=adam_optim)
model.fit([Anchor,Positive,Negative],
y=Y_dummy,
validation_data=([Anchor_test,Positive_test,Negative_test],Y_dummy2), batch_size=512, epochs=500)
The model plot is as follows:
model_plot

Dimension errors in neural network in Keras

I am trying to implement a neural network where I merge/concatenate a fully connected neural network with a convolution neural network. But when I fit the model, I get the following error:
ValueError: All input arrays (x) should have the same number of
samples. Got array shapes: [(1, 100, 60, 4500), (100, 4500)]
I have two different inputs:
image(dimensions: 1,100,60,4500) where 1 is the channel, 100: # of sample, 60*4500 (dimension of my image). This goes to my convolution neural network
positions(dimensions: 100,4500): where 100 refers to samples.
Dimension for my output is 100,2.
The code for my neural network is:
###Convolution neural network
b1 = Sequential()
b1.add(Conv2D(128*2, kernel_size=3,activation='relu',data_format='channels_first',
input_shape=(100,60,4500)))
b1.add(Conv2D(128*2, kernel_size=3, activation='relu'))
b1.add(Dropout(0.2))
b1.add(Conv2D(128*2, kernel_size=4, activation='relu'))
b1.add(Dropout(0.2))
b1.add(Flatten())
b1.summary()
###Fully connected feed forward neural network
b2 = Sequential()
b2.add(Dense(64, input_shape = (4500,), activation='relu'))
b2.add(Dropout(0.1))
b2.summary()
model = Sequential()
###Concatenating the two networks
concat = concatenate([b1.output, b2.output], axis=-1)
x = Dense(256, activation='relu', kernel_initializer='normal')(concat)
x = Dropout(0.25)(x)
output = Dense(2, activation='softmax')(x)
model = Model([b1.input, b2.input], [output])
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),
metrics=['accuracy'])
history = model.fit([image, positions], Ytest, batch_size=10,
epochs=1,
verbose=1)
Also, the reason why my 'image' array is 4 dimensional is because in the beginning it was just (100,60,4500) but then I ran into the following error:
ValueError: Error when checking input: expected conv2d_10_input to
have 4 dimensions, but got array with shape (100, 60, 4500)
And upon googling I found out that it expects # of channels as an input too. And after I added the # of channel, this error went away but then I ran into the other error that I mentioned in the beginning.
So can someone tell me how to solve for the error (the one I specified in the beginning)? Help would be appreciated.
It is not a good practice to mix Sequential and Functional API.
You can implement the model like this
i1 = Input(shape=(1, 60, 4500))
c1 = Conv2D(128*2, kernel_size=3,activation='relu',data_format='channels_first')(i1)
c1 = Conv2D(128*2, kernel_size=3, activation='relu')(c1)
c1 = Dropout(0.2)(c1)
c1 = Conv2D(128*2, kernel_size=4, activation='relu')(c1)
c1 = Dropout(0.2)(c1)
c1 = Flatten()(c1)
i2 = Input(shape=(4500, ))
c2 = Dense(64, input_shape = (4500,), activation='relu')(i2)
c2 = Dropout(0.2)(c2)
c = concatenate([c1, c2])
x = Dense(256, activation='relu', kernel_initializer='normal')(c)
x = Dropout(0.25)(x)
output = Dense(2, activation='softmax')(x)
model = Model([i1, i2], [output])
model.summary()
Note the shape of i1 is shape=(1, 60, 4500). You have set data_format='channels_first' in Conv2D layer hence you need 1 in the beginning.
Compiled the model like this
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
Placeholder data
import numpy as np
X_img = np.zeros((100, 1, 60, 4500))
X_pos = np.ones((100, 4500))
Y = np.zeros((100, 2))
Training
history = model.fit([X_img, X_pos], Y, batch_size=1,
epochs=1,
verbose=1)
You number of samples (batch size) should always be the first dimension. So, your data should have shape (100, 1, 60, 4500) for image and (100, 4500) for positions. The argument channels_first for the Conv2D layer means that the channels is the first non-batch dimension.
You also need to change the input shape to (1, 60, 4500) in the first Conv2D layer.

Add LSTM layer after Conv2D layers and add some other inputs

I'm working on a racing game that uses reinforcement learning. To train the model I'm facing an issue when implementing the neural network. I found some examples that use CNN. But it seems like adding extra LSTM layer will increase the model efficiency. I found the following example.
https://team.inria.fr/rits/files/2018/02/ICRA18_EndToEndDriving_CameraReady.pdf
The network I need to implement
The problem is I'm not sure how can I implement the LSTM layer here. How can I give following inputs to LSTM layer
Processed image output
current speed
last action
Here is the code I'm currently using. I want to add the LSTM layer after Conv2D.
self.__nb_actions = 28
self.__gamma = 0.99
#Define the model
activation = 'relu'
pic_input = Input(shape=(59,255,3))
img_stack = Conv2D(16, (3, 3), name='convolution0', padding='same', activation=activation, trainable=train_conv_layers)(pic_input)
img_stack = MaxPooling2D(pool_size=(2,2))(img_stack)
img_stack = Conv2D(32, (3, 3), activation=activation, padding='same', name='convolution1', trainable=train_conv_layers)(img_stack)
img_stack = MaxPooling2D(pool_size=(2, 2))(img_stack)
img_stack = Conv2D(32, (3, 3), activation=activation, padding='same', name='convolution2', trainable=train_conv_layers)(img_stack)
img_stack = MaxPooling2D(pool_size=(2, 2))(img_stack)
img_stack = Flatten()(img_stack)
img_stack = Dropout(0.2)(img_stack)
img_stack = Dense(128, name='rl_dense', kernel_initializer=random_normal(stddev=0.01))(img_stack)
img_stack=Dropout(0.2)(img_stack)
output = Dense(self.__nb_actions, name='rl_output', kernel_initializer=random_normal(stddev=0.01))(img_stack)
opt = Adam()
self.__action_model = Model(inputs=[pic_input], outputs=output)
self.__action_model.compile(optimizer=opt, loss='mean_squared_error')
self.__action_model.summary()
Thanks
There are various methods to do that, First, reshape the output of conv output and feed it to lstm layer. Here is an explained example with various method Shaping data for LSTM, and feeding output of dense layers to LSTM

How to have 2 inputs in a Dense network with Keras?

Most tutorials I've followed shows how I can give a single input into the first layer of a Dense network with something like this using Keras:
Inp = Input(shape=(1,))
x = Dense(100, activation='relu', name = "Dense_1")(Inp)
x = Dense(100, activation='relu', name = "Dense_2")(x)
output = Dense(50, activation='softmax', name = "outputL")(x)
However, if I want to provide 2 or more inputs into the first layer of a Dense network, how can I do so with Keras? The idea is just simply to have 2 inputs of x1 and x2, like this:
I've tried something like this which I've modified from snippets found on one of the pages in the Keras documentation:
Inp1 = Input(shape=(1,))
Inp2 = Input(shape=(1,))
Inp = keras.layers.concatenate([Inp1, Inp2])
x = Dense(100, activation='relu', name = "Dense_1")(Inp)
x = Dense(100, activation='relu', name = "Dense_2")(x)
output = Dense(50, activation='softmax', name = "outputL")(x)
res = model.fit([x1_train, x2_train], y_train,
validation_data=([x1_test, x2_test], y_test))
But so the far, the results that I'm getting from the model training appears to have ridiculously low accuracy. Is what I've done what I've actually intended?

Resources