How to have 2 inputs in a Dense network with Keras? - python-3.x

Most tutorials I've followed shows how I can give a single input into the first layer of a Dense network with something like this using Keras:
Inp = Input(shape=(1,))
x = Dense(100, activation='relu', name = "Dense_1")(Inp)
x = Dense(100, activation='relu', name = "Dense_2")(x)
output = Dense(50, activation='softmax', name = "outputL")(x)
However, if I want to provide 2 or more inputs into the first layer of a Dense network, how can I do so with Keras? The idea is just simply to have 2 inputs of x1 and x2, like this:
I've tried something like this which I've modified from snippets found on one of the pages in the Keras documentation:
Inp1 = Input(shape=(1,))
Inp2 = Input(shape=(1,))
Inp = keras.layers.concatenate([Inp1, Inp2])
x = Dense(100, activation='relu', name = "Dense_1")(Inp)
x = Dense(100, activation='relu', name = "Dense_2")(x)
output = Dense(50, activation='softmax', name = "outputL")(x)
res = model.fit([x1_train, x2_train], y_train,
validation_data=([x1_test, x2_test], y_test))
But so the far, the results that I'm getting from the model training appears to have ridiculously low accuracy. Is what I've done what I've actually intended?

Related

ValueError in multiple input model

I'm creating a multi input model where i concatenate a CNN model and a LSTM model. The lstm model contains the last 5 events and the CNN contains a picture of the last event. Both are organized so that each element k in the numpy matches the 5 events and the corresponding picture, as do the output labels which is the 'next' event that should be predicted by the model.
chanDim = -1
inputs = Input(shape=inputShape)
x = inputs
x = Dense(128)(x)
x = Activation("relu")(x)
x = BatchNormalization(axis=chanDim)(x)
x = Dropout(0.3)(x)
x = Flatten()(x)
x = Activation("relu")(x)
x = BatchNormalization(axis=chanDim)(x)
x = Dropout(0.1)(x)
x = Activation("relu")(x)
model_cnn = Model(inputs, x)
This creates the CNN model, and the following code represents the LSTM model
hidden1 = LSTM(128)(visible)
hidden2 = Dense(64, activation='relu')(hidden1)
output = Dense(10, activation='relu')(hidden2)
model_lstm = Model(inputs=visible, outputs=output)
Now, when I combine these models and extend them using a simple dense layer to make the multiclass prediction of 14 classes, all the inputs match and I can concat the (none, 10) and (none, 10) into a (none, 20) for the MLP:
x = Dense(14, activation="softmax")(x)
model_mlp = Model(inputs=[model_lstm.input, model_cnn.input], outputs=x)
This all works fine until I try to compile the model it gives me an error concerning the input of the last dense layer of the mlp model:
ValueError: Error when checking target: expected dense_121 to have shape (14,) but got array with shape (1,)
Do you know how this is possible? If you need more information I'm happy to provide that
your target must be (None, 14) dimensional. with softmax you have to one-hot encode the output
try this:
y = pd.get_dummies(np.concatenate([y_train, y_test])).values
y_train = y[:len(y_train)]
y_test = y[len(y_train):]

Keras - Proper way to extract weights from a nested model

I have a nested model which has an input layer, and has some final dense layers before the output. Here is the code for it:
image_input = Input(shape, name='image_input')
x = DenseNet121(input_shape=shape, include_top=False, weights=None,backend=keras.backend,
layers=keras.layers,
models=keras.models,
utils=keras.utils)(image_input)
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(1024, activation='relu', name='dense_layer1_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Dense(512, activation='relu', name='dense_layer2_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
output = Dense(num_class, activation='softmax', name='image_output')(x)
classificationModel = Model(inputs=[image_input], outputs=[output])
Now If say I wanted to extract the densenets weights from this model and perform transfer learning to another larger model which also has the same densenet model nested but also has an some other layers after the dense net such as:
image_input = Input(shape, name='image_input')
x = DenseNet121(input_shape=shape, include_top=False, weights=None,backend=keras.backend,
layers=keras.layers,
models=keras.models,
utils=keras.utils)(image_input)
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(1024, activation='relu', name='dense_layer1_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Dense(512, activation='relu', name='dense_layer2_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Dense(256, activation='relu', name='dense_layer3_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
output = Dense(num_class, activation='sigmoid', name='image_output')(x)
classificationModel = Model(inputs=[image_input], outputs=[output])
Would I need to just do: modelB.load_weights(<weights.hdf5>, by_name=True)? Also should I name the internal densenet? and if so how?
You can, before using the nested model, have it into a variable.
It gets a lot easier to do everything:
densenet = DenseNet121(input_shape=shape, include_top=False,
weights=None,backend=keras.backend,
layers=keras.layers,
models=keras.models,
utils=keras.utils)
image_input = Input(shape, name='image_input')
x = densenet(image_input)
x = GlobalAveragePooling2D(name='avg_pool')(x)
......
Now it's super simple to:
weights = densenet.get_weights()
another_densenet.set_weights(weights)
The loaded file
You can also print a model.summary() of your loaded model. The dense net will be the first or second layer (you must check this).
You can then get it like densenet = loaded_model.layers[i].
You can then transfer these weights to the new dense net, both with the method in the previous answer and with the new_model.layers[i].set_weights(densenet.get_weights())
Perhaps the easiest way to go about this is to use the model you have trained itself without trying to load the model weights. Say you have trained the initial model (copied and pasted from the provided source code with minimal edits to variable name):
image_input = Input(shape, name='image_input')
# ... intermediery layers elided
x = BatchNormalization()(x)
output = Dropout(0.5)(x)
model_output = Dense(num_class, activation='softmax', name='image_output')(output)
smaller_model = Model(inputs=[image_input], outputs=[model_output])
To use the trained weights of this model for a larger model, we can simply declare another model that uses the trained weights, then use that newly defined model as a component of the larger model.
new_model = Model(image_input, output) # Model that uses trained weights
main_input = Input(shape, name='main_input')
x = new_model(main_input)
x = Dense(256, activation='relu', name='dense_layer3_image')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
output = Dense(num_class, activation='sigmoid', name='image_output')(x)
final_model = Model(inputs=[main_input], outputs=[output])
If anything is unclear, I'd be more than happy to elaborate.

Using Tensorboard to monitor training real time and visualize the model architecture

I am learning to use Tensorboard -- Tensorflow 2.0.
In particular, I would like to monitor the learning curves realtime and also to visually inspect and communicate the architecture of my model.
Below I will provide code for a reproducible example.
I have three problems:
Although I get the learning curves once the training is over I don't know what I should do to monitor them in real time
The learning curve I get from Tensorboard does not agree with the plot of history.history. In fact is bizarre and difficult to interpret its reversals.
I can not make sense of the graph. I have trained a sequential model with 5 dense layers and dropout layers in between. What Tensorboard shows me is something which much more elements in it.
My code is the following:
from keras.datasets import boston_housing
(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()
inputs = Input(shape = (train_data.shape[1], ))
x1 = Dense(100, kernel_initializer = 'he_normal', activation = 'elu')(inputs)
x1a = Dropout(0.5)(x1)
x2 = Dense(100, kernel_initializer = 'he_normal', activation = 'elu')(x1a)
x2a = Dropout(0.5)(x2)
x3 = Dense(100, kernel_initializer = 'he_normal', activation = 'elu')(x2a)
x3a = Dropout(0.5)(x3)
x4 = Dense(100, kernel_initializer = 'he_normal', activation = 'elu')(x3a)
x4a = Dropout(0.5)(x4)
x5 = Dense(100, kernel_initializer = 'he_normal', activation = 'elu')(x4a)
predictions = Dense(1)(x5)
model = Model(inputs = inputs, outputs = predictions)
model.compile(optimizer = 'Adam', loss = 'mse')
logdir="logs\\fit\\" + datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)
history = model.fit(train_data, train_targets,
batch_size= 32,
epochs= 20,
validation_data=(test_data, test_targets),
shuffle=True,
callbacks=[tensorboard_callback ])
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.plot(history.history['val_loss'])
I think what you can do is to launch TensorBoard before calling .fit() on your model. If you are using IPython (Jupyter or Colab), and have already installed TensorBoard, here's how you can modify your code;
from keras.datasets import boston_housing
(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()
inputs = Input(shape = (train_data.shape[1], ))
x1 = Dense(100, kernel_initializer = 'he_normal', activation = 'relu')(inputs)
x1a = Dropout(0.5)(x1)
x2 = Dense(100, kernel_initializer = 'he_normal', activation = 'relu')(x1a)
x2a = Dropout(0.5)(x2)
x3 = Dense(100, kernel_initializer = 'he_normal', activation = 'relu')(x2a)
x3a = Dropout(0.5)(x3)
x4 = Dense(100, kernel_initializer = 'he_normal', activation = 'relu')(x3a)
x4a = Dropout(0.5)(x4)
x5 = Dense(100, kernel_initializer = 'he_normal', activation = 'relu')(x4a)
predictions = Dense(1)(x5)
model = Model(inputs = inputs, outputs = predictions)
model.compile(optimizer = 'Adam', loss = 'mse')
logdir="logs\\fit\\" + datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)
In another cell, you can run;
# Magic func to use TensorBoard directly in IPython
%load_ext tensorboard
Launch TensorBoard by running this in another cell;
# Launch TensorBoard with objects in the log directory
# This should launch tensorboard in your browser, but you may not see your metadata.
%tensorboard --logdir=logdir
And you can finally call .fit() on your model in another cell;
history = model.fit(train_data, train_targets,
batch_size= 32,
epochs= 20,
validation_data=(test_data, test_targets),
shuffle=True,
callbacks=[tensorboard_callback ])
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
If you are not using IPython, you probably just have to launch TensorBoard during or before training your model to monitor it in real-time.

Tuning neural network hyperparameters when using Keras functional API

I have a neural network that contains two branches. One branch takes input to a convolution neural network. And other branch is a fully connected layer. I merge these two branches and then get an output using softmax. I can not use a sequential model because it's deprecated and therefore, had to use functional API.
I want to tune the hyperparameters for a convolutional neural network branch. For example, I want to figure out how many convolution layers I should use. If it was a sequential model I would've used a for loop but since I am using a functional API I can't really do that. I've attached my code. Could anyone tell me how can optimise my neural network for number of convolutions in a smart way instead of making a lot of different scripts with different number of convolution layers.
Suggestions would be appreciated.
i1 = Input(shape=(xtest.shape[1], xtest.shape[2]))
###Convolution branch
c1 = Conv1D(128*2, kernel_size=ksize,activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(i1)
c1 = Conv1D(128*2, kernel_size=ksize, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(c1)
c1 = AveragePooling1D(pool_size=ksize)(c1)
c1 = Dropout(0.2)(c1)
c1 = Conv1D(128*2, kernel_size=ksize, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(c1)
c1 = AveragePooling1D(pool_size=ksize)(c1)
c1 = Dropout(0.2)(c1)
c1 = Flatten()(c1)
###fully connected branch
i2 = Input(shape=(5000, ))
c2 = Dense(64, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda))(i2)
c2 = Dropout(0.1)(c2)
###concatenating the two branches
c = concatenate([c1, c2])
x = Dense(256, activation='relu', kernel_initializer='normal',kernel_regularizer=keras.regularizers.l2(l2_lambda))(c)
x = Dropout(0.25)(x)
###Output branch
output = Dense(num_classes, activation='softmax')(x)
model = Model([i1, i2], [output])
model.summary()
With sequential models I can use a for loop so for example:
layers = [1,2,3,4,5]
b1 = Sequential()
b1.add(Conv1D(128*2, kernel_size=ksize,
activation='relu',
input_shape=( xtest.shape[1], xtest.shape[2]),
kernel_regularizer=keras.regularizers.l2(l2_lambda)))
for layer in layers:
count = layer
while count > 0:
b1.add(Conv1D(128*2, kernel_size=ksize, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
count -= 1
b1.add(MaxPooling1D(pool_size=ksize))
b1.add(Dropout(0.2))
b1.add(Flatten())
b2 = Sequential()
b2.add(Dense(64, input_shape = (5000,), activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
for layer in layers:
count = layer
while count > 0:
b2.add(Dense(64,, activation='relu',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
model = Sequential()
model.add(Merge([b1, b2], mode = 'concat'))
model.add(Dense(256, activation='relu', kernel_initializer='normal',kernel_regularizer=keras.regularizers.l2(l2_lambda)))
model.add(Dropout(0.25))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),
metrics=['accuracy'])
This is the minimal example of a model with a variable number of layers using Keras Functional API:
from keras.layers import Input, Conv2D, Dense, Dropout, Flatten, MaxPool2D
from keras.models import Model
def build_model(num_layers, input_shape, num_classes):
input = Input(shape=input_shape)
x = Conv2D(32, (3, 3), activation='relu')(input)
# Suppose you want to find out how many additional convolutional
# layers to add here.
for _ in num_layers:
x = Conv2D(32, (3, 3), activation='relu')(x)
x = MaxPool2D((2, 2))(x)
x = Flatten()(x)
x = Dense(64, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(num_classes, activation='softmax')(x)
return Model(inputs=input, outputs=x)
model = build_model(num_layers=2, input_shape=(128, 128), num_classes=3)
These are the steps I would follow to find out how many 'middle' convolutional layers to use:
Train several models with num_layers parameter set to various values. The code to build all those models is exactly the same, only the value of num_layers parameter changes across different training runs.
Choose the one that has the best values of metrics you care about.
That's it!
Side note: as far as I know, Keras Sequential model isn't deprecated.
You can dynamically set your model structure using the functional API as well. For the convolutional branch you could use something like:
layer_shapes = (64, 64, 32)
for _ in layers:
b1 = Conv1D(128*2, kernel_size=ksize, activation='relu', kernel_regularizer=keras.regularizers.l2(l2_lambda))(b1)
You just need to replace the Sequential.add by the corresponding variable assignment.

Extracting Activation maps from trained neural network

I have a trained cnn model. I am trying to extract the output from each convolutional layer and plot the results to explore which regions of the image have high activations. Any ideas on how to do this?
Below is the network I have trained.
input_shape = (3,227,227)
x = Input(input_shape)
# Conv Layer 1
x = Convolution2D(96, 7,7,subsample=(4,4),activation='relu',
name='conv_1', init='he_normal')(x_input)
x = MaxPooling2D((3, 3), strides=(2,2), name='maxpool')(x)
x = BatchNormalization()(x)
x = ZeroPadding2D((2,2))(x)
# Conv Layer 2
x = Convolution2D(256, 5,5,activation='relu',name='conv_2', init='he_normal')(x)
x = MaxPooling2D((3, 3), strides=(2,2),name='maxpool2')(x)
x = BatchNormalization()(x)
x = ZeroPadding2D((2,2))(x)
# Conv Layer 3
x = Convolution2D(384, 3,3,activation='relu',
name='conv_3', init='he_normal')(x)
x = MaxPooling2D((3, 3), strides=(2,2),name='maxpool3')(x)
x = Flatten()(x)
x = Dense(512, activation = "relu")(x)
x = Dropout(0.5)(x)
x = Dense(512, activation ="relu")(x)
x = Dropout(0.5)(x)
predictions = Dense(2, activation="softmax")(x)
model = Model(inputs = x_input, outputs = predictions)
Thanks!
Look at this GitHub issue and the FAQ How can I obtain the output of an intermediate layer?. It seems the easiest way to do that is defining new models with the outputs that you want. For example:
input_shape = (3,227,227)
x = Input(input_shape)
# Conv Layer 1
# Save layer in a variable
conv1 = Convolution2D(96, 7, 7, subsample=(4,4), activation='relu',
name='conv_1', init='he_normal')(x_input)
x = conv1
x = MaxPooling2D(...)(x)
# ...
conv2 = Convolution2D(...)(x)
x = conv2
# ...
conv3 = Convolution2D(...)(x)
x = conv3
# ...
predictions = Dense(2, activation="softmax")(x)
# Main model
model = Model(inputs=x_input, outputs=predictions)
# Intermediate evaluation model
conv_layers_model = Model(inputs=x_input, outputs=[conv1, conv2, conv3])
# After training is done, retrieve intermediate evaluations for data
conv1_val, conv2_val, conv3_val = conv_layers_model.predict(data)
Note that since you are using the same objects in both models the weights are automatically shared between them.
A more complete example of activation visualization can be found here. In that case they use the K.function approach.

Resources