Should CNN layers come before Bi-LSTM or after? - keras

I'm trying to build a Univariate Time Series forecasting model. The current architecture is looking
like this :
model = Sequential()
model.add(Bidirectional(LSTM(20, return_sequences=True), input_shape=(n_steps_in, n_features)))
model.add(Bidirectional(LSTM(20, return_sequences=True)))
model.add(Conv1D(64, 3, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(n_steps_out))
Then I tried the following, which places all CNN layers before Bi-LSTM layers (but doesn't work):
model = Sequential()
model.add(Conv1D(64, 3, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Bidirectional(LSTM(20, input_shape=(n_steps_in, n_features), return_sequences=True)))
model.add(Bidirectional(LSTM(20, return_sequences=True)))
model.add(Dense(100, activation='relu'))
model.add(Dense(n_steps_out))
The latest implementation doesn't seem to work. Any suggestions of fixing this ? Another question I had was, is there a one method approach to decide if CNN should come before Bi-LSTM of vice-versa ?

May I know what exactly you mean by saying it doesn't seem to work?
Because I'd rather you perform any convolutions before the LSTM as you've done in the second approach. But here are a few things to note.
First, you are only supposed to return the sequences from an LSTM layer, only when the next layer is also LSTM:
model.add(Bidirectional(LSTM(20, input_shape=(n_steps_in, n_features), return_sequences=True)))
model.add(Bidirectional(LSTM(20)))
model.add(Dense(1))
Second, you could try using GlobalAveragePooling1D instead of MaxPooling1D as the latter takes all features into account (an important factor to note compared to classifying images, for instance):
model.add(GlobalAveragePooling1D(pool_size=2))

your network receives as input sequences and output sequences so u need to take care of dimensionality. to do this you have to play with padding in convolutional layers and with pooling operation. you also need to set return_sequences=True in your last LSTM cell (you are predicting a sequence). In the example below I use your network with padding and I delate flattening which destroys the 3D dimensionality
you can apply Convolution before or after LSTM. the best way to do this is to try both and evaluate the performance with a trustable validation set
CNN + LSTM
n_sample = 100
n_steps_in, n_steps_out, n_features = 30, 15, 1
X = np.random.uniform(0,1, (n_sample, n_steps_in, n_features))
y = np.random.uniform(0,1, (n_sample, n_steps_out, n_features))
model = Sequential()
model.add(Conv1D(64, 3, activation='relu', padding='same',
input_shape=(n_steps_in, n_features)))
model.add(Conv1D(64, 3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Bidirectional(LSTM(20, return_sequences=True)))
model.add(Bidirectional(LSTM(20, return_sequences=True)))
model.add(Dense(1))
model.compile('adam', 'mse')
model.summary()
model.fit(X,y, epochs=3)
LSTM + CNN
model = Sequential()
model.add(Bidirectional(LSTM(20, return_sequences=True),
input_shape=(n_steps_in, n_features)))
model.add(Bidirectional(LSTM(20, return_sequences=True)))
model.add(MaxPooling1D(pool_size=2))
model.add(Conv1D(64, 3, activation='relu', padding='same'))
model.add(Conv1D(64, 3, padding='same', activation='relu'))
model.add(Dense(1))
model.compile('adam', 'mse')
model.summary()
model.fit(X,y, epochs=3)

Related

How to determine the number of layers your CNN has?

How to determine the number layers you have in a CNN. For example, in the code snippet given below, how can you determine the number of layers in the CNN?
CODE
# Construct model
model = Sequential()
model.add(Conv2D(filters=16, kernel_size=2, input_shape=(num_rows, num_columns, num_channels), activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(filters=32, kernel_size=2, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(filters=64, kernel_size=2, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(filters=128, kernel_size=2, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(GlobalAveragePooling2D())
model.add(Dense(num_labels, activation='softmax'))
do you mean how to count them per hand or a tensorflow function that returns the amount of layers? If second is the case, this should be it:
layer_amount = len(model.layers)

LSTM Grid Search

I have a code below which implements an architecture (in grid search), to yield appropriate parameters for input, nodes, epochs, batch size and differenced time series input.
The challenge I have is to convert the neural network from just having one LSTM hidden layer, to multiple LSTM hidden layers.
At the moment, I could only run the code with Dense-type hidden layers, without having any errors thrown, otherwise I get dimension errors, tuple errors and so on.
The problem is only persistent in the neural network architecture section.
Original code that works:
def model_fit(train, config):
# unpack config
n_input, n_nodes, n_epochs, n_batch, n_diff = config
# Data
if n_diff > 0:
train = difference(train, n_diff)
# Time series to supervised format
data = series_to_supervised(train, n_in=n_input)
train_x, train_y = data[:, :-1], data[:, -1]
# Reshaping input data into [samples, timesteps, features]
n_features = 1
train_x = train_x.reshape((train_x.shape[0], train_x.shape[1], n_features))
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features)))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(1))
# Compile model (Grid search architecture)
model.compile(loss='mse', optimizer='adam')
# fit model
model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch, verbose=0)
return model
Modified LSTM-hidden layer code, that fails to run:
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
Another variant that also threw an error - ValueError: Error when checking target: expected time_distributed_4 to have 3 dimensions, but got array with shape (34844, 1)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=False))
model.add(RepeatVector(n_input))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))
Could anyone with any suggestion please help me ?
Try to set return_sequences=False at the last layer.

Can a Keras CNN predict multiple classes?

I made a keras CNN model to predict different hand poses, and the model was not predicting the correct output. I had 10 classes. But for some images it was showing results like [0, 1, 0, 0, 1, 0, 0, 0, 0, 0]. My question is why is this happening.
My Architecture.
model = Sequential()
model.add(Conv2D(32, (5,5), input_shape=x.shape[1:]))
model.add(Conv2D(32, (5,5), input_shape=x.shape[1:]))
model.add(Conv2D(32, (5,5), input_shape=x.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(2,2))
model.add(Conv2D(64, (3,3), input_shape=x.shape[1:]))
model.add(Conv2D(64, (3,3), input_shape=x.shape[1:]))
model.add(Conv2D(64, (3,3), input_shape=x.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(2,2))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer = 'adam',
metrics = ['accuracy']
)
model.fit(x, y, epochs=10)
You are using binary_crossentropy loss which should be used for binary classification problems. For multiclass problems you should be using categorical_crossentropy. You might also want to change the activation on the last layer to softmax
This is the obvious engineering issue I can see; having said that, you probably will have to experiment with the number of layers, epochs, learning rates etc to get a working model.

How to increase the accuracy of the keras model and prevent overfitting

I am trying to train
model.add(Conv2D(32, (3, 3), kernel_initializer='random_uniform', activation='relu', input_shape=(x1, x2, depth)))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(BatchNormalization())
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.4))
model.add(Conv2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(3, activation='softmax'))
Here's how I'm compiling it:
sgd = optimizers.SGD(lr=0.1, decay=0.0, momentum=0.05, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
I've tried various learning rates and different optimizers. But the accuracy doesn't seem to go beyond 50% as shown below:
My images are properly normalized around 0 with STD as 1.
Is there something I am missing? How can I improve the accuracy of the model?
EDIT:
Hey, when I use the following data generator:
train_datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(featurewise_center=True,
featurewise_std_normalization=True)
train_generator = train_datagen.flow(np.array(X_train), batch_size=batchsize)
valid_generator = test_datagen.flow(np.array(X_test), batch_size = batchsize)
history = model.fit_generator(train_datagen.flow(np.array(X_train), y_train_cat, batch_size=batchsize),
steps_per_epoch=len(X_train) // batchsize, epochs=epochs,
validation_data= valid_generator,
validation_steps=len(X_test) // batchsize)
I get the following error:
TypeError: '>' not supported between instances of 'int' and 'str'
I used to solve this by either updating numpy or uninstalling it and installing it again, but this time, it's not working with either. Can you help me with it?
You have already captured things like tweaking learning rates, dropouts, batchnormalization etc - which is a good starting point to tweak.
Have you tried regularization?
Check out
https://cambridgespark.com/content/tutorials/neural-networks-tuning-techniques/index.html
If it doesnt help, you might need to look at how the input is structured and see if there are other ways that more helpful for the network to converge. This includes making sure that train and validation has same level of variance in data etc. This is however, more domain specific to what you are trying to solve.

How can I train on video data using Keras? "transfer learning"

I want to train my model on video data for gesture recognition, proposed using LSTM's and TimeDistributed layers. Would this be ideal way to tackle my problem?
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(Conv2D(32, (3, 3)))
model.add(Dropout(0.25))
model.add(Conv2D(32, (3, 3)))
model.add(MaxPooling2D(pool_size=pool_size))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# Run epochs of sampling data then training
For temporal sequence data LSTM networks are generally the right choice. If you want to analyze video then a combination with 2d convolutions sounds reasonable to me. However, you have to apply TimeDistributed on all layers which dont expect sequence data. In your example that means all laysers expect LSTM.
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(Dropout(0.25))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(TimeDistributed(MaxPooling2D(pool_size=pool_size)))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# run epochs of sampling data then training
The last Dense Layer can stay this way because the final lstm doesnt output a sequence.

Resources