I want to train my model on video data for gesture recognition, proposed using LSTM's and TimeDistributed layers. Would this be ideal way to tackle my problem?
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(Conv2D(32, (3, 3)))
model.add(Dropout(0.25))
model.add(Conv2D(32, (3, 3)))
model.add(MaxPooling2D(pool_size=pool_size))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# Run epochs of sampling data then training
For temporal sequence data LSTM networks are generally the right choice. If you want to analyze video then a combination with 2d convolutions sounds reasonable to me. However, you have to apply TimeDistributed on all layers which dont expect sequence data. In your example that means all laysers expect LSTM.
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(Dropout(0.25))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(TimeDistributed(MaxPooling2D(pool_size=pool_size)))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# run epochs of sampling data then training
The last Dense Layer can stay this way because the final lstm doesnt output a sequence.
Related
I'm trying to construct an encoder to get the latent space in order to plot it. I don't really know if I can get it from the RepeatVector or if I have to add a Dense layer.
Here is my code:
model = Sequential()
model.add(LSTM(16, activation='relu', return_sequences=True, input_shape= (x_train.shape[1], 1)))
model.add(LSTM(4, activation='relu', return_sequences=False)) #Encoder
model.add(RepeatVector(X_train.shape[1])) #Latent
model.add(LSTM(4, activation='relu', return_sequences=True)) #Decoder
model.add(LSTM(16, activation='relu', return_sequences=False)) #Decoder
model.add(TimeDistributed(Dense(X_train.shape[2]))) #Decoder
You need to separate the model into two parts (encoder and decoder).
Then, build the encoder using the input and output of the encoder part.
btw the output would be the last layer before using RepeatVector.
encoder = Model(inputs, output_from_encoder)
i am using fashion_mnist images database (60,000 small square 28×28 pixel grayscale images) and i am trying to apply CNN-LSTM in cascading, this is the code i am using:
from tensorflow.keras.datasets import fashion_mnist
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
print("Shape of x_train: {}".format(x_train.shape))
print("Shape of y_train: {}".format(y_train.shape))
print()
print("Shape of x_test: {}".format(x_test.shape))
print("Shape of y_test: {}".format(y_test.shape))
# define CNN model
model = Sequential()
model.add(TimeDistributed(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=(60000,28,28))))
model.add(TimeDistributed(Conv2D(64, (3, 3), activation='relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed((Dropout(0.25))))
model.add(TimeDistributed(Flatten()))
## LSTM
model.add(LSTM(200, activation='relu', return_sequences=True))
model.add(Dense(128, activation='relu'))
model.compile(loss='categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
##fitting model
model.fit(x_train,y_train,epochs=5)
test_loss, test_acc=model.evaluate(x_test,y_test)
print('Loss: {0}-Acc:{1}')
print(test_acc)
i get the error after running the fitting line, can any one help me solving the error.
Define an Input layer with a four-channel output instead of defining the input with Conv2D layer.
I have a code below which implements an architecture (in grid search), to yield appropriate parameters for input, nodes, epochs, batch size and differenced time series input.
The challenge I have is to convert the neural network from just having one LSTM hidden layer, to multiple LSTM hidden layers.
At the moment, I could only run the code with Dense-type hidden layers, without having any errors thrown, otherwise I get dimension errors, tuple errors and so on.
The problem is only persistent in the neural network architecture section.
Original code that works:
def model_fit(train, config):
# unpack config
n_input, n_nodes, n_epochs, n_batch, n_diff = config
# Data
if n_diff > 0:
train = difference(train, n_diff)
# Time series to supervised format
data = series_to_supervised(train, n_in=n_input)
train_x, train_y = data[:, :-1], data[:, -1]
# Reshaping input data into [samples, timesteps, features]
n_features = 1
train_x = train_x.reshape((train_x.shape[0], train_x.shape[1], n_features))
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features)))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(1))
# Compile model (Grid search architecture)
model.compile(loss='mse', optimizer='adam')
# fit model
model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch, verbose=0)
return model
Modified LSTM-hidden layer code, that fails to run:
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
Another variant that also threw an error - ValueError: Error when checking target: expected time_distributed_4 to have 3 dimensions, but got array with shape (34844, 1)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=False))
model.add(RepeatVector(n_input))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))
Could anyone with any suggestion please help me ?
Try to set return_sequences=False at the last layer.
Combine two models DropBlock(dimesnions=4), and CuDNNGRU(dimension=3) for MNIST image Classification. How can I combine different dimensions of Drop Block and GRU models?
GRU mostly used for Document classification but here I'm trying to classify MNIST dataset. Both Models separately score ~98%. I want to combine both DropBlock and GRU Model in a single classification Model.
The codes of combining the DropBlock and GRU model for MNIST classification.
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train/255.0
x_test = x_test/255.0
print(x_train.shape)
print(x_train[0].shape)
#load the model
model = Sequential()
gru = Sequential()
gru.add(CuDNNGRU(128,input_shape=(x_train.shape[1:]),
return_sequences=True))
gru.add(BatchNormalization())
gru.add(CuDNNGRU(128))
gru.add(BatchNormalization())
print(gru.output_shape)
drop = Sequential()
drop.add(DropBlock2D(input_shape=(28, 28, 1), block_size=7,
keep_prob=0.8, name='Input-Dropout'))
drop.add(Conv2D(filters=64, kernel_size=3, activation='relu',
padding='same', name='Conv-1'))
drop.add(MaxPool2D(pool_size=2, name='Pool-1'))
drop.add(DropBlock2D(block_size=5, keep_prob=0.8, name='Dropout-1'))
drop.add(Conv2D(filters=32, kernel_size=3, activation='relu',
padding='same', name='Conv-2'))
drop.add(MaxPool2D(pool_size=2, name='Pool-2'))
print(drop.output_shape)
model.add(merge([drop, gru], mode='concat', concat_axis=2))
print(model.output_shape)
model.add(Flatten(name='Flatten'))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
score = model.evaluate(x_test, y_test, verbose=0)
The expected result ~98% results on this classification model.
I have a dataset of about 160k images of 160 classes and I'm trying to classify them using CNN. Training on 120k images for 20 epochs i start with loss ~ 4.9 and val_loss ~ 4.6 which improves to about 3.3 and 3.2 after 20 epochs. I Really tried to read the documentation of Keras and understand what does that mean, but i couldn't, so I'm asking if someone would explain to me in the context of my model what does that mean for my model. I mean what does the loss score represent? What does it say for the model?
num_classes = 154
batch_size = 64
input_shape = (50,50,3)
epochs = 20
X, y = load_data()
# input image dimensions
img_rows, img_cols = 50, 50
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Conv2D(64, kernel_size=(5, 5),
activation='relu',
padding = 'same',
input_shape=input_shape))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3)))
model.add(Dropout(0.20))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
I think it will help to watch a few tutorials online on CNN, to begin with. Basically, you want your loss to reduce with the training epochs which is what is observed in your case. Typically we look at how both losses are evolving over the entire training period. Observing how the train and validation loss is changing helps us have an understand whether the model is overfitting or not. You can check This link for a basic explanation to detect overfitting.
Ideally, you would want both your training as well as test loss to reduce with iterations. It is a measure of the error committed by the model in classification (as accuracy increase you expect the loss to reduce)