I made a keras CNN model to predict different hand poses, and the model was not predicting the correct output. I had 10 classes. But for some images it was showing results like [0, 1, 0, 0, 1, 0, 0, 0, 0, 0]. My question is why is this happening.
My Architecture.
model = Sequential()
model.add(Conv2D(32, (5,5), input_shape=x.shape[1:]))
model.add(Conv2D(32, (5,5), input_shape=x.shape[1:]))
model.add(Conv2D(32, (5,5), input_shape=x.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(2,2))
model.add(Conv2D(64, (3,3), input_shape=x.shape[1:]))
model.add(Conv2D(64, (3,3), input_shape=x.shape[1:]))
model.add(Conv2D(64, (3,3), input_shape=x.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(2,2))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer = 'adam',
metrics = ['accuracy']
)
model.fit(x, y, epochs=10)
You are using binary_crossentropy loss which should be used for binary classification problems. For multiclass problems you should be using categorical_crossentropy. You might also want to change the activation on the last layer to softmax
This is the obvious engineering issue I can see; having said that, you probably will have to experiment with the number of layers, epochs, learning rates etc to get a working model.
Related
I'm trying to build a Univariate Time Series forecasting model. The current architecture is looking
like this :
model = Sequential()
model.add(Bidirectional(LSTM(20, return_sequences=True), input_shape=(n_steps_in, n_features)))
model.add(Bidirectional(LSTM(20, return_sequences=True)))
model.add(Conv1D(64, 3, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(n_steps_out))
Then I tried the following, which places all CNN layers before Bi-LSTM layers (but doesn't work):
model = Sequential()
model.add(Conv1D(64, 3, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(Conv1D(64, 3, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Bidirectional(LSTM(20, input_shape=(n_steps_in, n_features), return_sequences=True)))
model.add(Bidirectional(LSTM(20, return_sequences=True)))
model.add(Dense(100, activation='relu'))
model.add(Dense(n_steps_out))
The latest implementation doesn't seem to work. Any suggestions of fixing this ? Another question I had was, is there a one method approach to decide if CNN should come before Bi-LSTM of vice-versa ?
May I know what exactly you mean by saying it doesn't seem to work?
Because I'd rather you perform any convolutions before the LSTM as you've done in the second approach. But here are a few things to note.
First, you are only supposed to return the sequences from an LSTM layer, only when the next layer is also LSTM:
model.add(Bidirectional(LSTM(20, input_shape=(n_steps_in, n_features), return_sequences=True)))
model.add(Bidirectional(LSTM(20)))
model.add(Dense(1))
Second, you could try using GlobalAveragePooling1D instead of MaxPooling1D as the latter takes all features into account (an important factor to note compared to classifying images, for instance):
model.add(GlobalAveragePooling1D(pool_size=2))
your network receives as input sequences and output sequences so u need to take care of dimensionality. to do this you have to play with padding in convolutional layers and with pooling operation. you also need to set return_sequences=True in your last LSTM cell (you are predicting a sequence). In the example below I use your network with padding and I delate flattening which destroys the 3D dimensionality
you can apply Convolution before or after LSTM. the best way to do this is to try both and evaluate the performance with a trustable validation set
CNN + LSTM
n_sample = 100
n_steps_in, n_steps_out, n_features = 30, 15, 1
X = np.random.uniform(0,1, (n_sample, n_steps_in, n_features))
y = np.random.uniform(0,1, (n_sample, n_steps_out, n_features))
model = Sequential()
model.add(Conv1D(64, 3, activation='relu', padding='same',
input_shape=(n_steps_in, n_features)))
model.add(Conv1D(64, 3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Bidirectional(LSTM(20, return_sequences=True)))
model.add(Bidirectional(LSTM(20, return_sequences=True)))
model.add(Dense(1))
model.compile('adam', 'mse')
model.summary()
model.fit(X,y, epochs=3)
LSTM + CNN
model = Sequential()
model.add(Bidirectional(LSTM(20, return_sequences=True),
input_shape=(n_steps_in, n_features)))
model.add(Bidirectional(LSTM(20, return_sequences=True)))
model.add(MaxPooling1D(pool_size=2))
model.add(Conv1D(64, 3, activation='relu', padding='same'))
model.add(Conv1D(64, 3, padding='same', activation='relu'))
model.add(Dense(1))
model.compile('adam', 'mse')
model.summary()
model.fit(X,y, epochs=3)
I am trying to train
model.add(Conv2D(32, (3, 3), kernel_initializer='random_uniform', activation='relu', input_shape=(x1, x2, depth)))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(BatchNormalization())
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.4))
model.add(Conv2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(3, activation='softmax'))
Here's how I'm compiling it:
sgd = optimizers.SGD(lr=0.1, decay=0.0, momentum=0.05, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
I've tried various learning rates and different optimizers. But the accuracy doesn't seem to go beyond 50% as shown below:
My images are properly normalized around 0 with STD as 1.
Is there something I am missing? How can I improve the accuracy of the model?
EDIT:
Hey, when I use the following data generator:
train_datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(featurewise_center=True,
featurewise_std_normalization=True)
train_generator = train_datagen.flow(np.array(X_train), batch_size=batchsize)
valid_generator = test_datagen.flow(np.array(X_test), batch_size = batchsize)
history = model.fit_generator(train_datagen.flow(np.array(X_train), y_train_cat, batch_size=batchsize),
steps_per_epoch=len(X_train) // batchsize, epochs=epochs,
validation_data= valid_generator,
validation_steps=len(X_test) // batchsize)
I get the following error:
TypeError: '>' not supported between instances of 'int' and 'str'
I used to solve this by either updating numpy or uninstalling it and installing it again, but this time, it's not working with either. Can you help me with it?
You have already captured things like tweaking learning rates, dropouts, batchnormalization etc - which is a good starting point to tweak.
Have you tried regularization?
Check out
https://cambridgespark.com/content/tutorials/neural-networks-tuning-techniques/index.html
If it doesnt help, you might need to look at how the input is structured and see if there are other ways that more helpful for the network to converge. This includes making sure that train and validation has same level of variance in data etc. This is however, more domain specific to what you are trying to solve.
model.add(Conv2D(32, (5, 5),
padding='same',
data_format='channels_last',
input_shape=input_shape))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.25))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
This is what my current keras model looks like, which is completely borrowed from here.
My question has two parts,
1. How do I automatically determine whether to use model.add(Conv2D(32, (3, 3))) or model.add(Conv2D(32, (5,5))) or model.add(Conv2D(32, (4,4))) ?
2. except the first line of the model, If I change the rest of the conv2D(64, (3,3)) models to (5,5) I get `negative dimension obtained by subtracting 5 from 3) . why is that?
I looked at these two questions : Selecting number of strides and filters in CNN (Keras) and Conv2D layer output shape in keras
according to them experimenting is the only way to find out, but I was wondering if there is an automatic way to do it.
Because there are so many parameters such as value of dropout , kernel_size() and then value of Dense() should it be 512/356 or how much is the best.
PS:
Running different models with different parameters is becoming computationally expensive, and comparing all these results is becoming another painful process.
My laptop has a 2GB nvidia graphic card with 5.0 compute capability.
The kernels dimensions are hyper parameters that you can automatically optimize using a number of strategies. Here are a couple of tips for that
The output height/width of a convolutional layer follows the equation size = ((input_size - kernel_size) / stride) + 1. So you're using too many convolutional layers for an image that is too small. At some point size will be negative and you can't have a negative shaped output
I want to train my model on video data for gesture recognition, proposed using LSTM's and TimeDistributed layers. Would this be ideal way to tackle my problem?
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(Conv2D(32, (3, 3)))
model.add(Dropout(0.25))
model.add(Conv2D(32, (3, 3)))
model.add(MaxPooling2D(pool_size=pool_size))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# Run epochs of sampling data then training
For temporal sequence data LSTM networks are generally the right choice. If you want to analyze video then a combination with 2d convolutions sounds reasonable to me. However, you have to apply TimeDistributed on all layers which dont expect sequence data. In your example that means all laysers expect LSTM.
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(Dropout(0.25))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(TimeDistributed(MaxPooling2D(pool_size=pool_size)))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# run epochs of sampling data then training
The last Dense Layer can stay this way because the final lstm doesnt output a sequence.
I'm trying to get softmax probabilities from a net whose last layer is a softmax layer and when I use model.predict() I get classes instead probabilities. Could anyone tell how to get probabilities.
model = Sequential()
model.add(Convolution2D(32, 3, 3,input_shape=(32, 32, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.5))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.5))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dense(43))
model.add(Activation('softmax'))
Your model's outputs will be values between 0 and 1.
Your model should give a vector of size 43 and the sum of all outputs will add to one.
Depending on your training, these "probabilities" will often be almost one for the selected class if similar to the training examples, showing that the model was well trained.