TimeDistributed Model Not learning - keras

I am trying to train model to play Chrome Dino (the offline game).
The idea was to have 6 last screenshots of the game, use CNN on each separately (to extract features) and then put those features as timesteps into LSTM.
My training data is X = [6 timestep game screenshots] -> y=[1,0] (keep running, jump)
Timestep example
I have even split the database so it has 50% jump examples and 50% keep running examples.
Sadly I am stuck at 50% accuracy and loss is stuck too.
198/198 [==============================] - 0s - loss: 0.6944 - acc: 0.4596
Epoch 91/100
198/198 [==============================] - 0s - loss: 0.6932 - acc: 0.5000
Epoch 92/100
198/198 [==============================] - 0s - loss: 0.6932 - acc: 0.5000
Epoch 93/100
198/198 [==============================] - 0s - loss: 0.6932 - acc: 0.5000
Epoch 94/100
198/198 [==============================] - 0s - loss: 0.6933 - acc: 0.5000
Epoch 95/100
198/198 [==============================] - 0s - loss: 0.6942 - acc: 0.5000
Epoch 96/100
198/198 [==============================] - 0s - loss: 0.6939 - acc: 0.5000
Epoch 97/100
198/198 [==============================] - 0s - loss: 0.6935 - acc: 0.5000
I have tried many model hyperparams with different layers, but I always get the same result.
Current model
model = Sequential()
model.add(TimeDistributed(Convolution2D(64, 3, 3, activation='relu'), input_shape=(FRAMES_TO_PROCESS, FRAME_HEIGHT,FRAME_WIDTH, FRAME_FILTERS )))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2))))
model.add(TimeDistributed(ZeroPadding2D((1,1))))
model.add(TimeDistributed(Convolution2D(64, 3, 3, activation='relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=(2,2))))
model.add(TimeDistributed(ZeroPadding2D((1,1))))
model.add(TimeDistributed(Convolution2D(128, 3, 3, activation='relu')))
model.add(TimeDistributed(ZeroPadding2D((1,1))))
model.add(TimeDistributed(Convolution2D(128, 3, 3, activation='relu')))
model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2), strides=(2,2))))
model.add(Activation('relu'))
model.add(TimeDistributed(Flatten()))
model.add(Dropout(0.1))
model.add(LSTM(120, return_sequences=False))
model.add(Dense(2, activation='softmax'))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])
Any idea what went wrong?

Related

Why the accuracy and val_accuracy are stuck at 0 LSTM Keras classification

I'm trying to train a neural network for classification. The target labels are -1 or 1.
The neural network is as follows:
def build_nn(window,n_features,lr = 0.001):
_input = Input(shape = (window,n_features),name = 'input')
x1 = LSTM(100, input_shape = (window,n_features), return_sequences = True,activation = 'relu')(_input)
x2 = Dropout(0.5)(x1)
x3 = LSTM(50 , return_sequences = False, activation = 'relu')(x2)
x4 = Dropout(0.5)(x3)
x5 = Dense(25,kernel_initializer = 'uniform', activation = 'relu')(x4)
x6 = Dense(10,kernel_initializer = 'uniform', activation = 'relu')(x5)
nn = Model(inputs = [_input],outputs = [x6])
nn.compile(loss='binary_crossentropy',optimizer=Adam(lr = lr),metrics=['accuracy'])
return nn
The inputs are windowed (generally the actual value plus 1-2 past values) and when trying to fit the model, I get that the accuracy and val_accuracy are both 0 for almost like 100 epochs. Sometimes I see a little variation in the accuracy (sometimes even in the val_accuracy) but it goes back to 0.
I'm also using EarlyStopping and ModelCheckpoint (I know that setting patience to 500 will make no difference because the number of epochs are also 500):
early_stopping = EarlyStopping(patience = 500,verbose = True,monitor='val_accuracy')
check_point = ModelCheckpoint('lstm_classifier.hdf5',verbose=1,save_best_only=True,monitor = 'val_accuracy')
This is the model summary:
Model: "model_18"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) [(None, 2, 17)] 0
_________________________________________________________________
lstm_38 (LSTM) (None, 2, 100) 47200
_________________________________________________________________
dropout_38 (Dropout) (None, 2, 100) 0
_________________________________________________________________
lstm_39 (LSTM) (None, 50) 30200
_________________________________________________________________
dropout_39 (Dropout) (None, 50) 0
_________________________________________________________________
dense_36 (Dense) (None, 25) 1275
_________________________________________________________________
dense_37 (Dense) (None, 10) 260
=================================================================
Total params: 78,935
Trainable params: 78,935
Non-trainable params: 0
_________________________________________________________________
And this is a sample of the output when trying to fit the model.
Epoch 1/500
148/148 [==============================] - 10s 32ms/step - loss: 0.5422 - accuracy: 0.0000e+00 - val_loss: 0.2783 - val_accuracy: 0.0000e+00
Epoch 00001: val_accuracy improved from -inf to 0.00000, saving model to lstm_classifier.hdf5
Epoch 2/500
148/148 [==============================] - 4s 29ms/step - loss: 0.4661 - accuracy: 0.0000e+00 - val_loss: 0.2778 - val_accuracy: 0.0000e+00
Epoch 00002: val_accuracy did not improve from 0.00000
Epoch 3/500
148/148 [==============================] - 4s 24ms/step - loss: 0.4656 - accuracy: 0.0000e+00 - val_loss: 0.2772 - val_accuracy: 0.0000e+00
Epoch 00003: val_accuracy did not improve from 0.00000
Epoch 4/500
148/148 [==============================] - 4s 26ms/step - loss: 0.4631 - accuracy: 0.0000e+00 - val_loss: 0.2766 - val_accuracy: 0.0000e+00
Epoch 00004: val_accuracy did not improve from 0.00000
Epoch 5/500
148/148 [==============================] - 4s 29ms/step - loss: 0.4628 - accuracy: 0.0000e+00 - val_loss: 0.2763 - val_accuracy: 0.0000e+00
Epoch 00005: val_accuracy did not improve from 0.00000
Epoch 6/500
148/148 [==============================] - 5s 31ms/step - loss: 0.4608 - accuracy: 0.0000e+00 - val_loss: 0.2759 - val_accuracy: 0.0000e+00
Epoch 00006: val_accuracy did not improve from 0.00000
Epoch 7/500
148/148 [==============================] - 4s 25ms/step - loss: 0.4622 - accuracy: 0.0000e+00 - val_loss: 0.2754 - val_accuracy: 0.0000e+00
Epoch 00007: val_accuracy did not improve from 0.00000
Epoch 8/500
148/148 [==============================] - 4s 30ms/step - loss: 0.4583 - accuracy: 0.0000e+00 - val_loss: 0.2749 - val_accuracy: 0.0000e+00
Epoch 00008: val_accuracy did not improve from 0.00000
Epoch 9/500
148/148 [==============================] - 4s 30ms/step - loss: 0.4596 - accuracy: 0.0000e+00 - val_loss: 0.2748 - val_accuracy: 0.0000e+00
Epoch 00009: val_accuracy did not improve from 0.00000
Epoch 10/500
148/148 [==============================] - 4s 26ms/step - loss: 0.4570 - accuracy: 0.0000e+00 - val_loss: 0.2740 - val_accuracy: 0.0000e+00
Epoch 00010: val_accuracy did not improve from 0.00000
Epoch 11/500
148/148 [==============================] - 4s 27ms/step - loss: 0.4569 - accuracy: 0.0000e+00 - val_loss: 0.2921 - val_accuracy: 0.0000e+00
As you can see, the accuracy and val_accuracy are both stuck at 0, despite the fact that the loss and val_loss are changing. Generally, the accuracy and val_accuracy stay both at 0. Sometimes the model finds a better val_accuracy, but then it goes back to 0.
Epoch 00274: val_accuracy did not improve from 0.46822
Epoch 275/500
148/148 [==============================] - 4s 27ms/step - loss: 0.5844 - accuracy: 0.0000e+00 - val_loss: 0.3316 - val_accuracy: 0.0000e+00
And also noticed that the loss and val_loss tends to go bigger.
My question is: why I'm seeing no changes (o very small and difficult changes) in the accuracy and val_accuracy and what can I do to correct/improve this? Is something wrong with the model? I'm also scaling the features using MinMaxScaler(feature_range=(0,1))

InceptionV3+LSTM activity recognition, accuracy grows for 10 epochs and then drops down

I'm trying to build model to do activity recognition.
Using InceptionV3 and backbone and LSTM for the detection, using pre-trained weights.
train_generator = datagen.flow_from_directory(
'dataset/train',
target_size=(1,224, 224),
batch_size=batch_size,
class_mode='categorical', # this means our generator will only yield batches of data, no labels
shuffle=True,
classes=['PlayingPiano','HorseRiding','Skiing', 'Basketball','BaseballPitch'])
validation_generator = datagen.flow_from_directory(
'dataset/validate',
target_size=(1,224, 224),
batch_size=batch_size,
class_mode='categorical', # this means our generator will only yield batches of data, no labels
shuffle=True,
classes=['PlayingPiano','HorseRiding','Skiing', 'Basketball','BaseballPitch'])
return train_generator,validation_generator
I train 5 classes so split my data into folders for train and validate.
This is my CNN+LSTM architecture
image = Input(shape=(None,224,224,3),name='image_input')
cnn = applications.inception_v3.InceptionV3(
weights='imagenet',
include_top=False,
pooling='avg')
cnn.trainable = False
encoded_frame = TimeDistributed(Lambda(lambda x: cnn(x)))(image)
encoded_vid = LSTM(256)(encoded_frame)
layer1 = Dense(512, activation='relu')(encoded_vid)
dropout1 = Dropout(0.5)(layer1)
layer2 = Dense(256, activation='relu')(dropout1)
dropout2 = Dropout(0.5)(layer2)
layer3 = Dense(64, activation='relu')(dropout2)
dropout3 = Dropout(0.5)(layer3)
outputs = Dense(5, activation='softmax')(dropout3)
model = Model(inputs=[image],outputs=outputs)
sgd = SGD(lr=0.001, decay = 1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd,loss='categorical_crossentropy', metrics=['accuracy'])
model.fit_generator(train_generator,validation_data = validation_generator,steps_per_epoch=300, epochs=nb_epoch,callbacks=callbacks,shuffle=True,verbose=1)
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
image_input (InputLayer) (None, None, 224, 224, 3) 0
_________________________________________________________________
time_distributed_1 (TimeDist (None, None, 2048) 0
_________________________________________________________________
lstm_1 (LSTM) (None, 256) 2360320
_________________________________________________________________
dense_1 (Dense) (None, 512) 131584
_________________________________________________________________
dropout_1 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 256) 131328
_________________________________________________________________
dropout_2 (Dropout) (None, 256) 0
_________________________________________________________________
dense_3 (Dense) (None, 64) 16448
_________________________________________________________________
dropout_3 (Dropout) (None, 64) 0
_________________________________________________________________
dense_4 (Dense) (None, 5) 325
_________________________________________________________________
Model compiles normally without problem.
Problem starts during the training. It reaches val_acc=0.50 and then drops back to val_acc=0.30 and the loss just freeze on 0.80 and mostly don't move.
Here the logs from training, as you see the model for some tome improves and then just slowly drops down and later just freeze.
Any idea what can be the reason?
Epoch 00002: val_loss improved from 1.56471 to 1.55652, saving model to ./weights_inception/Inception_V3.02-0.28.h5
Epoch 3/500
300/300 [==============================] - 66s 219ms/step - loss: 1.5436 - acc: 0.3281 - val_loss: 1.5476 - val_acc: 0.2981
Epoch 00003: val_loss improved from 1.55652 to 1.54757, saving model to ./weights_inception/Inception_V3.03-0.30.h5
Epoch 4/500
300/300 [==============================] - 66s 220ms/step - loss: 1.5109 - acc: 0.3593 - val_loss: 1.5284 - val_acc: 0.3588
Epoch 00004: val_loss improved from 1.54757 to 1.52841, saving model to ./weights_inception/Inception_V3.04-0.36.h5
Epoch 5/500
300/300 [==============================] - 66s 221ms/step - loss: 1.4167 - acc: 0.4167 - val_loss: 1.4945 - val_acc: 0.3553
Epoch 00005: val_loss improved from 1.52841 to 1.49446, saving model to ./weights_inception/Inception_V3.05-0.36.h5
Epoch 6/500
300/300 [==============================] - 66s 221ms/step - loss: 1.2941 - acc: 0.4683 - val_loss: 1.4735 - val_acc: 0.4443
Epoch 00006: val_loss improved from 1.49446 to 1.47345, saving model to ./weights_inception/Inception_V3.06-0.44.h5
Epoch 7/500
300/300 [==============================] - 66s 221ms/step - loss: 1.2096 - acc: 0.5116 - val_loss: 1.3738 - val_acc: 0.5186
Epoch 00007: val_loss improved from 1.47345 to 1.37381, saving model to ./weights_inception/Inception_V3.07-0.52.h5
Epoch 8/500
300/300 [==============================] - 66s 221ms/step - loss: 1.1477 - acc: 0.5487 - val_loss: 1.2337 - val_acc: 0.5788
Epoch 00008: val_loss improved from 1.37381 to 1.23367, saving model to ./weights_inception/Inception_V3.08-0.58.h5
Epoch 9/500
300/300 [==============================] - 66s 221ms/step - loss: 1.0809 - acc: 0.5831 - val_loss: 1.2247 - val_acc: 0.5658
Epoch 00009: val_loss improved from 1.23367 to 1.22473, saving model to ./weights_inception/Inception_V3.09-0.57.h5
Epoch 10/500
300/300 [==============================] - 66s 221ms/step - loss: 1.0362 - acc: 0.6089 - val_loss: 1.1704 - val_acc: 0.5774
Epoch 00010: val_loss improved from 1.22473 to 1.17035, saving model to ./weights_inception/Inception_V3.10-0.58.h5
Epoch 11/500
300/300 [==============================] - 66s 221ms/step - loss: 0.9811 - acc: 0.6317 - val_loss: 1.1612 - val_acc: 0.5616
Epoch 00011: val_loss improved from 1.17035 to 1.16121, saving model to ./weights_inception/Inception_V3.11-0.56.h5
Epoch 12/500
300/300 [==============================] - 66s 221ms/step - loss: 0.9444 - acc: 0.6471 - val_loss: 1.1533 - val_acc: 0.5613
Epoch 00012: val_loss improved from 1.16121 to 1.15330, saving model to ./weights_inception/Inception_V3.12-0.56.h5
Epoch 13/500
300/300 [==============================] - 66s 221ms/step - loss: 0.9072 - acc: 0.6650 - val_loss: 1.1843 - val_acc: 0.5361
Epoch 00013: val_loss did not improve from 1.15330
Epoch 14/500
300/300 [==============================] - 66s 221ms/step - loss: 0.8747 - acc: 0.6744 - val_loss: 1.2135 - val_acc: 0.5258
Epoch 00014: val_loss did not improve from 1.15330
Epoch 15/500
300/300 [==============================] - 67s 222ms/step - loss: 0.8666 - acc: 0.6829 - val_loss: 1.1585 - val_acc: 0.5443
Epoch 00015: val_loss did not improve from 1.15330
Epoch 16/500
300/300 [==============================] - 66s 222ms/step - loss: 0.8386 - acc: 0.6926 - val_loss: 1.1503 - val_acc: 0.5482
Epoch 00016: val_loss improved from 1.15330 to 1.15026, saving model to ./weights_inception/Inception_V3.16-0.55.h5
Epoch 17/500
300/300 [==============================] - 66s 221ms/step - loss: 0.8199 - acc: 0.7023 - val_loss: 1.2162 - val_acc: 0.5288
Epoch 00017: val_loss did not improve from 1.15026
Epoch 18/500
300/300 [==============================] - 66s 222ms/step - loss: 0.8018 - acc: 0.7150 - val_loss: 1.1995 - val_acc: 0.5179
Epoch 00018: val_loss did not improve from 1.15026
Epoch 19/500
300/300 [==============================] - 66s 221ms/step - loss: 0.7923 - acc: 0.7186 - val_loss: 1.2218 - val_acc: 0.5137
Epoch 00019: val_loss did not improve from 1.15026
Epoch 20/500
300/300 [==============================] - 67s 222ms/step - loss: 0.7748 - acc: 0.7268 - val_loss: 1.2880 - val_acc: 0.4574
Epoch 00020: val_loss did not improve from 1.15026
Epoch 21/500
300/300 [==============================] - 66s 221ms/step - loss: 0.7604 - acc: 0.7330 - val_loss: 1.2658 - val_acc: 0.4861
The model is starting to overfit. Ideally as you increase number of epochs training loss will decrease(depends on learning rate), if its not able to decrease may be your model can have a high bias for the data. You can use bigger model(more parameters or deeper model).
you can also to reduce the learning rate, if it still freezes then model may have a low bias.
Thank you for the help. Yes, the problem was overfitting, so i made more aggresive dropout on LSTM, and it helped. But the accuracy on val_loss and acc_val still very low
video = Input(shape=(None, 224,224,3))
cnn_base = VGG16(input_shape=(224,224,3),
weights="imagenet",
include_top=False)
cnn_out = GlobalAveragePooling2D()(cnn_base.output)
cnn = Model(inputs=cnn_base.input, outputs=cnn_out)
cnn.trainable = False
encoded_frames = TimeDistributed(cnn)(video)
encoded_sequence = LSTM(32, dropout=0.5, W_regularizer=l2(0.01), recurrent_dropout=0.5)(encoded_frames)
hidden_layer = Dense(units=64, activation="relu")(encoded_sequence)
dropout = Dropout(0.2)(hidden_layer)
outputs = Dense(5, activation="softmax")(dropout)
model = Model([video], outputs)
Here the logs
Epoch 00033: val_loss improved from 1.62041 to 1.57951, saving model to
./weights_inception/Inception_V3.33-0.76.h5
Epoch 34/500
100/100 [==============================] - 54s 537ms/step - loss: 0.6301 - acc:
0.9764 - val_loss: 1.6190 - val_acc: 0.7627
Epoch 00034: val_loss did not improve from 1.57951
Epoch 35/500
100/100 [==============================] - 54s 537ms/step - loss: 0.5907 - acc:
0.9840 - val_loss: 1.5927 - val_acc: 0.7608
Epoch 00035: val_loss did not improve from 1.57951
Epoch 36/500
100/100 [==============================] - 54s 537ms/step - loss: 0.5783 - acc:
0.9812 - val_loss: 1.3477 - val_acc: 0.7769
Epoch 00036: val_loss improved from 1.57951 to 1.34772, saving model to
./weights_inception/Inception_V3.36-0.78.h5
Epoch 37/500
100/100 [==============================] - 54s 537ms/step - loss: 0.5618 - acc:
0.9802 - val_loss: 1.6545 - val_acc: 0.7384
Epoch 00037: val_loss did not improve from 1.34772
Epoch 38/500
100/100 [==============================] - 54s 537ms/step - loss: 0.5382 - acc:
0.9818 - val_loss: 1.8298 - val_acc: 0.7421
Epoch 00038: val_loss did not improve from 1.34772
Epoch 39/500
100/100 [==============================] - 54s 536ms/step - loss: 0.5080 - acc:
0.9844 - val_loss: 1.7948 - val_acc: 0.7290
Epoch 00039: val_loss did not improve from 1.34772
Epoch 40/500
100/100 [==============================] - 54s 537ms/step - loss: 0.4800 - acc:
0.9892 - val_loss: 1.8036 - val_acc: 0.7522

Keras binary classification probabilities to labels

Keras predicted output for binary classification is probabilities. Not classes, i.e., 1 or 0.
for example the following code generates probabilities.
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout
# Generate dummy data
x_train = np.random.random((100, 20))
y_train = np.random.randint(2, size=(100, 1))
x_test = np.random.random((10, 20))
y_test = np.random.randint(2, size=(10, 1))
model = Sequential()
model.add(Dense(64, input_dim=20, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
model.fit(x_train, y_train, epochs=20, batch_size=128)
y_predicted = model.predict(x_test)
print(y_predicted)
and the output is:
Epoch 1/20
100/100 [==============================] - 1s 5ms/step - loss: 0.8134 - acc: 0.4300
Epoch 2/20
100/100 [==============================] - 0s 17us/step - loss: 0.7429 - acc: 0.4600
Epoch 3/20
100/100 [==============================] - 0s 20us/step - loss: 0.7511 - acc: 0.4300
Epoch 4/20
100/100 [==============================] - 0s 18us/step - loss: 0.7408 - acc: 0.5000
Epoch 5/20
100/100 [==============================] - 0s 21us/step - loss: 0.6922 - acc: 0.5700
Epoch 6/20
100/100 [==============================] - 0s 31us/step - loss: 0.6874 - acc: 0.5600
Epoch 7/20
100/100 [==============================] - 0s 29us/step - loss: 0.7005 - acc: 0.5600
Epoch 8/20
100/100 [==============================] - 0s 23us/step - loss: 0.6960 - acc: 0.5200
Epoch 9/20
100/100 [==============================] - 0s 24us/step - loss: 0.6988 - acc: 0.5200
Epoch 10/20
100/100 [==============================] - 0s 26us/step - loss: 0.7276 - acc: 0.4000
Epoch 11/20
100/100 [==============================] - 0s 20us/step - loss: 0.6967 - acc: 0.5000
Epoch 12/20
100/100 [==============================] - 0s 30us/step - loss: 0.7085 - acc: 0.5000
Epoch 13/20
100/100 [==============================] - 0s 24us/step - loss: 0.6993 - acc: 0.5500
Epoch 14/20
100/100 [==============================] - 0s 26us/step - loss: 0.7278 - acc: 0.4600
Epoch 15/20
100/100 [==============================] - 0s 27us/step - loss: 0.6665 - acc: 0.5500
Epoch 16/20
100/100 [==============================] - 0s 24us/step - loss: 0.6784 - acc: 0.5500
Epoch 17/20
100/100 [==============================] - 0s 24us/step - loss: 0.7259 - acc: 0.4800
Epoch 18/20
100/100 [==============================] - 0s 26us/step - loss: 0.7093 - acc: 0.5500
Epoch 19/20
100/100 [==============================] - 0s 28us/step - loss: 0.6911 - acc: 0.5700
Epoch 20/20
100/100 [==============================] - 0s 34us/step - loss: 0.6771 - acc: 0.5500
[[0.4875336 ]
[0.47847825]
[0.4808622 ]
[0.5032022 ]
[0.4556646 ]
[0.48644704]
[0.4600153 ]
[0.47782585]
[0.49664593]
[0.5001673 ]]
Now, how can I get the classes from that probabilities? I tried manually setting a threshold like this:
print([1 if x >0.4 else 0 for x in y_predicted])
Is there any other method to do that from Keras API? I could not find any.
Yes, model.predict_classes.
model.predict_classes(x_test)
https://github.com/keras-team/keras/blob/f0eb8d538c82798944346b4b2df917a06bf5e9d4/keras/engine/sequential.py#L254 (predict_classes)
which uses a threshold of 0.5 in case of binary classification or argmax in case of multi-class.

Increase the efficiency of an object detection model in keras

Model I am using:
num_classes = 20
INIT_LR = 1e-3
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(3, 56, 56), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(1024, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu', kernel_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
epochs = 40
lrate = 0.01
decay = lrate/epochs
opt = Adam(lr=INIT_LR, decay=INIT_LR / epochs)
sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False)
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
print(model.summary())
Accuracy that I got:
Train on 36124 samples, validate on 4014 samples
Epoch 1/40
36124/36124 [==============================] - 2161s 60ms/step - loss: 2.1642 - acc: 0.4387 - val_loss: 1.8971 - val_acc: 0.4584
Epoch 2/40
36124/36124 [==============================] - 2185s 60ms/step - loss: 1.8403 - acc: 0.4813 - val_loss: 1.6874 - val_acc: 0.4983
Epoch 3/40
36124/36124 [==============================] - 3774s 104ms/step - loss: 1.6476 - acc: 0.5231 - val_loss: 1.5375 - val_acc: 0.5451
Epoch 4/40
36124/36124 [==============================] - 2194s 61ms/step - loss: 1.5143 - acc: 0.5572 - val_loss: 1.4662 - val_acc: 0.5688
Epoch 5/40
36124/36124 [==============================] - 2079s 58ms/step - loss: 1.4169 - acc: 0.5792 - val_loss: 1.3685 - val_acc: 0.5952
Epoch 6/40
36124/36124 [==============================] - 2203s 61ms/step - loss: 1.3441 - acc: 0.6011 - val_loss: 1.4403 - val_acc: 0.5850
Epoch 7/40
36124/36124 [==============================] - 2212s 61ms/step - loss: 1.2922 - acc: 0.6140 - val_loss: 1.2964 - val_acc: 0.6168
Epoch 8/40
36124/36124 [==============================] - 2179s 60ms/step - loss: 1.2490 - acc: 0.6254 - val_loss: 1.2622 - val_acc: 0.6243
Epoch 9/40
36124/36124 [==============================] - 2169s 60ms/step - loss: 1.2033 - acc: 0.6377 - val_loss: 1.2622 - val_acc: 0.6206
Epoch 10/40
36124/36124 [==============================] - 2171s 60ms/step - loss: 1.1762 - acc: 0.6460 - val_loss: 1.3887 - val_acc: 0.6001
Epoch 11/40
36124/36124 [==============================] - 2168s 60ms/step - loss: 1.1313 - acc: 0.6577 - val_loss: 1.1599 - val_acc: 0.6452
Epoch 12/40
36124/36124 [==============================] - 2168s 60ms/step - loss: 1.1002 - acc: 0.6658 - val_loss: 1.2067 - val_acc: 0.6390
Epoch 13/40
36124/36124 [==============================] - 2170s 60ms/step - loss: 1.0932 - acc: 0.6676 - val_loss: 1.2386 - val_acc: 0.6335
Epoch 14/40
36124/36124 [==============================] - 2169s 60ms/step - loss: 1.0518 - acc: 0.6768 - val_loss: 1.1448 - val_acc: 0.6490
Epoch 15/40
36124/36124 [==============================] - 2168s 60ms/step - loss: 1.0342 - acc: 0.6832 - val_loss: 1.1420 - val_acc: 0.6522
Epoch 16/40
36124/36124 [==============================] - 2170s 60ms/step - loss: 1.0104 - acc: 0.6894 - val_loss: 1.2271 - val_acc: 0.6385
Epoch 17/40
36124/36124 [==============================] - 2168s 60ms/step - loss: 0.9855 - acc: 0.6964 - val_loss: 1.1793 - val_acc: 0.6517
Epoch 18/40
36124/36124 [==============================] - 2184s 60ms/step - loss: 0.9635 - acc: 0.7029 - val_loss: 1.1647 - val_acc: 0.6574
Epoch 19/40
36124/36124 [==============================] - 2074s 57ms/step - loss: 0.9517 - acc: 0.7071 - val_loss: 1.1118 - val_acc: 0.6639
Epoch 20/40
36124/36124 [==============================] - 2063s 57ms/step - loss: 0.9276 - acc: 0.7144 - val_loss: 1.1187 - val_acc: 0.6662
Epoch 21/40
36124/36124 [==============================] - 2104s 58ms/step - loss: 0.9111 - acc: 0.7202 - val_loss: 1.1444 - val_acc: 0.6637
Epoch 22/40
36124/36124 [==============================] - 2156s 60ms/step - loss: 0.8872 - acc: 0.7231 - val_loss: 1.1062 - val_acc: 0.6684
Epoch 23/40
36124/36124 [==============================] - 2181s 60ms/step - loss: 0.8716 - acc: 0.7279 - val_loss: 1.1912 - val_acc: 0.6540
Epoch 24/40
36124/36124 [==============================] - 2100s 58ms/step - loss: 0.8596 - acc: 0.7336 - val_loss: 1.1339 - val_acc: 0.6664
Epoch 25/40
36124/36124 [==============================] - 3357s 93ms/step - loss: 0.8412 - acc: 0.7380 - val_loss: 1.1295 - val_acc: 0.6627
Epoch 26/40
36124/36124 [==============================] - 2170s 60ms/step - loss: 0.8104 - acc: 0.7475 - val_loss: 1.1511 - val_acc: 0.6572
Epoch 27/40
36124/36124 [==============================] - 2131s 59ms/step - loss: 0.8091 - acc: 0.7468 - val_loss: 1.1501 - val_acc: 0.6679
Epoch 28/40
36124/36124 [==============================] - 2107s 58ms/step - loss: 0.7791 - acc: 0.7569 - val_loss: 1.1579 - val_acc: 0.6637
Epoch 29/40
36124/36124 [==============================] - 2247s 62ms/step - loss: 0.7665 - acc: 0.7598 - val_loss: 1.1310 - val_acc: 0.6724
Epoch 30/40
36124/36124 [==============================] - 2019s 56ms/step - loss: 0.7575 - acc: 0.7615 - val_loss: 1.1065 - val_acc: 0.6766
Epoch 31/40
36124/36124 [==============================] - 2098s 58ms/step - loss: 0.7344 - acc: 0.7705 - val_loss: 1.1025 - val_acc: 0.6751
Epoch 32/40
36124/36124 [==============================] - 2170s 60ms/step - loss: 0.7246 - acc: 0.7726 - val_loss: 1.1563 - val_acc: 0.6694
Epoch 33/40
36124/36124 [==============================] - 4057s 112ms/step - loss: 0.7133 - acc: 0.7777 - val_loss: 1.1328 - val_acc: 0.6714
Epoch 34/40
36124/36124 [==============================] - 2177s 60ms/step - loss: 0.6873 - acc: 0.7832 - val_loss: 1.1047 - val_acc: 0.6886
Epoch 35/40
36124/36124 [==============================] - 2175s 60ms/step - loss: 0.6816 - acc: 0.7860 - val_loss: 1.1477 - val_acc: 0.6662
Epoch 36/40
36124/36124 [==============================] - 2177s 60ms/step - loss: 0.6684 - acc: 0.7885 - val_loss: 1.1006 - val_acc: 0.6886
Epoch 37/40
36124/36124 [==============================] - 2179s 60ms/step - loss: 0.6622 - acc: 0.7951 - val_loss: 1.1352 - val_acc: 0.6814
Epoch 38/40
36124/36124 [==============================] - 2177s 60ms/step - loss: 0.6393 - acc: 0.7976 - val_loss: 1.1688 - val_acc: 0.6707
Epoch 39/40
36124/36124 [==============================] - 2137s 59ms/step - loss: 0.6263 - acc: 0.8018 - val_loss: 1.1279 - val_acc: 0.6896
Epoch 40/40
8160/36124 [=====>........................] - ETA: 26:35 - loss: 0.5668 - acc: 0.8205
Can any one suggest a way to improve the model efficiency I tried increasing no of layers and no of epochs but the efficiency i got is around 65 to 68 percent.
in the ng's course on courser he says that:
*if have **HIGH BIAS so you have to:***
build a bigger NN or
train larger or
change your architecture
*if have **HIGH VARIANCE then:***
collect more data
regularize your NN
change your architecture

my simple regression model(by keras) doesn't work

i'm beginner studying neural network.
i'm doing kaggle project - bike sharing demand.
i want to use simple neural network by keras, but loss is not decreasing.
what should i do?
-----------code--------------
# dataset from pandas
feature_names = ["season", "holiday", "workingday", "weather",
"temp", "atemp", "humidity", "windspeed",
"datetime_year", "datetime_hour", "datetime_dayofweek"]
label_name = ["count"]
X_train = train[feature_names] #shape (10886, 11)
Y_train = train[label_name] #shape (10886, 1)
X_test = test[feature_names]
# layers
model = Sequential()
model.add(Dense(units = 50, kernel_initializer = 'uniform', activation = 'relu', input_dim=11))
model.add(Dropout(0.3))
model.add(Dense(units = 50, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(0.3))
model.add(Dense(units = 5, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(0.3))
model.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
model.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics = ['accuracy'])
# Train
model.fit(X_train, Y_train, batch_size = 100, epochs = 200)
---------result------------
Epoch 1/200
10886/10886 [==============================] - 4s 325us/step - loss: 69206.2478 - acc: 0.0094
Epoch 2/200
10886/10886 [==============================] - 1s 93us/step - loss: 69184.5435 - acc: 0.0096
Epoch 3/200
10886/10886 [==============================] - 1s 89us/step - loss: 69181.6330 - acc: 0.0096
Epoch 4/200
10886/10886 [==============================] - 1s 93us/step - loss: 69179.0222 - acc: 0.0096
Epoch 5/200
10886/10886 [==============================] - 1s 91us/step - loss: 69175.7442 - acc: 0.0096
Epoch 6/200
10886/10886 [==============================] - 1s 109us/step - loss: 69171.9052 - acc: 0.0096
Epoch 7/200
10886/10886 [==============================] - 1s 122us/step - loss: 69171.6164 - acc: 0.0096
Epoch 8/200
10886/10886 [==============================] - 1s 92us/step - loss: 69167.6923 - acc: 0.0096
Epoch 9/200
10886/10886 [==============================] - 1s 91us/step - loss: 69166.2911 - acc: 0.0096
Epoch 10/200
10886/10886 [==============================] - 1s 94us/step - loss: 69164.1145 - acc: 0.0096
...
Try setting the learning rate in the model.compile section. Start with 0.0001, then 0.001 and 0.01 and see what happens.

Resources