I am trying to find a useful code for improve classification using autoencoder.
I followed this example keras autoencoder vs PCA
But not for MNIST data, I tried to use it with cifar-10
so I made some changes but it seems like something is not fitting.
Could any one please help me in this?
if you have another example that can run in different dataset, that would help.
the validation in, which is (X_test,Y_test) is not learned, so it gives wronf accuracy in .evalute()
always give
val_loss: 2.3026 - val_acc: 0.1000
This is the code, and the error:
rom keras.datasets import cifar10
from keras.models import Model
from keras.layers import Input, Dense
from keras.utils import np_utils
import numpy as np
num_train = 50000
num_test = 10000
height, width, depth = 32, 32, 3 # MNIST images are 28x28
num_classes = 10 # there are 10 classes (1 per digit)
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X_train = X_train.reshape(num_train,height * width * depth)
X_test = X_test.reshape(num_test,height * width*depth)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255 # Normalise data to [0, 1] range
X_test /= 255 # Normalise data to [0, 1] range
Y_train = np_utils.to_categorical(y_train, num_classes) # One-hot encode the labels
Y_test = np_utils.to_categorical(y_test, num_classes) # One-hot encode the labels
input_img = Input(shape=(height * width * depth,))
s=height * width * depth
x = Dense(s, activation='relu')(input_img)
encoded = Dense(s//2, activation='relu')(x)
encoded = Dense(s//8, activation='relu')(encoded)
y = Dense(s//256, activation='relu')(x)
decoded = Dense(s//8, activation='relu')(y)
decoded = Dense(s//2, activation='relu')(decoded)
z = Dense(s, activation='sigmoid')(decoded)
model = Model(input_img, z)
model.compile(optimizer='adadelta', loss='mse') # reporting the accuracy, X_train,
validation_data=(X_test, X_test))
mid = Model(input_img, y)
reduced_representation =mid.predict(X_test)
out = Dense(num_classes, activation='softmax')(y)
reduced = Model(input_img, out)
metrics=['accuracy']), Y_train,
validation_data=(X_test, Y_test))
scores = reduced.evaluate(X_test, Y_test, verbose=0)
print("Accuracy: ", scores[1])
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
50000/50000 [==============================] - 5s - loss: 0.0639 - val_loss: 0.0633
Epoch 2/10
50000/50000 [==============================] - 5s - loss: 0.0610 - val_loss: 0.0568
Epoch 3/10
50000/50000 [==============================] - 5s - loss: 0.0565 - val_loss: 0.0558
Epoch 4/10
50000/50000 [==============================] - 5s - loss: 0.0557 - val_loss: 0.0545
Epoch 5/10
50000/50000 [==============================] - 5s - loss: 0.0536 - val_loss: 0.0518
Epoch 6/10
50000/50000 [==============================] - 5s - loss: 0.0502 - val_loss: 0.0461
Epoch 7/10
50000/50000 [==============================] - 5s - loss: 0.0443 - val_loss: 0.0412
Epoch 8/10
50000/50000 [==============================] - 5s - loss: 0.0411 - val_loss: 0.0397
Epoch 9/10
50000/50000 [==============================] - 5s - loss: 0.0391 - val_loss: 0.0371
Epoch 10/10
50000/50000 [==============================] - 5s - loss: 0.0377 - val_loss: 0.0403
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
50000/50000 [==============================] - 3s - loss: 2.3605 - acc: 0.0977 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 2/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0952 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 3/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0978 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 4/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0980 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 5/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0974 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 6/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.1000 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 7/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0992 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 8/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0982 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 9/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0965 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 10/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0978 - val_loss: 2.3026 - val_acc: 0.1000
9856/10000 [============================>.] - ETA: 0s('Accuracy: ', 0.10000000000000001)

there are multiple issues with your code.
Your autoencoder is not fully trained, if you plot the training data, you will see the model haven't converged yet. By
history =, X_train,
validation_data=(X_test, X_test))
you will obtain the loss values during training. If you plot them, e.g. in matplotlib,
import matplotlib.pyplot as plt
plt.title('model train vs validation loss 1')
plt.legend(['train', 'validation'], loc='upper right')
you will see that it needs more epochs to converge.
The autoencoder architecture is wrongly built, there is typo in line y = Dense(s//256, activation='relu')(x), you probably wanted to usey = Dense(s//256, activation='linear')(encoded) so it uses previous layer and not the input. And also you don't want to use the relu activation in latent space, because then it disallows you subtracting latent variables from each other and thus makes the autoencoder much less efficient.
With those fixes, the model trains withour problems.
I increased number of epochs to 30 for training both networks so it will train better.
At the end of the trainings, the classification model reports loss: 1.2881 - acc: 0.5397 - val_loss: 1.3841 - val_acc: 0.5126 which is lower than you experienced.


how to train amd test dataset of images downloaded from kaggle

I want to load dataset from Kaggle. The link for the dataset is
It has images in different folder. How do I label the dataset and split and train it.
I did it the following way, but i got error
train_ds = tf.keras.preprocessing.image_dataset_from_directory(data_dir,#color_mode="grayscale",validation_split=0.2,subset="training",seed=123,image_size=(img_height, img_width),batch_size=batch_size)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(data_dir,#color_mode="grayscale",validation_split=0.2,subset="validation",seed=123,image_size=(img_height, img_width),batch_size=batch_size)
code is shown below
import os
import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, Activation,Dropout,Conv2D, MaxPooling2D,BatchNormalization
from tensorflow.keras.optimizers import Adam, Adamax
from tensorflow.keras.metrics import categorical_crossentropy
from tensorflow.keras import regularizers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Model, load_model, Sequential
for klass in classlist:
classpath=os.path.join(sdir, klass)
if os.path.isdir(classpath):
for f in flist:
if os.path.isfile(fpath):
fseries=pd.Series(filepaths, name='filepaths')
Lseries=pd.Series (labels, name='labels')
df=pd.concat([fseries, Lseries], axis=1)
print (balance) # dataset is reasonably balanced
train_df, dummy_df=train_test_split(df, train_size=train_split, shuffle=True, random_state = 123)
test_df, valid_df=train_test_split(dummy_df, train_size=dummy_split, shuffle=True, random_state=123)
def scalar(img):
return img/127.5-1 # scale pixels between -1 and + 1
train_gen=gen.flow_from_dataframe(train_df, x_col= 'filepaths', y_col='labels', target_size=(128,128), class_mode='categorical',
color_mode='rgb', shuffle=False)
test_gen=gen.flow_from_dataframe(test_df, x_col= 'filepaths', y_col='labels', target_size=(128,128), class_mode='categorical',
color_mode='rgb', shuffle=False)
valid_gen=gen.flow_from_dataframe(valid_df, x_col= 'filepaths', y_col='labels', target_size=(128,128), class_mode='categorical',
color_mode='rgb', shuffle=False)
base_model=tf.keras.applications.MobileNetV2( include_top=False, input_shape=(128,128,3), pooling='max', weights='imagenet')
x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x = Dense(1024, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
bias_regularizer=regularizers.l1(0.006) ,activation='relu', kernel_initializer= tf.keras.initializers.GlorotUniform(seed=123))(x)
x=Dropout(rate=.3, seed=123)(x)
output=Dense(len(classes), activation='softmax',kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123))(x)
model=Model(inputs=base_model.input, outputs=output)
model.compile(Adamax(lr=.001), loss='categorical_crossentropy', metrics=['accuracy'])
estop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=4, verbose=1,restore_best_weights=True)
rlronp=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss",factor=0.5, patience=1, verbose=1), epochs=10, verbose=1, callbacks=[estop, rlronp], validation_data=valid_gen,
validation_steps=None, shuffle=False, initial_epoch=0)
save_path=r'c:\mydir\mymodel.h5' # specify the path to where to save model
the results of should be as shown below
Epoch 1/20
254/254 [==============================] - 26s 84ms/step - loss: 14.9756 - accuracy: 0.8516 - val_loss: 5.0730 - val_accuracy: 0.6452
Epoch 2/20
254/254 [==============================] - 18s 73ms/step - loss: 2.7752 - accuracy: 0.9945 - val_loss: 1.7161 - val_accuracy: 0.7783
Epoch 3/20
254/254 [==============================] - 20s 78ms/step - loss: 0.7500 - accuracy: 0.9994 - val_loss: 0.9572 - val_accuracy: 0.8780
Epoch 4/20
254/254 [==============================] - 21s 84ms/step - loss: 0.3855 - accuracy: 0.9998 - val_loss: 0.6381 - val_accuracy: 0.9357
Epoch 5/20
254/254 [==============================] - 18s 71ms/step - loss: 0.2984 - accuracy: 1.0000 - val_loss: 0.4525 - val_accuracy: 0.9601
Epoch 6/20
254/254 [==============================] - 18s 73ms/step - loss: 0.2609 - accuracy: 1.0000 - val_loss: 0.3453 - val_accuracy: 0.9778
Epoch 7/20
254/254 [==============================] - 18s 70ms/step - loss: 0.2354 - accuracy: 0.9998 - val_loss: 0.2760 - val_accuracy: 0.9867
Epoch 8/20
254/254 [==============================] - 18s 69ms/step - loss: 0.2160 - accuracy: 1.0000 - val_loss: 0.2478 - val_accuracy: 0.9911
Epoch 9/20
254/254 [==============================] - 18s 70ms/step - loss: 0.2023 - accuracy: 1.0000 - val_loss: 0.2042 - val_accuracy: 0.9956
Epoch 10/20
254/254 [==============================] - 19s 74ms/step - loss: 0.1894 - accuracy: 1.0000 - val_loss: 0.1889 - val_accuracy: 0.9956

Video classification using vgg16 and LSTM

my project is about violence classification using video dataset so i converted all my videos to images every video converted to 7 images .
first i use my vgg16 to extract the features from my images and then train my LSTM on this features , but when i train my LSTM i get bad accuracy and val_accuracy and Strangely, the two values remain constant form many epochs like my accuracy remain 0.5000 for about 50 epoch and the same problem for my validation accuracy .
here is my code if u can figure where is the problem or why the accuracy remain constant and to low .
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Convolution2D, MaxPooling2D, Flatten, Dense, Dropout, GlobalAveragePooling2D
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.layers import Flatten, BatchNormalization ,LSTM
import os, shutil
from keras.preprocessing.image import ImageDataGenerator
import keras
conv = tf.keras.applications.vgg16.VGG16()
model = Sequential()
here i copy my vgg16 to sequential model
for layer in conv.layers:
here i get rid of all dense and flatten layers making my last layer maxpool with (7,7,512)shape
import os, shutil
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator()
batch_size = 1
img_width, img_height = 224, 224 # Default input size for VGG16
this is the function use to extract the features for every pic my arameters is the file path and the number of images in this file .
def extract_features(directory, sample_count):
features = np.zeros(shape=(sample_count, 7, 7, 512)) # Must be equal to the output of the
convolutional base
labels = np.zeros(shape=(sample_count,2))
# Preprocess data
generator = datagen.flow_from_directory(directory,
batch_size = batch_size,
# Pass data through convolutional base
i = 0
for inputs_batch, labels_batch in generator:
features_batch = model.predict(inputs_batch)
features[i * batch_size: (i + 1) * batch_size] = features_batch
labels[i * batch_size: (i + 1) * batch_size] = labels_batch
i += 1
if i * batch_size >= sample_count:
return features, labels
train_violence="/content/drive/My Drive/images/one/data/train"
train_non="/content/drive/My Drive/images/two/data/train"
valid_violence="/content/drive/My Drive/images/three/data/validation"
valid_non="/content/drive/My Drive/images/four/data/validation"
train_violence_features, train_violence_labels = extract_features(train_violence,119)
train_non_features , train_non_labels = extract_features(train_non,119)
valid_violence_features , valid_violence_labels = extract_features(valid_violence,77)
valid_non_features , valid_non_labels = extract_features(valid_non,77)
now i have my features for violence and non violence features for training and validation so i need to concatenate my 2 arrays for training to make it one array have all violence features then come the non violence features because i found that flow from directory function take the images randomly from the 2 classes but i need it as a sequence so i need every 7 photos come as a sequence so it cant be arranges randomly from the 2 classes so i used 4 arrays 2 Validation arrays one for violence and one of non violence and the same for the training and then i concatenate them maintaining the correct sequence for every video .
x= np.concatenate((train_violence_features, train_non_features))
y = np.concatenate((valid_violence_features, valid_non_features))
now the shape of x is (238, 7, 7, 512) as 228 photo and y for validation is (154, 7, 7, 512)
here i rshape the input for the LSTM to be in shape of(samples , time steps , features) which will be 34 video for training every video was converted to 7 images so 7 is my time steps and 7*7*512 is the number of features equal to 25088
lstm_train_sample = np.reshape(x,(34,7,25088))
lstm_validation_sample = np.reshape(y,(22,7,25088))
here i make my labels ->label for every video
t_labels = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
v_labels = [1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0]
t_labels= keras.utils.to_categorical(t_labels, num_classes=2, dtype='float32')
v_labels= keras.utils.to_categorical(v_labels, num_classes=2, dtype='float32')
finally my LSTM :
lstm = Sequential()
lstm.add(LSTM(200, activation='relu', return_sequences=True, input_shape=(7, 25088)))
lstm.add(LSTM(25, activation='relu'))
lstm.add(Dense(20, activation='relu'))
lstm.add(Dense(10, activation='relu'))
lstm.compile(optimizer='adam', loss='mse')
lstm.compile(optimizer='adam', loss='mse',metrics=['accuracy']) , t_labels , epochs=100 , batch_size=2 , validation_data=
(lstm_validation_sample,v_labels) , validation_batch_size= 2 )
this is an example of the result and i want to know why its like that :
Epoch 55/100
17/17 [==============================] - 8s 460ms/step - loss: 0.4490 - accuracy: 0.5000 - val_loss: 1.4238 - val_accuracy: 0.5000
Epoch 56/100
17/17 [==============================] - 8s 464ms/step - loss: 0.4476 - accuracy: 0.5000 - val_loss: 1.4218 - val_accuracy: 0.5000
Epoch 57/100
17/17 [==============================] - 8s 462ms/step - loss: 0.4461 - accuracy: 0.5000 - val_loss: 1.4198 - val_accuracy: 0.5000
Epoch 58/100
17/17 [==============================] - 8s 461ms/step - loss: 0.4447 - accuracy: 0.5000 - val_loss: 1.4176 - val_accuracy: 0.5000
Epoch 59/100
17/17 [==============================] - 8s 457ms/step - loss: 0.4432 - accuracy: 0.5000 - val_loss: 1.4156 - val_accuracy: 0.5000
Epoch 60/100
17/17 [==============================] - 8s 461ms/step - loss: 0.4418 - accuracy: 0.5000 - val_loss: 1.4135 - val_accuracy: 0.5000
Epoch 61/100
17/17 [==============================] - 8s 459ms/step - loss: 0.4403 - accuracy: 0.5000 - val_loss: 1.4114 - val_accuracy: 0.5000
Epoch 62/100
17/17 [==============================] - 8s 458ms/step - loss: 0.4388 - accuracy: 0.5000 - val_loss: 1.4094 - val_accuracy: 0.5000
Epoch 63/100
17/17 [==============================] - 8s 456ms/step - loss: 0.4373 - accuracy: 0.5000 - val_loss: 1.4072 - val_accuracy: 0.5000
Epoch 64/100
17/17 [==============================] - 8s 461ms/step - loss: 0.4358 - accuracy: 0.5000 - val_loss: 1.4051 - val_accuracy: 0.5000
Epoch 65/100
17/17 [==============================] - 8s 467ms/step - loss: 0.4343 - accuracy: 0.5000 - val_loss: 1.4029 - val_accuracy: 0.5000
Epoch 66/100
17/17 [==============================] - 8s 458ms/step - loss: 0.4328 - accuracy: 0.5000 - val_loss: 1.4008 - val_accuracy: 0.5000
Epoch 67/100
17/17 [==============================] - 8s 460ms/step - loss: 0.4313 - accuracy: 0.5000 - val_loss: 1.3987 - val_accuracy: 0.5000
Epoch 68/100
17/17 [==============================] - 8s 461ms/step - loss: 0.4298 - accuracy: 0.5000 - val_loss: 1.3964 - val_accuracy: 0.5000

Keras - Why is the accuracy of my CNN model not being affected by the hyper-parameters?

As the title clearly describes, the accuracy of my simple CNN model is not being affected by the hyper-parameters or even the existence of layers such as Dropout, and MaxPooling. I implemented the model using Keras. What could be the reason behind this odd situation? I added the regarding part of the code below:
input_dim = X_train.shape[1]
nb_classes = Y_train.shape[1]
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(input_dim, 1)))
model.add(Dense(40, activation='relu'))
model.add(Dense(nb_classes, activation='softmax'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
p.s. The input data (X_train and X_test) contains vectors which were reproduced by Word2Vec. The output is binary.
Edit: You may find a sample training log below:
Sample training log:
Train on 3114 samples, validate on 347 samples
Epoch 1/10
- 1s - loss: 0.6917 - accuracy: 0.5363 - val_loss: 0.6901 - val_accuracy: 0.5476
Epoch 2/10
- 1s - loss: 0.6906 - accuracy: 0.5369 - val_loss: 0.6896 - val_accuracy: 0.5476
Epoch 3/10
- 1s - loss: 0.6908 - accuracy: 0.5369 - val_loss: 0.6895 - val_accuracy: 0.5476
Epoch 4/10
- 1s - loss: 0.6908 - accuracy: 0.5369 - val_loss: 0.6903 - val_accuracy: 0.5476
Epoch 5/10
- 1s - loss: 0.6908 - accuracy: 0.5369 - val_loss: 0.6899 - val_accuracy: 0.5476
Epoch 6/10
- 1s - loss: 0.6909 - accuracy: 0.5369 - val_loss: 0.6901 - val_accuracy: 0.5476
Epoch 7/10
- 1s - loss: 0.6905 - accuracy: 0.5369 - val_loss: 0.6896 - val_accuracy: 0.5476
Epoch 8/10
- 1s - loss: 0.6909 - accuracy: 0.5369 - val_loss: 0.6897 - val_accuracy: 0.5476
Epoch 9/10
- 1s - loss: 0.6905 - accuracy: 0.5369 - val_loss: 0.6892 - val_accuracy: 0.5476
Epoch 10/10
- 1s - loss: 0.6909 - accuracy: 0.5369 - val_loss: 0.6900 - val_accuracy: 0.5476
First you need to change the last layer to
model.add(Dense(1, activation='sigmoid'))
You also need to change the loss function to
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
I assume that you have multi-class classification, right?
Then your loss is not appropriate: you should use 'categorical_crossentropy' not 'mean_squared_error'.
Also, try adding several Conv+Drop+MaxPool (3 sets) in order to clearly verify the robustness of your network.

Classify activity person go into and go out the car (behavior detection)

I'm working on the problem classify activity get out the car and get in.
Also need to classify if upload and download activity going near the car
Need advice how to fix problem of overfitting model in testing dataset
Using CNN + LSTM architecture. In the attachment i've provided samples of the dataset.
Have around 15 000 images for each class
Dataset example
go in image
go in image 2
go in image 3
go out image 1
go out image 2
Now let's go to code.
First i get my dataset using keras
batch_size = 128
batch_size_train = 148
def bring_data_from_directory():
datagen = ImageDataGenerator(rescale=1./255)
train_generator = datagen.flow_from_directory(
target_size=(224, 224),
class_mode='categorical', # this means our generator will only yield batches of data, no labels
validation_generator = datagen.flow_from_directory(
target_size=(224, 224),
class_mode='categorical', # this means our generator will only yield batches of data, no labels
return train_generator,validation_generator
Use VGG16 network to extract features and store them into .npy format
def load_VGG16_model():
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224,224,3))
print ("Model loaded..!")
print (base_model.summary())
return base_model
def extract_features_and_store(train_generator,validation_generator,base_model):
x_generator = None
y_lable = None
batch = 0
for x,y in train_generator:
if batch == int(56021/batch_size):
print("Total needed:", int(56021/batch_size))
print ("predict on batch:",batch)
if np.any(x_generator)==None:
x_generator = base_model.predict_on_batch(x)
y_lable = y
print (y)
x_generator = np.append(x_generator,base_model.predict_on_batch(x),axis=0)
y_lable = np.append(y_lable,y,axis=0)
print (y)
x_generator,y_lable = shuffle(x_generator,y_lable)'video_x_VGG16.npy', 'wb'), x_generator)'video_y_VGG16.npy','wb'),y_lable)
batch = 0
x_generator = None
y_lable = None
for x,y in validation_generator:
if batch == int(3971/batch_size):
print("Total needed:", int(3971/batch_size))
print ("predict on batch validate:",batch)
if np.any(x_generator)==None:
x_generator = base_model.predict_on_batch(x)
y_lable = y
print (y)
x_generator = np.append(x_generator,base_model.predict_on_batch(x),axis=0)
y_lable = np.append(y_lable,y,axis=0)
print (y)
x_generator,y_lable = shuffle(x_generator,y_lable)'video_x_validate_VGG16.npy', 'wb'),x_generator)'video_y_validate_VGG16.npy','wb'),y_lable)
train_data = np.load(open('video_x_VGG16.npy', 'rb'))
train_labels = np.load(open('video_y_VGG16.npy', 'rb'))
train_data,train_labels = shuffle(train_data,train_labels)
validation_data = np.load(open('video_x_validate_VGG16.npy', 'rb'))
validation_labels = np.load(open('video_y_validate_VGG16.npy', 'rb'))
validation_data,validation_labels = shuffle(validation_data,validation_labels)
train_data = train_data.reshape(train_data.shape[0],
train_data.shape[1] * train_data.shape[2],
validation_data = validation_data.reshape(validation_data.shape[0],
validation_data.shape[1] * validation_data.shape[2],
return train_data,train_labels,validation_data,validation_labels
def train_model(train_data,train_labels,validation_data,validation_labels):
print("SHAPE OF DATA : {}".format(train_data.shape))
model = Sequential()
model.add(LSTM(2048, stateful=True, activation='relu', kernel_regularizer=l2(0.0000001), activity_regularizer=l2(0.0000001), kernel_initializer='glorot_uniform', return_sequences=True, bias_initializer='zeros', dropout=0.2 , batch_input_shape=( batch_size_train, train_data.shape[1],
model.add(LSTM(1024, stateful=True, activation='relu', kernel_regularizer=l2(0.0000001), activity_regularizer=l2(0.0000001), kernel_initializer='glorot_uniform', return_sequences=True, bias_initializer='zeros', dropout=0.2))
model.add(LSTM(512, stateful=True, activation='relu', kernel_regularizer=l2(0.0000001), activity_regularizer=l2(0.0000001), kernel_initializer='glorot_uniform', return_sequences=True, bias_initializer='zeros', dropout=0.2))
model.add(LSTM(128, stateful=True, activation='relu', kernel_regularizer=l2(0.0000001), activity_regularizer=l2(0.0000001), kernel_initializer='glorot_uniform', bias_initializer='zeros', dropout=0.2))
model.add(Dense(1024, kernel_regularizer=l2(0.01), activity_regularizer=l2(0.01), kernel_initializer='random_uniform', bias_initializer='zeros', activation='relu'))
model.add(Dense(4, kernel_initializer='random_uniform', bias_initializer='zeros', activation='softmax'))
adam = Adam(lr=0.00005, decay = 1e-6, clipnorm=1.0, clipvalue=0.5)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])
callbacks = [ EarlyStopping(monitor='val_loss', patience=10, verbose=0), ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0), ModelCheckpoint('video_1_LSTM_1_1024.h5', monitor='val_loss', save_best_only=True, verbose=0) ]
nb_epoch = 500,train_labels,validation_data=(validation_data,validation_labels),batch_size=batch_size_train,nb_epoch=nb_epoch,callbacks=callbacks,shuffle=True,verbose=1)
return model
Train on 55796 samples, validate on 3552 samples
Epoch 1/500
55796/55796 [==============================] - 209s 4ms/step - loss: 2.0079 - acc: 0.4518 - val_loss: 1.6785 - val_acc: 0.6166
Epoch 2/500
55796/55796 [==============================] - 205s 4ms/step - loss: 1.3974 - acc: 0.8347 - val_loss: 1.3561 - val_acc: 0.6740
Epoch 3/500
55796/55796 [==============================] - 205s 4ms/step - loss: 1.1181 - acc: 0.8628 - val_loss: 1.1961 - val_acc: 0.7311
Epoch 4/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.9644 - acc: 0.8689 - val_loss: 1.1276 - val_acc: 0.7218
Epoch 5/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.8681 - acc: 0.8703 - val_loss: 1.0483 - val_acc: 0.7435
Epoch 6/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.7944 - acc: 0.8717 - val_loss: 0.9755 - val_acc: 0.7641
Epoch 7/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.7296 - acc: 0.9245 - val_loss: 0.9444 - val_acc: 0.8260
Epoch 8/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.6670 - acc: 0.9866 - val_loss: 0.8486 - val_acc: 0.8426
Epoch 9/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.6121 - acc: 0.9943 - val_loss: 0.8455 - val_acc: 0.8708
Epoch 10/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.5634 - acc: 0.9964 - val_loss: 0.8335 - val_acc: 0.8553
Epoch 11/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.5216 - acc: 0.9973 - val_loss: 0.9688 - val_acc: 0.7838
Epoch 12/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.4841 - acc: 0.9986 - val_loss: 0.8166 - val_acc: 0.8133
Epoch 13/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.4522 - acc: 0.9984 - val_loss: 0.8399 - val_acc: 0.8184
Epoch 14/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.4234 - acc: 0.9987 - val_loss: 0.7864 - val_acc: 0.8072
Epoch 15/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.3977 - acc: 0.9990 - val_loss: 0.7306 - val_acc: 0.8446
Epoch 16/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.3750 - acc: 0.9990 - val_loss: 0.7644 - val_acc: 0.8514
Epoch 17/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.3546 - acc: 0.9989 - val_loss: 0.7542 - val_acc: 0.7908
Epoch 18/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.3345 - acc: 0.9994 - val_loss: 0.7150 - val_acc: 0.8314
Epoch 19/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.3170 - acc: 0.9993 - val_loss: 0.8910 - val_acc: 0.7798
Epoch 20/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.3017 - acc: 0.9992 - val_loss: 0.6143 - val_acc: 0.8809
Epoch 21/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.2861 - acc: 0.9995 - val_loss: 0.7907 - val_acc: 0.8156
Epoch 22/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.2719 - acc: 0.9996 - val_loss: 0.7077 - val_acc: 0.8401
Epoch 23/500
55796/55796 [==============================] - 206s 4ms/step - loss: 0.2593 - acc: 0.9995 - val_loss: 0.6482 - val_acc: 0.8133
Epoch 24/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.2474 - acc: 0.9995 - val_loss: 0.7671 - val_acc: 0.7942
The problem is appears that the model starts to overfit and on the testing dataset makes significant detection errors. So far as i see the problem that model can't see the difference between these to actions, or maybe the sequence problem.
As you see i've already tried regularization, clipping and so on. No result.
Please any advice regarding how to fix this problem.

Very low accuracy on Digit recgonition dataset with images having 4 channels, using Convolutional Neural Networks

I am currently working on a digit recognition challenge by Analytics Vidhya, the link to which is .
The images in the dataset pertaining to this challenge are of dimensions 28*28*4 (28 = length = width , 4 = no. of channels).The code I have implemented is:
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten,Activation
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
import numpy as np
# fix random seed for reproducibility
seed = 7
# define the larger model
def larger_model():
# create model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(4, 28, 28),activation='relu',padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(15, (3, 3), activation='relu',padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dense(200, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
def loadImages(path):
# return array of images
imagesList = listdir(path)
loadedImages = []
for image in imagesList:
img = io.imread(path + "/" + image,as_grey = False)
return loadedImages
path = "C:/Users/Farz Jamal/Downloads/mnist/Train/Images/train" #path_to_train_dataset
import pandas as pd
df = pd.read_csv("C:/Users/Farz Jamal/Downloads/mnist/Train/train.csv") #path_to_class_labels
y = np.array(df['label'])
from sklearn.cross_validation import train_test_split as ttt
x_train,x_val,y_train,y_val = ttt(imgs,y,test_size = 0.2)
Continued Code:
x_vall,x_test,y_vall,y_test = ttt(x_val,y_val,test_size = 0.4)
x_train,x_vall,x_test = np.array(x_train).astype('float32'),np.array(x_vall).astype('float32'),np.array(x_test).astype('float32')
# normalize inputs from 0-255 to 0-1
x_train = x_train / 255.0
x_vall = x_vall / 255.0
x_test = x_test / 255.0
y_train = np_utils.to_categorical(y_train)
y_vall = np_utils.to_categorical(y_vall)
y_test = np_utils.to_categorical(y_test)
num_classes = y_vall.shape[1] #10
model = larger_model()
# Fit the model, y_train, validation_data=(x_vall, y_vall), epochs=50, batch_size=200)
# Final evaluation of the model
scores = model.evaluate(x_test, y_test, verbose=0)
The output is coming as follows:(from 16thepoch to 37th epoch)
Epoch 16/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.3013 - acc: 0.1135 - val_loss: 2.3015 - val_acc: 0.1095
Epoch 17/50
39200/39200 [==============================] - 275s 7ms/step - loss: 2.3011 - acc: 0.1128 - val_loss: 2.3014 - val_acc: 0.1095
Epoch 18/50
39200/39200 [==============================] - 270s 7ms/step - loss: 2.3011 - acc: 0.1124 - val_loss: 2.3015 - val_acc: 0.1095
Epoch 19/50
39200/39200 [==============================] - 273s 7ms/step - loss: 2.3012 - acc: 0.1131 - val_loss: 2.3017 - val_acc: 0.1095
Epoch 20/50
39200/39200 [==============================] - 273s 7ms/step - loss: 2.3011 - acc: 0.1130 - val_loss: 2.3018 - val_acc: 0.1111
Epoch 21/50
39200/39200 [==============================] - 272s 7ms/step - loss: 2.3010 - acc: 0.1127 - val_loss: 2.3013 - val_acc: 0.1095
Epoch 22/50
39200/39200 [==============================] - 281s 7ms/step - loss: 2.3006 - acc: 0.1133 - val_loss: 2.3015 - val_acc: 0.1097
Epoch 23/50
39200/39200 [==============================] - 273s 7ms/step - loss: 2.3005 - acc: 0.1136 - val_loss: 2.3018 - val_acc: 0.1099
Epoch 24/50
39200/39200 [==============================] - 276s 7ms/step - loss: 2.3005 - acc: 0.1135 - val_loss: 2.3022 - val_acc: 0.1116
Epoch 25/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.2998 - acc: 0.1155 - val_loss: 2.3025 - val_acc: 0.1071
Epoch 26/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.2996 - acc: 0.1156 - val_loss: 2.3021 - val_acc: 0.1100
Epoch 27/50
39200/39200 [==============================] - 272s 7ms/step - loss: 2.2981 - acc: 0.1168 - val_loss: 2.3024 - val_acc: 0.1078
Epoch 28/50
39200/39200 [==============================] - 270s 7ms/step - loss: 2.2970 - acc: 0.1187 - val_loss: 2.3035 - val_acc: 0.1065
Epoch 29/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.2945 - acc: 0.1218 - val_loss: 2.3061 - val_acc: 0.1041
Epoch 30/50
39200/39200 [==============================] - 270s 7ms/step - loss: 2.2935 - acc: 0.1223 - val_loss: 2.3059 - val_acc: 0.1003
Epoch 31/50
39200/39200 [==============================] - 274s 7ms/step - loss: 2.2906 - acc: 0.1268 - val_loss: 2.3067 - val_acc: 0.1014
Epoch 32/50
39200/39200 [==============================] - 276s 7ms/step - loss: 2.2873 - acc: 0.1278 - val_loss: 2.3078 - val_acc: 0.1073
Epoch 33/50
39200/39200 [==============================] - 292s 7ms/step - loss: 2.2806 - acc: 0.1368 - val_loss: 2.3118 - val_acc: 0.1034
Epoch 34/50
39200/39200 [==============================] - 301s 8ms/step - loss: 2.2744 - acc: 0.1404 - val_loss: 2.3160 - val_acc: 0.1022
Epoch 35/50
39200/39200 [==============================] - 289s 7ms/step - loss: 2.2662 - acc: 0.1486 - val_loss: 2.3172 - val_acc: 0.1029
Epoch 36/50
39200/39200 [==============================] - 295s 8ms/step - loss: 2.2557 - acc: 0.1543 - val_loss: 2.3162 - val_acc: 0.1087
Epoch 37/50
39200/39200 [==============================] - 308s 8ms/step - loss: 2.2459 - acc: 0.1632 - val_loss: 2.3275 - val_acc: 0.1083
As can be seen, there is very low training as well validation accuracy.
I have tried reducing Dropout(previously it was 0.5 for one of the layers) but still no effect. I doubled the neurons in the last hidden layer,(previously they were 100), still no effect. It seems like, it is something to do with the pre processing of the images as well as the input parameters for the image.
What can be done?
Copied in from comments as the answer:
In fact your model isn't learning anything, which usually points to a bug. I don't see anything overtly wrong. A common error is inputting garbage to the network accidentally. Take the first few images that you're feeding to the network and display them in a debugger before your fit step and print out the labels and make sure they match. Do a sanity check on your inputs.
