image_dim_ordering - what am I missing here? - keras

EDIT: Could not reproduce this issue using cuda 8.0 and using titan X (Pascal)
Using tensorflow backend for keras I have issues that are related to image_dim_ordering.
When I use image_dim_ordering='th' in the keras config file, everything works well But when I use 'tf', training simply doesn't really improve much from 0.5 accuracy.
The motivation is that currently my live augmentations are very costly, and I'd love to remove the unneeded reshape from theano dim order convention to tensorflow.
I tried recreating the issue with simple code to allow reproduction by other people which may assist me to understand what am I doing wrong here. I'm well aware of the channel,height,width different conventions, and at least I think that I handle that.
While I didn't fully reproduce my problem in the compact example (maybe because it's a trivial task), the training results are repeatedly different, and much worse for the 'tf' case, even when I try different seed values.
Note - in this reproducing code, all that the network needs to do is to tell apart full patches of -1.0 from full patches of 1.0
This is my '~/.keras/keras.json'
{
"floatx": "float32",
"epsilon": 1e-07,
"backend": "tensorflow",
"image_dim_ordering": "th"
}
my tensorflow version is ''0.11.0rc0'' (it happened on 0,10 as well)
my keras is latest git pull of today.
Using 'th' for the image_dim_ordering I get accuracy >=0.99 at epoch 4 for three different seeds.
Using 'tf' for the dim order I get accuracy >= 0.9 much latest as you can see below in the log, only at around epoch 24
The following is a standalone code that should reproduce the problem:
from keras import backend as K
import keras.optimizers
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense, Input
from keras.models import Model
import numpy as np
def make_model(input_dim_size):
if K.image_dim_ordering() == 'tf':
input_shape = (input_dim_size, input_dim_size,1)
else:
input_shape = (1, input_dim_size, input_dim_size)
img_input = Input(shape=input_shape)
x = Convolution2D(64,5,5,border_mode='same')(img_input)
x = Activation('relu')(x)
x = MaxPooling2D((2,2),strides=(2,2))(x)
x = Convolution2D(64, 5, 5, border_mode='same')(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)
x = Convolution2D(64, 5, 5, border_mode='same')(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)
x = Convolution2D(128, 5, 5, border_mode='same')(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)
x = Convolution2D(128, 5, 5, border_mode='same')(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), strides=(2, 2))(x)
x = Flatten()(x)
x = Dense(1024*2)(x)
x = Activation('relu')(x)
x = Dropout(0.5)(x)
x = Dense(1024 * 2)(x)
x = Activation('relu')(x)
x = Dropout(0.75)(x)
x = Dense(200)(x)
x = Activation('relu')(x)
x = Dropout(0.75)(x)
x = Dense(1,activation='sigmoid')(x)
model = Model(img_input, x)
learning_rate = 0.01
sgd = keras.optimizers.sgd(lr=learning_rate, momentum=0.9, nesterov=True)
model.summary()
model.compile(loss='binary_crossentropy',
optimizer=sgd,
metrics=['accuracy']
)
return model
np.random.seed(456)
def dummy_generator(mini_batch_size=64, block_size=100):
if K.image_dim_ordering() == 'tf':
tensor_X_shape = (mini_batch_size,block_size, block_size,1)
else:
tensor_X_shape = (mini_batch_size, 1, block_size, block_size)
X = np.zeros(tensor_X_shape, dtype=np.float32)
y = np.zeros((mini_batch_size, 1))
while True:
for b in range(mini_batch_size):
X[b, :, :, :] = (float(b % 2) * 2.0) - 1.0
y[b, :] = float(b % 2)
yield X,y
with K.tf.device('/gpu:2'):
K.set_session(K.tf.Session(config=K.tf.ConfigProto(allow_soft_placement=True, log_device_placement=False)))
MINI_BATCH_SIZE = 64
PATCH_SIZE = 100
model = make_model(PATCH_SIZE)
gen = dummy_generator(mini_batch_size=MINI_BATCH_SIZE,block_size=PATCH_SIZE)
model.fit_generator(gen, MINI_BATCH_SIZE*10,
100, verbose=1,
callbacks=[],
validation_data=None,
nb_val_samples=None,
max_q_size=1,
nb_worker=1, pickle_safe=False)
For the 'tf' case this is the training log: (and looks very similar on different seeds):
Epoch 1/100
640/640 [==============================] - 1s - loss: 0.6932 - acc: 0.4781
Epoch 2/100
640/640 [==============================] - 0s - loss: 0.6932 - acc: 0.4938
Epoch 3/100
640/640 [==============================] - 0s - loss: 0.6921 - acc: 0.5203
Epoch 4/100
640/640 [==============================] - 0s - loss: 0.6920 - acc: 0.5469
Epoch 5/100
640/640 [==============================] - 0s - loss: 0.6935 - acc: 0.4875
Epoch 6/100
640/640 [==============================] - 0s - loss: 0.6941 - acc: 0.4969
Epoch 7/100
640/640 [==============================] - 0s - loss: 0.6937 - acc: 0.5047
Epoch 8/100
640/640 [==============================] - 0s - loss: 0.6931 - acc: 0.5312
Epoch 9/100
640/640 [==============================] - 0s - loss: 0.6923 - acc: 0.5250
Epoch 10/100
640/640 [==============================] - 0s - loss: 0.6929 - acc: 0.5281
Epoch 11/100
640/640 [==============================] - 0s - loss: 0.6934 - acc: 0.4953
Epoch 12/100
640/640 [==============================] - 0s - loss: 0.6918 - acc: 0.5234
Epoch 13/100
640/640 [==============================] - 0s - loss: 0.6930 - acc: 0.5125
Epoch 14/100
640/640 [==============================] - 0s - loss: 0.6939 - acc: 0.4797
Epoch 15/100
640/640 [==============================] - 0s - loss: 0.6936 - acc: 0.5047
Epoch 16/100
640/640 [==============================] - 0s - loss: 0.6917 - acc: 0.4922
Epoch 17/100
640/640 [==============================] - 0s - loss: 0.6945 - acc: 0.4891
Epoch 18/100
640/640 [==============================] - 0s - loss: 0.6948 - acc: 0.5000
Epoch 19/100
640/640 [==============================] - 0s - loss: 0.6968 - acc: 0.4594
Epoch 20/100
640/640 [==============================] - 0s - loss: 0.6919 - acc: 0.5391
Epoch 21/100
640/640 [==============================] - 0s - loss: 0.6904 - acc: 0.5172
Epoch 22/100
640/640 [==============================] - 0s - loss: 0.6881 - acc: 0.5906
Epoch 23/100
640/640 [==============================] - 0s - loss: 0.6804 - acc: 0.6359
Epoch 24/100
640/640 [==============================] - 0s - loss: 0.6470 - acc: 0.8219
Epoch 25/100
640/640 [==============================] - 0s - loss: 0.4134 - acc: 0.9625
Epoch 26/100
640/640 [==============================] - 0s - loss: 0.2347 - acc: 0.9953
Epoch 27/100
640/640 [==============================] - 0s - loss: 0.1231 - acc: 1.0000
And for the 'th' case the training log is (and looks very similar on different seeds):
Epoch 1/100
640/640 [==============================] - 3s - loss: 0.6891 - acc: 0.5594
Epoch 2/100
640/640 [==============================] - 2s - loss: 0.6079 - acc: 0.7328
Epoch 3/100
640/640 [==============================] - 2s - loss: 0.3166 - acc: 0.9422
Epoch 4/100
640/640 [==============================] - 2s - loss: 0.1767 - acc: 0.9969
I find it suspicious that it's so fast in the tensorflow case, (0s), but after adding debug printing to the generator it does seem to get called.
I thought that maybe it's related to keras not needing to reshape anything, but 2-3 seconds sounds too much time for this amount of reshapes
If anyone can try to reproduce the results that I see and help me understand what the heck am I missing here, I'd be grateful :)

This thread is bit old but I am still replying in case someone faces the same issues.
The error is caused due to the inconsistent Keras backend configuration...
{
"floatx": "float32",
"epsilon": 1e-07,
"backend": "tensorflow",
"image_dim_ordering": "th"
}
The configuration uses tensorflow as backend but uses image dimension ordering of Theano instead of tensorflow. change image_dim_ordering to tf and that should solve the issue..
"image_dim_ordering": "tf"

Related

Classify activity person go into and go out the car (behavior detection)

I'm working on the problem classify activity get out the car and get in.
Also need to classify if upload and download activity going near the car
Need advice how to fix problem of overfitting model in testing dataset
Using CNN + LSTM architecture. In the attachment i've provided samples of the dataset.
Have around 15 000 images for each class
Dataset example
go in image
go in image 2
go in image 3
go out image 1
go out image 2
Now let's go to code.
First i get my dataset using keras
batch_size = 128
batch_size_train = 148
def bring_data_from_directory():
datagen = ImageDataGenerator(rescale=1./255)
train_generator = datagen.flow_from_directory(
'train',
target_size=(224, 224),
batch_size=batch_size,
class_mode='categorical', # this means our generator will only yield batches of data, no labels
shuffle=True,
classes=['get_on','get_off','load','unload'])
validation_generator = datagen.flow_from_directory(
'validate',
target_size=(224, 224),
batch_size=batch_size,
class_mode='categorical', # this means our generator will only yield batches of data, no labels
shuffle=True,
classes=['get_on','get_off','load','unload'])
return train_generator,validation_generator
Use VGG16 network to extract features and store them into .npy format
def load_VGG16_model():
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224,224,3))
print ("Model loaded..!")
print (base_model.summary())
return base_model
def extract_features_and_store(train_generator,validation_generator,base_model):
x_generator = None
y_lable = None
batch = 0
for x,y in train_generator:
if batch == int(56021/batch_size):
break
print("Total needed:", int(56021/batch_size))
print ("predict on batch:",batch)
batch+=1
if np.any(x_generator)==None:
x_generator = base_model.predict_on_batch(x)
y_lable = y
print (y)
else:
x_generator = np.append(x_generator,base_model.predict_on_batch(x),axis=0)
y_lable = np.append(y_lable,y,axis=0)
print (y)
x_generator,y_lable = shuffle(x_generator,y_lable)
np.save(open('video_x_VGG16.npy', 'wb'), x_generator)
np.save(open('video_y_VGG16.npy','wb'),y_lable)
batch = 0
x_generator = None
y_lable = None
for x,y in validation_generator:
if batch == int(3971/batch_size):
break
print("Total needed:", int(3971/batch_size))
print ("predict on batch validate:",batch)
batch+=1
if np.any(x_generator)==None:
x_generator = base_model.predict_on_batch(x)
y_lable = y
print (y)
else:
x_generator = np.append(x_generator,base_model.predict_on_batch(x),axis=0)
y_lable = np.append(y_lable,y,axis=0)
print (y)
x_generator,y_lable = shuffle(x_generator,y_lable)
np.save(open('video_x_validate_VGG16.npy', 'wb'),x_generator)
np.save(open('video_y_validate_VGG16.npy','wb'),y_lable)
train_data = np.load(open('video_x_VGG16.npy', 'rb'))
train_labels = np.load(open('video_y_VGG16.npy', 'rb'))
train_data,train_labels = shuffle(train_data,train_labels)
print(train_data)
validation_data = np.load(open('video_x_validate_VGG16.npy', 'rb'))
validation_labels = np.load(open('video_y_validate_VGG16.npy', 'rb'))
validation_data,validation_labels = shuffle(validation_data,validation_labels)
train_data = train_data.reshape(train_data.shape[0],
train_data.shape[1] * train_data.shape[2],
train_data.shape[3])
validation_data = validation_data.reshape(validation_data.shape[0],
validation_data.shape[1] * validation_data.shape[2],
validation_data.shape[3])
return train_data,train_labels,validation_data,validation_labels
Model
def train_model(train_data,train_labels,validation_data,validation_labels):
print("SHAPE OF DATA : {}".format(train_data.shape))
model = Sequential()
model.add(LSTM(2048, stateful=True, activation='relu', kernel_regularizer=l2(0.0000001), activity_regularizer=l2(0.0000001), kernel_initializer='glorot_uniform', return_sequences=True, bias_initializer='zeros', dropout=0.2 , batch_input_shape=( batch_size_train, train_data.shape[1],
train_data.shape[2])))
model.add(LSTM(1024, stateful=True, activation='relu', kernel_regularizer=l2(0.0000001), activity_regularizer=l2(0.0000001), kernel_initializer='glorot_uniform', return_sequences=True, bias_initializer='zeros', dropout=0.2))
model.add(LSTM(512, stateful=True, activation='relu', kernel_regularizer=l2(0.0000001), activity_regularizer=l2(0.0000001), kernel_initializer='glorot_uniform', return_sequences=True, bias_initializer='zeros', dropout=0.2))
model.add(LSTM(128, stateful=True, activation='relu', kernel_regularizer=l2(0.0000001), activity_regularizer=l2(0.0000001), kernel_initializer='glorot_uniform', bias_initializer='zeros', dropout=0.2))
model.add(Dense(1024, kernel_regularizer=l2(0.01), activity_regularizer=l2(0.01), kernel_initializer='random_uniform', bias_initializer='zeros', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(4, kernel_initializer='random_uniform', bias_initializer='zeros', activation='softmax'))
adam = Adam(lr=0.00005, decay = 1e-6, clipnorm=1.0, clipvalue=0.5)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])
callbacks = [ EarlyStopping(monitor='val_loss', patience=10, verbose=0), ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0), ModelCheckpoint('video_1_LSTM_1_1024.h5', monitor='val_loss', save_best_only=True, verbose=0) ]
nb_epoch = 500
model.fit(train_data,train_labels,validation_data=(validation_data,validation_labels),batch_size=batch_size_train,nb_epoch=nb_epoch,callbacks=callbacks,shuffle=True,verbose=1)
return model
LOGS
Train on 55796 samples, validate on 3552 samples
Epoch 1/500
55796/55796 [==============================] - 209s 4ms/step - loss: 2.0079 - acc: 0.4518 - val_loss: 1.6785 - val_acc: 0.6166
Epoch 2/500
55796/55796 [==============================] - 205s 4ms/step - loss: 1.3974 - acc: 0.8347 - val_loss: 1.3561 - val_acc: 0.6740
Epoch 3/500
55796/55796 [==============================] - 205s 4ms/step - loss: 1.1181 - acc: 0.8628 - val_loss: 1.1961 - val_acc: 0.7311
Epoch 4/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.9644 - acc: 0.8689 - val_loss: 1.1276 - val_acc: 0.7218
Epoch 5/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.8681 - acc: 0.8703 - val_loss: 1.0483 - val_acc: 0.7435
Epoch 6/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.7944 - acc: 0.8717 - val_loss: 0.9755 - val_acc: 0.7641
Epoch 7/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.7296 - acc: 0.9245 - val_loss: 0.9444 - val_acc: 0.8260
Epoch 8/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.6670 - acc: 0.9866 - val_loss: 0.8486 - val_acc: 0.8426
Epoch 9/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.6121 - acc: 0.9943 - val_loss: 0.8455 - val_acc: 0.8708
Epoch 10/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.5634 - acc: 0.9964 - val_loss: 0.8335 - val_acc: 0.8553
Epoch 11/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.5216 - acc: 0.9973 - val_loss: 0.9688 - val_acc: 0.7838
Epoch 12/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.4841 - acc: 0.9986 - val_loss: 0.8166 - val_acc: 0.8133
Epoch 13/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.4522 - acc: 0.9984 - val_loss: 0.8399 - val_acc: 0.8184
Epoch 14/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.4234 - acc: 0.9987 - val_loss: 0.7864 - val_acc: 0.8072
Epoch 15/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.3977 - acc: 0.9990 - val_loss: 0.7306 - val_acc: 0.8446
Epoch 16/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.3750 - acc: 0.9990 - val_loss: 0.7644 - val_acc: 0.8514
Epoch 17/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.3546 - acc: 0.9989 - val_loss: 0.7542 - val_acc: 0.7908
Epoch 18/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.3345 - acc: 0.9994 - val_loss: 0.7150 - val_acc: 0.8314
Epoch 19/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.3170 - acc: 0.9993 - val_loss: 0.8910 - val_acc: 0.7798
Epoch 20/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.3017 - acc: 0.9992 - val_loss: 0.6143 - val_acc: 0.8809
Epoch 21/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.2861 - acc: 0.9995 - val_loss: 0.7907 - val_acc: 0.8156
Epoch 22/500
55796/55796 [==============================] - 205s 4ms/step - loss: 0.2719 - acc: 0.9996 - val_loss: 0.7077 - val_acc: 0.8401
Epoch 23/500
55796/55796 [==============================] - 206s 4ms/step - loss: 0.2593 - acc: 0.9995 - val_loss: 0.6482 - val_acc: 0.8133
Epoch 24/500
55796/55796 [==============================] - 204s 4ms/step - loss: 0.2474 - acc: 0.9995 - val_loss: 0.7671 - val_acc: 0.7942
The problem is appears that the model starts to overfit and on the testing dataset makes significant detection errors. So far as i see the problem that model can't see the difference between these to actions, or maybe the sequence problem.
As you see i've already tried regularization, clipping and so on. No result.
Please any advice regarding how to fix this problem.

My convolutional network loss does not change and keeps stagnant throughout the training. how do fix this?

I am trying to train a convolutional network but the loss does change no matter what i do. I want to know where i am going wrong and also would appreciate any friendly advises as this is my first time i am dealing with such large data.
I have tried many combinations of optimizers(adam,SGD,adamdelta...),loss functions(sqauare mean error,binary cross entropy....) and activation(Relu,elu,selu....) but the problem still persists.
Nature of my project: this is my attempt at training a simple self driving car in simulation.
Training data: the training data is split in around ~4000 .h5 files. Each file has exactly 200 images with respective data for each image like speed,acceleration etc.
Due to the nature of the data I decided to train in mini batches of 200 and cycle through all the files.
# model (I am a beginner so forgive my sloppy code)
rgb_in = Input(batch_shape=(200, 88, 200, 3), name='rgb_in')
conv_1 = Conv2D(filters=10,kernel_size=5,activation="elu",data_format="channels_last",init = "he_normal")(rgb_in)
conv_2 = Conv2D(filters=16,kernel_size=5,activation="elu",data_format="channels_last",init = "he_normal")(conv_1)
conv_3 = Conv2D(filters=24,kernel_size=5,activation="elu",data_format="channels_last",init = "he_normal")(conv_2)
conv_4 = Conv2D(filters=32,kernel_size=3,activation="elu",data_format="channels_last",init = "he_normal")(conv_3)
conv_5 = Conv2D(filters=32,kernel_size=3,activation="elu",data_format="channels_last",init = "he_normal")(conv_4)
flat = Flatten(data_format="channels_last")(conv_5)
t_in = Input(batch_shape=(200,14), name='t_in')
x = concatenate([flat, t_in])
dense_1 = Dense(100,activation="elu",init = "he_normal")(x)
dense_2 = Dense(50,activation="elu",init = "he_normal")(dense_1)
dense_3 = Dense(25,activation="elu",init = "he_normal")(dense_2)
out = Dense(5,activation="elu",init = "he_normal")(dense_3)
model = Model(inputs=[rgb_in, t_in], outputs=[out])
model.compile(optimizer='Adadelta', loss='binary_crossentropy')
for i in range(3663,6951):
filename = 'data_0'+str(i)+'.h5'
f = h5py.File(filename, 'r')
rgb = f["rgb"][:,:,:,:]
targets = f["targets"][:,:]
rgb = (rgb - rgb.mean())/rgb.std()
input_target[:,0] = targets[:,10]
input_target[:,1] = targets[:,11]
input_target[:,2] = targets[:,12]
input_target[:,3] = targets[:,13]
input_target[:,4] = targets[:,16]
input_target[:,5] = targets[:,17]
input_target[:,6] = targets[:,18]
input_target[:,7] = targets[:,21]
input_target[:,8] = targets[:,22]
input_target[:,9] = targets[:,23]
a = one_hot(targets[:,24].astype(int),6)
input_target[:,10] = a[:,2]
input_target[:,11] = a[:,3]
input_target[:,12] = a[:,4]
input_target[:,13] = a[:,5]
output[:,0] = targets[:,0]
output[:,1] = targets[:,1]
output[:,2] = targets[:,2]
output[:,3] = targets[:,4]
output[:,4] = targets[:,5]
model.fit([rgb,input_target], output,epochs=10,batch_size=200)
The result:
Epoch 1/10
200/200 [==============================] - 7s 35ms/step - loss: 6.1657
Epoch 2/10
200/200 [==============================] - 0s 2ms/step - loss: 2.3812
Epoch 3/10
200/200 [==============================] - 0s 2ms/step - loss: 2.2955
Epoch 4/10
200/200 [==============================] - 0s 2ms/step - loss: 2.2778
Epoch 5/10
200/200 [==============================] - 0s 2ms/step - loss: 2.2778
Epoch 6/10
200/200 [==============================] - 0s 2ms/step - loss: 2.2778
Epoch 7/10
200/200 [==============================] - 0s 2ms/step - loss: 2.2778
Epoch 8/10
200/200 [==============================] - 0s 2ms/step - loss: 2.2778
Epoch 9/10
200/200 [==============================] - 0s 2ms/step - loss: 2.2778
Epoch 10/10
200/200 [==============================] - 0s 2ms/step - loss: 2.2778
Epoch 1/10
200/200 [==============================] - 0s 2ms/step - loss: 1.9241
Epoch 2/10
200/200 [==============================] - 0s 2ms/step - loss: 1.9241
Epoch 3/10
200/200 [==============================] - 0s 2ms/step - loss: 1.9241
Epoch 4/10
200/200 [==============================] - 0s 2ms/step - loss: 1.9241
Epoch 5/10
200/200 [==============================] - 0s 2ms/step - loss: 1.9241
Epoch 6/10
200/200 [==============================] - 0s 2ms/step - loss: 1.9241
Epoch 7/10
200/200 [==============================] - 0s 2ms/step - loss: 1.9241
Epoch 8/10
200/200 [==============================] - 0s 2ms/step - loss: 1.9241
Epoch 9/10
200/200 [==============================] - 0s 2ms/step - loss: 1.9241
Epoch 10/10
200/200 [==============================] - 0s 2ms/step - loss: 1.9241
And lastly I will appretiate if you any advice for me regarding the project。
How about using a ReduceLROnPlateau callback?
from keras.callbacks import ReduceLROnPlateau
reduce_lr = ReduceLROnPlateau(monitor='loss', patience=6)
model.fit(X,y,num_epochs=666,callbacks=[reduce_lr])
I have used cyclic learning rate and it has fixed the problem.
for who ever has suffered a similar issue, here is a link to it
https://github.com/bckenstler/CLR

resnet50 - increasing training speed? keras

hth do I increase the speed of this? I mean the loss is moving down by hairs. HAIRS.
Epoch 1/30
4998/4998 [==============================] - 307s 62ms/step - loss: 0.6861 - acc: 0.6347
Epoch 2/30
4998/4998 [==============================] - 316s 63ms/step - loss: 0.6751 - acc: 0.6387
Epoch 3/30
4998/4998 [==============================] - 357s 71ms/step - loss: 0.6676 - acc: 0.6387
Epoch 4/30
4998/4998 [==============================] - 376s 75ms/step - loss: 0.6625 - acc: 0.6387
Epoch 5/30
4998/4998 [==============================] - 354s 71ms/step - loss: 0.6592 - acc: 0.6387
Epoch 6/30
4998/4998 [==============================] - 345s 69ms/step - loss: 0.6571 - acc: 0.6387
Epoch 7/30
4998/4998 [==============================] - 349s 70ms/step - loss: 0.6559 - acc: 0.6387
Model Architecture:
resnet50 (CNN with skip connections)
Except instead of 1 FC I have two. And I changed the softmax output to sigmoid for binary classification.
num positive training data: 1806
num neg training data: 3192
My output is represented by a 1 or 0 for each example ( [0, 0, 1, 1, ...])
batches = 40, num epochs =30, but that doesn't matter because the loss stopped

Very low accuracy on Digit recgonition dataset with images having 4 channels, using Convolutional Neural Networks

I am currently working on a digit recognition challenge by Analytics Vidhya, the link to which is https://datahack.analyticsvidhya.com/contest/practice-problem-identify-the-digits/ .
The images in the dataset pertaining to this challenge are of dimensions 28*28*4 (28 = length = width , 4 = no. of channels).The code I have implemented is:
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten,Activation
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils
from keras import backend as K
K.set_image_dim_ordering('th')
import numpy as np
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
# define the larger model
def larger_model():
# create model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(4, 28, 28),activation='relu',padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(15, (3, 3), activation='relu',padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(200, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
def loadImages(path):
# return array of images
imagesList = listdir(path)
loadedImages = []
for image in imagesList:
img = io.imread(path + "/" + image,as_grey = False)
loadedImages.append(np.array(img))
return loadedImages
path = "C:/Users/Farz Jamal/Downloads/mnist/Train/Images/train" #path_to_train_dataset
import pandas as pd
df = pd.read_csv("C:/Users/Farz Jamal/Downloads/mnist/Train/train.csv") #path_to_class_labels
y = np.array(df['label'])
from sklearn.cross_validation import train_test_split as ttt
x_train,x_val,y_train,y_val = ttt(imgs,y,test_size = 0.2)
Continued Code:
x_vall,x_test,y_vall,y_test = ttt(x_val,y_val,test_size = 0.4)
x_train,x_vall,x_test = np.array(x_train).astype('float32'),np.array(x_vall).astype('float32'),np.array(x_test).astype('float32')
# normalize inputs from 0-255 to 0-1
x_train = x_train / 255.0
x_vall = x_vall / 255.0
x_test = x_test / 255.0
y_train = np_utils.to_categorical(y_train)
y_vall = np_utils.to_categorical(y_vall)
y_test = np_utils.to_categorical(y_test)
num_classes = y_vall.shape[1] #10
#fitting_and_evaluating
model = larger_model()
# Fit the model
model.fit(x_train, y_train, validation_data=(x_vall, y_vall), epochs=50, batch_size=200)
# Final evaluation of the model
scores = model.evaluate(x_test, y_test, verbose=0)
The output is coming as follows:(from 16thepoch to 37th epoch)
Epoch 16/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.3013 - acc: 0.1135 - val_loss: 2.3015 - val_acc: 0.1095
Epoch 17/50
39200/39200 [==============================] - 275s 7ms/step - loss: 2.3011 - acc: 0.1128 - val_loss: 2.3014 - val_acc: 0.1095
Epoch 18/50
39200/39200 [==============================] - 270s 7ms/step - loss: 2.3011 - acc: 0.1124 - val_loss: 2.3015 - val_acc: 0.1095
Epoch 19/50
39200/39200 [==============================] - 273s 7ms/step - loss: 2.3012 - acc: 0.1131 - val_loss: 2.3017 - val_acc: 0.1095
Epoch 20/50
39200/39200 [==============================] - 273s 7ms/step - loss: 2.3011 - acc: 0.1130 - val_loss: 2.3018 - val_acc: 0.1111
Epoch 21/50
39200/39200 [==============================] - 272s 7ms/step - loss: 2.3010 - acc: 0.1127 - val_loss: 2.3013 - val_acc: 0.1095
Epoch 22/50
39200/39200 [==============================] - 281s 7ms/step - loss: 2.3006 - acc: 0.1133 - val_loss: 2.3015 - val_acc: 0.1097
Epoch 23/50
39200/39200 [==============================] - 273s 7ms/step - loss: 2.3005 - acc: 0.1136 - val_loss: 2.3018 - val_acc: 0.1099
Epoch 24/50
39200/39200 [==============================] - 276s 7ms/step - loss: 2.3005 - acc: 0.1135 - val_loss: 2.3022 - val_acc: 0.1116
Epoch 25/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.2998 - acc: 0.1155 - val_loss: 2.3025 - val_acc: 0.1071
Epoch 26/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.2996 - acc: 0.1156 - val_loss: 2.3021 - val_acc: 0.1100
Epoch 27/50
39200/39200 [==============================] - 272s 7ms/step - loss: 2.2981 - acc: 0.1168 - val_loss: 2.3024 - val_acc: 0.1078
Epoch 28/50
39200/39200 [==============================] - 270s 7ms/step - loss: 2.2970 - acc: 0.1187 - val_loss: 2.3035 - val_acc: 0.1065
Epoch 29/50
39200/39200 [==============================] - 271s 7ms/step - loss: 2.2945 - acc: 0.1218 - val_loss: 2.3061 - val_acc: 0.1041
Epoch 30/50
39200/39200 [==============================] - 270s 7ms/step - loss: 2.2935 - acc: 0.1223 - val_loss: 2.3059 - val_acc: 0.1003
Epoch 31/50
39200/39200 [==============================] - 274s 7ms/step - loss: 2.2906 - acc: 0.1268 - val_loss: 2.3067 - val_acc: 0.1014
Epoch 32/50
39200/39200 [==============================] - 276s 7ms/step - loss: 2.2873 - acc: 0.1278 - val_loss: 2.3078 - val_acc: 0.1073
Epoch 33/50
39200/39200 [==============================] - 292s 7ms/step - loss: 2.2806 - acc: 0.1368 - val_loss: 2.3118 - val_acc: 0.1034
Epoch 34/50
39200/39200 [==============================] - 301s 8ms/step - loss: 2.2744 - acc: 0.1404 - val_loss: 2.3160 - val_acc: 0.1022
Epoch 35/50
39200/39200 [==============================] - 289s 7ms/step - loss: 2.2662 - acc: 0.1486 - val_loss: 2.3172 - val_acc: 0.1029
Epoch 36/50
39200/39200 [==============================] - 295s 8ms/step - loss: 2.2557 - acc: 0.1543 - val_loss: 2.3162 - val_acc: 0.1087
Epoch 37/50
39200/39200 [==============================] - 308s 8ms/step - loss: 2.2459 - acc: 0.1632 - val_loss: 2.3275 - val_acc: 0.1083
As can be seen, there is very low training as well validation accuracy.
I have tried reducing Dropout(previously it was 0.5 for one of the layers) but still no effect. I doubled the neurons in the last hidden layer,(previously they were 100), still no effect. It seems like, it is something to do with the pre processing of the images as well as the input parameters for the image.
What can be done?
Copied in from comments as the answer:
In fact your model isn't learning anything, which usually points to a bug. I don't see anything overtly wrong. A common error is inputting garbage to the network accidentally. Take the first few images that you're feeding to the network and display them in a debugger before your fit step and print out the labels and make sure they match. Do a sanity check on your inputs.

Keras autoencoder classification

I am trying to find a useful code for improve classification using autoencoder.
I followed this example keras autoencoder vs PCA
But not for MNIST data, I tried to use it with cifar-10
so I made some changes but it seems like something is not fitting.
Could any one please help me in this?
if you have another example that can run in different dataset, that would help.
the validation in reduced.fit, which is (X_test,Y_test) is not learned, so it gives wronf accuracy in .evalute()
always give
val_loss: 2.3026 - val_acc: 0.1000
This is the code, and the error:
rom keras.datasets import cifar10
from keras.models import Model
from keras.layers import Input, Dense
from keras.utils import np_utils
import numpy as np
num_train = 50000
num_test = 10000
height, width, depth = 32, 32, 3 # MNIST images are 28x28
num_classes = 10 # there are 10 classes (1 per digit)
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X_train = X_train.reshape(num_train,height * width * depth)
X_test = X_test.reshape(num_test,height * width*depth)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255 # Normalise data to [0, 1] range
X_test /= 255 # Normalise data to [0, 1] range
Y_train = np_utils.to_categorical(y_train, num_classes) # One-hot encode the labels
Y_test = np_utils.to_categorical(y_test, num_classes) # One-hot encode the labels
input_img = Input(shape=(height * width * depth,))
s=height * width * depth
x = Dense(s, activation='relu')(input_img)
encoded = Dense(s//2, activation='relu')(x)
encoded = Dense(s//8, activation='relu')(encoded)
y = Dense(s//256, activation='relu')(x)
decoded = Dense(s//8, activation='relu')(y)
decoded = Dense(s//2, activation='relu')(decoded)
z = Dense(s, activation='sigmoid')(decoded)
model = Model(input_img, z)
model.compile(optimizer='adadelta', loss='mse') # reporting the accuracy
model.fit(X_train, X_train,
nb_epoch=10,
batch_size=128,
shuffle=True,
validation_data=(X_test, X_test))
mid = Model(input_img, y)
reduced_representation =mid.predict(X_test)
out = Dense(num_classes, activation='softmax')(y)
reduced = Model(input_img, out)
reduced.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
reduced.fit(X_train, Y_train,
nb_epoch=10,
batch_size=128,
shuffle=True,
validation_data=(X_test, Y_test))
scores = reduced.evaluate(X_test, Y_test, verbose=0)
print("Accuracy: ", scores[1])
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
50000/50000 [==============================] - 5s - loss: 0.0639 - val_loss: 0.0633
Epoch 2/10
50000/50000 [==============================] - 5s - loss: 0.0610 - val_loss: 0.0568
Epoch 3/10
50000/50000 [==============================] - 5s - loss: 0.0565 - val_loss: 0.0558
Epoch 4/10
50000/50000 [==============================] - 5s - loss: 0.0557 - val_loss: 0.0545
Epoch 5/10
50000/50000 [==============================] - 5s - loss: 0.0536 - val_loss: 0.0518
Epoch 6/10
50000/50000 [==============================] - 5s - loss: 0.0502 - val_loss: 0.0461
Epoch 7/10
50000/50000 [==============================] - 5s - loss: 0.0443 - val_loss: 0.0412
Epoch 8/10
50000/50000 [==============================] - 5s - loss: 0.0411 - val_loss: 0.0397
Epoch 9/10
50000/50000 [==============================] - 5s - loss: 0.0391 - val_loss: 0.0371
Epoch 10/10
50000/50000 [==============================] - 5s - loss: 0.0377 - val_loss: 0.0403
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
50000/50000 [==============================] - 3s - loss: 2.3605 - acc: 0.0977 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 2/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0952 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 3/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0978 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 4/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0980 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 5/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0974 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 6/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.1000 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 7/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0992 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 8/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0982 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 9/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0965 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 10/10
50000/50000 [==============================] - 3s - loss: 2.3027 - acc: 0.0978 - val_loss: 2.3026 - val_acc: 0.1000
9856/10000 [============================>.] - ETA: 0s('Accuracy: ', 0.10000000000000001)
there are multiple issues with your code.
Your autoencoder is not fully trained, if you plot the training data, you will see the model haven't converged yet. By
history = model.fit(X_train, X_train,
nb_epoch=10,
batch_size=128,
shuffle=True,
validation_data=(X_test, X_test))
you will obtain the loss values during training. If you plot them, e.g. in matplotlib,
import matplotlib.pyplot as plt
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model train vs validation loss 1')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper right')
plt.show()
you will see that it needs more epochs to converge.
The autoencoder architecture is wrongly built, there is typo in line y = Dense(s//256, activation='relu')(x), you probably wanted to usey = Dense(s//256, activation='linear')(encoded) so it uses previous layer and not the input. And also you don't want to use the relu activation in latent space, because then it disallows you subtracting latent variables from each other and thus makes the autoencoder much less efficient.
With those fixes, the model trains withour problems.
I increased number of epochs to 30 for training both networks so it will train better.
At the end of the trainings, the classification model reports loss: 1.2881 - acc: 0.5397 - val_loss: 1.3841 - val_acc: 0.5126 which is lower than you experienced.

Resources