I am trying to train
model.add(Conv2D(32, (3, 3), kernel_initializer='random_uniform', activation='relu', input_shape=(x1, x2, depth)))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(BatchNormalization())
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.4))
model.add(Conv2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(3, activation='softmax'))
Here's how I'm compiling it:
sgd = optimizers.SGD(lr=0.1, decay=0.0, momentum=0.05, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
I've tried various learning rates and different optimizers. But the accuracy doesn't seem to go beyond 50% as shown below:
My images are properly normalized around 0 with STD as 1.
Is there something I am missing? How can I improve the accuracy of the model?
EDIT:
Hey, when I use the following data generator:
train_datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(featurewise_center=True,
featurewise_std_normalization=True)
train_generator = train_datagen.flow(np.array(X_train), batch_size=batchsize)
valid_generator = test_datagen.flow(np.array(X_test), batch_size = batchsize)
history = model.fit_generator(train_datagen.flow(np.array(X_train), y_train_cat, batch_size=batchsize),
steps_per_epoch=len(X_train) // batchsize, epochs=epochs,
validation_data= valid_generator,
validation_steps=len(X_test) // batchsize)
I get the following error:
TypeError: '>' not supported between instances of 'int' and 'str'
I used to solve this by either updating numpy or uninstalling it and installing it again, but this time, it's not working with either. Can you help me with it?
You have already captured things like tweaking learning rates, dropouts, batchnormalization etc - which is a good starting point to tweak.
Have you tried regularization?
Check out
https://cambridgespark.com/content/tutorials/neural-networks-tuning-techniques/index.html
If it doesnt help, you might need to look at how the input is structured and see if there are other ways that more helpful for the network to converge. This includes making sure that train and validation has same level of variance in data etc. This is however, more domain specific to what you are trying to solve.
Related
I have a code below which implements an architecture (in grid search), to yield appropriate parameters for input, nodes, epochs, batch size and differenced time series input.
The challenge I have is to convert the neural network from just having one LSTM hidden layer, to multiple LSTM hidden layers.
At the moment, I could only run the code with Dense-type hidden layers, without having any errors thrown, otherwise I get dimension errors, tuple errors and so on.
The problem is only persistent in the neural network architecture section.
Original code that works:
def model_fit(train, config):
# unpack config
n_input, n_nodes, n_epochs, n_batch, n_diff = config
# Data
if n_diff > 0:
train = difference(train, n_diff)
# Time series to supervised format
data = series_to_supervised(train, n_in=n_input)
train_x, train_y = data[:, :-1], data[:, -1]
# Reshaping input data into [samples, timesteps, features]
n_features = 1
train_x = train_x.reshape((train_x.shape[0], train_x.shape[1], n_features))
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features)))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(n_nodes, activation='relu'))
model.add(Dense(1))
# Compile model (Grid search architecture)
model.compile(loss='mse', optimizer='adam')
# fit model
model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch, verbose=0)
return model
Modified LSTM-hidden layer code, that fails to run:
# Define model for (Grid search architecture)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
Another variant that also threw an error - ValueError: Error when checking target: expected time_distributed_4 to have 3 dimensions, but got array with shape (34844, 1)
model = Sequential()
model.add(LSTM(n_nodes, activation='relu', input_shape=(n_input, n_features), return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=False))
model.add(RepeatVector(n_input))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(LSTM(n_nodes, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))
Could anyone with any suggestion please help me ?
Try to set return_sequences=False at the last layer.
I made a keras CNN model to predict different hand poses, and the model was not predicting the correct output. I had 10 classes. But for some images it was showing results like [0, 1, 0, 0, 1, 0, 0, 0, 0, 0]. My question is why is this happening.
My Architecture.
model = Sequential()
model.add(Conv2D(32, (5,5), input_shape=x.shape[1:]))
model.add(Conv2D(32, (5,5), input_shape=x.shape[1:]))
model.add(Conv2D(32, (5,5), input_shape=x.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(2,2))
model.add(Conv2D(64, (3,3), input_shape=x.shape[1:]))
model.add(Conv2D(64, (3,3), input_shape=x.shape[1:]))
model.add(Conv2D(64, (3,3), input_shape=x.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(2,2))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer = 'adam',
metrics = ['accuracy']
)
model.fit(x, y, epochs=10)
You are using binary_crossentropy loss which should be used for binary classification problems. For multiclass problems you should be using categorical_crossentropy. You might also want to change the activation on the last layer to softmax
This is the obvious engineering issue I can see; having said that, you probably will have to experiment with the number of layers, epochs, learning rates etc to get a working model.
I'm trying to build a model to predict wether a picture has text in it using keras with Tensorflow backend.
This is my Model:
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(image_size, image_size, 3))) # 32?
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5)) # 0.5?
model.add(Conv2D(32, (3, 3))) # again, 32?
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5)) # again, 0.5?
model.add(Conv2D(64, (3, 3))) # again, 64?
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5)) # again, 0.5?
model.add(Flatten())
model.add(Dense(96))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1)) # binary
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
I've tried image sizes 128, 256, 384.
I train it with 9000 images, 4500 of cat 1 and 4500 of cat 2.
the training accuracy goes as high as 0.90.
but when I load the model and use it to predict the category of images of the two categories that I haven't trained it with, it always gives the score 0.
Any ideas why this is happenning?
p.s. cat 1 is images with text and cat 2 is images without text.
also this is my code for testing the model:
model = load_model(model_path)
test_data_generator = ImageDataGenerator(rescale=1. / 255)
test_generator = test_data_generator.flow_from_directory(
test_data_dir,
target_size=(image_size, image_size),
batch_size=batch_size,
class_mode=None,
shuffle=False)
prediction = model.predict_generator(
test_generator,
use_multiprocessing=True,
verbose=1) # verbose=1 makes it show a progress bar.
dst = []
for pred in prediction:
if int(round(pred)) == 0:
dst += [0]
else:
dst += [1]
dst is all 0's.
Apparently the problem was that I used keras' generators to feed the images I was testing and used shutil to iterate over them. So I'd get the predictions and iterate over them and the files in the directory using two different things and the order of them wasn't the same. shutil would list them in some random order whereas the generator would list them in ascending order by name. so I wasn't putting a label on the file it was generated for.
I have a dataset of about 160k images of 160 classes and I'm trying to classify them using CNN. Training on 120k images for 20 epochs i start with loss ~ 4.9 and val_loss ~ 4.6 which improves to about 3.3 and 3.2 after 20 epochs. I Really tried to read the documentation of Keras and understand what does that mean, but i couldn't, so I'm asking if someone would explain to me in the context of my model what does that mean for my model. I mean what does the loss score represent? What does it say for the model?
num_classes = 154
batch_size = 64
input_shape = (50,50,3)
epochs = 20
X, y = load_data()
# input image dimensions
img_rows, img_cols = 50, 50
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Conv2D(64, kernel_size=(5, 5),
activation='relu',
padding = 'same',
input_shape=input_shape))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3)))
model.add(Dropout(0.20))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
I think it will help to watch a few tutorials online on CNN, to begin with. Basically, you want your loss to reduce with the training epochs which is what is observed in your case. Typically we look at how both losses are evolving over the entire training period. Observing how the train and validation loss is changing helps us have an understand whether the model is overfitting or not. You can check This link for a basic explanation to detect overfitting.
Ideally, you would want both your training as well as test loss to reduce with iterations. It is a measure of the error committed by the model in classification (as accuracy increase you expect the loss to reduce)
I want to train my model on video data for gesture recognition, proposed using LSTM's and TimeDistributed layers. Would this be ideal way to tackle my problem?
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(Conv2D(32, (3, 3)))
model.add(Dropout(0.25))
model.add(Conv2D(32, (3, 3)))
model.add(MaxPooling2D(pool_size=pool_size))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# Run epochs of sampling data then training
For temporal sequence data LSTM networks are generally the right choice. If you want to analyze video then a combination with 2d convolutions sounds reasonable to me. However, you have to apply TimeDistributed on all layers which dont expect sequence data. In your example that means all laysers expect LSTM.
# Convolution
pool_size = 4
# LSTM
lstm_output_size = 1
print('Build model...')
model = Sequential()
model.add(TimeDistributed(Dense(62), input_shape=(img_width, img_height,3)))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(Dropout(0.25))
model.add(TimeDistributed(Conv2D(32, (3, 3))))
model.add(TimeDistributed(MaxPooling2D(pool_size=pool_size)))
# model.add(Dense(1))
model.add(TimeDistributed(Flatten()))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(256, return_sequences=True))
model.add(CuDNNLSTM(lstm_output_size))
model.add(Dense(units = 1, activation = 'sigmoid'))
print('Train...')
model.summary()
# run epochs of sampling data then training
The last Dense Layer can stay this way because the final lstm doesnt output a sequence.