Predicting New ImageNet Pictures - keras

I'm having problems predicting new images based on the ImageNet database where objects are placed on different backgrounds. Predicting the classes work well on various models (ResNets and VGG) but it doesn't work at all for Densenets and Inception. I'm thinking this is related to the labels but I can't figure out what's wrong exactly.
The labels are ordered from 0 to 999 (divided in maps) and are in the same order as the original dataset.
I've changed to labels to the other format (e.g. n02676566). This didn't work for DenseNet (ResNets still worked fine).
Code:
model = keras.applications.densenet.DenseNet201(weights='imagenet', classes=1000)
labels = [str(i) for i in range(0,1000)]
generator_imagenet = ImageDataGenerator().flow_from_directory(path, target_size(224,224), batch_size=64, class_mode='categorical', shuffle=True, classes=labels)
resnet18_model.compile(optimizer='SGD', loss='categorical_crossentropy', metrics=['acc'])
resnet18_model.evaluate_generator(generator_imagenet)
Output ResNet 152:
Found 243 images belonging to 1000 classes. [1.6771895689238245,
0.5679012397189199]
Output DenseNet 201:
Found 243 images belonging to 1000 classes. [15.938421453468102, 0.0]

You will need to preprocess the images using the preprocess_input function provided with each keras_applications model.
Add the following changes.
keras.applications.densenet import preprocess_input
generator_imagenet = ImageDataGenerator(preprocessing_function=preprocess_input).flow_from_directory(path,
target_size(224,224),
batch_size=64,
class_mode='categorical',
shuffle=True,
classes=labels)

Related

Is image needed to rescale before predicting with model that trained with ImageDataGenerator(1./255)?

After training model with ImageDataGenerator(1/255.), do I need to rescale image before predicting ?
I thought it is necessary but experiment result said NO.
I trained a Resnet50 model which has 37 class on top layer.
Model was trained with ImageDataGenerator like this.
datagen = ImageDataGenerator(rescale=1./255)
generator=datagen.flow_from_directory(
directory=os.path.join(os.getcwd(), data_folder),
target_size=(224,224),
batch_size=256,
classes=None,
class_mode='categorical')
history = model.fit_generator(generator, steps_per_epoch=generator.n / 256, epochs=10)
Accuracy achieved 98% after 10 epochs on my train dataset.
The problem is, when i tried to predict each image in TRAIN dataset, prediction was wrong ( result is 33 whatever input image was )
img_p = './data/pets/shiba_inu/shiba_inu_27.jpg'
img = cv2.imread(img_p, cv2.IMREAD_COLOR)
img = cv2.resize(img, (224,224))
img_arr = np.zeros((1,224,224,3))
img_arr[0, :, :, :] = img / 255.
pred = model.predict(img_arr)
yhat = np.argmax(pred, axis=1)
yhat is 5, but y is 33
When I replace this line
img_arr[0, :, :, :] = img / 255.
by this
img_arr[0, :, :, :] = img
yhat is exactly 33.
Someone might suggest to use predict_generator() instead of predict(), but I want to understand what I did wrong here.
I knew what's wrong here.
I'm using Imagenet pretrained model, which DO NOT rescale image by divide it to 255. I have to use resnet50.preprocess_input before train/test.
preprocess_input function can be found here.
https://github.com/keras-team/keras-applications/blob/master/keras_applications/imagenet_utils.py
You must do every preprocessing that you do on train data, on each data that you want to feed to your trained network. actually when, for example, you rescale train images and train a network, your network train to get a matrix with entries between 0 and 1 and find the proper category. so if after training phase, you feed an image without rescaling, you feed a matrix with entries between 0 and 255 to your trained network while your network did not learn how treat with such matrix.
If you are following pre-processing exactly same as at the time of training then, you might look at the part of your code where you are predicting class using yhat = np.argmax(pred, axis=1) my hunch is that there might be class mismatch in accordance to indexing, to check how your classes are indexed when you use flow_from_directory use class_map = generator.class_indices this will return you a dictionary which will show you how your classes are mapped against index.
Note: The reason I state this because I've faced similar problem, using Keras flow_from_directory doesn't sort classes and hence it's quite possible that your prediction class 1 lies on the index 10 while np.argmax will return you class 1'.

Why is the validation accuracy constant at 20%?

I am trying to implement a 5 class animal classifier using Keras. I am building the CNN from scratch and the weird thing is, the validation accuracy stays constant at 0.20 for all epochs. Any idea why this is happening? The dataset folder contains train, test and validation folders. And each of the folders contains 5 folders corresponding to the 5 classes. What am I doing wrong?
I have tried multiple optimizer but the problem persists. I have included the code sample below.
import warnings
warnings.filterwarnings("ignore")
#First convolution layer
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(3, 3), activation='relu',kernel_initializer='he_normal',input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
#Second convolution layer
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu',kernel_initializer='he_normal',input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
#Flatten the outputs of the convolution layer into a 1D contigious array
model.add(Flatten())
#Add a fully connected layer containing 256 neurons
model.add(Dense(256, activation='relu',kernel_initializer='he_normal'))
model.add(BatchNormalization())
#Add another fully connected layer containing 256 neurons
model.add(Dense(256, activation='relu',kernel_initializer='he_normal'))
model.add(BatchNormalization())
#Add the ouput layer containing 5 neurons, because we have 5 categories
model.add(Dense(5, activation='softmax',kernel_initializer='glorot_uniform'))
optim=RMSprop(lr=1e-6)
model.compile(loss='categorical_crossentropy',optimizer=optim,metrics=['accuracy'])
model.summary()
#We will use the below code snippet for rescaling the images to 0-1 for all the train and test images
train_datagen = ImageDataGenerator(rescale=1./255)
#We won't augment the test data. We will just use ImageDataGenerator to rescale the images.
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(train_data_dir,
classes=['frog', 'giraffe', 'horse', 'tiger','dog'],
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=False)
validation_generator = test_datagen.flow_from_directory(validation_data_dir,
classes=['frog', 'giraffe', 'horse', 'tiger','dog'],
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=False)
hist=History()
model.fit_generator(train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size,
callbacks=[hist])
model.save('models/basic_cnn_from_scratch_model.h5') #Save the model weights #Load using: model = load_model('cnn_from_scratch_weights.h5') from keras.models import load_model
print("Time taken to train the baseline model from scratch: ",datetime.now()-global_start)
Check the following for your data:
Shuffle the training data well (I see shuffle=False everywhere)
Properly normalize all data (I see you are doing rescale=1./255, maybe okay)
Proper train/val split (you seem to be doing that too)
Suggestions for your model:
Use multiple Conv2D layers followed by a final Dense. That's what works best for image classification problems. You can also look at popular architectures that are tried and tested; e.g. AlexNet
Can change the optimizer to Adam and try with different learning rates
Have a look at your training and validation loss graphs and see if they look as expected
Also, I guess you corrected the shape of the 2nd Conv2D layer as mentioned in the comments.
It looks as if your output is always the same animal, thus you have a 20% accuracy. I highly recommend you to check your testing outputs to see if they are all the same.
Also you said that you were building a CNN but in the code snipet you posted I see only dense layers, it is going to be hard for a dense architecture to do this task, and it is very small. What is the size of your pictures?
Hope it helps!
The models seems to be working now. I have removed shuffle=False attribute. Corrected the input shape for the 2nd convolution layer. Changed the optimizer to adam. I have reached a validation accuracy of almost 94%. However, I have not yet tested the model on unseen data. There is a bit of overfitting in the model. I will have to use some aggressive dropouts to reduce them. Thanks!

vgg16 model not converging

I tried to train keras's VGG16 on my dataset of orca calls with 8 classes. Data contains 7 classes for orca calls and 1 class for negative. Data contains spectrogram of orca calls and their corresponding labels.
Even after 50 epocs of training, loss and accuracy is not changing at all. I have tried feature scaling and various learning rate but model is not learning anything at all. I have checked the data manually, it is fine.
code:
model = keras.applications.vgg16.VGG16(include_top=True, weights=None, input_tensor=None, input_shape=(320, 480, 3), pooling='max', classes=8)
opt = keras.optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-8, decay=1e-6, amsgrad=False)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x=X_train, y= Y_train, batch_size=16, epochs=2000, shuffle=True,validation_split=0.2)
I have read some research papers and I have done some contrasting and other type of preprocessing on images according to those papers. Upper half of images is turned to white as the frequencies I am interested in lies in second half. I chose cmap ='gray', as the frequencies popped out more in it. Should I go for some another cmap?
Some Images from my dataset:
Kaggle kernal: https://www.kaggle.com/sainimohit23/myorcanb.

Modify ResNet50 output layer for regression

I am trying to create a ResNet50 model for a regression problem, with an output value ranging from -1 to 1.
I omitted the classes argument, and in my preprocessing step I resize my images to 224,224,3.
I try to create the model with
def create_resnet(load_pretrained=False):
if load_pretrained:
weights = 'imagenet'
else:
weights = None
# Get base model
base_model = ResNet50(weights=weights)
optimizer = Adam(lr=1e-3)
base_model.compile(loss='mse', optimizer=optimizer)
return base_model
and then create the model, print the summary and use the fit_generator to train
history = model.fit_generator(batch_generator(X_train, y_train, 100, 1),
steps_per_epoch=300,
epochs=10,
validation_data=batch_generator(X_valid, y_valid, 100, 0),
validation_steps=200,
verbose=1,
shuffle = 1)
I get an error though that says
ValueError: Error when checking target: expected fc1000 to have shape (1000,) but got array with shape (1,)
Looking at the model summary, this makes sense, since the final Dense layer has an output shape of (None, 1000)
fc1000 (Dense) (None, 1000) 2049000 avg_pool[0][0]
But I can't figure out how to modify the model. I've read through the Keras documentation and looked at several examples, but pretty much everything I see is for a classification model.
How can I modify the model so it is formatted properly for regression?
Your code is throwing the error because you're using the original fully-connected top layer that was trained to classify images into one of 1000 classes. To make the network working, you need to replace this top layer with your own which should have the shape compatible with your dataset and task.
Here is a small snippet I was using to create an ImageNet pre-trained model for the regression task (face landmarks prediction) with Keras:
NUM_OF_LANDMARKS = 136
def create_model(input_shape, top='flatten'):
if top not in ('flatten', 'avg', 'max'):
raise ValueError('unexpected top layer type: %s' % top)
# connects base model with new "head"
BottleneckLayer = {
'flatten': Flatten(),
'avg': GlobalAvgPooling2D(),
'max': GlobalMaxPooling2D()
}[top]
base = InceptionResNetV2(input_shape=input_shape,
include_top=False,
weights='imagenet')
x = BottleneckLayer(base.output)
x = Dense(NUM_OF_LANDMARKS, activation='linear')(x)
model = Model(inputs=base.inputs, outputs=x)
return model
In your case, I guess you only need to replace InceptionResNetV2 with ResNet50. Essentially, you are creating a pre-trained model without top layers:
base = ResNet50(input_shape=input_shape, include_top=False)
And then attaching your custom layer on top of it:
x = Flatten()(base.output)
x = Dense(NUM_OF_LANDMARKS, activation='sigmoid')(x)
model = Model(inputs=base.inputs, outputs=x)
That's it.
You also can check this link from the Keras repository that shows how ResNet50 is constructed internally. I believe it will give you some insights about the functional API and layers replacement.
Also, I would say that both regression and classification tasks are not that different if we're talking about fine-tuning pre-trained ImageNet models. The type of task mostly depends on your loss function and the top layer's activation function. Otherwise, you still have a fully-connected layer with N outputs but they are interpreted in a different way.

Keras flow_from_directory class index

I used to make it manually, but i am using now flow_from_directory to train my network with my own data. I just have one question. When i make model.predict(), how can i know that my index 0 on predictions is for label category dog and index 1 is for category cats?
The code i am using is the following.
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_images_path,
target_size=(64, 64),
batch_size=batch_size)
validation_generator = test_datagen.flow_from_directory(
validate_images_path,
target_size=(64, 64),
batch_size=batch_size)
early_stopping = keras.callbacks.EarlyStopping(monitor='val_acc', min_delta=0, patience=3, verbose=1, mode='auto')
history = model.fit_generator(
train_generator,
steps_per_epoch=1700,
epochs=epochs,
verbose=1,
callbacks=[early_stopping],
validation_data=validation_generator,
validation_steps=196
)
What i wanted to know is the pair images vs ground truth label.
Thank you
You can have the the index of each class generated by the generator with class_indices property.
print(validation_generator.class_indices)
Simple...
When you gather data, you define that. There is no rule. But a simple way to check is:
see what your first training image is, look at it yourself: is it a cat or a dog?
then see the training Y (result/class/desired output), is it [0,1] or [1,0]?
This will answer your question.
For getting one sample from a generator, you can see this question: How to get one value from a generator in Python?
As defined in Keras documentation, the generator output is a tuple of (inputs, targets)
Its pretty simple. When you pre-process your data, just replace the class labels with some specific integers (you can call it id). So, when you compute the loss or accuracy from the model's output, just compare the prediction with the ground truth in terms of integer labels (id).
In case if you need the label text, you can get it back from the id (integer).

Resources