How can I test my own image on my CNN model? - conv-neural-network

I'm a beginner programmer trying out image classification using CNN. I'm aiming to build a model which classifies if an image is an aluminum can or not, and I want to test it with my own image.
I've resized the images with the code below:
#Resizing to 128,128
files = os.listdir("../input/aluminum-can-image-data/Aluminum Cans")
for f in files:
img = Image.open("../input/aluminum-can-image-data/Aluminum Cans/" + f)
img = img.resize((128,128))
ds_train_ = image_dataset_from_directory(
'../input/aluminum-can-image-data',
labels='inferred',
image_size=[128, 128],
interpolation='nearest',
batch_size=64,
)
ds_valid_ = image_dataset_from_directory(
'../input/aluminum-can-image-data',
labels='inferred',
image_size=[128, 128],
interpolation='nearest',
batch_size=64,
)
ds_train, ds_valid = train_test_split(files, test_size=0.2, random_state=1)
I want to build a code which shows the percentage of how likely an image is an aluminum can when it has received a single image. Any help with the codes to build this function would be highly appreciated~!

let us suppose the model you use is named "model" and you have has 2 ouput labels -"Aluminium","Not Aluminium" .
As you only need the prediction of only a single image, you have to use np.expandims(image,axis=0) to increase the dimension for the input for the model to work.
Code:
class=["alum","not_alum"]
prediction=model.predict(np.expand_dims(image,axis=0))
confidence=round(100 * (np.max(prediction[0]),2)
argclass=np.argmax(prediction,axis=1)
print(class[argclass[0])
print(confidence) .

Related

Predicting single image using Tensorflow not being accurate

I'm trying to build a CNN model in order to classify an image, but whenever the training is done and I try to feed it a single image (from the training dataset) it misclassifies this image always.
Please take a look at the code I wrote below.
Thank you in advance.
First, I declared an Image Data Generator for both my training and testing sets:
train_datagen = ImageDataGenerator(rescale = 1./255, rotation_range=20, horizontal_flip = True,
validation_split=0.3)
test_datagen = ImageDataGenerator(rescale = 1./255,validation_split=0.3)
Then, I used the flow_from_directory() function to load the images:
train_generator = train_datagen.flow_from_directory(
data_dir,
shuffle=False,
subset='training',
target_size = (224, 224),
class_mode = 'categorical'
)
test_generator = test_datagen.flow_from_directory(
data_dir,
shuffle=False,
subset='validation',
target_size = (224, 224),
class_mode = 'categorical'
)
I then loaded a pretrained model and added a few layers to build my model:
pretrained_model = VGG16(weights="imagenet", include_top=False,
input_tensor=input_shape)
pretrained_model.trainable = False
model = tf.keras.Sequential([
pretrained_model,
Flatten(name="flatten"),
Dense(3, activation="softmax")
])
I then trained the model :
INIT_LR = 3e-4
EPOCHS = 15
opt = Adam(lr=INIT_LR)
model.compile(loss="categorical_crossentropy", optimizer='Adam', metrics=["accuracy"])
H = model.fit(
train_generator,
validation_data=test_generator,
epochs=EPOCHS,
verbose= 1)
Then came the part to predict a single image:
I chose an image that was part of the training set, I even overfitted the model to make sure the predictions should be correct, but it was giving me wrong results for every image I input to the model.
I tried the following ways:
image = image.load_img(url,target_size = (224, 224))
img = tf.keras.preprocessing.image.img_to_array(image)
img = np.array([img])
img = img.astype('float32') / 255.
img = tf.keras.applications.vgg16.preprocess_input(img)
This didn't work
image = cv2.imread(url)
image = cv2.normalize(image, None,beta=255, dtype=cv2.CV_32F)
image = cv2.resize(image, (224, 224))
image = np.expand_dims(image, axis=0)
This also didn't work, I also tried many other ways to predict a single image, but none worked.
Finally, the only way was that I had to create an Image Data Generator and Flow From Directory for this single image, and it worked, but I believe that's not how it should be done.
The code img = tf.keras.applications.vgg16.preprocess_input(img) scales the pixel
values in the image to values between -1 to +1 assuming the original pixel values are in the range 0 to 255. In the previous line of code
img = img.astype('float32') / 255.
You rescaled the pixels. So remove that line of code. Now to predict a single image you need to expand the dimensions with
img = np.expand_dims(img, axis=0)
In your second code effort be aware the CV2 reads in images as BGR. If your model was trained on RGB images then your predictions will be wrong. Use the code below to convert the image to RGB.
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
As a side note you can replace tf.keras.applications.vgg16.preprocess_input(img) with the function below which will scale the images between -1 to +1
def scalar(img):
return img/127.5 - 1
This answer could be one starting point:
Resnet50 produces different prediction when image loading and resizing is done with OpenCV
These are possible differences (short gist):
RGB vs BGR (OpenCV loads BGR)
The interpolation method used (INTER_LINEAR vs INTER_NEAREST).
img_to_array() transforms the data type into float32 rather than uint8 which is obtained by default when loading with OpenCV.
tf.keras.applications.vgg16.preprocess_input(img). This preprocessing function can actually differ from what you have written above as image preprocessing; it is also notable that, if you do not preprocess it while training in this particular way (preprocess_input()) then it also makes sense to have bad results on the test set, since the preprocessings are different.
Hope these observations shed some light.

ValueError: Output of generator should be a tuple `(x, y, sample_weight)` or `(x, y)` while using Fit_generator

I've been trying to teach a model using a set of training and validation images, and i've been gettin the title error ( i'll post the full one below ).
Alittle lost on how to proceed and the previous questions asked about this topic didn't yield results.
My code snippet is :
train_datagen = ImageDataGenerator(
preprocessing_function=preprocess_input,
rotation_range=90,
horizontal_flip=True,
vertical_flip=True,
)
val_datagen = ImageDataGenerator(
preprocessing_function=preprocess_input,
rotation_range=90,
horizontal_flip=True,
vertical_flip=True,
)
train_generator = train_datagen.flow_from_directory(VAL_DIR,
target_size=(HEIGHT, WIDTH),
batch_size=TRAIN_BATCH_SIZE,
class_mode=None,
shuffle=True)
val_generator = val_datagen.flow_from_directory(TRAIN_DIR,
target_size=(HEIGHT, WIDTH),
batch_size=VAL_BATCH_SIZE,
class_mode=None,
shuffle=True)
and then i try to teach the model using :
history = finetune_model.fit_generator(train_generator,epochs=NUM_EPOCHS, workers=8,
steps_per_epoch=num_train_images // TRAIN_BATCH_SIZE,
validation_data=val_generator,
validation_steps=num_val_images // VAL_BATCH_SIZE,
shuffle=True, callbacks=callbacks_list)
The error i get is :
ValueError: Output of generator should be a tuple `(x, y, sample_weight)` or `(x, y)`. Found: [[[[-5.30867195e+01 -6.81702271e+01 2.66113968e+01]
[-5.04675522e+01 -6.62993927e+01 2.90434952e+01]
[-4.78483849e+01 -6.44285583e+01 3.14755783e+01]
...
I'd love some direction, as i'm a starting ML student. Would be happy to provide more info.
The images are use are in the jpeg format.
What can i do ? can't seem to find the issue.
Fixing the Bug:
Per the documentation, specifying class_mode=None gives a generator that only yields batches of image data, with no targets (intended for use with model.predict_generator()).
fit_generator needs a generator that yields pairs of (inputs, targets). So you can't fit your model to the generators you're using right now, because those generators don't say what targets the model is supposed to fit. You need to figure out what your labels are, and then choose the appropriate class_mode so that those labels get included by the data generators.
Making Sure The Bug is Fixed:
Once you've selected the correct class_mode, you can sanity-check the data generator by printing/visualizing a batch1:
In this example, I'm doing multiclass (bunny, cat, dog) classification, so the default class_mode=categorical will work fine.
After getting images and labels for a batch, the first image in that batch is a dog, the first label for that batch is [0, 0, 1] (a one in the 2nd position, counting from zero), and the class_indices dict says that dog images have label 2 (counting from zero).
1. In general, it's a good idea to always carefully check the data here, even if the data generator seems to be working; see this post, under "visualize just before the net".

Is image needed to rescale before predicting with model that trained with ImageDataGenerator(1./255)?

After training model with ImageDataGenerator(1/255.), do I need to rescale image before predicting ?
I thought it is necessary but experiment result said NO.
I trained a Resnet50 model which has 37 class on top layer.
Model was trained with ImageDataGenerator like this.
datagen = ImageDataGenerator(rescale=1./255)
generator=datagen.flow_from_directory(
directory=os.path.join(os.getcwd(), data_folder),
target_size=(224,224),
batch_size=256,
classes=None,
class_mode='categorical')
history = model.fit_generator(generator, steps_per_epoch=generator.n / 256, epochs=10)
Accuracy achieved 98% after 10 epochs on my train dataset.
The problem is, when i tried to predict each image in TRAIN dataset, prediction was wrong ( result is 33 whatever input image was )
img_p = './data/pets/shiba_inu/shiba_inu_27.jpg'
img = cv2.imread(img_p, cv2.IMREAD_COLOR)
img = cv2.resize(img, (224,224))
img_arr = np.zeros((1,224,224,3))
img_arr[0, :, :, :] = img / 255.
pred = model.predict(img_arr)
yhat = np.argmax(pred, axis=1)
yhat is 5, but y is 33
When I replace this line
img_arr[0, :, :, :] = img / 255.
by this
img_arr[0, :, :, :] = img
yhat is exactly 33.
Someone might suggest to use predict_generator() instead of predict(), but I want to understand what I did wrong here.
I knew what's wrong here.
I'm using Imagenet pretrained model, which DO NOT rescale image by divide it to 255. I have to use resnet50.preprocess_input before train/test.
preprocess_input function can be found here.
https://github.com/keras-team/keras-applications/blob/master/keras_applications/imagenet_utils.py
You must do every preprocessing that you do on train data, on each data that you want to feed to your trained network. actually when, for example, you rescale train images and train a network, your network train to get a matrix with entries between 0 and 1 and find the proper category. so if after training phase, you feed an image without rescaling, you feed a matrix with entries between 0 and 255 to your trained network while your network did not learn how treat with such matrix.
If you are following pre-processing exactly same as at the time of training then, you might look at the part of your code where you are predicting class using yhat = np.argmax(pred, axis=1) my hunch is that there might be class mismatch in accordance to indexing, to check how your classes are indexed when you use flow_from_directory use class_map = generator.class_indices this will return you a dictionary which will show you how your classes are mapped against index.
Note: The reason I state this because I've faced similar problem, using Keras flow_from_directory doesn't sort classes and hence it's quite possible that your prediction class 1 lies on the index 10 while np.argmax will return you class 1'.

Inference code for prediction in multi-class image classification

I am trying to take a single input image and predict its label. Training data image was converted to array and labels to int and made into a single dataset using DatasetMixin before feeding into classifier. So I have converted the image into array.
When I tried with the given code..this is the error...
Expect: in_types[0].shape[1] == in_types[1].shape[1] * 1
Actual: 240 != 3
img = cv2.imread('C:/Users/Dell/Desktop/TEST IMAGES/MONOCYTE.jpeg')
plt.imshow(img)
plt.show()
img=np.array((img), dtype = np.float32)
img=img/255.0
x = Variable(np.asarray([img]))
y = model(x)
prediction = y.data.argmax(axis=1)
The details of the model is necessary for the accurate answer...
But I guess that the model requires an array whose shape is (batch, channel, width, height), but the shape of the array you fed to model seems to be (width, height, channel).
This may be the reason of the error message.

How to calculate class scores when batch size changes

My question is at the bottom, but first I will explain what I am attempting to achieve.
I have an example I am trying to implement on my own model. I am creating an adversarial image, in essence I want to graph how the image score changes when the epsilon value changes.
So let's say my model has already been trained, and in this example I am using the following model...
x = tf.placeholder(tf.float32, shape=[None, 784])
...
...
# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits) # Softmax
Next, let us assume I extract an array of images of the number 2 from the mnist data set, and I saved it in the following variable...
# convert into a numpy array of shape [100, 784]
labels_of_2 = np.concatenate(labels_of_2, axis=0)
So now, in the example that I have, the next step is to try different epsilon values on every image...
# random epsilon values from -1.0 to 1.0
epsilon_res = 101
eps = np.linspace(-1.0, 1.0, epsilon_res).reshape((epsilon_res, 1))
labels = [str(i) for i in range(10)]
num_colors = 10
cmap = plt.get_cmap('hsv')
colors = [cmap(i) for i in np.linspace(0, 1, num_colors)]
# Create an empty array for our scores
scores = np.zeros((len(eps), 10))
for j in range(len(labels_of_2)):
# Pick the image for this iteration
x00 = labels_of_2[j].reshape((1, 784))
# Calculate the sign of the derivative,
# at the image and at the desired class
# label
sign = np.sign(im_derivative[j])
# Calculate the new scores for each
# adversarial image
for i in range(len(eps)):
x_fool = x00 + eps[i] * sign
scores[i, :] = logits.eval({x: x_fool,
keep_prob: 1.0})
Now we can graph the images using the following...
# Create a figure
plt.figure(figsize=(10, 8))
plt.title("Image {}".format(j))
# Loop through the score functions for each
# class label and plot them as a function of
# epsilon
for k in range(len(scores.T)):
plt.plot(eps, scores[:, k],
color=colors[k],
marker='.',
label=labels[k])
plt.legend(prop={'size':8})
plt.xlabel('Epsilon')
plt.ylabel('Class Score')
plt.grid('on')
For the first image the graph would look something like the following...
Now Here Is My Question
Let's say the model I trained used a batch_size of 100, in that case the following line would not work...
scores[i, :] = logits.eval({x: x_fool,
keep_prob: 1.0})
In order for this to work, I would need to pass an array of 100 images to the model, but in this instance x_fool is just one image of size (1, 784).
I want to graph the effect of different epsilon values on class scores for any one image, but how can I do so when I need calculate the score of 100 images at a time (since my model was trained on a batch_size of 100)?
You can choose to not choose a batch size by setting it to None. That way, any batch size can be used.
However, keep in mind that this non-choice could com with a moderate penalty.
This fixes it if you start again from scratch. If you start from an existing trained network with a batch size of 100, you can create a test network that is similar to your starting network except for the batch size. You can set the batch size to 1, or again, to None.
I realised the problem was not with the batch_size but with the format of the image I was attempting to pass to the model. As user1735003 pointed out, the batch_size does not matter.
The reason I could not pass the image to the model was because I was passing it as so...
x_fool = x00 + eps[i] * sign
scores[i, :] = logits.eval({x: x_fool})
The problem with this is that the shape of the image is simply (784,) whereas the placeholder needs to accept an array of images of shape shape=[None, 784], so what needs to be done is to reshape the image.
x_fool = labels_of_2[0].reshape((1, 784)) + eps[i] * sign
scores[i, :] = logits.eval({x:x_fool})
Now my image is shape (1, 784) which can now be accepted by the placeholder.

Resources