why do we reshape grayscale image to (x,y,1)? - keras

I noticed that when training CNN with a grayscale image. The dimensions of the image is reshaped to (x,y,1). I thought that this shouldn't be necessary but when i try with shape (x,y). I get an error
ValueError: Input 0 of layer conv2d is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: [None, 28, 28]
As i understand the only reason we are doing this because keras implemented this way. Or is there any other reason for this?

The input shape of of Conv2D layer in keras is: batch_size + (rows, cols, channels). So, the layer expects number of channels as the final input shape which is 1 for grayscale image. For RGB images this would be 3.

Related

Input 0 is incompatible with layer res2a_branch1: expected axis -1 of input shape to have value 64 but got shape (None, None, None, 256)

I am trying to do transfer learning with mask rcnn. I want to use the weights of the mask rcnn model and add a binary classifier in the last layer. So, I introduced a sequential model and added the base model as one of the layers and then added two other layers. But I am getting this error. Is there any way to solve this issue?
model3=Sequential()
model =mrcnn.model.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)
model.load_weights(COCO_MODEL_PATH, by_name=True)
for layer in model.keras_model.layers[:-1]:
model3.add(layer)
model3.add(Dense(256,activation='relu'))
model3.add(Dense(2,activation='softmax'))

Unable to run model.predict() with image shape same as that which the model was trained on

I am trying to run inference on a ResNet model that I had designed and trained on google Colab, the link to the notebook can be found here. The dimension of the images that the model is trained on is (32, 32, 3). After training, I saved the model in the SavedModel format so that I could run Inference on my machine. The code I used is
import tensorflow as tf
import cv2 as cv
from resize import resize_to_fit
image = cv.imread('extracted_letter_images/001.png')
image_resized = resize_to_fit(image, 32, 32)
model = tf.keras.models.load_model('Model/CAPTCHA-Model')
model.predict(image_resized)
The resize_to_fit method resizes the image to 32x32px. The shape of the returned image is also (32, 32, 3). When the model.predict() function is called, the following error message is shown
ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: (32, 32, 3)
I have tried uninstalling and reinstalling Tensorflow as well as tf-nightly several times to no avail. I have even tried expanding the dimension of the image with this
image_resized = np.expand_dims(image_resized, axis=0)
This results in the image having dimensions (1, 32, 32, 3). When the above change is made the following error message is shown
2021-04-07 19:49:11.821261: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:180] None of the MLIR Optimization Passes are enabled (registered 2)
What I'm confused about is that the dimensions of the resized image and the dimensions of the image used the train the model is the same but model.predict() does not seem to work.
In your ImageDataGenerators you used the preprocessing function tf.image.rgb_to_grayscale. This converted the images to 32 X 32 X 1. So you must do the same transformation on the images you wish to predict. You also rescaled the images to be in the range 0 to 1 so you must also rescale the images you wish to predict. The code image_resized = np.expand_dims(image_resized, axis=0) is correct. Not sure if his will be an issue but be aware that cv2 reads in images as BGR not RGB so before you apply tf.image.rgb_to_grayscale first convert the image to RGB with image= cv2.cvtColor(image, cv2.COLOR_BGR2RGB).

PyTorch: Convolving a single channel image using torch.nn.Conv2d

I am trying to use a convolution layer to convolve a grayscale (single layer) image (stored as a numpy array). Here is the code:
conv1 = torch.nn.Conv2d(in_channels = 1, out_channels = 1, kernel_size = 33)
tensor1 = torch.from_numpy(img_gray)
out_2d_np = conv1(tensor1)
out_2d_np = np.asarray(out_2d_np)
I want my kernel to be 33x33 and the number of output layers should be equal to the number of input layers, which is 1 as the image's RGB channels are summed. Whenout_2d_np = conv1(tensor1) is run it yields the following runtime error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 1 1 33 33, but got 2-dimensional input of size [246, 248] instead
Any idea on how I can solve this? I specifically want to use the torch.nn.Conv2d() class/function.
Thanks in advance for any help!
pytorch's Conv2d expects its 2D inputs to actually have 4 dimensions: mini-batch dim, channel dim, and the two spatial dimensions.
Your input tensor has only two spatial dimensions and it lacks the mini-batch and channel dimensions. In your case these two dimensions are actually singelton dimensions (dimensions with size=1).
try:
conv1(tensor1[None, None, ...])

Conv2d wrong dimensions on Keras

I'm new to Keras and I'm trying to use convolutional autoencoders for image compression.
In particular I'm compressing images which are all of dimensions (365,929). As I'm working with numpy 2D arrays for the images, I add a dimension to make them tensors.
When feeding the network with the images with this code:
X,X_test=train_test_split(images,test_size=0.1)
# Adds 1D to each matrix, so to have a tensor.
X=np.array([np.expand_dims(i,axis=2) for i in X])
# X is (1036, 365, 929, 1) now
X_test=np.array([np.expand_dims(i,axis=2) for i in X_test])
inputs = Input(shape=(365, 929, 1))
h = Conv2D(4,(3,3),activation='relu',padding="same")(inputs)
encoded = MaxPooling2D(pool_size=2,padding="same")(h)
h = Conv2D(4,(3,3),activation='relu',padding="same")(encoded)
h = UpSampling2D((2,2))(h)
outputs = Conv2D(1,(3,3),activation='relu',padding="same")(h)
model = Model(inputs=inputs, output=outputs)
model.compile(optimizer='adam', loss='mse')
model.fit(X, X, batch_size=64, nb_epoch=5, validation_split=.33)
I get the following error:
ValueError: Error when checking target: expected conv2d_3 to have shape (366, 930, 1) but got array with shape (365, 929, 1)
How can I solve this issue? How can I modify the CNN to take images with uneven dimensions?
Your problem lies in the UpSampling2D. You can pad the image with 0s unsymetrically and then crop the image to its original size, as explained here.
To help debugging you can use print(model.Summary()) to check the dimensions of all layers.

OpenCV Grayscale conversion giving unexpected Shape

I'm programing a Neural network(CNN) where I'm providing images as input to the network.
I want to Ccovert the image into grayscale to reduce the depth of the image from 3 to 1.
I used the OpenCV Function for conversion as follows.
X = []
for name in cars:
img = cv2.imread(name,cv2.IMREAD_GRAYSCALE)
X.append(img)
for name in non_cars:
img = cv2.imread(name,cv2.IMREAD_GRAYSCALE)
X.append(img)
Have created X,which hold my data for training purpose.
each image is of 64 by 64 by 3
After Conversion into gray scale I should get 64 by 64 by 1.
Printing out Shape of my Array X
print(X_train.shape[0], 'train samples')
Output - X_train shape: (15984, 64, 64)
15984 are number of images.
I 'm expecting output to be (15984, 64, 64,1)
My Neural Network gives me this Error :
ValueError: Cannot feed value of shape (64, 64, 64) for Tensor 'image_input:0', which has shape '(?, ?, ?, 3)'
Please guide me with Help.
When you load an image as grayscale as you are doing there, if you check the shape will be (64,64) and when you add and stack this images you will end up with (15984, 64, 64). The representation of the image using (64,64) can be viewed as a matrix of one channel 64 by 64 pixels. If you need to add the missing channel axis you can use:
img = img[:,:,np.newaxis]
Then you will end up with the shape like (64,64,1).
Note: You can do the same procedure on X_train. For more on that you can check numpy.expand_dims.

Resources