Simple way to evaluate Pytorch torchvision on a single image - pytorch

I have a pre-trained model on Pytorch v1.3, torchvision v0.4.2 as the following:
import PIL, torch, torchvision
# Load and normalize the image
img_file = "./robot_image.jpg"
img = PIL.Image.open(img_file)
img = torchvision.transforms.ToTensor()((img))
img = 0.5 + 0.5 * (img - img.mean()) / img.std()
# Load a pre-trained network and compute its prediction
alexnet = torchvision.models.alexnet(pretrained=True)
I want to test this single image, but I get an error:
alexnet(img)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 64 3 11 11, but got 3-dimensional input of size [3, 741, 435] instead
what is the most simple and idiomatic way of getting the model to evaluate a single data point?

AlexNet is expecting a 4-dimensional tensor of size (batch_size x channels x height x width). You are providing a 3-dimensional tensor.
To change your tensor to size (1, 3, 741, 435) simply add the line:
img = img.unsqueeze(0)
You will also need to downsample your image as AlexNet expects inputs of height and width 224x224.

Related

A question about applying a neural network on a specified dimension using PyTorch

I'm wondering about how to do the following thing:
If I have a torch.tensor x with shape (4,5,1) how can apply a neural network using PyTorch on the last dimension?
Using the standard procedure, the model is flattening the entire tensor into some new tensor of shape (20,1) but this is not actually what I want.
Let's say we want some output features of dimension 64, then I would like to obtain a new object of shape (4,5,64)
import torch
import torch.nn as nn
x = torch.randn(4, 5, 1)
print(x.size())
# https://pytorch.org/docs/stable/generated/torch.nn.Linear.html
m = nn.Linear(1, 64)
y = m(x)
print(y.size())
result:
torch.Size([4, 5, 1])
torch.Size([4, 5, 64])

How to convert (1,64,224,224) --> (1,64) using adaptive average pooling(Pytorch)?

Is there any way in Pytorch to reduce dimensions of tensor in model?
Adaptive Average Pooling or in fact any typical pooling in Pytorch does not reduce the dimensions of a tensor.
You can find all the types of poolings, Pytorch offers over here:
https://pytorch.org/docs/master/nn.html#pooling-layers
I suggest to use this template code to try out different poolings and their affect on dimensions:
m = nn.AdaptiveAvgPool2d((5,7))
input = torch.randn(1, 64, 8, 9)
output = m(input)
print(output.size())
In order to reduce dimension in Pytorch models you can specify a block that does squeeze() to the tensor or even flattens the tensor with example_tensor.view(-1, x, y) for example.
Sarthak Jain
This code should work to compress (1,64,224,224) --> (1,64)
import torch
import torch.nn as nn
m = nn.AdaptiveAvgPool2d((1,1))
input = torch.randn(1, 64, 224, 224)
output = m(input).view(1,-1)
print(output.size()) #torch.Size([1, 64])

Unable to run model.predict() with image shape same as that which the model was trained on

I am trying to run inference on a ResNet model that I had designed and trained on google Colab, the link to the notebook can be found here. The dimension of the images that the model is trained on is (32, 32, 3). After training, I saved the model in the SavedModel format so that I could run Inference on my machine. The code I used is
import tensorflow as tf
import cv2 as cv
from resize import resize_to_fit
image = cv.imread('extracted_letter_images/001.png')
image_resized = resize_to_fit(image, 32, 32)
model = tf.keras.models.load_model('Model/CAPTCHA-Model')
model.predict(image_resized)
The resize_to_fit method resizes the image to 32x32px. The shape of the returned image is also (32, 32, 3). When the model.predict() function is called, the following error message is shown
ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: (32, 32, 3)
I have tried uninstalling and reinstalling Tensorflow as well as tf-nightly several times to no avail. I have even tried expanding the dimension of the image with this
image_resized = np.expand_dims(image_resized, axis=0)
This results in the image having dimensions (1, 32, 32, 3). When the above change is made the following error message is shown
2021-04-07 19:49:11.821261: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:180] None of the MLIR Optimization Passes are enabled (registered 2)
What I'm confused about is that the dimensions of the resized image and the dimensions of the image used the train the model is the same but model.predict() does not seem to work.
In your ImageDataGenerators you used the preprocessing function tf.image.rgb_to_grayscale. This converted the images to 32 X 32 X 1. So you must do the same transformation on the images you wish to predict. You also rescaled the images to be in the range 0 to 1 so you must also rescale the images you wish to predict. The code image_resized = np.expand_dims(image_resized, axis=0) is correct. Not sure if his will be an issue but be aware that cv2 reads in images as BGR not RGB so before you apply tf.image.rgb_to_grayscale first convert the image to RGB with image= cv2.cvtColor(image, cv2.COLOR_BGR2RGB).

PyTorch: Convolving a single channel image using torch.nn.Conv2d

I am trying to use a convolution layer to convolve a grayscale (single layer) image (stored as a numpy array). Here is the code:
conv1 = torch.nn.Conv2d(in_channels = 1, out_channels = 1, kernel_size = 33)
tensor1 = torch.from_numpy(img_gray)
out_2d_np = conv1(tensor1)
out_2d_np = np.asarray(out_2d_np)
I want my kernel to be 33x33 and the number of output layers should be equal to the number of input layers, which is 1 as the image's RGB channels are summed. Whenout_2d_np = conv1(tensor1) is run it yields the following runtime error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 1 1 33 33, but got 2-dimensional input of size [246, 248] instead
Any idea on how I can solve this? I specifically want to use the torch.nn.Conv2d() class/function.
Thanks in advance for any help!
pytorch's Conv2d expects its 2D inputs to actually have 4 dimensions: mini-batch dim, channel dim, and the two spatial dimensions.
Your input tensor has only two spatial dimensions and it lacks the mini-batch and channel dimensions. In your case these two dimensions are actually singelton dimensions (dimensions with size=1).
try:
conv1(tensor1[None, None, ...])

Conv2d wrong dimensions on Keras

I'm new to Keras and I'm trying to use convolutional autoencoders for image compression.
In particular I'm compressing images which are all of dimensions (365,929). As I'm working with numpy 2D arrays for the images, I add a dimension to make them tensors.
When feeding the network with the images with this code:
X,X_test=train_test_split(images,test_size=0.1)
# Adds 1D to each matrix, so to have a tensor.
X=np.array([np.expand_dims(i,axis=2) for i in X])
# X is (1036, 365, 929, 1) now
X_test=np.array([np.expand_dims(i,axis=2) for i in X_test])
inputs = Input(shape=(365, 929, 1))
h = Conv2D(4,(3,3),activation='relu',padding="same")(inputs)
encoded = MaxPooling2D(pool_size=2,padding="same")(h)
h = Conv2D(4,(3,3),activation='relu',padding="same")(encoded)
h = UpSampling2D((2,2))(h)
outputs = Conv2D(1,(3,3),activation='relu',padding="same")(h)
model = Model(inputs=inputs, output=outputs)
model.compile(optimizer='adam', loss='mse')
model.fit(X, X, batch_size=64, nb_epoch=5, validation_split=.33)
I get the following error:
ValueError: Error when checking target: expected conv2d_3 to have shape (366, 930, 1) but got array with shape (365, 929, 1)
How can I solve this issue? How can I modify the CNN to take images with uneven dimensions?
Your problem lies in the UpSampling2D. You can pad the image with 0s unsymetrically and then crop the image to its original size, as explained here.
To help debugging you can use print(model.Summary()) to check the dimensions of all layers.

Resources