how to know model's input size in onnx? - pytorch

The size of the input is not specified in the pytorch. Just size the kernel to make the output. The WinMLDashboard shows the width and height of the image input. How is that possible?

Do you mean when you serialize the network from pytorch to onnx? Because when you export from pytorch you need to define the size of the input as per documentation
dummy_input = torch.randn(10, 3, 224, 224, device='cuda')
model = torchvision.models.alexnet(pretrained=True).cuda()
input_names = [ "actual_input_1" ]
output_names = [ "output1" ]
torch.onnx.export(model, dummy_input, "alexnet.onnx", verbose=True,
input_names=input_names, output_names=output_names)

Related

RuntimeError: Unsupported: ONNX export of index_put in opset 9. Please try opset version 11

I was converting our custom Pytorch model to Trt and run it on a Jetson.
While converting pt to ONNX I am getting an error like:
RuntimeError: Unsupported: ONNX export of index_put in opset 9. Please try opset version 11.
reference:
https://github.com/onnx/onnx/issues/3057#issuecomment-707857945
I got this error while trying to run this code
import torch.onnx
# Standard ImageNet input - 3 channels, 224x224,
# values don't matter as we care about network structure.
# But they can also be real inputs.
dummy_input = torch.randn(1, 3, 224, 224)
# Invoke export
# torch.onnx.export(model, dummy_input, "best.onnx")
I solve this error by adding this attribute opset_version=11 in the function torch.onnx.export() as follow
import torch.onnx
# Standard ImageNet input - 3 channels, 224x224,
# values don't matter as we care about network structure.
# But they can also be real inputs.
dummy_input = torch.randn(1, 3, 224, 224)
# Invoke export
torch.onnx.export(model, dummy_input, "best.onnx", opset_version=11)
I solve this error by adding this attribute opset_version=11 in the function torch.onnx.export() as follows
import torch.onnx
# Standard ImageNet input - 3 channels, 224x224,
# values don't matter as we care about network structure.
# But they can also be real inputs.
dummy_input = torch.randn(1, 3, 224, 224)
# Invoke export
torch.onnx.export(model, dummy_input, "best.onnx", opset_version=11)

PyTorch: one dimension in input data

I don't understand why this code works:
# Hyperparameters for our network
input_size = 784
hidden_sizes = [128, 64]
output_size = 10
# Build a feed-forward network
model = nn.Sequential(nn.Linear(input_size, hidden_sizes[0]),
nn.ReLU(),
nn.Linear(hidden_sizes[0], hidden_sizes[1]),
nn.ReLU(),
nn.Linear(hidden_sizes[1], output_size),
nn.Softmax(dim=1))
print(model)
# Forward pass through the network and display output
images, labels = next(iter(trainloader))
images.resize_(images.shape[0], 1, 784)
ps = model.forward(images[0,:])
The size of an image is (images.shape[0], 1, 784) but our networks has input_size = 784. How does the network handle 1 dimension in an input image? I tried to change images.resize_(images.shape[0], 1, 784) to images = images.view(images.shape[0], -1) but I got an error:
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
For reference, data loader is created the next way:
# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,)),
])
# Download and load the training data
trainset = datasets.MNIST('~/.pytorch/MNIST_data/', download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
PyTorch networks take inputs as [batch_size, input_dimensions]
In your case, images[0,:] is of the shape [1,784] where “1” is tue batch size and that is why your code works.

ValueError: Cannot feed value of shape (64, 80, 60, 3) for Tensor 'input/X:0', which has shape '(?, 80, 60, 1)'

I was trying to recreate a program that Sentdex been doing in "python plays gta V" series but when i come to train the ai it turn me this error: ValueError: Cannot feed value of shape (64, 80, 60, 3) for Tensor 'input/X:0', which has shape '(?, 80, 60, 1)' i was trying to canche sme parameters but it didn't work. Here is my code:
import numpy as np
from alexnet import alexnet
import time
width=80
height=60
lr=1e-3
epochs=30
model_name='minecraft-ai-{}-{}-{}'.format(lr,'ghostbot',epochs)
model=alexnet(width,height,lr)
train_data=np.load('training_data.npy',allow_pickle=True)
train=train_data[:-500]
test=train_data[-500:]
X=np.array([i[0]for i in train]).reshape(-1,width,height,3)
Y=[i[1] for i in train]
test_x = np.array([i[0] for i in test]).reshape(-1,width,height,3)
test_y = [i[1] for i in test]
print(X.shape)
print(test_x.shape)
time.sleep(3)
model.fit({'input': X}, {'targets': Y}, n_epoch=epochs, validation_set=({'input': test_x}, {'targets': test_y}),
snapshot_step=500, show_metric=True, run_id=model_name,)
model.save(model_name)
I checked the source at this path - https://github.com/Sentdex/pygta5/blob/master/2.%20train_model.py#L91. It seems the line #91 has been changed to:
test_x = np.array([i[0] for i in test]).reshape(-1,width,height,3)
so you need to edit the last axis (number of channels) to 3 so that the last dimension (channels) of the test images matches that of the train ones. Make the same changes to debug this. Hope this helps!
This is caused because, number of channels in training images does not match input layer architecture.
If you are using grayscale images, in alexnet model definition and change
Line 728 from
network = input_data(shape=[None, width, height, 3], name='input')
to this
network = input_data(shape=[None, width, height, 1], name='input')
https://github.com/Sentdex/pygta5/blob/master/models.py#L728

Transferring pretrained pytorch model to onnx

I am trying to convert pytorch model to ONNX, in order to use it later for TensorRT. I followed the following tutorial https://pytorch.org/tutorials/advanced/super_resolution_with_caffe2.html, but my kernel dies all the time.
This is the code that I implemented.
# Some standard imports
import io
import numpy as np
from torch import nn
import torch.onnx
from deepformer.nets.quicknat import quickNAT
param = {
'num_channels': 64,
'num_filters': 64,
'kernel_h': 5,
'kernel_w': 5,
'kernel_c': 1,
'stride_conv': 1,
'pool': 2,
'stride_pool': 2,
'num_classes': 1,
'padding': 'reflection'
}
net = quickNAT(param)
checkpoint_path = 'checkpoint_epoch36_loss0.78.t7'
checkpoints=torch.load(checkpoint_path)
map_location = lambda storage, loc: storage
if torch.cuda.is_available():
map_location = None
net.load_state_dict(checkpoints['net'])
net.train(False)
# Input to the modelvcdfx
x = torch.rand(1, 64, 256, 1600, requires_grad=True)
# Export the model
torch_out = torch.onnx._export(net, # model being run
x, # model input (or a tuple for multiple inputs)
"quicknat.onnx", # where to save the model (can be a file or file-like object)
export_params=True) # store the trained parameter weights inside the model file
What is the output you get? It seems SuperResolution is supported with the export operators in pytorch as mentioned in the documentation
Are you sure the input to your model is:
x = torch.rand(1, 64, 256, 1600, requires_grad=True)
That could be the variable that you used for training, since for deployment you run the network on one or multiple images the dummy input to export to onnx is usually:
dummy_input = torch.randn(1, 3, 720, 1280, device='cuda')
With 1 being the batch size, 3 being the channels of the image(RGB), and then the size of the image, in this case 720x1280. Check on that input, I guess you don't have a 64 channel image as input right?
Also, it'd be helpful if you post the terminal output to see where it fails.
Good luck!

Resnet with Custom Data

I am trying to modify Resnet50 with my custom data as follows:
X = [[1.85, 0.460,... -0.606] ... [0.229, 0.543,... 1.342]]
y = [2, 4, 0, ... 4, 2, 2]
X is a feature vector of length 2000 for 784 images. y is an array of size 784 containing the binary representation of labels.
Here is the code:
def __classifyRenet(self, X, y):
image_input = Input(shape=(2000,1))
num_classes = 5
model = ResNet50(weights='imagenet',include_top=False)
model.summary()
last_layer = model.output
# add a global spatial average pooling layer
x = GlobalAveragePooling2D()(last_layer)
# add fully-connected & dropout layers
x = Dense(512, activation='relu',name='fc-1')(x)
x = Dropout(0.5)(x)
x = Dense(256, activation='relu',name='fc-2')(x)
x = Dropout(0.5)(x)
# a softmax layer for 5 classes
out = Dense(num_classes, activation='softmax',name='output_layer')(x)
# this is the model we will train
custom_resnet_model2 = Model(inputs=model.input, outputs=out)
custom_resnet_model2.summary()
for layer in custom_resnet_model2.layers[:-6]:
layer.trainable = False
custom_resnet_model2.layers[-1].trainable
custom_resnet_model2.compile(loss='categorical_crossentropy',
optimizer='adam',metrics=['accuracy'])
clf = custom_resnet_model2.fit(X, y,
batch_size=32, epochs=32, verbose=1,
validation_data=(X, y))
return clf
I am calling to function as:
clf = self.__classifyRenet(X_train, y_train)
It is giving an error:
ValueError: Error when checking input: expected input_24 to have 4 dimensions, but got array with shape (785, 2000)
Please help. Thank you!
1. First, understand the error.
Your input does not match the input of ResNet, for ResNet, the input should be (n_sample, 224, 224, 3) but you are having (785, 2000). From your question, you have 784 images with array of size 2000, which doesn't really align with the original ResNet50 input shape of (224 x 224) no matter how you reshape it. That means you cannot use the ResNet50 directly with your data. The only thing you did in your code is to take the last layer of ResNet50 and added you output layer to align with your output class size.
2. Then, what you can do.
If you insist to use the ResNet architecture, you will need to change the input layer rather than output layer. Also, you will need to reshape your image data to utilize the convolution layers. That means, you cannot have it in a (2000,) array, but need to be something like (height, width, channel), just like what ResNet and other architectures are doing. Of course you will also need to change the output layer as well just like you did so that you are predicting for your classes. Try something like:
model = ResNet50(input_tensor=image_input_shape, include_top=True,weights='imagenet')
This way, you can specify customized input image shape. You can check the github code for more information (https://github.com/keras-team/keras/blob/master/keras/applications/resnet50.py). Here's part of the docstring:
input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 224)` (with `channels_first` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 197.
E.g. `(200, 200, 3)` would be one valid value.

Resources