I have a image tensor of shape :-
N,C,H,W = 5,512,13,13
I need to take a mean across H and W dimensions so that the output is of shape :-
N,C,1,1
I am trying doing for loop but is there some better way to do so using reshape. .
import torch
tz = torch.rand(5, 512, 13, 13)
tzm = tz.mean(dim=(2,3), keepdim=True)
tzm.shape
Output
torch.Size([5, 512, 1, 1])
Related
I have a PyTorch tensor of size (1, 4, 128, 128) (batch, channel, height, width), and I want to 'upsample' it to (1, 3, 256, 256)
I thought to use interpolate (a function in nn.functional)
However, reading the documentation, and applying this function I am able to get in output a shape (1, 4, 256, 256), so maybe it is not the function that I am looking for. The code that I used is the following:
import torch.nn as nn
#x.shape -> (1,4,128,128)
x_0 = nn.functional.interpolate(x, scale_factor=2, mode='bilinear', align_corners=False)
#x_0.shape -> (1,4,256,256)
How can I do that (from (1, 4, 128, 128) to (1, 3, 256, 256))?
To follow there is the network that I am trying to replicate, but I got stack in the upsample layer.
What about PyTorch's nn.Upsample function:
upsample = nn.Upsample(scale_factor=2)
x = upsample(x)
Not sure if that's what you are looking for since you want the second dimension to change from 4 to 3.
I was trying to implement a paper where the input dimensions are meant to be a tensor of size ([1, 3, 224, 224]). My current image size is (512, 512, 3).
How do I resize and convert in order to input to the model?
Assuming image already converted to torch.Tensor and has shape (512, 512, 3), one of possible ways:
from torchvision.transforms import Resize
image = image.permute((2, 0, 1)) # convert to (C, H, W) format
image = image.unsqueeze(0) # add fake batch dimension
resize = Resize((224, 224))
new_image = resize(image)
Now new_image.shape equals to (1, 3, 224, 224)
Following the official tutorial on affine_grid, this line (inside function stn()):
grid = F.affine_grid(theta, x.size())
gives me an error:
RuntimeError: Expected size for first two dimensions of batch2 tensor to be: [10, 3] but got: [4000, 3].
using the input as follow:
from torchinfo import summary
model = Net()
summary(model, input_size=(10, 1, 256, 256))
theta is of size (4000, 2, 3) and x is of size (10, 1, 256, 256), how do I properly manipulate theta for correct dimension to use with batch tensor?
EDIT: I honestly don't see any mistake here, and the dimension is actually according the the affine_grid doc, is there something changed in the function itself?
I have multiple torch tensors with the following shapes
x1 = torch.Size([1, 512, 177])
x2 = torch.Size([1, 512, 250])
x3 = torch.Size([1, 512, 313])
How I can pad all these tensors by 0 over the last dimension, to have a unique shape like ([1, 512, 350]).
What I tried to do is to convert them into NumPy arrays and use these two lines of code:
if len(x1) < 350:
ff = np.pad(f, [(0, self.max_len - f.shape[0]), ], mode='constant')
f = ff
But unfortunately, it doesn't affect the last dim and still, the shapes are not equal.
Any help will be appreciated
Thanks
You can simply do:
import torch.nn.functional as F
x = F.pad(x, (0, self.max_len - x.size(2)), "constant", 0)
I have these 2 variables:
result- tensor 1 X 251 X 20
kernel - tensor 1 X 10 X 10
when I run the command:
from torch.nn import functional as F
result = F.conv2d(result, kernel)
I get the error:
RuntimeError: expected stride to be a single integer value or a list of 1 values to match the convolution dimensions, but got stride=[1, 1]
I am not giving any stride, what am I doing wrong?
import torch
import torch.nn.functional as F
image = torch.rand(16, 3, 32, 32)
filter = torch.rand(1, 3, 5, 5)
out_feat_F = F.conv2d(image, filter,stride=1, padding=0)
print(out_feat_F.shape)
Out:
torch.Size([16, 1, 28, 28])
Which is equivalent with:
import torch
import torch.nn
image = torch.rand(16, 3, 32, 32)
conv_filter = torch.nn.Conv2d(in_channels=3, out_channels=1, kernel_size=5, stride=1, padding=0)
output_feature = conv_filter(image)
print(output_feature.shape)
Out:
torch.Size([16, 1, 28, 28])
Padding is by default 0, stride is by default 1.
The filter last two dimensions in the first example correspond to the
kernel size in the second example.
kernel_size=5 is the same as kernel_size=(5,5).