Convert 3D Tensor to 4D Tensor in Pytorch - pytorch

I had difficulty finding information on reshaping in PyTorch. Tensorflow is quite easy.
My tensor has shape torch.Size([3, 480, 480]).
I want to convert it to a 4D tensor with shape [1,3,480,480].
How do I do that?

You can use unsqueeze()
For example:
x = torch.zeros((4,4,4)) # Create 3D tensor
x = x.unsqueeze(0) # Add dimension as the first axis (1,4,4,4)
I've seen a few people use indexing with None to add a singular dimension as well. For example:
x = torch.zeros((4,4,4)) # Create 3D tensor
print(x[None].shape) # (1,4,4,4)
print(x[:,None,:,:].shape) # (4,1,4,4)
print(x[:,:,None,:].shape) # (4,4,1,4)
print(x[:,:,:,None].shape) # (4,4,4,1)
Personally, I prefer unsqueeze(), but it's good to be familiar with both.

Related

Size Mismatch for Functional Linear Layer

I apologize that this is probably a simple question that has been answered before, but I could not find the answer. I’m attempting to use a CNN to extract features and then input that into a FC network that outputs 2 variables. I’m attempting to use the functional linear layer as a way to dynamically handle the flattened features. The self.cnn is a Sequential container which last layer is the nn.Flatten(). When I print the size of x after the CNN I see it is 15x152064, so I’m unclear why the F.linear layer is failing to run with the error below. Any help would be appreciated.
RuntimeError: size mismatch, get 15, 15x152064,2
x = self.cnn(x)
batch_size, channels = x.size()
x = F.linear(x, torch.Tensor([256,channels]))
y_hat = self.FC(x)
torch.Tensor([256, channels]) does not create a tensor of size (256, channels) but the 1D tensor containing the values 256 and channels instead. I don't know how you want to initialize your weights, but there are a couple options :
# Identity transform:
x = F.linear(x, torch.ones(256,channels))
# Random transform :
x = F.linear(x, torch.randn(256,channels))

Keras multi-class semantic segmentation label

For semantic segmentations, you generally end up with the last layer being something like
output = Conv2D(num_classes, (1, 1), activation='softmax')
My question is, how do I prepare the labels for this? For example, if I have 10 classes to identify, each with a different colour. For each label image, do I need to apply masking for one particular colour, turn this into grayscale image so that I can compare with 1 filter from the model output? Or is there a way to pass one full RGB picture in as the label?
The output of your network will be an image with 10 channels, where each pixel will consist of a vector of probabilities that sum to one (due to the softmax). Example: [0.1,0.1,0.1,0.05,0.05,0.1,0.1,0.1,0.1,0.2]. You want your labels images to be in the same shape: an image with 10 channels, and each pixel is a binary vector with a 1 at the index of the class and 0 elsewhere. Your segmentation loss function is then the pixel-wise crossentropy.
For implementation: the softmax in keras has an axis parameter: https://keras.io/activations/#softmax
np_utils.to_categorical(labels, num_classes)
When labels are (row,col), the output shape will be: (row, col, num_classes)
example:
https://github.com/naomifridman/Unet_Brain_tumor_segmentation

Does 1D Convolutional layer support variable sequence lengths?

I have a series of processed audio files I am using as input into a CNN using Keras. Does the Keras 1D Convolutional layer support variable sequence lengths? The Keras documentation makes this unclear.
https://keras.io/layers/convolutional/
At the top of the documentation it mentions you can use (None, 128) for variable-length sequences of 128-dimensional vectors. Yet at the bottom it declares that the input shape must be a
3D tensor with shape: (batch_size, steps, input_dim)
Given the following example how should I input sequences of variable length into the network
Lets say I have two examples (a and b) containing X 1 dimensional vectors of length 100 that I want to feed into the 1DConv layer as input
a.shape = (100, 100)
b.shape = (200, 100)
Can I use an input shape of (2, None, 100)? Do I need to concatenate these tensors into c where
c.shape = (300, 100)
Then reshape it to be something
c_reshape.shape = (3, 100, 100)
Where 3 is the batch size, 100, is the number of steps, and the second 100 is the input size? The documentation on the input vector is not very clear.
Keras supports variable lengths by using None in the respective dimension when defining the model.
Notice that often input_shape refers to the shape without the batch size.
So, the 3D tensor with shape (batch_size, steps, input_dim) suits perfectly a model with input_shape=(steps, input_dim).
All you need to make this model accept variable lengths is use None in the steps dimension:
input_shape=(None, input_dim)
Numpy limitation
Now, there is a numpy limitation about variable lengths. You cannot create a numpy array with a shape that suits variable lengths.
A few solutions are available:
Pad your sequences with dummy values until they all reach the same size so you can put them into a numpy array of shape (batch_size, length, input_dim). Use Masking layers to disconsider the dummy values.
Train with separate numpy arrays of shape (1, length, input_dim), each array having its own length.
Group your images by sizes into smaller arrays.
Be careful with layers that don't support variable sizes
In convolutional models using variable sizes, you can't for instance, use Flatten, the result of the flatten would have a variable size if this were possible. And the following Dense layers would not be able to have a constant number of weights. This is impossible.
So, instead of Flatten, you should start using GlobalMaxPooling1D or GlobalAveragePooling1D layers.

How to gather a tensor from Keras using its backend?

I trying to compile a model in Keras with an input that is a 2D numpy array.
What I need is to take the vector at the nth place of this 2D array and use it as a tensor 1D tensor for one of the layers.
How do I do it?
Using a lambda layer should do it:
extracted_tensor = Lambda(lambda x: x[:,nth_index,:], output_shape=(1,dim_vector))(input)
extracted_tensor = Flatten()(extracted_tensor)
note that in the x tensor (lambda function), you take the batch dimension into account, but you don't in the output_shape parameter.
I hope this helps
Use tf.gather( input_tensor, indices, axis ) to collect indices along the specified axis.

Convolutional NN for text input in PyTorch

I am trying to implement a text classification model using a CNN. As far as I know, for text data, we should use 1d Convolutions. I saw an example in pytorch using Conv2d but I want to know how can I apply Conv1d for text? Or, it is actually not possible?
Here is my model scenario:
Number of in-channels: 1, Number of out-channels: 128
Kernel size : 3 (only want to consider trigrams)
Batch size : 16
So, I will provide tensors of shape, <16, 1, 28, 300> where 28 is the length of a sentence. I want to use Conv1d which will give me 128 feature maps of length 26 (as I am considering trigrams).
I am not sure, how to define nn.Conv1d() for this setting. I can use Conv2d but want to know is it possible to achieve the same using Conv1d?
This example of Conv1d and Pool1d layers into an RNN resolved my issue.
So, I need to consider the embedding dimension as the number of in-channels while using nn.Conv1d as follows.
m = nn.Conv1d(200, 10, 2) # in-channels = 200, out-channels = 10
input = Variable(torch.randn(10, 200, 5)) # 200 = embedding dim, 5 = seq length
feature_maps = m(input)
print(feature_maps.size()) # feature_maps size = 10,10,4
Although I don't work with text data, the input tensor in its current form would only work using conv2d. One possible way to use conv1d would be to concatenate the embeddings in a tensor of shape e.g. <16,1,28*300>. You can reshape the input with view In pytorch.

Resources