how to duplicate the input channel in a tensor? - pytorch

I have a tensor with the shape torch.Size([39, 1, 20, 256, 256]) how do I duplicate the channel to make the shape torch.Size([39, 3, 20, 256, 256]).

I am fairly certain that this is already a duplicate question, but I could not find a fitting answer myself, which is why I am going ahead and answer this by referring to both the PyTorch documentation and PyTorch forum
Essentially, torch.Tensor.expand() is the function that you are looking for, and can be used as follows:
x = torch.rand([39, 1, 20, 256, 256])
y = x.expand(39, 3, 20, 256, 256)
Note that this works only on singleton dimensions, which is the case in your example, but may not work for arbitrary dimensions prior to expansion. Also, this is basically just providing a different memory view, which means that, according to the documentation, you have to keep the following in mind:
More than one element of an expanded tensor may refer to a single
memory location. As a result, in-place operations (especially ones
that are vectorized) may result in incorrect behavior. If you need to
write to the tensors, please clone them first.
For a newly allocated memory version, see torch.Tensor.repeat, which is outlined in this (slightly related) answer. The syntax works otherwise exactly the same as expand().

Related

Is it possible to add tensors of different sizes together in pytorch?

I have an image gradient of size (3, 224, 224) and a patch of (1, 768). is it possible to add this gradient to the patch to get a size of the patch (1, 768)?
Forgive my inquisitiveness. I know pytorch too utilizes broadcasting and I am not sure if I will able to do so with two different tensors in way similar to the line below:
torch.add(a, b)
For example:
The end product would be the same patch on the left with the gradient of an entire image on the right added to it. My understanding is that it’s not possible, but knowledge isn’t bounded.
No. Whether two tensors are broadcastable is defined by the following rules:
Each tensor has at least one dimension.
When iterating over the dimension sizes, starting at the trailing dimension, the dimension sizes must either be equal, one of them is 1, or one of them does not exist.
Because the second bullet doesn't hold in your example (i.e., 768 != 224, 1 not in {224, 768}), you can't broadcast the add. If you have some meaningful way to reshape your gradients, you might be able to.
I figured out to do it myself. I divided the image gradient (right) into 16 x 16 patches, created a loop that adds each patch to the original image patch (left). This way, I was able to add a 224 x 224 image gradient into a 16 x 16 patch. I just wanted to see what would happen if I do such

Conv3D size doesn’t make sense with NIFTI data?

So I am writing custom dataset for medical images, with .nii (NIFTI1 format), but there is a confusion.
My dataloader returns the shape torch.Size (1,1,256,256,51) . But NIFTI volumes use anatomical axes, different coordinate system, so it doesn’t make any sense to permute the axes, which I normally would with volume made of 2D images each stored separately in local drive with 51 slice images (or depth), as Conv3D follows the convention (N,C,D,H,W).
so torch.Size (1,1,256,256,51) (ordinarily 51 would be the depth) doesn’t follow the convention (N,C,D,H,W) , but I should not permute the axes as the data uses entirely different coordinate system ?
In pytorch 3d convolution layer naming of the 3 dimensions you do convolution on is not really important (e.g. this layer doesn't really have a special treatment for depth compared to height). All difference is coming from kernel_size argument (and also padding if you use that). If you permute the dimensions and correspondingly permute the kernel_size parameters nothing will really change. So you can either permute your input's dimensions using e.g. x.permute(0, 1, 4, 2, 3) or continue using your initial tensor with depth as the last dimension.
Just to clarify - if you wanted to use kernel_size=(2, 10, 10) on your DxHxW image, now you can instead to use kernel_size=(10, 10, 2) on your HxWxD image. If you want all your code explicitly assume that dimension order is always D, H, W then you can create tensor with permuted dimensions using x.permute(0, 1, 4, 2, 3).
Let me know if I somehow misunderstand the problem you have.

Torch squeeze and the batch dimension

does anyone here know if the torch.squeeze function respects the batch (e.g. first) dimension? From some inline code it seems it does not.. but maybe someone else knows the inner workings better than I do.
Btw, the underlying problem is that I have tensor of shape (n_batch, channel, x, y, 1). I want to remove the last dimension with a simple function, so that I end up with a shape of (n_batch, channel, x, y).
A reshape is of course possible, or even selecting the last axis. But I want to embed this functionality in a layer so that I can easily add it to a ModuleList or Sequence object.
EDIT: just found out that for Tensorflow (2.5.0) the function tf.linalg.diag DOES respect batch dimension. Just a FYI that it might differ per function you are using
No! squeeze doesn't respect the batch dimension. It's a potential source of error if you use squeeze when the batch dimension may be 1. Rule of thumb is that only classes and functions in torch.nn respect batch dimensions by default.
This has caused me headaches in the past. I recommend using reshape or only using squeeze with the optional input dimension argument. In your case you could use .squeeze(4) to only remove the last dimension. That way nothing unexpected happens. Squeeze without the input dimension has led me to unexpected results, specifically when
the input shape to the model may vary
batch size may vary
nn.DataParallel is being used (in which case batch size for a particular instance may be reduced to 1)
Accepted answer is sufficient for the problem - to squeeze last dimension. However, I had tensor of dimension (batch, 1280, 1, 1) and wanted (batch, 1280). Squeeze function didn't allow for that - squeeze(tensor, 1).shape -> (batch, 1280, 1, 1) and squeeze(tensor, 2).shape -> (batch, 1280, 1). I could have used squeeze two times, but you know, aesthetics :).
What helped me was torch.flatten(tensor, start_dim = 1) -> (batch, 1280). Trivial, but I forgot about it. Warning though, this function my create a copy instead view, so be careful.
https://pytorch.org/docs/stable/generated/torch.flatten.html

Padding in Conv2D gives wrong result?

I'm using the Conv2D method of Keras. In the documentation it is written that
padding: one of "valid" or "same" (case-insensitive). Note that "same"
is slightly inconsistent across backends with strides != 1, as
described here
As input I have images of size (64,80,1) and I'm using kernel of size 3x3. Does that mean that the padding is wrong when using Conv2D(32, 3, strides=2, padding='same')(input)?
How can I fix it using ZeroPadding2D?
Based on your comment and seeing that you defined a stride of 2, I believe what you want to achieve is an output size that's exactly half of the input size, i.e. output_shape == (32, 40, 32) (the second 32 is the features).
In that case, just call model.summary() on the final model and you will see if that is the case or not.
If it is, there's nothing else to do.
If it's bigger than you want, you can add a Cropping2D layer to cut off pixels from the borders of the image.
If it's smaller than you want, you can add a ZeroPadding2D layer to add zero-pixels to the borders of the image.
The syntax to create these layers is
Cropping2D(cropping=((a, b), (c, d)))
ZeroPadding2D(padding=((a, b), (c, d)))
a: number of rows you want to add/cut off to/from the top
b: number of rows you want to add/cut off to/from the bottom
c: number of columns you want to add/cut off to/from the left
d: number of columns you want to add/cut off to/from the right
Note however, that there is no strict technical need to always perfectly half the size with each convolution layer. Your model might work well without any padding or cropping. You will have to experiment with it in order to find out.

Is It Possible to the Take the Mode of a Tensor in Tensorflow?

I'm trying to construct a DAG in Tensorflow where I need to take the mode (most frequent value) of individual regions of my target. This is in order to construct a downsampled target.
Right now, I'm pre-processing the downsampled targets for every individual situation I might encounter, saving them, and then loading them. Obviously, this would all be much easier if it was integrated into my Tensorflow graph, so that I could downsample at runtime.
But I've looked everywhere, and I can find no evidence of a tf.reduce_mode, that would function the same as tf.reduce_mean. Is there any way to construct this functionality in a Tensorflow graph?
My idea is that we get the unique numbers and their counts. We then find the numbers that appear most frequently. Finally we fetch those numbers (could be more than one) out by using their indices in the number-count tensor.
samples = tf.constant([10, 32, 10, 5, 7, 9, 9, 9])
unique, _, count = tf.unique_with_counts(samples)
max_occurrences = tf.reduce_max(count)
max_cond = tf.equal(count, max_occurrences)
max_numbers = tf.squeeze(tf.gather(unique, tf.where(max_cond)))
with tf.Session() as sess:
print 'Most frequent Numbers\n', sess.run(max_numbers)
> Most frequent Numbers
9

Resources