Division in batches of a 3D tensor (Pytorch) - pytorch

I have a 3D tensor of size say 100x5x2 and mean of the tensor across axis=1 which gives shape 100x2.
100 here is the batch size. Normally without batch, the division of tensor of shape 5x2 and 2 works perfectly but in the case of the 3D tensor with batch, I’m receiving error.
a = torch.rand(5,2)
b = torch.rand(2)
z=a/b
gives me expected answer.
a = torch.rand(100,5,2)
b = torch.rand(100,2)
z=a/b
Gives me the following error.
The size of tensor a (5) must match the size of tensor b (100) at non-singleton dimension 1.
How to divide these tensors such that my output is of shape 100x5x2 ? Something like bmm for division?

Simply do:
z = a / b.unsqueeze(1)
This adds an extra dimension in b and makes it of shape (100, 1, 2) which is compatible for broadcasting with a.

Related

Torch tensor filter by index but keep the shape

I have an input tensor of shape: data (x,y,z).
I have a binary mask tensor of shape: mask (x,y).
When I do data[mask > 0] I obtain a new tensor of shape (q,z) where q is the number of ones in the mask tensor.
I would instead want to get a tensor of original shape (x,y,z) but for the values for which we have zeros in mask being eliminated and instead being padded at the end with 0 in data tensor so we keep the original shape ( the reason it doesn't do that now is because we would have variable length across second dimension).
Of course this can be easily done in python with basic matrix operations, but is there an efficient tensor-way to do it in pytorch?
Example (imagine a,b,c...are 1D tesnors):
data
[[a,b,c],
[d,e,f]]
mask
[[0,1,0],
[1,0,1]]
Ideal output:
[[b,junk,junk],
[d,f, junk]]
Here, the missing stuff is padded with some "junk" to keep the original shape.

PyTorch high-dimensional tensor through linear layer

I have a tensor of size (32, 128, 50) in PyTorch. These are 50-dim word embeddings with a batch size of 32. That is, the three indices in my size correspond to number of batches, maximum sequence length (with 'pad' token), and the size of each embedding. Now, I want to pass this through a linear layer to get an output of size (32, 128, 1). That is, for every word embedding in every sequence, I want to make it one dimensional. I tried adding a linear layer to my network going from 50 to 1 dimension, and my output tensor is of the desired shape. So I think this works, but I would like to understand how PyTorch deals with this issue, since I did not explicitly tell it which dimension to apply the linear layer to. I played around with this and found that:
If I input a tensor of shape (32, 50, 50) -- thus creating ambiguity by having two dimensions along which the linear layer could be applied to (two 50s) -- it only applies it to the last dim and gives an output tensor of shape (32, 50, 1).
If I input a tensor of shape (32, 50, 128) it does NOT output a tensor of shape (32, 1, 128), but rather gives me an error.
This suggests that a linear layer in PyTorch applies the transformation to the last dimension of your tensor. Is that the case?
In the nn.Linear docs, it is specified that the input of this module can be any tensor of size (*, H_in) and the output will be a tensor of size (*, H_out), where:
* means any number of dimensions
H_in is the number of in_features
H_out is the number of out_features
To understand this better, for a tensor of size (n, m, 50) can be processed by a Linear module with in_features=50, while a tensor of size (n, 50, m) can be processed by a Linear module with in_features=m (in your case 128).

Sample a tensor of probability distributions in pytorch

I want to sample a tensor of probability distributions with shape (N, C, H, W), where dimension 1 (size C) contains normalized probability distributions with ‘C’ possibilities. Is there a pytorch function to efficiently sample all the distributions in the tensor in parallel? I just need to sample each distribution once, so the result could either be a one-hot tensor with the same shape or a tensor of indices with shape (N, 1, H, W).
There was no single function to sample that I saw, but I was able to sample the tensor in several steps by computing the cumulative probabilities, sampling each point independently, and then picking the first point that sampled a 1 in the distribution dimension:
reverse_cumulative = torch.flip(torch.cumsum(torch.flip(probabilities, [1]), dim=1), [1])
cumulative = probabilities / reverse_cumulative
sampled = (torch.rand(cumulative.shape, device=device()) <= cumulative)
idxs = sampled * one_hot
idxs[~sampled] = self.tile_count
sampled_idxs = idxs.min(dim=1).indices

Broadcasting element wise multiplication in pytorch

I have a tensor in pytorch with size torch.Size([1443747, 128]). Let's name it tensor A. In this tensor, 128 represents a batch size. I have another 1D tensor with size torch.Size([1443747]). Let's call it B. I want to do element wise multiplication of B with A, such that B is multiplied with all 128 columns of tensor A (obviously in an element wise manner). In other words, I want to broadcast the element wise multiplication along dimension=1.
How can I achieve this in pytorch?
It I didn't have a batch size involved in the tensor A (batch size = 1), then normal * operator would do the multiplication easily. A*B then would have generated resultant tensor of size torch.Size([1443747]). However, I don't understand why pytorch is not broadcasting the tensor multiplication along dimension 1? Is there any way to do this?
What I want is, B should be multiplied with all 128 columns of A in an element wise manner. So, the resultant tensors' size would be torch.Size([1443747, 128]).
The dimensions should match, it should work if you transpose A or unsqueeze B:
C = A.transpose(1,0) * B # shape: [128, 1443747]
or
C = A * B.unsqueeze(dim=1) # shape: [1443747, 128]
Note that the shapes of the two solutions are different.

Why can tf.random.truncated_normal get a shape that is not a vector even though it says it only receives shape of a vector?

I am working with TensorFlow in Python.
I read through the documentation of
tf.random.truncated_normal
that the input 'shape' gets 1-D tensor or python array, i.e. a vector (according to https://www.tensorflow.org/guide/tensors).
However, with the example I'm using, 'shape' is a 4-D tensor. Or is it considered a vector? Perhaps I have problem with the definition of vectors and tensors?
def weight_variable(shape, name = 'noname'):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial, name = name)
W_conv1 = weight_variable([5, 5, 3, 32], 'W_conv1')
So there is a small mistake you are making in your understanding of a Tensor. A Tensor can have different "ranks". A single scalar such as 1 is a Rank 0 Tensor. A list/vector such as [1,2,3,4] is a Rank 1 Tensor. a 2-D Matrix such as [[0,0],[0,0]] is a Rank 2 Tensor and 3D Matrix are Rank 3 Tensors and so on. So the input you have here is a vector or Rank 1 Tensor not a 4-D Tensor.
Here is a nice blog post about this.

Resources