Pytorch - add rows of a 2D tensor element-wise - add

I have the following tensor :
ts = torch.tensor([[1,2,3],[4,6,7],[8,9,10]])
> tensor([[ 1, 2, 3],
[ 4, 6, 7],
[ 8, 9, 10]])
I am looking for a pytorch generic operation that adds all rows element-wise like that:
ts2 = ts[0]+ts[1]+ts[2]
print(ts2)
> tensor([13, 17, 20])
In reality, the number of rows corresponds to the batch size that vary.

You can sum over an axis/dimension like so:
torch.sum(ts, dim=0)

Related

How to transpose and multiply tensor with itself in pytorch

What I have is the following tensor:
a = np.array([[1,1], [2,2], [3,3]])
t = torch.from_numpy(a)
I need an operation that gets me to the following matrix:
enter image description here
This matrix contains the element-wise dot product of the tensor if it gets multiplied by its transposed version, i.e., the first element on the diagonal is
1x1 + 1x1 = 2, the 2nd element on the diagonal is
2x2 + 2x2 = 8, and the 2nd element in the first column is 1x2 + 1x2 = 4,
and so on. How do I get this matrix in torch, starting from tensor t? Thank you!
I tried various combinations of torch.tensordot, torch.transpose, etc.
einsum it!
print(torch.einsum("ik,jk->ij", t, t))
print(torch.einsum("ki,kj->ij", t, t))
tensor([[ 2, 4, 6],
[ 4, 8, 12],
[ 6, 12, 18]], dtype=torch.int32)
tensor([[14, 14],
[14, 14]], dtype=torch.int32)

Is it possible to perform row-wise tensor operations if two PyTorch tensors do not have the same size without using list comprehensions?

This is a pretty specific usage case, but I'm hoping someone out there is more familiar with PyTorch tensors than I am and can help me speed this up.
I'm working on implementing a custom similarity metric for a neural network and have successfully gotten it to work, but it is incredibly slow to calculate. Each epoch takes about a minute to run, which simply isn't going to work with how I wanted to compare it with other metrics. So, I've been trying to utilize PyTorch tensors more effectively to speed things up, but haven't had much success.
Basically, I need to sum up the integers in the 'counts' tensor between the min and max indices specified in the 'min' and 'max' tensors for each sample and cluster combination.
As mentioned, my original implementation using loops took about a minute per epoch to run, but I did manage to reduce that to about 18-20 seconds using list comprehensions:
# counts has size (16, 100), max and min have size (2708, 7, 16)
data_mass = torch.sum(torch.tensor([[[torch.pow(torch.sum(counts[k][min[i][j][k]:max[i][j][k]+1]) / divisor, 2) for k in range(len(counts))] for j in range(len(min[i]))] for i in range(len(min))]), 2)
This feels super janky, and I've seen some clever things done with PyTorch functions, but I haven't been able to find anything yet that addresses quite what I want to do. Thanks in advance! I'm happy to clarify anything that may not be clear, I understand the use case is a bit convoluted.
EDIT: I'll try and break down the code snippet above and provide a minimal example. Examples of minimal inputs might look like the following:
'min' and 'max' are both 3-dimensional tensors of shape (num_samples, num_clusters, num_features), such as this one of size (2, 3, 4)
min = tensor([[[1, 2, 3, 1],
[2, 1, 1, 2],
[1, 2, 2, 1]],
[[2, 3, 2, 1],
[3, 3, 1, 2],
[1, 0, 2, 1]]])
max = tensor([[[3, 3, 4, 4],
[3, 2, 3, 4],
[2, 4, 3, 2]],
[[4, 4, 3, 3],
[4, 4, 2, 3],
[2, 1, 3, 2]]])
'counts' is a 2-dimensional tensor of size(num_features, num_bins),
so for this example we'll say size (4, 5)
counts = tensor([[1, 2, 3, 4, 5],
[2, 5, 3, 1, 1],
[1, 2, 3, 4, 5],
[2, 5, 3, 1, 1]])
The core part of the code snippet given above is the summation of the counts tensor between the values given by the min and max tensors for each pair of indices given at each index in max/min. For the first sample/cluster combo above:
mins = [1, 2, 3, 1]
maxes = [3, 3, 4, 4]
#Starting with feature #1 (leftmost element of min/max, top row of counts),
we sum the values in counts between the indices specified by min and max:
min_value = mins[0] = 1
max_value = maxes[0] = 3
counts[0] = [1, 2, 3, 4, 5]
subset = counts[0][mins[0]:maxes[0]+1] = [2, 3, 4]
torch.sum(subset) = 9
#Second feature
min_value = mins[1] = 2
max_value = maxes[1] = 3
counts[1] = [2, 5, 3, 1, 1]
subset = counts[0][mins[0]:maxes[0]+1] = [3, 1]
torch.sum(subset) = 4
In my code snippet, I perform a few additional operations, but if we ignore those and just sum all the index pairs, the output will have the form
pre_sum_output = tensor([[[9, 4, 9, 10],
[7, 8, 9, 5]
[5, 5, 7, 8]],
[[12, 2, 7, 9],
[9, 2, 5, 4],
[5, 7, 7, 8]]])
Finally, I sum the output one final time along the third dimension:
data_mass = torch.sum(pre_sum_output, 2) = torch.tensor([[32, 39, 25],
[30, 20, 27]])
I then need to repeat this for every pair of mins and maxes in 'min' and 'max' (each [i][j][k]), hence the list comprehension above iterating through i and j to get each sample and cluster respectively.
By noticing that torch.sum(counts[0][mins[0]:maxes[0]+1]) is equal to cumsum[maxes[0]] - cumsum[mins[0]-1] where cumsum = torch.cumsum(counts[0]), you can get rid of the loops like so:
# Dim of sample, clusters, etc.
S, C, F, B = range(4)
# Copy min and max over bins
min = min.unsqueeze(B)
max = max.unsqueeze(B)
# Copy counts over samples and clusters
counts = counts.reshape(
1, # S
1, # C
*counts.shape # F x B
)
# Number of samples, clusters, etc.
ns, nc, nf, nb = min.size(S), min.size(C), min.size(F), counts.size(B)
# Calculate cumulative sum and copy over samples and clusters
cum_counts = counts.cumsum(dim=B).expand(ns, nc, nf, nb)
# Prevent index error when min index is 0
is_zero = min == 0
lo = (min - 1).masked_fill(is_zero, 0)
# Compute the contiguous sum from min to max (inclusive)
lo_sum = cum_counts.gather(dim=B, index=lo)
hi_sum = cum_counts.gather(dim=B, index=max)
sum_counts = torch.where(is_zero, hi_sum, hi_sum - lo_sum)
pre_sum_output = sum_counts.squeeze(B)
You can then sum over the 2nd dim to get data_mass.

How can I calculate all cross-terms in pytorch?

I would like to calculate all cross-terms of each vector in a matrix.
For example, consider the following matrix:
X = tensor([[1, 2, 3],
[4, 5, 6]]),
and I would like to obtain all cross-terms of each vector in this matrix as:
Y = [[1*1, 1*2, 1*3, 2*2, 2*3, 3*3],
[4*4, 4*5, 4*6, 5*5, 5*6, 6*6]].
= [[1, 2, 3, 4, 6, 9],
[16, 20, 24, 25, 30, 36]].
That is, this is the all combination values of the vector elements
and I believe that this can be calculated using torch.combinations;
however, torch.combinations does not provide the batch implementation
and I couldn't produce the above result in pytorch.
How can I calculate all cross-terms in pytorch?
You can stack the product of combinations with replacement for each of the rows in that matrix
>>> torch.stack(tuple(torch.prod(torch.combinations(data[i],with_replacement=True),1) for i in range(data.shape[0])),0)
>>> tensor([[ 1, 2, 3, 4, 6, 9],
[16, 20, 24, 25, 30, 36]])

Split and extract values from a PyTorch tensor according to given dimensions

I have a tensor Aof sizetorch.Size([32, 32, 3, 3]) and I want to split it and extract a tensor B of size torch.Size([16, 16, 3, 3]) from it. The tensor can be 1d or 4d and split has to be according to the given new tensor dimensions. I have been able to generate the target dimensions but I'm unable to split and extract the values from the source tensor. I ave tried torch.narrow but it takes only 3 arguments and I need 4 in many cases. torch.split takes dim as an int due to which tensor is split along one dimension only. But I want to split it along multiple dimensions.
You have multiple options:
use .split multiple times
use .narrow multiple times
use slicing
e.g.:
t = torch.rand(32, 32, 3, 3)
t0, t1 = t.split((16, 16), 0)
print(t0.shape, t1.shape)
>>> torch.Size([16, 32, 3, 3]) torch.Size([16, 32, 3, 3])
t00, t01 = t0.split((16, 16), 1)
print(t00.shape, t01.shape)
>>> torch.Size([16, 16, 3, 3]) torch.Size([16, 16, 3, 3])
t00_alt, t01_alt = t[:16, :16, :, :], t[16:, 16:, :, :]
print(t00_alt.shape, t01_alt.shape)
>>> torch.Size([16, 16, 3, 3]) torch.Size([16, 16, 3, 3])

Selecting second dim of tensor using an index tensor

I have a 2D tensor and an index tensor. The 2D tensor has a batch dimension, and a dimension with 3 values. I have an index tensor that selects exactly 1 element of the 3 values. What is the "best" way to product a slice containing just the elements in the index tensor?
t = torch.tensor([[1,2,3], [4,5,6], [7,8,9]])
t = tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
i = torch.tensor([0,0,1], dtype=torch.int64)
tensor([0, 0, 1])
Expected output...
tensor([1, 4, 8])
An example of the answer is as follows.
import torch
t = torch.tensor([[1,2,3], [4,5,6], [7,8,9]])
col_i = [0, 0, 1]
row_i = range(3)
print(t[row_i, col_i])
# tensor([1, 4, 8])

Resources