Assume I have a tensor sequences of shape [8, 12, 2]. Now I would like to make a selection of that tensor for each first dimension which results in a tensor of shape [8, 2]. The selection over dimension 1 is specified by indices stored in a long tensor indices of shape [8].
I tried this, however it selects each index in indices for each first dimension in sequences instead of only one.
sequences[:, indices]
How can I make this query without a slow and ugly for loop?
sequences[torch.arange(sequences.size(0)), indices]
torch.index_select solves your problem more easily than torch.gather since you don't have to adapt the dimensions of the indeces. Indeces must be a tensor. For your case
indeces = [0,2]
a = torch.rand(size=(3,3,3))
torch.index_select(a,dim=1,index=torch.tensor(indeces,dtype=torch.long))
This should be doable by torch.gather, but you need to convert your index tensor first by
unsqueeze it to match the number of dimension of your input tensor
repeat_interleave it to match the size of last dimension
Here is an example based on your description:
# original indices dimension [8]
# after first unsueeze, dimension is [8, 1]
indices = torch.unsqueeze(indices, 1)
# after second unsueeze, dimension is [8, 1, 1]
indices = torch.unsqueeze(indices, 2)
# after repeat, dimension is [8, 1, 2]
indices = torch.repeat_interleave(indices, 2, dim=2)
# now you have the right dimension for torch.gather
# don't forget to squeeze the redundant dimension
# result has dimension [8, 2]
result = torch.gather(sequences, 1, indices).squeeze()
It can be done using torch.Tensor.gather
sequences = torch.randn(8,12,2)
# defining the indices as a 1D Tensor of random integers and reshaping it to use with Tensor.gather
indices = torch.randint(sequences.size(1),(sequences.size(0),)).unsqueeze(-1).unsqueeze(-1).repeat(1,1,sequences.size(-1))
# indices shape: (8, 1, 2)
output = sequences.gather(1,indices).squeeze(1)
# output shape: (8, 2)
Related
Say I have tensor A, and indexes Tensor: A = [1, 2, 3, 4], indexes = [1, 0, 3, 2]
I want to create a new Tensor from these two with the following result : [2, 1, 4, 3]
Each element of the result is element from A and the order is defined by the indexes Tensor.
Is there a way to do it with PyTorch tensor ops without loops?
My goal is to do it for 2D Tensor, but I don't think there is a way to do it without loops, so I thought to project it to 1D, do the work and project it back to the 2D.
You can use scatter:
A = torch.tensor([1, 2, 3, 4])
indices = torch.tensor([1, 0, 3, 2])
result = torch.tensor([0, 0, 0, 0])
print(result.scatter_(0, indices, A))
In 1D you can simply perform A[indexes].
In 2D it is still doable in this way:
A = torch.arange(5, 10).repeat(3, 1) # shape: (3, 5)
indexes = torch.stack([torch.randperm(5) for _ in range(3)]) # shape (3, 5)
A_sort = A[torch.arange(3).unsqueeze(1), indexes]
print(A_sort)
I once saw code segment using torch.reshape as following:
for name, value in samples.items():
value_flat = torch.reshape(value, (-1,) + value.shape[2:]))
what does (-1,) + value.shape[2:]) means here? Here value is of type torch.tensor.
(-1,) is a one element tuple containing -1.
value.shape[2:] selects from the third to last elements from value.shape (the shape of tensor value).
All in all, what happens is the tuple gets concatenated with the torch.Size object to make a new tuple. Let's take an example tensor:
>>> x = torch.rand(2, 3, 63, 64)
>>> x.shape[2:]
torch.Size([64, 64])
>>> (-1,) + x.shape[2:]
(-1, 64, 64)
When using -1 in torch.reshape, it indicates to 'put the rest of the dimensions on that axis'. Here it will essentially flatten the first and second axes (batch and channel) together.
In our example, the shape will go from (2, 3, 64, 64) to (6, 64, 64), i.e. if the tensor has four dimensions, the operation is equivalent to
value.reshape(value.size(0)*value.size(1), value.size(2), value.size(3))
but is certainly very clumsy to write it this way.
Are there equivalent of np.multiply.at in Pytorch? I have two 4d arrays and one 2d index array:
base = torch.ones((2, 3, 5, 5))
to_multiply = torch.arange(120).view(2, 3, 4, 5)
index = torch.tensor([[0, 2, 4, 2], [0, 3, 3, 2]])
As shown in this question I asked earlier (in Numpy), the row index of the index array corresponds to the 1st dimension of base and to_multiply, and the value of the index array corresponds to the 3rd dimension of base. I want to take the slice from base according to the index and multiply with to_multiply, it can be achieved in Numpy as follows:
np.multiply.at(base1, (np.arange(2)[:,None,None],np.arange(3)[:,None],index[:,None,:]), to_multiply)
However, now when I want to translate this to PyTorch, I cannot find an equivalent of np.multiply.at in Pytorch, I can only find the "index_add_" method but there is no "index_multiply". And I want to avoid doing explicit for loop.
So how can I achieve above in PyTorch? Thanks!
I'm trying to concatenate a tensor of numerical data with the output tensor of a resnet-50 model. The output of that model is tensor shape torch.Size([10,1000]) and the numerical data is tensor shape torch.Size([10, 110528,8]) where the 10 is the batch size, 110528 is the number of observations in a data frame sense, and 8 is the number of columns (in a dataframe sense). I need to reshape the numerical tensor to torch.Size([10,8]) so it will concatenate properly.
How would I reshape the tensor?
Starting tensors.
a = torch.randn(10, 1000)
b = torch.randn(10, 110528, 8)
New tensor to allow concatenate.
c = torch.zeros(10,1000,7)
Check shapes.
a[:,:,None].shape, c.shape
(torch.Size([10, 1000, 1]), torch.Size([10, 1000, 7]))
Alter tensor a to allow concatenate.
a = torch.cat([a[:,:,None],c], dim=2)
Concatenate in dimension 1.
torch.cat([a,b], dim=1).shape
torch.Size([10, 111528, 8])
I'm trying to test my Scikit-learn machine learning algorithm with a simple R^2 score, but for some reason it always returns zero.
import numpy
from sklearn.metrics import r2_score
prediction = numpy.array([0.1567, 4.7528, 1.1260, 0.2294]).reshape(1, -1)
training = numpy.array([0, 3, 1, 0]).reshape(1, -1)
r2 = r2_score(training, prediction, multioutput="raw_values")
print r2
[ 0. 0. 0. 0.]
This is a single four-part value, not four separate values. How do I get proper R^2 scores?
If you are trying to calculate the r2 value between two vectors you should just pass two one dimensional arrays. See the documentation
In the example you provided, the first item is compared to the first item, but note you only have one list in each the prediction and training, so it is calculating R2 for 0.1567 to 0, which is 0, then it calculates it for 4.7528 to 3 which is also 0 and so on... It sounds like you want the R2 for the two vectors like the following:
prediction = numpy.array([0.1567, 4.7528, 1.1260, 0.2294])
training = numpy.array([0, 3, 1, 0])
print(r2_score(training, prediction))
0.472439485
If you have multi-dimensional arrays you can use the multioutput flag to determine what the output should look like:
#modified from the scikit-learn example
y_true = [[0.5, 1], [-1, 1], [7, -6]]
y_pred = [[0, 2], [-1, 2], [8, -5]]
print(r2_score(y_true, y_pred, multioutput='raw_values'))
array([ 0.96543779, 0.90816327])
Here the output is where the first item of each list in y_true is compared to the first item in each list of y_pred, the second item to the second and so on