How to divide a list of arrays into into subarrays? - python-3.x

I have a list 'a' with following element values. In my code, I created a list :
a=[]
b=np.zeros(3)
c=[]
for i in range(0,4):
b[0]=i+1
b[1]=i+2
b[2]=i+3
c.append(deepcopy(b))
a.append(c)
c=[]
print(a)
Output:
[[array([1., 2., 3.]), array([2., 3., 4.]), array([3., 4., 5.]), array([4., 5., 6.])]]
Above list is example like I get in my data
I tried to make array
b=np.array(a)
array([[[1., 2., 3.],
[2., 3., 4.],
[3., 4., 5.],
[4., 5., 6.]]])
b.shape
(1,4,3)
But I want to make b of shape (4,1,3) which gives following output:
So that when I access
b[0] gives [1,2,3]
b[1] gives [2,3,4]
b[2] gives [3,4,5]
b[3] gives [4,5,6]

There's a built-in function for this:
b = np.vstack(a)
EDITED
After using np.vstack(a)
b=b.reshape(4,3,1)
This gives required result
b[0]- > [1,2,3]

EDIT:
Use #orli answer it's easier to type!
Using basic Python 3.
import numpy as np
from copy import deepcopy
a=[]
b=np.zeros(3)
c=[]
for i in range(0,4):
b[0]=i+1
b[1]=i+2
b[2]=i+3
c.append(deepcopy(b))
a.append(c)
res = []
for r in a:
for c in r:
rw = []
for e in c.tolist():
rw.append(e)
res.append(rw)
print(res)
Yields:
[[1.0, 2.0, 3.0], [2.0, 3.0, 4.0], [3.0, 4.0, 5.0], [4.0, 5.0, 6.0]]

Maybe I'm missing something but you should be able to get the result as:
b = np.array(a[0])
print(b[0]) # [1. 2. 3.]
print(b[1]) # [2. 3. 4.]
print(b[2]) # [3. 4. 5.]
print(b[3]) # [4. 5. 6.]
To preserve 3D array:
np.array([a[0]]).reshape(4,1,3)
print(b[0]) #=> [[1. 2. 3.]]
print(b[1]) #=> [[2. 3. 4.]]
print(b[2]) #=> [[3. 4. 5.]]
print(b[3]) #=> [[4. 5. 6.]]

Related

How can I replace part of a matrix with a new matrix in torch

Consider following matrices
>>> a = torch.Tensor([[1,2,3],[4,5,6], [7,8,9]])
>>> a
tensor([[1., 2., 3.],
[4., 5., 6.],
[7., 8., 9.]])
>>> b = torch.tensor([[1,1],[1,1]])
>>> b
tensor([[1, 1],
[1, 1]])
I want to replace 4 elements in a with b where their indices are specified in X = [0,2] and Y = [0,2]
To have:
>>>a
tensor([[1., 2., 1.],
[4., 5., 6.],
[1., 8., 1.]])
I look for some operations like scatter or put_index to update the matrix in few commands (not loops).
If we consider X and Y two tensors of horizontal and vertical indices, the following can work:
a[X.reshape(-1,1), Y] = b

Indexing using pyTorch tensors along one specific dimension with 3 dimensional tensor

I have 2 tensors:
A with shape (batch, sequence, vocab)
and B with shape (batch, sequence).
A = torch.tensor([[[ 1., 2., 3.],
[ 5., 6., 7.]],
[[ 9., 10., 11.],
[13., 14., 15.]]])
B = torch.tensor([[0, 2],
[1, 0]])
I want to get the following:
C = torch.zeros_like(B)
for i in range(B.shape[0]):
for j in range(B.shape[1]):
C[i,j] = A[i,j,B[i,j]]
But in a vectorized way. I tried torch.gather and other stuff but I cannot make it work.
Can anyone please help me?
>>> import torch
>>> A = torch.tensor([[[ 1., 2., 3.],
... [ 5., 6., 7.]],
...
... [[ 9., 10., 11.],
... [13., 14., 15.]]])
>>> B = torch.tensor([[0, 2],
... [1, 0]])
>>> A.shape
torch.Size([2, 2, 3])
>>> B.shape
torch.Size([2, 2])
>>> C = torch.zeros_like(B)
>>> for i in range(B.shape[0]):
... for j in range(B.shape[1]):
... C[i,j] = A[i,j,B[i,j]]
...
>>> C
tensor([[ 1, 7],
[10, 13]])
>>> torch.gather(A, -1, B.unsqueeze(-1))
tensor([[[ 1.],
[ 7.]],
[[10.],
[13.]]])
>>> torch.gather(A, -1, B.unsqueeze(-1)).shape
torch.Size([2, 2, 1])
>>> torch.gather(A, -1, B.unsqueeze(-1)).squeeze(-1)
tensor([[ 1., 7.],
[10., 13.]])
Hi, you can use torch.gather(A, -1, B.unsqueeze(-1)).squeeze(-1).
the first -1 between A and B.unsqueeze(-1) is indicating the dimension along which you want to pick the element.
the second -1 in B.unsqueeze(-1) is to add one dim to B to make the two tensor the same dims otherwise you get RuntimeError: Index tensor must have the same number of dimensions as input tensor.
the last -1 is to reshape the result from torch.Size([2, 2, 1]) to torch.Size([2, 2])

What is the logic behind this assignment: understanding in-place assignment operations in numpy

I have two fairly simple codes that give different answer. I understand it is due to the reference shared but I am not very clear what exactly happens in the 2nd case
a = np.ones(5)
b = torch.from_numpy(a)
a=np.add(a, 1, out=a)
print(a)
print(b)
[out]:
[2. 2. 2. 2. 2.]
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
a = np.ones(5)
b = torch.from_numpy(a)
a=a+1
print(a)
print(b)
[out]:
[2. 2. 2. 2. 2.]
tensor([1., 1., 1., 1., 1.], dtype=torch.float64)
Why isn't b changed in the second case ?
In the first case both a and b share the same memory (i.e. b is a view of a or in other words, b is pointing to the (array) value where a is also pointing to) and out argument guarantees that the same memory of a is updated after the np.add() operation is completed. Whereas in the second case, a is a new copy when you do a = a+1 and b is still pointing to the old value of a.
Try the second case with:
a += 1
and observe that both a and b are indeed updated.
In [7]: a = np.ones(5)
...: b = torch.from_numpy(a)
...: a += 1
In [8]: a
Out[8]: array([2., 2., 2., 2., 2.])
In [9]: b
Out[9]: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
As #hpaulj aptly pointed out in his comment, when we do a = a+1, a new object is created and a would now point to this new (array) object instead of the old one, which is still pointed to by b. And this is the reason the (array) value of b is not updated.
To understand this behavior a bit better, you might wanna refer the excellent article by Ned Batchelder about how names are bind to values in Python

one dimension tuple to two dimension numpy array

Convert a tuple into a Numpy matrix with the below conditions:
The shape of the array should be len(tuple) x len(tuple), ie a square matrix.
Elements in array at the location specified by (index of the element in the tuple, the value of the element in the tuple) should be one.
For example, I have a random tuple, like below:
# index means row ,value means col
(2,0,1)
I use two loop to change this tuple into Numpy array:
def get_np_represent(result):
two_D = []
for row in range(len(result)):
one_D = []
for col in range(len(result)):
if result[row] == col:
one_D.append(1)
else:
one_D.append(0)
two_D.append(one_D)
return np.array(two_D)
output:
array([[0, 0, 1],
[1, 0, 0],
[0, 1, 0]])
But I have 10,000,000 such tuple, Is there a faster way?
Something like this? Manipulate the matrix is quite faster than for loop.
import numpy as np
t = (2, 0, 1)
x = np.zeros([len(t),len(t)])
for i,v in enumerate(t):
x[i, v] = 1
print(x)
outputs:
[[0. 0. 1.]
[1. 0. 0.]
[0. 1. 0.]]
For example (setting up from Ke)
t = (2, 0, 1)
x = np.zeros([len(t),len(t)])
x[np.arange(len(x)),t]=1
x
Out[145]:
array([[0., 0., 1.],
[1., 0., 0.],
[0., 1., 0.]])

Pytorch select values from the last tensor dimension with indices from another tenor with a smaller dimension

I have a tensor a with three dimensions. The first dimension corresponds to minibatch size, the second to the sequence length, and the third to the feature dimension. E.g.,
>>> a = torch.arange(1, 13, dtype=torch.float).view(2,2,3) # Consider the values of a to be random
>>> a
tensor([[[ 1., 2., 3.],
[ 4., 5., 6.]],
[[ 7., 8., 9.],
[10., 11., 12.]]])
I have a second, two-dimensional tensor. Its first dimension corresponds to the minibatch size and its second dimension to the sequence length. It contains values in the range of the indices of the third dimension of a. as third dimension has size 3, so b can contain values 0, 1 or 2. E.g.,
>>> b = torch.LongTensor([[0, 2],[1,0]])
>>> b
tensor([[0, 2],
[1, 0]])
I want to obtain a tensor c that has the shape of b and contains all the values of a that are referenced by b.
In the upper scenario I would like to have:
c = torch.empty(2,2)
c[0,0] = a[0, 0, b[0,0]]
c[1,0] = a[1, 0, b[1,0]]
c[0,1] = a[0, 1, b[0,1]]
c[1,1] = a[1, 1, b[1,1]]
>>> c
tensor([[ 1., 5.],
[ 8., 10.]])
How can I create the tensor c fast? Further, I also want c to be differentiable (be able to use .backprob()). I am not too familiar with pytorch, so I am not sure, if a differentiable version of this exists.
As an alternative, instead of c having the same shape as b I could also use a c with the same shape of a, having only zeros, but at the places referenced by b ones. Then I could multiply a and c to obtain a differentiable tensor.
Like follows:
c = torch.zeros(2,2,3, dtype=torch.float)
c[0,0,b[0,0]] = 1
c[1,0,b[1,0]] = 1
c[0,1,b[0,1]] = 1
c[1,1,b[1,1]] = 1
>>> a*c
tensor([[[ 1., 0., 0.],
[ 0., 5., 0.]],
[[ 0., 8., 0.],
[10., 0., 0.]]])
Lets declare necessary variables first: (notice requires_grad in a's initialization, we will use it to ensure differentiability)
a = torch.arange(1,13,dtype=torch.float32,requires_grad=True).reshape(2,2,3)
b = torch.LongTensor([[0, 2],[1,0]])
Lets reshape a and squash minibatch and sequence dimensions:
temp = a.reshape(-1,3)
so temp now looks like:
tensor([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.],
[10., 11., 12.]], grad_fn=<AsStridedBackward>)
Notice now each value of b can be used in each row of temp to get desired output. Now we do:
c = temp[range(len(temp )),b.view(-1)].view(b.size())
Notice how we index temp, range(len(temp )) to select each row and 1D b i.e b.view(-1) to get corresponding columns. Lastly .view(b.size()) brings this array to the same size as b.
If we print c now:
tensor([[ 1., 6.],
[ 8., 10.]], grad_fn=<ViewBackward>)
The presence of grad_fn=.. shows that c requires gradient i.e. its differentiable.

Resources