Indices of sorted array - python-3.x

I have a 2 dimensional Array with shape (nrows,ncols) containing real numbers. I would like get the indices (row,col) corresponding to the Array values in decreasing order. Checking the documentaion of np.argsort(), it seems that it only returns the indices ordered by a specific axis. Im sure that that this should be simple but i just cant figure it out.
For example, if i have:
[
[1 5 6]
[7 4 9]
[8 2 3]
]
the desired output would be:
[
(1,2),
(2,0),
(1,0),
(0,2),
(0,1),
(1,1),
(2,2),
(2,1),
(0,0),
]

Here's one way for descending order -
In [19]: a
Out[19]:
array([[1, 5, 6],
[7, 4, 9],
[8, 2, 3]])
In [20]: np.c_[np.unravel_index(a.ravel().argsort()[::-1],a.shape)]
Out[20]:
array([[1, 2],
[2, 0],
[1, 0],
[0, 2],
[0, 1],
[1, 1],
[2, 2],
[2, 1],
[0, 0]])
For ascending order, skip the flipping part : [::-1].
Or with negative values -
In [24]: np.c_[np.unravel_index((-a).ravel().argsort(),a.shape)]
Out[24]:
array([[1, 2],
[2, 0],
[1, 0],
[0, 2],
[0, 1],
[1, 1],
[2, 2],
[2, 1],
[0, 0]])

Related

Element wise addition of two n dimensional lists

When I need to add two 2D lists element wise, the approach I am using is
l1 = [[1, 1, 1],
[2, 2, 2],
[3, 3, 3]]
l2 = [[1, 1, 1],
[2, 2, 2],
[3, 3, 3]]
new = list(map(lambda e: [sum(x) for x in zip(*e)], zip(l1, l2)))
print(new)
output : [[2, 2, 2],
[4, 4, 4],
[6, 6, 6]]
This code is already difficult to read.
So how would I add two n dimensional lists element wise? Is there a pythonic way to do it or should I use numpy?

repeating specific rows in array n times

I have a huge numpy array with 15413 rows and 70 columns. The first column represents the weight (so if the first element in a row is n, that row should be repeated n times.
I tried with numpy.repeat but I don’t think it’s giving me the correct answer because np.sum(chain[:,0]) is not equal to len(ACT_chain)
ACT_chain = []
for i in range(len(chain[:,0])):
chain_row = chain[i]
ACT_chain.append(chain_row)
if int(chain[:,0][i]) > 1:
chain_row = np.repeat(chain[i], chain[:, 0][i], axis=0)
ACT_chain.append(chain_row)
For example, running this code with this sample array
chain = np.array([[1, 5, 3], [2, 2, 1], [3, 0, 1]])
gives
[array([1, 5, 3]), array([2, 2, 1]), array([[2, 2, 1],
[2, 2, 1]]), array([3, 0, 1]), array([[3, 0, 1],
[3, 0, 1],
[3, 0, 1]])]
but the output I expect is
array([[1, 5, 3], [2, 2, 1], [2, 2, 1], [3, 0, 1], [3, 0, 1], [3, 0, 1]])
You can use repeat here, without the iteration.
np.repeat(chain, chain[:, 0], axis=0)
array([[1, 5, 3],
[2, 2, 1],
[2, 2, 1],
[3, 0, 1],
[3, 0, 1],
[3, 0, 1]])
I solved the typeError by converting the whole array to int
chain = chain.astype(int)

Pytorch, retrieving values from a 3D tensor using several indices. Most computationally efficient solution

Related:
Pytorch, retrieving values from a tensor using several indices. Most computationally efficient solution
This is another question about retrieving values from a 3D tensor, using a list of indices.
In this case, I have a 3d tensor, for example
b = [[[4, 20], [1, -1]], [[1, 2], [8, -1]], [[92, 4], [23, -1]]]
tensor_b = torch.tensor(b)
tensor_b
tensor([[[ 4, 20],
[ 1, -1]],
[[ 1, 2],
[ 8, -1]],
[[92, 4],
[23, -1]]])
In this case, I have a list of 3D indices. So
indices = [
[[1, 0, 1], [2, 0, 1]],
[[1, 1, 1], [0, 0, 0]],
[[2, 1, 0], [0, 1, 0]]
]
Each triple is an index for tensor-b. The desired result is
[[2, 4], [-1, 4], [23, 1]]
Potential Approach
Like in the last question, the first solution that comes to mind is a nested for loop, but there is probably a more computationally efficient solution using pytorch function.
And like in the last question, perhaps reshape would be needed to get the desired shape for the last solution.
So a desired solution could be [2, 4, -1, 4, 23, 1], which can come from a flattened list of indices
[ [1, 0, 1], [2, 0, 1], [1, 1, 1], [0, 0, 0], [2, 1, 0], [0, 1, 0] ]
But I am not aware of any pytorch functions so far which allow for a list of 3D indices. I have been looking at gather and index_select.
You can use advanced indexing specifically integer array indexing
tensor_b = torch.tensor([[[4, 20], [1, -1]], [[1, 2], [8, -1]], [[92, 4], [23, -1]]])
indices = torch.tensor([
[[1, 0, 1], [2, 0, 1]],
[[1, 1, 1], [0, 0, 0]],
[[2, 1, 0], [0, 1, 0]]
])
result = tensor_b[indices[:, :, 0], indices[:, :, 1], indices[:, :, 2]]
results in
tensor([[ 2, 4],
[-1, 4],
[23, 1]])

idiom for getting contiguous copies

In the help of numpy.broadcst-array, an idiom is introduced.
However, the idiom give exactly the same output as original command.
Waht is the meaning of "getting contiguous copies instead of non-contiguous views."?
https://docs.scipy.org/doc/numpy/reference/generated/numpy.broadcast_arrays.html
x = np.array([[1,2,3]])
y = np.array([[1],[2],[3]])
np.broadcast_arrays(x, y)
[array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
Here is a useful idiom for getting contiguous copies instead of non-contiguous views.
[np.array(a) for a in np.broadcast_arrays(x, y)]
[array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
To understand the difference try writing into the new arrays:
Let's begin with the contiguous copies.
>>> import numpy as np
>>> x = np.array([[1,2,3]])
>>> y = np.array([[1],[2],[3]])
>>>
>>> xc, yc = [np.array(a) for a in np.broadcast_arrays(x, y)]
>>> xc
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
We can modify an element and nothing unexpected will happen.
>>> xc[0, 0] = 0
>>> xc
array([[0, 2, 3],
[1, 2, 3],
[1, 2, 3]])
>>> x
array([[1, 2, 3]])
Now, let's try the same with the broadcasted arrays:
>>> xb, yb = np.broadcast_arrays(x, y)
>>> xb
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
Although we only write to the top left element ...
>>> xb[0, 0] = 0
... the entire left column will change ...
>>> xb
array([[0, 2, 3],
[0, 2, 3],
[0, 2, 3]])
... and also the input array.
>>> x
array([[0, 2, 3]])
It means that broadcast_arrays function doesn't create entirely new object. It creates views from original arrays which means the elements of it's results have memory addresses as those arrays which may or may not be contiguous. But when you create a list you're creating new copies within a list which guarantees that its items are stored contiguous in memory.
You can check this like following:
arr = np.broadcast_arrays(x, y)
In [144]: arr
Out[144]:
[array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
In [145]: x
Out[145]: array([[1, 2, 3]])
In [146]: arr[0][0] = 0
In [147]: arr
Out[147]:
[array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
In [148]: x
Out[148]: array([[0, 0, 0]])
As you can see, changing the arr's elements is changing both its elements and the original x array.

NumPy doesn't recognize well array shape

I have a code which is as follows:
data = np.array([[[i, j], i * j] for i in range(10) for j in range(10)])
print(data)
x = np.array(data[:,0])
x1 = x[:,0]
x2 = x[:,1]
print(x)
data correctly outputs [[[0,0],0],[[0,1],0],[[0,2],0],...,[[9,9],81]] which is, by the way, the multiplication table and it's results.
So, the first column of the data (which is x) must be separated into x1 and x2, which are the first and last column of it respectively. Which I think I did it right but it raises an error saying too many indices for array. What am I doing wrong?
data.dtype is object because the elements of [[i,j],k] are not homogeneous. A workaround for you :
data = np.array([(i, j, i * j) for i in range(10) for j in range(10)])
print(data)
x1 = data[:,:2]
x2 = data[:,2]
data.shape is now (100,3), data.dtype is int and x1 and x2 what you want.
Because of the mix of list lengths, this produces an object array:
In [97]: data = np.array([[[i, j], i * j] for i in range(3) for j in range(3)])
In [98]: data
Out[98]:
array([[[0, 0], 0],
[[0, 1], 0],
[[0, 2], 0],
[[1, 0], 0],
[[1, 1], 1],
[[1, 2], 2],
[[2, 0], 0],
[[2, 1], 2],
[[2, 2], 4]], dtype=object)
In [99]: data.shape
Out[99]: (9, 2)
One column contains numbers (but is still object dtype), the other lists. Both have (9,) shape
In [100]: data[:,1]
Out[100]: array([0, 0, 0, 0, 1, 2, 0, 2, 4], dtype=object)
In [101]: data[:,0]
Out[101]:
array([[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [2, 0], [2, 1],
[2, 2]], dtype=object)
The easiest way of turning that column into a numeric arrays is via .tolist
In [104]: np.array(data[:,0].tolist())
Out[104]:
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2],
[2, 0],
[2, 1],
[2, 2]])
In [105]: _.shape
Out[105]: (9, 2)
The [i, j, i * j] elements as suggested in the other answer are easier to work with.
A structured array approach to generating such a 'table':
In [113]: dt='(2)int,int'
In [114]: data = np.array([([i, j], i * j) for i in range(3) for j in range(3)],
...: dtype=dt)
In [115]: data
Out[115]:
array([([0, 0], 0), ([0, 1], 0), ([0, 2], 0), ([1, 0], 0), ([1, 1], 1),
([1, 2], 2), ([2, 0], 0), ([2, 1], 2), ([2, 2], 4)],
dtype=[('f0', '<i4', (2,)), ('f1', '<i4')])
In [116]: data['f0']
Out[116]:
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2],
[2, 0],
[2, 1],
[2, 2]])
In [117]: data['f1']
Out[117]: array([0, 0, 0, 0, 1, 2, 0, 2, 4])

Resources