NumPy doesn't recognize well array shape

NumPy doesn't recognize well array shape - python-3.x

I have a code which is as follows:
data = np.array([[[i, j], i * j] for i in range(10) for j in range(10)])
print(data)
x = np.array(data[:,0])
x1 = x[:,0]
x2 = x[:,1]
print(x)
data correctly outputs [[[0,0],0],[[0,1],0],[[0,2],0],...,[[9,9],81]] which is, by the way, the multiplication table and it's results.
So, the first column of the data (which is x) must be separated into x1 and x2, which are the first and last column of it respectively. Which I think I did it right but it raises an error saying too many indices for array. What am I doing wrong?

data.dtype is object because the elements of [[i,j],k] are not homogeneous. A workaround for you :
data = np.array([(i, j, i * j) for i in range(10) for j in range(10)])
print(data)
x1 = data[:,:2]
x2 = data[:,2]
data.shape is now (100,3), data.dtype is int and x1 and x2 what you want.

Because of the mix of list lengths, this produces an object array:
In [97]: data = np.array([[[i, j], i * j] for i in range(3) for j in range(3)])
In [98]: data
Out[98]:
array([[[0, 0], 0],
[[0, 1], 0],
[[0, 2], 0],
[[1, 0], 0],
[[1, 1], 1],
[[1, 2], 2],
[[2, 0], 0],
[[2, 1], 2],
[[2, 2], 4]], dtype=object)
In [99]: data.shape
Out[99]: (9, 2)
One column contains numbers (but is still object dtype), the other lists. Both have (9,) shape
In [100]: data[:,1]
Out[100]: array([0, 0, 0, 0, 1, 2, 0, 2, 4], dtype=object)
In [101]: data[:,0]
Out[101]:
array([[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [2, 0], [2, 1],
[2, 2]], dtype=object)
The easiest way of turning that column into a numeric arrays is via .tolist
In [104]: np.array(data[:,0].tolist())
Out[104]:
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2],
[2, 0],
[2, 1],
[2, 2]])
In [105]: _.shape
Out[105]: (9, 2)
The [i, j, i * j] elements as suggested in the other answer are easier to work with.
A structured array approach to generating such a 'table':
In [113]: dt='(2)int,int'
In [114]: data = np.array([([i, j], i * j) for i in range(3) for j in range(3)],
...: dtype=dt)
In [115]: data
Out[115]:
array([([0, 0], 0), ([0, 1], 0), ([0, 2], 0), ([1, 0], 0), ([1, 1], 1),
([1, 2], 2), ([2, 0], 0), ([2, 1], 2), ([2, 2], 4)],
dtype=[('f0', '<i4', (2,)), ('f1', '<i4')])
In [116]: data['f0']
Out[116]:
array([[0, 0],
[0, 1],
[0, 2],
[1, 0],
[1, 1],
[1, 2],
[2, 0],
[2, 1],
[2, 2]])
In [117]: data['f1']
Out[117]: array([0, 0, 0, 0, 1, 2, 0, 2, 4])

Related

repeating specific rows in array n times

I have a huge numpy array with 15413 rows and 70 columns. The first column represents the weight (so if the first element in a row is n, that row should be repeated n times.
I tried with numpy.repeat but I don’t think it’s giving me the correct answer because np.sum(chain[:,0]) is not equal to len(ACT_chain)
ACT_chain = []
for i in range(len(chain[:,0])):
chain_row = chain[i]
ACT_chain.append(chain_row)
if int(chain[:,0][i]) > 1:
chain_row = np.repeat(chain[i], chain[:, 0][i], axis=0)
ACT_chain.append(chain_row)
For example, running this code with this sample array
chain = np.array([[1, 5, 3], [2, 2, 1], [3, 0, 1]])
gives
[array([1, 5, 3]), array([2, 2, 1]), array([[2, 2, 1],
[2, 2, 1]]), array([3, 0, 1]), array([[3, 0, 1],
[3, 0, 1],
[3, 0, 1]])]
but the output I expect is
array([[1, 5, 3], [2, 2, 1], [2, 2, 1], [3, 0, 1], [3, 0, 1], [3, 0, 1]])

You can use repeat here, without the iteration.
np.repeat(chain, chain[:, 0], axis=0)
array([[1, 5, 3],
[2, 2, 1],
[2, 2, 1],
[3, 0, 1],
[3, 0, 1],
[3, 0, 1]])

I solved the typeError by converting the whole array to int
chain = chain.astype(int)

Pytorch, retrieving values from a 3D tensor using several indices. Most computationally efficient solution

Related:
Pytorch, retrieving values from a tensor using several indices. Most computationally efficient solution
This is another question about retrieving values from a 3D tensor, using a list of indices.
In this case, I have a 3d tensor, for example
b = [[[4, 20], [1, -1]], [[1, 2], [8, -1]], [[92, 4], [23, -1]]]
tensor_b = torch.tensor(b)
tensor_b
tensor([[[ 4, 20],
[ 1, -1]],
[[ 1, 2],
[ 8, -1]],
[[92, 4],
[23, -1]]])
In this case, I have a list of 3D indices. So
indices = [
[[1, 0, 1], [2, 0, 1]],
[[1, 1, 1], [0, 0, 0]],
[[2, 1, 0], [0, 1, 0]]
]
Each triple is an index for tensor-b. The desired result is
[[2, 4], [-1, 4], [23, 1]]
Potential Approach
Like in the last question, the first solution that comes to mind is a nested for loop, but there is probably a more computationally efficient solution using pytorch function.
And like in the last question, perhaps reshape would be needed to get the desired shape for the last solution.
So a desired solution could be [2, 4, -1, 4, 23, 1], which can come from a flattened list of indices
[ [1, 0, 1], [2, 0, 1], [1, 1, 1], [0, 0, 0], [2, 1, 0], [0, 1, 0] ]
But I am not aware of any pytorch functions so far which allow for a list of 3D indices. I have been looking at gather and index_select.

You can use advanced indexing specifically integer array indexing
tensor_b = torch.tensor([[[4, 20], [1, -1]], [[1, 2], [8, -1]], [[92, 4], [23, -1]]])
indices = torch.tensor([
[[1, 0, 1], [2, 0, 1]],
[[1, 1, 1], [0, 0, 0]],
[[2, 1, 0], [0, 1, 0]]
])
result = tensor_b[indices[:, :, 0], indices[:, :, 1], indices[:, :, 2]]
results in
tensor([[ 2, 4],
[-1, 4],
[23, 1]])

3D matrix addition python

I am trying to add 3D matrix but third loop is not starting from 0.
Here shape of matrix is (2,3,3).
Code:
for i in range(0,r):
for j in range(0,c):
for l in range(0,k):
sum[i][j][k]=A1[i][j][k]+A2[i][j][k]
Output:
IndexError: index 3 is out of bounds for axis 0 with size 3

For element-wise addition of two matrices, you can simply use the + operator between two numpy arrays:
#create two matrices of random integers
matrix1 = np.random.randint(10, size=(2,3,3))
matrix2 = np.random.randint(10, size=(2,3,3))
#add the two matrices element-wise
sum_matrix = matrix1 + matrix2
print(matrix1, matrix2, sum_matrix, sep='\n__________\n')

I don't get IndexError. Maybe you post your whole code?
This is my code:
arr1 = [[[2, 4, 8], [7, 7, 1], [4, 9, 0]], [[5, 0, 0], [3, 8, 6], [0, 5, 8]]]
arr2 = [[[3, 8, 0], [1, 5, 2], [0, 3, 9]], [[9, 7, 7], [1, 2, 5], [1, 1, 3]]]
sumArr = [[[0, 0, 0], [0, 0, 0], [0, 0, 0]], [[0, 0, 0], [0, 0, 0],[0, 0, 0]]]
for i in range(2): #can also use range(0,2)
for j in range(3):
for k in range(3):
sumArr[i][j][k]=arr1[i][j][k]+arr2[i][j][k]
print(sumArr)
By the way, is it necessary to use for loop?
If not, you can use numpy library.
import numpy as np
Convert your manual array to numpy matrix array, then do addition.
arr1 = [[[2, 4, 8], [7, 7, 1], [4, 9, 0]], [[5, 0, 0], [3, 8, 6], [0, 5, 8]]]
arr2 = [[[3, 8, 0], [1, 5, 2], [0, 3, 9]], [[9, 7, 7], [1, 2, 5], [1, 1, 3]]]
m1 = np.array(arr1)
m2 = np.array(arr2)
print("M1: \n", m1)
print("M2: \n", m2)
print("Sum: \n", m1 + m2)

You iterate with 'l' in the third loop but to access in list, you used k. As a result, your code is trying to access k-th index which doesn't exists, and you're getting an error.
Use this:
for i in range(0, r):
for j in range(0, c):
for l in range(0, k):
sum[i][j][l] = A1[i][j][l] + A2[i][j][l]

Indices of sorted array

I have a 2 dimensional Array with shape (nrows,ncols) containing real numbers. I would like get the indices (row,col) corresponding to the Array values in decreasing order. Checking the documentaion of np.argsort(), it seems that it only returns the indices ordered by a specific axis. Im sure that that this should be simple but i just cant figure it out.
For example, if i have:
[
[1 5 6]
[7 4 9]
[8 2 3]
]
the desired output would be:
[
(1,2),
(2,0),
(1,0),
(0,2),
(0,1),
(1,1),
(2,2),
(2,1),
(0,0),
]

Here's one way for descending order -
In [19]: a
Out[19]:
array([[1, 5, 6],
[7, 4, 9],
[8, 2, 3]])
In [20]: np.c_[np.unravel_index(a.ravel().argsort()[::-1],a.shape)]
Out[20]:
array([[1, 2],
[2, 0],
[1, 0],
[0, 2],
[0, 1],
[1, 1],
[2, 2],
[2, 1],
[0, 0]])
For ascending order, skip the flipping part : [::-1].
Or with negative values -
In [24]: np.c_[np.unravel_index((-a).ravel().argsort(),a.shape)]
Out[24]:
array([[1, 2],
[2, 0],
[1, 0],
[0, 2],
[0, 1],
[1, 1],
[2, 2],
[2, 1],
[0, 0]])

idiom for getting contiguous copies

In the help of numpy.broadcst-array, an idiom is introduced.
However, the idiom give exactly the same output as original command.
Waht is the meaning of "getting contiguous copies instead of non-contiguous views."?
https://docs.scipy.org/doc/numpy/reference/generated/numpy.broadcast_arrays.html
x = np.array([[1,2,3]])
y = np.array([[1],[2],[3]])
np.broadcast_arrays(x, y)
[array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
Here is a useful idiom for getting contiguous copies instead of non-contiguous views.
[np.array(a) for a in np.broadcast_arrays(x, y)]
[array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]

To understand the difference try writing into the new arrays:
Let's begin with the contiguous copies.
>>> import numpy as np
>>> x = np.array([[1,2,3]])
>>> y = np.array([[1],[2],[3]])
>>>
>>> xc, yc = [np.array(a) for a in np.broadcast_arrays(x, y)]
>>> xc
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
We can modify an element and nothing unexpected will happen.
>>> xc[0, 0] = 0
>>> xc
array([[0, 2, 3],
[1, 2, 3],
[1, 2, 3]])
>>> x
array([[1, 2, 3]])
Now, let's try the same with the broadcasted arrays:
>>> xb, yb = np.broadcast_arrays(x, y)
>>> xb
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
Although we only write to the top left element ...
>>> xb[0, 0] = 0
... the entire left column will change ...
>>> xb
array([[0, 2, 3],
[0, 2, 3],
[0, 2, 3]])
... and also the input array.
>>> x
array([[0, 2, 3]])

It means that broadcast_arrays function doesn't create entirely new object. It creates views from original arrays which means the elements of it's results have memory addresses as those arrays which may or may not be contiguous. But when you create a list you're creating new copies within a list which guarantees that its items are stored contiguous in memory.
You can check this like following:
arr = np.broadcast_arrays(x, y)
In [144]: arr
Out[144]:
[array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
In [145]: x
Out[145]: array([[1, 2, 3]])
In [146]: arr[0][0] = 0
In [147]: arr
Out[147]:
[array([[0, 0, 0],
[0, 0, 0],
[0, 0, 0]]), array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])]
In [148]: x
Out[148]: array([[0, 0, 0]])
As you can see, changing the arr's elements is changing both its elements and the original x array.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

NumPy doesn't recognize well array shape - python-3.x

data.dtype is object because the elements of [[i,j],k] are not homogeneous. A workaround for you : data = np.array([(i, j, i * j) for i in range(10) for j in range(10)]) print(data) x1 = data[:,:2] x2 = data[:,2] data.shape is now (100,3), data.dtype is int and x1 and x2 what you want.

Related

repeating specific rows in array n times

Pytorch, retrieving values from a 3D tensor using several indices. Most computationally efficient solution

3D matrix addition python

Indices of sorted array

idiom for getting contiguous copies

Categories

Resources