I'm attempting to put together a number of 3D arrays with the same size on the first two dimensions but differing sizes on the 3rd dimensions. I'm using numpy.hstack().
import numpy as np
first = np.array([[[1,2], [3,4]],
[[5,6], [7,8]],
[[9,10],[11,12]]])
second = np.array([[[88],[88]],
[[88],[88]],
[[88],[88]]])
output = np.hstack((first,second))
print (output)
This results in an error:
Exception has occurred: ValueError
all the input array dimensions for the concatenation axis must match exactly, but along dimension 2, the array at index 0 has size 2 and the array at index 1 has size 1all the input array dimensions for the concatenation axis must match exactly, but along dimension 2, the array at index 0 has size 2 and the array at index 1 has size 1
Now, if I try this on two 2D arrays with a mismatched second dimension, np.hstack() has no trouble. For instance:
import numpy as np
first= np.array([[1,2],[3,4],[5,6]])
second= np.array([[88],[88],[88]])
output = np.hstack((first,second))
print (output)
outputs, as expected:
[[ 1 2 88]
[ 3 4 88]
[ 5 6 88]]
The result I'm going for with the 3D concatenation is:
[[[ 1 2 88],[ 3 4 88]]
[[ 5 6 88],[ 7 8 88]]
[[ 9 10 88],[ 11 12 88]]]
Am I going about it the right way? Is there an alternative? Thanks for your help.
np.concatenate is what you're looking for:
>>> import numpy as np
>>> first = np.arange(1, 13).reshape(3, 2, 2); first
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]]])
>>> second = np.repeat(88, 6).reshape(3, 2, 1); second
array([[[88],
[88]],
[[88],
[88]],
[[88],
[88]]])
>>> np.concatenate((first, second), axis=2)
array([[[ 1, 2, 88],
[ 3, 4, 88]],
[[ 5, 6, 88],
[ 7, 8, 88]],
[[ 9, 10, 88],
[11, 12, 88]]])
Related
DataReader = pd.read_csv('Quality.csv')
...
ip = [DataReader.x1, DataReader.x2, DataReader.x3, DataReader.x4,........., DataReader.x12,
DataReader.x13]
op = DataReader.y
ip = np.matrix(ip).transpose()
op = np.matrix(op).transpose()
Please help to solve below error. Python 3.7v and numpy 1.17v
Traceback (most recent call last):
File "Quality.py", line xx, in <module>
ip = np.matrix(ip).transpose()
File "\\defmatrix.py", line 147, in __new__
arr = N.array(data, dtype=dtype, copy=copy)
**ValueError: cannot copy sequence with size 13 to array axis with dimension 200**
You start with a dataframe with 200 rows and 14 columns. Its .values attribute (or to_numpy() method result) will then be a (200,14) shape array.
In:
ip = [DataReader.x1, DataReader.x2, ...DataReader.x13]
DataReader.x1 is a column, a pandas Series. Its .values is a 1d array, (200,) shape. I would expect np.array(ip) to be a (13,200) array.
If instead you'd done
ip.DataReader[['x1','x2',...,'x13']].values
the result would be a (200,13) array.
With a simple dataframe, your code shouldn't produce an error:
In [61]: df = pd.DataFrame(np.arange(12).reshape(4,3), columns=['a','b','c'])
In [63]: df.values
Out[63]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
In [65]: ip = [df.a, df.b, df.c]
In [67]: np.array(ip)
Out[67]:
array([[ 0, 3, 6, 9],
[ 1, 4, 7, 10],
[ 2, 5, 8, 11]])
np.matrix(ip).transpose() works just as well (though there's no need to use np.matrix instead of np.array).
I can't reproduce your error. Making an array from certain mixes of shaped arrays produces an error like
ValueError: could not broadcast input array from shape (3) into shape (1)
or in other cases an object array (of Series).
====
For one column, I'd expect the resulting 1d array, or reshape if needed.
In [82]: df.a
Out[82]:
0 0
1 3
2 6
3 9
Name: a, dtype: int64
In [83]: df.a.values
Out[83]: array([0, 3, 6, 9])
In [84]: df.a.values[:,None]
Out[84]:
array([[0],
[3],
[6],
[9]])
I found two ways to determine how many elements are in a variable…
I always get the same values for len () and size (). Is there a difference? Could size () have come with an imported library (like math, numpy, pandas)?
asdf = range (10)
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
asdf = list (range (10))
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
asdf = np.array (range (10))
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
asdf = tuple (range (10))
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
size comes from numpy (on which pandas is based).
It gives you the total number of elements in the array. However, you can also query the sizes of specific axes with np.size (see below).
In contrast, len gives the length of the first dimension.
For example, let's create an array with 36 elements shaped into three dimensions.
In [1]: import numpy as np
In [2]: a = np.arange(36).reshape(2, 3, -1)
In [3]: a
Out[3]:
array([[[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]],
[[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]]])
In [4]: a.shape
Out[4]: (2, 3, 6)
size
size will give you the total number of elements.
In [5]: a.size
Out[5]: 36
len
len will give you the number of 'elements' of the first dimension.
In [6]: len(a)
Out[6]: 2
This is because, in this case, each 'element' stands for a 2-dimensional array.
In [14]: a[0]
Out[14]:
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
In [15]: a[1]
Out[15]:
array([[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
These arrays, in turn, have their own shape and size.
In [16]: a[0].shape
Out[16]: (3, 6)
In [17]: len(a[0])
Out[17]: 3
np.size
You can use size more specifically with np.size.
For example you can reproduce len by specifying the first ('0') dimension.
In [11]: np.size(a, 0)
Out[11]: 2
And you can also query the sizes of the other dimensions.
In [10]: np.size(a, 1)
Out[10]: 3
In [12]: np.size(a, 2)
Out[12]: 6
Basically, you reproduce the values of shape.
Numpy nparray has Size
https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.size.html
Whilst len is from Python itself
Size is from numpy ndarray.size
The main difference is that nparray size only measures the size of an array, whilst python's Len can be used for getting the length of objects in general
Consider this example :
a = numpy.array([[1,2,3,4,5,6],[7,8,9,10,11,12]])
print(len(a))
#output is 2
print(numpy.size(a))
#output is 12
len() is built-in method used to compute the length of iterable python objects like str, list , dict etc. len returns the length of the iterable, i.e the number of elements. In above example the array is actually of length 2, because it is a nested list where each list is considered as an element.
numpy.size() returns the size of the array, it is equal to n_dim1 * n_dim2 * --- n_dimn , i.e it is the product of dimensions of the array, for example if we have an array of dimension (5,5,2), the size is 50, as it can hold 50 elements. But len() will return 5, because the number of elements in higher order list (or 1st dimension is 5).
According to your question, len() and numpy.size() return same output for 1-D arrays (same as lists) but in vector form. However, the results are different for 2-D + arrays. So to get the correct answer, use numpy.size() as it returns the actual size.
When you callnumpy.size() on any iterable, as in your example, it is first casted to a numpy array object, then size() is called.
Thanks for A2A
Suppose I have a 3D array arr. I want to iterate over arr in such a way that each iteration yields a vector along z-axis. This can be done but the solution is not generalized. If the arr.shape and the axis along which the vectors have to be obtained are not known or variable then there seems no straight forward way to do this. Can anyone provide a solution to this?
for line in np.nditer(arr, axis=2):
# Perform operation on line
arr = array(
[[[2, 2, 8, 8],
[6, 2, 1, 5],
[4, 5, 1, 4]],
[[7, 4, 7, 4],
[0, 0, 3, 3],
[7, 6, 8, 0]]]
)
Expected output:
[2 2 8 8]
[6 2 1 5]
[4 5 1 4]
[7 4 7 4]
[0 0 3 3]
[7 6 8 0]
In numpy arrays the shape provides you information about # dimensions and # elements in each of the dimensions. with your code we get,
print(arr.shape)
# (2,3,4)
# 3-D array
# along x-axis = 2 elements each
# along y-axis = 3 elements each
# along z-axis = 4 elements each
So, If i want to look at elements along z-axis for all x-axis and y-axis it will look like
for xid in range(arr.shape[0]): # for each x-axis
for yid in range(arr.shape[1]): # for each y-axis
print(arr[xid, yid, :]) # All elements in z-axis
Writing the suggestions of #hpaulj into an answer.
moveaxis seems to be right answer. However apply_along_axis is intuitive and also very easy to use.
I have one numpy array with two dimensions, for the example let's say :
a = np.array([[1,2,3,4,5],[4,6,5,8,9]])
I tried to do a = a[a[0]>2] but i got an error. I would like to obtain:
array([[3, 4, 5],
[5, 8, 9]])
Is it possible ? thanks !
Evaluate the options step by step:
In [75]: a = np.array([[1,2,3,4,5],[4,6,5,8,9]])
first row, a 1d array
In [76]: a[0]
Out[76]: array([1, 2, 3, 4, 5])
where that first row is >2, a 1d boolean array of same size
In [77]: a[0]>2
Out[77]: array([False, False, True, True, True])
Using that direct, produces an error:
In [78]: a[a[0]>2]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-78-631a57b67cdb> in <module>()
----> 1 a[a[0]>2]
IndexError: boolean index did not match indexed array along dimension 0; dimension is 2 but corresponding boolean dimension is 5
First dimension of a is 2, but the boolean index (mask) has size 2 (2nd dim)
So we need to apply it to the 2nd dimension. 2d indexing syntax: x[i, j], x[:, j] to select all rows, but subset of columns:
In [79]: a[:,a[0]>2]
Out[79]:
array([[3, 4, 5],
[5, 8, 9]])
Hi I have 2 arrays of vectors:
A=np.array([[5,62,7],[5,62,7],[5,62,7]])
B=np.array([[1,2,3],[1,2,3],[1,2,3]])
and I would like to concentrate them like that:
C=[[[5,62,7], [1,2,3]],
[[5,62,7], [1,2,3]],
[[5,62,7], [1,2,3]]]
The newish stack makes this easy:
In [130]: A=np.array([[5,62,7],[5,62,7],[5,62,7]])
...: B=np.array([[1,2,3],[1,2,3],[1,2,3]])
...:
In [131]: np.stack((A,B), axis=1)
Out[131]:
array([[[ 5, 62, 7],
[ 1, 2, 3]],
[[ 5, 62, 7],
[ 1, 2, 3]],
[[ 5, 62, 7],
[ 1, 2, 3]]])
It adds an extra dimension to each of the arrays, and then concatenates. With axis=0 is behave just like np.array.
np.array((A,B)).transpose(1,0,2)
joins them on a new 1st axis, and then moves it over.
hstack().reshape() to the rescue:
import numpy as np
A=np.array([[5,62,7],[5,62,7],[5,62,7]])
B=np.array([[1,2,3],[1,2,3],[1,2,3]])
c = np.hstack((A,B)).reshape(3,2,3)
print(c)
Output:
[[[ 5 62 7] [ 1 2 3]]
[[ 5 62 7] [ 1 2 3]]
[[ 5 62 7] [ 1 2 3]]]
hstack
reshape