I found two ways to determine how many elements are in a variable…
I always get the same values for len () and size (). Is there a difference? Could size () have come with an imported library (like math, numpy, pandas)?
asdf = range (10)
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
asdf = list (range (10))
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
asdf = np.array (range (10))
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
asdf = tuple (range (10))
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
size comes from numpy (on which pandas is based).
It gives you the total number of elements in the array. However, you can also query the sizes of specific axes with np.size (see below).
In contrast, len gives the length of the first dimension.
For example, let's create an array with 36 elements shaped into three dimensions.
In [1]: import numpy as np
In [2]: a = np.arange(36).reshape(2, 3, -1)
In [3]: a
Out[3]:
array([[[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]],
[[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]]])
In [4]: a.shape
Out[4]: (2, 3, 6)
size
size will give you the total number of elements.
In [5]: a.size
Out[5]: 36
len
len will give you the number of 'elements' of the first dimension.
In [6]: len(a)
Out[6]: 2
This is because, in this case, each 'element' stands for a 2-dimensional array.
In [14]: a[0]
Out[14]:
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
In [15]: a[1]
Out[15]:
array([[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
These arrays, in turn, have their own shape and size.
In [16]: a[0].shape
Out[16]: (3, 6)
In [17]: len(a[0])
Out[17]: 3
np.size
You can use size more specifically with np.size.
For example you can reproduce len by specifying the first ('0') dimension.
In [11]: np.size(a, 0)
Out[11]: 2
And you can also query the sizes of the other dimensions.
In [10]: np.size(a, 1)
Out[10]: 3
In [12]: np.size(a, 2)
Out[12]: 6
Basically, you reproduce the values of shape.
Numpy nparray has Size
https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.size.html
Whilst len is from Python itself
Size is from numpy ndarray.size
The main difference is that nparray size only measures the size of an array, whilst python's Len can be used for getting the length of objects in general
Consider this example :
a = numpy.array([[1,2,3,4,5,6],[7,8,9,10,11,12]])
print(len(a))
#output is 2
print(numpy.size(a))
#output is 12
len() is built-in method used to compute the length of iterable python objects like str, list , dict etc. len returns the length of the iterable, i.e the number of elements. In above example the array is actually of length 2, because it is a nested list where each list is considered as an element.
numpy.size() returns the size of the array, it is equal to n_dim1 * n_dim2 * --- n_dimn , i.e it is the product of dimensions of the array, for example if we have an array of dimension (5,5,2), the size is 50, as it can hold 50 elements. But len() will return 5, because the number of elements in higher order list (or 1st dimension is 5).
According to your question, len() and numpy.size() return same output for 1-D arrays (same as lists) but in vector form. However, the results are different for 2-D + arrays. So to get the correct answer, use numpy.size() as it returns the actual size.
When you callnumpy.size() on any iterable, as in your example, it is first casted to a numpy array object, then size() is called.
Thanks for A2A
Related
I have 2 list A and B, list A contain values I want list B to be the sum of value from list a
A = [3,5,7,8,9,12,13,20]
#Wanted result
#B = [3, 8, 15, 23,...77]
#so the new value will be the sum of the old value
# [x1, x2+x1, x3+x2+x1,... xn+xn+xn]
what methods I could use to get the answer, thank you.
The easiest way IMO would be to use numpy.cumsum, to get the cumulative sum of your list:
>>> import numpy as np
>>> np.cumsum(A)
array([ 3, 8, 15, 23, 32, 44, 57, 77])
But you also could do it in a list comprehension like this:
>>> [sum(A[0:x]) for x in range(1, len(A)+1)]
[3, 8, 15, 23, 32, 44, 57, 77]
Another fun way is to use itertools.accumulate, which gives accumulated sums by default:
>>> from itertools import accumulate
>>> list(accumulate(A))
[3, 8, 15, 23, 32, 44, 57, 77]
I am using Python 3.6.3.
My code generates a random list of five values (type integer).
The values in this random list are compared to another list, which contains values to exclude from the generated list. That means, if a random number is mentioned in the excluded_numbers list, a redraw must be processed to obtain a new value.
Below is a simplified source code : (you can copy/paste, this code is running)
import numpy as np
import random
excluded_values = [1,2,3,4,5,6,15,18,30] # values to exclude from the random list
print ("draw_numbers to exclude ",excluded_values,"\n")
loops = 1
for ln in range (0,loops):
# Random list
z=[]
z=random.sample(range(1,30), 5) # no duplicates
z.sort()
print ("Draw numbers original list z ", z,"\n")
# Standard deviation draw_numbers
deviation = np.std(z)
deviation=float(deviation)
deviation= round (deviation,5)
print ("draw_numbers loop number ",ln," ", z, " standard deviation ", deviation,"\n")
control_list = []
for i in range (0,5):
draw_number = z[i]
draw_number = str(draw_number)
print ("Draw number z[",i,"] ",draw_number)
print ("Original random list z ",z)
if int(draw_number) in excluded_values:
print ("Present in excluded_values")
# trigger if draw_number must be excluded
draw_number= random.randint(1,50)
print ("------>>> REDRAW draw_number ", draw_number)
draw_number = int (draw_number)
control_list.append(draw_number)
print ("Control list ", control_list," Original list z ",z,"\n")
else:
draw_number = int (draw_number)
control_list.append(draw_number)
print ("Control list ", control_list," z ",z,"\n")
continue
z = control_list
continue
But a problem is still remaining.
When a redraw occcurs, the new draw number must be checked against excluded_values list, AND control_list, in order to get a final control_ list completely clean with no duplicates.
I did not find how to arrange this double process after trying many ways in my code; I missed maybe an argument or method, and I dont know how to manage this "circular" process.
Below is an example of running the code above:
draw_numbers to exclude [1, 2, 3, 4, 5, 6, 15, 18, 30]
Draw numbers original list z [4, 6, 9, 12, 18]
draw_numbers loop number 0 [4, 6, 9, 12, 18] standard deviation 4.91528
Draw number z[ 0 ] 4
Original random list z [4, 6, 9, 12, 18]
Present in excluded_values
------>>> REDRAW draw_number 45
Control list [45] Original list z [4, 6, 9, 12, 18]
Draw number z[ 1 ] 6
Original random list z [4, 6, 9, 12, 18]
Present in excluded_values
------>>> REDRAW draw_number 40
Control list [45, 40] Original list z [4, 6, 9, 12, 18]
Draw number z[ 2 ] 9
Original random list z [4, 6, 9, 12, 18]
Control list [45, 40, 9] z [4, 6, 9, 12, 18]
Draw number z[ 3 ] 12
Original random list z [4, 6, 9, 12, 18]
Control list [45, 40, 9, 12] z [4, 6, 9, 12, 18]
Draw number z[ 4 ] 18
Original random list z [4, 6, 9, 12, 18]
Present in excluded_values
------>>> REDRAW draw_number 14
Control list [45, 40, 9, 12, 14] Original list z [4, 6, 9, 12, 18]
Thank you for your help and your time.
Anyway, take care and stay safe # home.
Greetings from Paris, France :)
Following the comments, here is a simpler solution
from random import sample
k = 5
all_values = range(30)
excluded_values = [1,2,3,4,5,6,15,18,30]
k_sampled = sample(set(all_values) - set(excluded_values), k)
print(k_sampled) # --> [28, 10, 8, 11, 20]
DataReader = pd.read_csv('Quality.csv')
...
ip = [DataReader.x1, DataReader.x2, DataReader.x3, DataReader.x4,........., DataReader.x12,
DataReader.x13]
op = DataReader.y
ip = np.matrix(ip).transpose()
op = np.matrix(op).transpose()
Please help to solve below error. Python 3.7v and numpy 1.17v
Traceback (most recent call last):
File "Quality.py", line xx, in <module>
ip = np.matrix(ip).transpose()
File "\\defmatrix.py", line 147, in __new__
arr = N.array(data, dtype=dtype, copy=copy)
**ValueError: cannot copy sequence with size 13 to array axis with dimension 200**
You start with a dataframe with 200 rows and 14 columns. Its .values attribute (or to_numpy() method result) will then be a (200,14) shape array.
In:
ip = [DataReader.x1, DataReader.x2, ...DataReader.x13]
DataReader.x1 is a column, a pandas Series. Its .values is a 1d array, (200,) shape. I would expect np.array(ip) to be a (13,200) array.
If instead you'd done
ip.DataReader[['x1','x2',...,'x13']].values
the result would be a (200,13) array.
With a simple dataframe, your code shouldn't produce an error:
In [61]: df = pd.DataFrame(np.arange(12).reshape(4,3), columns=['a','b','c'])
In [63]: df.values
Out[63]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
In [65]: ip = [df.a, df.b, df.c]
In [67]: np.array(ip)
Out[67]:
array([[ 0, 3, 6, 9],
[ 1, 4, 7, 10],
[ 2, 5, 8, 11]])
np.matrix(ip).transpose() works just as well (though there's no need to use np.matrix instead of np.array).
I can't reproduce your error. Making an array from certain mixes of shaped arrays produces an error like
ValueError: could not broadcast input array from shape (3) into shape (1)
or in other cases an object array (of Series).
====
For one column, I'd expect the resulting 1d array, or reshape if needed.
In [82]: df.a
Out[82]:
0 0
1 3
2 6
3 9
Name: a, dtype: int64
In [83]: df.a.values
Out[83]: array([0, 3, 6, 9])
In [84]: df.a.values[:,None]
Out[84]:
array([[0],
[3],
[6],
[9]])
I have the following tensor :
ts = torch.tensor([[1,2,3],[4,6,7],[8,9,10]])
> tensor([[ 1, 2, 3],
[ 4, 6, 7],
[ 8, 9, 10]])
I am looking for a pytorch generic operation that adds all rows element-wise like that:
ts2 = ts[0]+ts[1]+ts[2]
print(ts2)
> tensor([13, 17, 20])
In reality, the number of rows corresponds to the batch size that vary.
You can sum over an axis/dimension like so:
torch.sum(ts, dim=0)
I have a 2D numpy array of lambda functions. Each function has 2 arguments and returns a float.
What's the best way to pass the same 2 arguments to all of these functions and get a numpy array of answers out?
I've tried something like:
np.reshape(np.fromiter((fn(1,2) for fn in np.nditer(J,order='K',flags=["refs_ok"])),dtype = float),J.shape)
to evaluate each function in J with arguments (1,2) ( J contains the functions).
But it seems very round the houses, and also doesn't quite work...
Is there a good way to do this?
A = J(1,2)
doesn't work!
You can use list comprehensions:
A = np.asarray([[f(1,2) for f in row] for row in J])
This should work for both numpy arrays and list of lists.
I don't think there is a really clean way, but this is reasonably clean and works:
import operator
import numpy as np
# create array of lambdas
a = np.array([[lambda x, y, i=i, j=j: x**i + y**j for i in range(4)] for j in range(4)])
# apply arguments 2 and 3 to all of them
np.vectorize(operator.methodcaller('__call__', 2, 3))(a)
# array([[ 2, 3, 5, 9],
# [ 4, 5, 7, 11],
# [10, 11, 13, 17],
# [28, 29, 31, 35]])
Alternatively, and slightly more flexible:
from types import FunctionType
np.vectorize(FunctionType.__call__)(a, 2, 3)
# array([[ 2, 3, 5, 9],
# [ 4, 5, 7, 11],
# [10, 11, 13, 17],
# [28, 29, 31, 35]])