Large rounding errors in python plots - python-3.x

I try to plot the following simple sequence
a_n=\frac{3^n+1}{7^n+8}
which should tend to 0, but the plot shows a weird effect for values of $n$ near 20....
I use the code
import numpy as np
import matplotlib.pyplot as plt
def f(n):
return (3**n+1)/(7**n+8)
n=np.arange(0,25, 1)
plt.plot(n,f(n),'bo-')
On the other hand, computing numerically the above sequence one does not find such large values
for i in range(0,25):
print([i,f(i)])
[0, 0.2222222222222222]
[1, 0.26666666666666666]
[2, 0.17543859649122806]
[3, 0.07977207977207977]
[4, 0.034039020340390205]
[5, 0.014510853404698185]
[6, 0.0062044757218015075]
[7, 0.0026567874970706124]
[8, 0.0011382857610720493]
[9, 0.00048778777316480816]
[10, 0.00020904485804220367]
[11, 8.958964415487241e-05]
[12, 3.8395417418579486e-05]
[13, 1.6455158259653074e-05]
[14, 7.05220773432529e-06]
[15, 3.022374322043928e-06]
[16, 1.295303220696569e-06]
[17, 5.551299431298911e-07]
[18, 2.3791283154177113e-07]
[19, 1.0196264191387531e-07]
[20, 4.3698275080881505e-08]
[21, 1.872783217393992e-08]
[22, 8.026213788319863e-09]
[23, 3.439805909206865e-09]
[24, 1.4742025325067883e-09]
​
Why is this happening?

The issue is not with matplotlib, but with the datatype of the numbers that arange is producing. You are not specifying the dtype, because in the docs for arange, it states that is inferred from the input. Your inputs are integers, so it must assume they are 32-bit integers since the dtype is unmodified so that when I check the type:
print(type(n[0]))
<class 'numpy.int32'>
If I change the dtype to single precision floats, we get the behavior you expect:
n = np.arange(0,25,1, dtype=np.float32)
print(type(n[0]))
<class 'numpy.float32'>
plt.plot(n,f(n),'bo-')
Alternatively, you could just put a period behind the 1 -> 1. to imply you want double-precision floats (even if the resulting array contains integer-esque numbers [0., 1., 2., ...])

Related

Extract lower off diagonal elements from numpy array

I have below array
import numpy as np
a = np.array([[7412, 33, 2],
[2, 7304, 83],
[3, 101, 7237]])
I would like to extract only lower off-diagonal elements from above array and put them in a vector.
I tried with np.extract(~a, a), but is extracting all elements.
Desired output will be [2, 3, 101] for above example.
Any insight would be helpful
You can use np.tril_indices or np.tri:
import numpy as np
a = np.array([[7412, 33, 2],
[2, 7304, 83],
[3, 101, 7237]])
n, m = a.shape
# Option 1
out = a[ np.tril_indices(n=n, k=-1, m=m) ]
# Option 2 (should have equivalent output)
out = a[ np.tri(N=n, M=m, k=-1, dtype=bool) ]
out:
array([ 2, 3, 101])

Add a number to all the elements in a matrix except the diagonal elements

I wanted to add a constant number to all the elements in a matrix but except to the diagonal elements.
e.g., matrix = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Desired output : adding 10 to all the elements except to diagonal elements
matrix = np.array([[1, 12, 13],
[14, 5, 16],
[17, 18, 9]])
How can I exclude diagonal elements from this operation ?
I would use an identity matrix multplied by the number you add and subtract like this:
import numpy as np
x= 9 #number to add
matrix = np.array([ [1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
matrix2 = matrix + x - (np.identity(len(matrix))*x)
print(matrix2)

What does the ordering/index of cluster_centers_ represent in KMeans clustering SKlearn

I have implemented the following code
k_mean = KMeans(n_clusters=5,init=centroids,n_init=1,random_state=SEED).fit(X_input)
k_mean.cluster_centers_.shape
>>
(5, 50)
I have 5 clusters of the data.
How are the clusters ordered? Are the indices of the clusters centres representing the labels?
Means does the cluster_center index at 0th position represent the label = 0 or not?
In the docs you have a smiliar example:
>>> from sklearn.cluster import KMeans
>>> import numpy as np
>>> X = np.array([[1, 2], [1, 4], [1, 0],
... [10, 2], [10, 4], [10, 0]])
>>> kmeans = KMeans(n_clusters=2, random_state=0).fit(X)
>>> kmeans.labels_
array([1, 1, 1, 0, 0, 0], dtype=int32)
>>> kmeans.predict([[0, 0], [12, 3]])
array([1, 0], dtype=int32)
>>> kmeans.cluster_centers_
array([[10., 2.],
[ 1., 2.]])
The indexes are ordered yes. Btw with k_mean.cluster_centers_.shapeyou only return the shape of your array, and not the values. So in your case you have 5 clusters, and the dimension of your features is 50.
To get the nearest point, you can have a look here.

Difference between len and size

I found two ways to determine how many elements are in a variable…
I always get the same values for len () and size (). Is there a difference? Could size () have come with an imported library (like math, numpy, pandas)?
asdf = range (10)
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
asdf = list (range (10))
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
asdf = np.array (range (10))
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
asdf = tuple (range (10))
print ( 'len:', len (asdf), 'versus size:', size (asdf) )
size comes from numpy (on which pandas is based).
It gives you the total number of elements in the array. However, you can also query the sizes of specific axes with np.size (see below).
In contrast, len gives the length of the first dimension.
For example, let's create an array with 36 elements shaped into three dimensions.
In [1]: import numpy as np
In [2]: a = np.arange(36).reshape(2, 3, -1)
In [3]: a
Out[3]:
array([[[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]],
[[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]]])
In [4]: a.shape
Out[4]: (2, 3, 6)
size
size will give you the total number of elements.
In [5]: a.size
Out[5]: 36
len
len will give you the number of 'elements' of the first dimension.
In [6]: len(a)
Out[6]: 2
This is because, in this case, each 'element' stands for a 2-dimensional array.
In [14]: a[0]
Out[14]:
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
In [15]: a[1]
Out[15]:
array([[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
These arrays, in turn, have their own shape and size.
In [16]: a[0].shape
Out[16]: (3, 6)
In [17]: len(a[0])
Out[17]: 3
np.size
You can use size more specifically with np.size.
For example you can reproduce len by specifying the first ('0') dimension.
In [11]: np.size(a, 0)
Out[11]: 2
And you can also query the sizes of the other dimensions.
In [10]: np.size(a, 1)
Out[10]: 3
In [12]: np.size(a, 2)
Out[12]: 6
Basically, you reproduce the values of shape.
Numpy nparray has Size
https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.size.html
Whilst len is from Python itself
Size is from numpy ndarray.size
The main difference is that nparray size only measures the size of an array, whilst python's Len can be used for getting the length of objects in general
Consider this example :
a = numpy.array([[1,2,3,4,5,6],[7,8,9,10,11,12]])
print(len(a))
#output is 2
print(numpy.size(a))
#output is 12
len() is built-in method used to compute the length of iterable python objects like str, list , dict etc. len returns the length of the iterable, i.e the number of elements. In above example the array is actually of length 2, because it is a nested list where each list is considered as an element.
numpy.size() returns the size of the array, it is equal to n_dim1 * n_dim2 * --- n_dimn , i.e it is the product of dimensions of the array, for example if we have an array of dimension (5,5,2), the size is 50, as it can hold 50 elements. But len() will return 5, because the number of elements in higher order list (or 1st dimension is 5).
According to your question, len() and numpy.size() return same output for 1-D arrays (same as lists) but in vector form. However, the results are different for 2-D + arrays. So to get the correct answer, use numpy.size() as it returns the actual size.
When you callnumpy.size() on any iterable, as in your example, it is first casted to a numpy array object, then size() is called.
Thanks for A2A

Pass argument to array of functions

I have a 2D numpy array of lambda functions. Each function has 2 arguments and returns a float.
What's the best way to pass the same 2 arguments to all of these functions and get a numpy array of answers out?
I've tried something like:
np.reshape(np.fromiter((fn(1,2) for fn in np.nditer(J,order='K',flags=["refs_ok"])),dtype = float),J.shape)
to evaluate each function in J with arguments (1,2) ( J contains the functions).
But it seems very round the houses, and also doesn't quite work...
Is there a good way to do this?
A = J(1,2)
doesn't work!
You can use list comprehensions:
A = np.asarray([[f(1,2) for f in row] for row in J])
This should work for both numpy arrays and list of lists.
I don't think there is a really clean way, but this is reasonably clean and works:
import operator
import numpy as np
# create array of lambdas
a = np.array([[lambda x, y, i=i, j=j: x**i + y**j for i in range(4)] for j in range(4)])
# apply arguments 2 and 3 to all of them
np.vectorize(operator.methodcaller('__call__', 2, 3))(a)
# array([[ 2, 3, 5, 9],
# [ 4, 5, 7, 11],
# [10, 11, 13, 17],
# [28, 29, 31, 35]])
Alternatively, and slightly more flexible:
from types import FunctionType
np.vectorize(FunctionType.__call__)(a, 2, 3)
# array([[ 2, 3, 5, 9],
# [ 4, 5, 7, 11],
# [10, 11, 13, 17],
# [28, 29, 31, 35]])

Resources