Accuracy problems in estimating pi using Machin's method - python-3.x

I need to estimate pi to 100 decimal places using Machin's method which is as follows. 4(4(arctan(1/5)) - arctan(1/239)) = pi.
This formula is known to converge to pi pretty quickly, with sources citing accuracy of 72 places after the decimal in 50 iterations/terms.
I can only achieve accuracy up to 15 places after the decimal point, using Machin's formula, and I can not figure out why.
I have written a function for the Taylor series of arctan(x) and then I use that function inside another function that applies the formula I have written above. I have also tried setting higher precision using the Decimals module.
##This is the function for the taylor series of arctan(x)
def arctan(first_term, terms):
k = 0
array = []
if k < 1:
x = (((-1)**k)*(first_term)**((2*k)+1))/((2*k)+1)
k = 1
array.append(x)
if k > 0:
while k < terms:
x = x + (((-1)**k)*(first_term)**((2*k)+1))/((2*k)+1)
k += 1
array.append(x)
return array[-1]
##Here is the function for Machin's formula
def machinpi(first_term, first_term2, terms):
x = 4*(arctan(first_term, terms))-(arctan(first_term2, terms))
return x*4
Machin is famous for estimating pi to 100 decimal places by hand. I am trying to figure out how many terms of the series are required to achieve this accuracy. However, I can not find the answer if I can not first converge to 100 decimal places of pi. Using Machin's formula I expect to converge to 72 values after the decimal place in 50 iterations.

Okay I have figured this problem out. I don't know why this happens in python but calling the function I wrote like this machinpi(Decimal(1/5), Decimal(1/239), terms) is not equal to this which has the accuracy that I expect machinpi(Decimal(1)/Decimal(5), Decimal(1)/Decimal(239), terms)

Related

Given a exponential probability density function, how to generate random values using the random generator in Excel?

Based on a set of experiments, a probability density function (PDF) for an exponentially distributed variable was generated. Now the goal is to use this function in a Monte carlo simulation. I am vaguely familiar with PDF's and random generator, especially for normal and log-normal distributions. However, I am not quite able to figure this out. Would be great if someone can help.
Here's the function:
f = γ/2R * exp⁡(-γl/2R) (1-exp⁡(-γ) )^(-1) H (2R-l)
f is the probability density function,
1/γ is the mean of the distribution,
R is a known fixed variable,
H is the heaviside step function,
l is the variable that is exponentially distributed
Well. I don't know how to do it in Excel, but using inverse method it is easy to get the answer (assuming there is RANDOM() function which returns uniform numbers in the [0...1] range)
l = -(2R/γ)*LOG(1 - RANDOM()*(1-EXP(-γ)))
Easy to check boundary values
if RANDOM()=0, then l = 0
if RANDOM()=1, then l = 2R
UPDATE
So there is a PDF
PDF(l|R,γ) = γ/2R * exp⁡(-lγ/2R)/(1-exp⁡(-γ)), l in the range [0...2R]
First, check that it is normalized
∫ PDF(l|R,γ) dl from 0 to 2R = 1
Ok, it is normalized
Then compute CDF(l|R,γ)
CDF(l|R,γ) = ∫ PDF(l|R,γ) dl from 0 to l =
(1 - exp⁡(-lγ/2R))/(1-exp⁡(-γ))
Check again, CDF(l=2R|R,γ) = 1, good.
Now set CDF(l|R,γ)=RANDOM(), solve it wrt l and get your sampling expression. Check it at the RANDOM() returning 0 or RANDOM() returning 1, you should get end points of l interval.

Is there a faster way to calculate the distance between elements in the same matrix with a Gaussian function?

Starting from an M matrix of shape 7000 x 2, I calculate the following quantity:
I do it in the following way (the variance sigma is arbitrary):
W = np.zeros((M.shape[0], M.shape[0]))
elements_sum_by_i = np.zeros((M.shape[0]))
for i in range (0,M.shape[0]):
#normalization
for k in range (0, M.shape[0]):
elements_sum_by_i[k] = math.exp(-(np.linalg.norm(M[i,:] - M[k,:])**2)/(2*sigma**2))
sum_by_i = sum(elements_sum_by_i)
#calculation
for j in range (0,M.shape[0]):
W[i,j] = (math.exp(-(np.linalg.norm(M[i,:] - M[j,:]))**2/(2*sigma**2)))/(sum_by_i)
The problem is that it is really very slow (takes about 30 minutes). Is there a faster way to do this calculation?
May be you can extract some ideas from the following comments:
1) Calculate the Log(W[i,j]) with the simplifications of the formula, the exponents disappear, the processing should be quicker.
2) Take the exponent of it: Exp(Log(W{i,j])) == W[i,j]
3) Use variables for values that are constants inside the iterations like sigma = 2*sigma**2, that you can compute at start outside of the iterations.
Important, before any change, memorize the result so that your new development can be tested on the final matrix result that you already know, I suppose, is correct.
Good luck.

Why does my if function not return the proper value?

I'm quite new at Python programming so forgive me if it seems like a stupid question. This is my code with the given results:
Code:
def Stopping_Voltage(Frequency, Phi):
x = (4.14E-15) * Frequency ##The value of (h/e) multiplied by frequency
y = Phi / (1.602E-19) ##The value of Phi/e
Energy = x * (1.602E-19)
print(Energy)
print(Phi)
print(x)
print(y)
String = 'No electron is emitted'
if Energy > Phi:
Voltage = x - y
return(Voltage)
else:
return(String)
Stopping_Voltage(10, (6.63228E-33))
Result:
6.632280000000001e-33
6.63228e-33
4.1400000000000005e-14
4.14e-14
6.310887241768095e-30
What we're asked to do is if the energy is less than or equal to phi, return the string but when testing it with the given variables, it should return the string but it is still giving me a quantitative result. I initially tried using "else" rather than "elif" but it still gave me the same thing (if that matters). When I printed the value for Energy and Phi, the energy value has a lot of zeroes after the decimal (with 1 following after all the zeroes). How do I fix this to give me the string?
Your code is fine! It does return the string, if Energy is <= Phi. It's just that your Energy in this particular example is really bigger than your Phi :) This is the scientific notation of a number, so e means 10^exponent like 2e-5 is equal to 2*10^-5. You can check it by adding print(Energy > Phi) which will print you either True or False e.g. before the if-else block.

How to calculate the standard deviation from a histogram? (Python, Matplotlib)

Let's say I have a data set and used matplotlib to draw a histogram of said data set.
n, bins, patches = plt.hist(data, normed=1)
How do I calculate the standard deviation, using the n and bins values that hist() returns? I'm currently doing this to calculate the mean:
s = 0
for i in range(len(n)):
s += n[i] * ((bins[i] + bins[i+1]) / 2)
mean = s / numpy.sum(n)
which seems to work fine as I get pretty accurate results. However, if I try to calculate the standard deviation like this:
t = 0
for i in range(len(n)):
t += (bins[i] - mean)**2
std = np.sqrt(t / numpy.sum(n))
my results are way off from what numpy.std(data) returns. Replacing the left bin limits with the central point of each bin doesn't change this either. I have the feeling that the problem is that the n and bins values don't actually contain any information on how the individual data points are distributed within each bin, but the assignment I'm working on clearly demands that I use them to calculate the standard deviation.
You haven't weighted the contribution of each bin with n[i]. Change the increment of t to
t += n[i]*(bins[i] - mean)**2
By the way, you can simplify (and speed up) your calculation by using numpy.average with the weights argument.
Here's an example. First, generate some data to work with. We'll compute the sample mean, variance and standard deviation of the input before computing the histogram.
In [54]: x = np.random.normal(loc=10, scale=2, size=1000)
In [55]: x.mean()
Out[55]: 9.9760798903061847
In [56]: x.var()
Out[56]: 3.7673459904902025
In [57]: x.std()
Out[57]: 1.9409652213499866
I'll use numpy.histogram to compute the histogram:
In [58]: n, bins = np.histogram(x)
mids is the midpoints of the bins; it has the same length as n:
In [59]: mids = 0.5*(bins[1:] + bins[:-1])
The estimate of the mean is the weighted average of mids:
In [60]: mean = np.average(mids, weights=n)
In [61]: mean
Out[61]: 9.9763028267760312
In this case, it is pretty close to the mean of the original data.
The estimated variance is the weighted average of the squared difference from the mean:
In [62]: var = np.average((mids - mean)**2, weights=n)
In [63]: var
Out[63]: 3.8715035807387328
In [64]: np.sqrt(var)
Out[64]: 1.9676136767004677
That estimate is within 2% of the actual sample standard deviation.
The following answer is equivalent to Warren Weckesser's, but maybe more familiar to those who prefer to want mean as the expected value:
counts, bins = np.histogram(x)
mids = 0.5*(bins[1:] + bins[:-1])
probs = counts / np.sum(counts)
mean = np.sum(probs * mids)
sd = np.sqrt(np.sum(probs * (mids - mean)**2))
Do take note in certain context you may want the unbiased sample variance where the weights are not normalized by N but N-1.

Statistical Analysis Error? python 3 proof read please

The code below generates two random integers within range specified by argv, tests if the integers match and starts again. At the end it prints some stats about the process.
I've noticed though that increasing the value of argv reduces the percentage of tested possibilities exponentially.
This seems counter intuitive to me so my question is, is this an error in the code or are the numbers real and if so then what am I not thinking about?
#!/usr/bin/python3
import sys
import random
x = int(sys.argv[1])
a = random.randint(0,x)
b = random.randint(0,x)
steps = 1
combos = x**2
while a != b:
a = random.randint(0,x)
b = random.randint(0,x)
steps += 1
percent = (steps / combos) * 100
print()
print()
print('[{} ! {}]'.format(a,b), end=' ')
print('equality!'.upper())
print('steps'.upper(), steps)
print('possble combinations = {}'.format(combos))
print('explored {}% possibilitys'.format(percent))
Thanks
EDIT
For example:
./runscrypt.py 100000
will returm me something like:
[65697 ! 65697] EQUALITY!
STEPS 115867
possble combinations = 10000000000
explored 0.00115867% possibilitys
"explored 0.00115867% possibilitys" <-- This number is too low?
This experiment is really a geometric distribution.
Ie.
Let Y be the random variable of the number of iterations before a match is seen. Then Y is geometrically distributed with parameter 1/x (the probability of generating two matching integers).
The expected value, E[Y] = 1/p where p is the mentioned probability (the proof of this can be found in the link above). So in your case the expected number of iterations is 1/(1/x) = x.
The number of combinations is x^2.
So the expected percentage of explored possibilities is really x/(x^2) = 1/x.
As x approaches infinity, this number approaches 0.
In the case of x=100000, the expected percentage of explored possibilities = 1/100000 = 0.001% which is very close to your numerical result.

Resources