So I am ultimately trying to use Horner's rule (http://mathworld.wolfram.com/HornersRule.html) to evaluate polynomials, and creating a function to evaluate the polynomial. Whatever, so my problem is with how I wrote the function; it works for easy polynomials like 3x^2 + 2x^1 + 5 and so on. But once you get to evaluating a polynomial with a floating point number (something crazy like 1.8953343e-20, etc. ) it loses it's precision.
Because I am using this function to evaluate roots of a polynomial using Newton's Method (http://tutorial.math.lamar.edu/Classes/CalcI/NewtonsMethod.aspx), I need this to be precise, so it doesn't lose it's value through a small rounding error and whatnot.
I have already troubleshooted with two other people that the problem lies within the evaluatePoly() function, and not my other functions that evaluates Newton's Method. Also, I originally evaluated the polynomial normally (multiplying x to the degree, multiplying by constant, etc.) and it pulled out the correct answer. However, the assignment requires one to use Horner's rule for easier calculation.
This is my following code:
def evaluatePoly(poly, x_):
"""Evaluates the polynomial at x = x_ and returns the result as a floating
point number using Horner's rule"""
#http://mathworld.wolfram.com/HornersRule.html
total = 0.
polyRev = poly[::-1]
for nn in polyRev:
total = total * x_
total = total + nn
return total
Note: I have already tried setting nn, x_, (total * x_) as floats using float().
This is the output I am receiving:
Polynomial: 5040x^0 + 1602x^1 + 1127x^2 - 214x^3 - 75x^4 + 4x^5 + 1x^6
Derivative: 1602x^0 + 2254x^1 - 642x^2 - 300x^3 + 20x^4 + 6x^5
(6.9027369297630505, False)
Starting at 100.00, no root found, last estimate was 6.90, giving value f(6.90) = -6.366463e-12
(6.9027369297630505, False)
Starting at 10.00, no root found, last estimate was 6.90, giving value f(6.90) = -6.366463e-12
(-2.6575456505038764, False)
Starting at 0.00, no root found, last estimate was -2.66, giving value f(-2.66) = 8.839758e+03
(-8.106973924480215, False)
Starting at -10.00, no root found, last estimate was -8.11, giving value f(-8.11) = -1.364242e-11
(-8.106973924480215, False)
Starting at -100.00, no root found, last estimate was -8.11, giving value f(-8.11) = -1.364242e-11
This is the output I need:
Polynomial: 5040x^0 + 1602x^1 + 1127x^2 - 214x^3 - 75x^4 + 4x^5 + 1x^6
Derivative: 1602x^0 + 2254x^1 - 642x^2 - 300x^3 + 20x^4 + 6x^5
(6.9027369297630505, False)
Starting at 100.00, no root found, last estimate was 6.90,giving value f(6.90) = -2.91038e-11
(6.9027369297630505, False)
Starting at 10.00, no root found, last estimate was 6.90,giving value f(6.90) = -2.91038e-11
(-2.657545650503874, False)
Starting at 0.00, no root found, last estimate was -2.66,giving value f(-2.66) = 8.83976e+03
(-8.106973924480215, True)
Starting at -10.00, root found at x = -8.11, giving value f(-8.11)= 0.000000e+00
(-8.106973924480215, True)
Starting at -100.00, root found at x = -8.11, giving value f(-8.11)= 0.000000e+00
Note: Please ignore the tuples of the errored output; That is the result of my newton's method, where the first result is the root and the second result is indicating whether it is a root or not.
Try this:
def evaluatePoly(poly, x_):
'''Evaluate the polynomial poly at x = x_ and return the result as a
floating-point number using Horner's rule'''
total= 0
degree =0
for coef in poly:
total += (x_**degree) * coef
degree += 1
Evaluation of a polynomial close to a root requires, by construction of the problem, that large terms cancel to yield a small result. The size of the intermediate terms can be bounded in a worst-case sense by the evaluation of the polynomial that has all its coefficients set to the absolute values of the coefficients of the original polynomial. For the first root that gives
In [6]: x0=6.9027369297630505
In [7]: evaluatePoly(poly,x0)
Out[7]: -6.366462912410498e-12
In [8]: evaluatePoly([abs(c) for c in poly],abs(x0))
Out[8]: 481315.82997756737
This value is a first estimate for the magnification factor of the floating point errors, multiplied with the machine epsilon 2.22e-16 this gives a bound on the accumulated floating point errors of any evaluation method of 1.07e-10 and indeed both evaluation methods give values comfortably below this bound, indicating that the root was found within the capabilities of the floating point format.
Looking at the graph of the evaluation around x0 one sees that the basic assumption of a smooth curve fails at that magnification, and the x-axis is crossed in a jump so that no better value for x0 can be found:
Related
I want my sigmoid to never print a solid 1 or 0, but to actually print the exact value
i tried using
torch.set_printoptions(precision=20)
but it didn't work. here's a sample output of the sigmoid function :
before sigmoid : tensor([[21.2955703735]])
after sigmoid : tensor([[1.]])
but i don't want it to print 1, i want it to print the exact number, how can i force this?
The difference between 1 and the exact value of sigmoid(21.2955703735) is on the order of 5e-10, which is significantly less than machine epsilon for float32 (which is about 1.19e-7). Therefore 1.0 is the best approximation that can be achieved with the default precision. You can cast your tensor to a float64 (AKA double precision) tensor to get a more precise estimate.
torch.set_printoptions(precision=20)
x = torch.tensor([21.2955703735])
result = torch.sigmoid(x.to(dtype=torch.float64))
print(result)
which results in
tensor([0.99999999943577644324], dtype=torch.float64)
Keep in mind that even with 64-bit floating point computation this is only accurate to about 6 digits past the last 9 (and will be even less precise for larger sigmoid inputs). A better way to represent numbers very close to one is to directly compute the difference between 1 and the value. In this case 1 - sigmoid(x) which is equivalent to 1 / (1 + exp(x)) or sigmoid(-x). For example,
x = torch.tensor([21.2955703735])
delta = torch.sigmoid(-x.to(dtype=torch.float64))
print(f'sigmoid({x.item()}) = 1 - {delta.item()}')
results in
sigmoid(21.295570373535156) = 1 - 5.642236648842976e-10
and is a more accurate representation of your desired result (though still not exact).
I have a custom (discrete) probability distribution defined somewhat in the form: f(x)/(sum(f(x')) for x' in a given discrete set X). Also, 0<=x<=1.
So I have been trying to implement it in python 3.8.2, and the problem is that the numerator and denominator both come out to be really small and python's floating point representation just takes them as 0.0.
After calculating these probabilities, I need to sample a random element from an array, whose each index may be selected with the corresponding probability in the distribution. So if my distribution is [p1,p2,p3,p4], and my array is [a1,a2,a3,a4], then probability of selecting a2 is p2 and so on.
So how can I implement this in an elegant and efficient way?
Is there any way I could use the np.random.beta() in this case? Since the difference between the beta distribution and my actual distribution is only that the normalization constant differs and the domain is restricted to a few points.
Note: The Probability Mass function defined above is actually in the form given by the Bayes theorem and f(x)=x^s*(1-x)^f, where s and f are fixed numbers for a given iteration. So the exact problem is that, when s or f become really large, this thing goes to 0.
You could well compute things by working with logs. The point is that while both the numerator and denominator might underflow to 0, their logs won't unless your numbers are really astonishingly small.
You say
f(x) = x^s*(1-x)^t
so
logf (x) = s*log(x) + t*log(1-x)
and you want to compute, say
p = f(x) / Sum{ y in X | f(y)}
so
p = exp( logf(x) - log sum { y in X | f(y)}
= exp( logf(x) - log sum { y in X | exp( logf( y))}
The only difficulty is in computing the second term, but this is a common problem, for example here
On the other hand computing logsumexp is easy enough to to by hand.
We want
S = log( sum{ i | exp(l[i])})
if L is the maximum of the l[i] then
S = log( exp(L)*sum{ i | exp(l[i]-L)})
= L + log( sum{ i | exp( l[i]-L)})
The last sum can be computed as written, because each term is now between 0 and 1 so there is no danger of overflow, and one of the terms (the one for which l[i]==L) is 1, and so if other terms underflow, that is harmless.
This may however lose a little accuracy. A refinement would be to recognize the set A of indices where
l[i]>=L-eps (eps a user set parameter, eg 1)
And then compute
N = Sum{ i in A | exp(l[i]-L)}
B = log1p( Sum{ i not in A | exp(l[i]-L)}/N)
S = L + log( N) + B
For reference, I'm using this page. I understand the original pagerank equation
but I'm failing to understand why the sparse-matrix implementation is correct. Below is their code reproduced:
def compute_PageRank(G, beta=0.85, epsilon=10**-4):
'''
Efficient computation of the PageRank values using a sparse adjacency
matrix and the iterative power method.
Parameters
----------
G : boolean adjacency matrix. np.bool8
If the element j,i is True, means that there is a link from i to j.
beta: 1-teleportation probability.
epsilon: stop condition. Minimum allowed amount of change in the PageRanks
between iterations.
Returns
-------
output : tuple
PageRank array normalized top one.
Number of iterations.
'''
#Test adjacency matrix is OK
n,_ = G.shape
assert(G.shape==(n,n))
#Constants Speed-UP
deg_out_beta = G.sum(axis=0).T/beta #vector
#Initialize
ranks = np.ones((n,1))/n #vector
time = 0
flag = True
while flag:
time +=1
with np.errstate(divide='ignore'): # Ignore division by 0 on ranks/deg_out_beta
new_ranks = G.dot((ranks/deg_out_beta)) #vector
#Leaked PageRank
new_ranks += (1-new_ranks.sum())/n
#Stop condition
if np.linalg.norm(ranks-new_ranks,ord=1)<=epsilon:
flag = False
ranks = new_ranks
return(ranks, time)
To start, I'm trying to trace the code and understand how it relates to the PageRank equation. For the line under the with statement (new_ranks = G.dot((ranks/deg_out_beta))), this looks like the first part of the equation (the beta times M) BUT it seems to be ignoring all divide by zeros. I'm confused by this because the PageRank algorithm requires us to replace zero columns with ones (except along the diagonal). I'm not sure how this is accounted for here.
The next line new_ranks += (1-new_ranks.sum())/n is what I presume to be the second part of the equation. I can understand what this does, but I can't see how this translates to the original equation. I would've thought we would do something like new_ranks += (1-beta)*ranks.sum()/n.
This happens because in the row sums
e.T * M * r = e.T * r
by the column sum construction of M. The convex combination with coefficient beta has the effect that the sum over the new r vector is again 1. Now what the algorithm does is to take the first matrix-vector product b=beta*M*r and then find a constant c so that r_new = b+c*e has row sum one. In theory this should be the same as what the formula says, but in the floating point practice this approach corrects and prevents floating point error accumulation in the sum of r.
Computing it this way also allows to ignore zero columns, as the compensation for them is automatically computed.
I'm attempting to solve the differential equation:
m(t) = M(x)x'' + C(x, x') + B x'
where x and x' are vectors with 2 entries representing the angles and angular velocity in a dynamical system. M(x) is a 2x2 matrix that is a function of the components of theta, C is a 2x1 vector that is a function of theta and theta' and B is a 2x2 matrix of constants. m(t) is a 2*1001 array containing the torques applied to each of the two joints at the 1001 time steps and I would like to calculate the evolution of the angles as a function of those 1001 time steps.
I've transformed it to standard form such that :
x'' = M(x)^-1 (m(t) - C(x, x') - B x')
Then substituting y_1 = x and y_2 = x' gives the first order linear system of equations:
y_2 = y_1'
y_2' = M(y_1)^-1 (m(t) - C(y_1, y_2) - B y_2)
(I've used theta and phi in my code for x and y)
def joint_angles(theta_array, t, torques, B):
phi_1 = np.array([theta_array[0], theta_array[1]])
phi_2 = np.array([theta_array[2], theta_array[3]])
def M_func(phi):
M = np.array([[a_1+2.*a_2*np.cos(phi[1]), a_3+a_2*np.cos(phi[1])],[a_3+a_2*np.cos(phi[1]), a_3]])
return np.linalg.inv(M)
def C_func(phi, phi_dot):
return a_2 * np.sin(phi[1]) * np.array([-phi_dot[1] * (2. * phi_dot[0] + phi_dot[1]), phi_dot[0]**2])
dphi_2dt = M_func(phi_1) # (torques[:, t] - C_func(phi_1, phi_2) - B # phi_2)
return dphi_2dt, phi_2
t = np.linspace(0,1,1001)
initial = theta_init[0], theta_init[1], dtheta_init[0], dtheta_init[1]
x = odeint(joint_angles, initial, t, args = (torque_array, B))
I get the error that I cannot index into torques using the t array, which makes perfect sense, however I am not sure how to have it use the current value of the torques at each time step.
I also tried putting odeint command in a for loop and only evaluating it at one time step at a time, using the solution of the function as the initial conditions for the next loop, however the function simply returned the initial conditions, meaning every loop was identical. This leads me to suspect I've made a mistake in my implementation of the standard form but I can't work out what it is. It would be preferable however to not have to call the odeint solver in a for loop every time, and rather do it all as one.
If helpful, my initial conditions and constant values are:
theta_init = np.array([10*np.pi/180, 143.54*np.pi/180])
dtheta_init = np.array([0, 0])
L_1 = 0.3
L_2 = 0.33
I_1 = 0.025
I_2 = 0.045
M_1 = 1.4
M_2 = 1.0
D_2 = 0.16
a_1 = I_1+I_2+M_2*(L_1**2)
a_2 = M_2*L_1*D_2
a_3 = I_2
Thanks for helping!
The solver uses an internal stepping that is problem adapted. The given time list is a list of points where the internal solution gets interpolated for output samples. The internal and external time lists are in no way related, the internal list only depends on the given tolerances.
There is no actual natural relation between array indices and sample times.
The translation of a given time into an index and construction of a sample value from the surrounding table entries is called interpolation (by a piecewise polynomial function).
Torque as a physical phenomenon is at least continuous, a piecewise linear interpolation is the easiest way to transform the given function value table into an actual continuous function. Of course one also needs the time array.
So use numpy.interp1d or the more advanced routines of scipy.interpolate to define the torque function that can be evaluated at arbitrary times as demanded by the solver and its integration method.
So the question I'm having problems with is:
The mathematical
constant π (pi) is an irrational number with value approximately
3.1415928... The precise value of π is equal to the following infinite sum: π = 4/1 - 4/3 + 4/5 - 4/7 + 4/9 - 4/11 + ... We can get a good
approximation of π by computing the sum of the first few terms. Write
a function approxPi(error) that takes as a parameter a floating point value
error and approximates the constant π within error by computing the
above sum, term by term, until the absolute value of the difference
between the current sum and the previous sum (with one fewer terms) is
no greater than error. Once the function finds that the difference is
less than error, it should return the new sum.
Example of correct input: approxPi(0.01) should return 3.1465677471829556
My code:
def approxPi(error):
i=5 #counter for list appending
lst = [4,(-4/3)] #list values will be appended to
currentSum = sum(lst) #calcs sum of the list
previousSum = sum(lst[:len(lst)-1]) #calcs sum of list with one less term
while abs(currentSum - previousSum) > error: #while loop for error value
if len(lst) % 2==0: #alternates appending positive or negative value
lst.append((4/i)) #appends to list
else:
lst.append((-4/i))
i+=2 #increments counter
return currentSum #returns sum of the whole list once error value has been reached
When I run the code I get stuck in the loop until memory usage reaches 100% and my system locks up. Any tips on what I am doing wrong? This is a homework problem so please don't just post an answer.