I want my Python functions to accept numpy ndarrays - python-3.x

I'm writing a script to plot a 3D representation of some thermodynamic property, Z = f(pr, Tr). pr and Tr are created with [numpy.]arange() and then they are mapped with [numpy.]meshgrid() the following way:
Tr = arange(1.0, 2.6, 0.10)
pr = arange(0.5, 9.0, 0.25)
Xpr, YTr = meshgrid(pr, Tr)
Xpr and YTr are then passed to the function that calculates the aforementioned property:
Z = function(Xpr, YTr)
("function" is just a generic name that is later replaced by the actual function name).
The values stored in Z are finally plotted:
fig = plt.figure(1, figsize=(7, 6))
ax = fig.add_subplot(projection='3d')
surf = ax.plot_surface(Xpr, YTr, Z, cmap=cm.jet, linewidth=0, antialiased=False)
Everything works fine when "function" is something quite straightforward like:
def zshell(pr_, Tr_):
A = -0.101 - 0.36*Tr_ + 1.3868*sqrt(Tr_ - 0.919)
B = 0.021 + 0.04275/(Tr_ - 0.65)
E = 0.6222 - 0.224*Tr_
F = 0.0657/(Tr_ - 0.85) - 0.037
G = 0.32*exp(-19.53*(Tr_ - 1.0))
D = 0.122*exp(-11.3*(Tr_ - 1.0))
C = pr_*(E + F*pr_ + G*pr_**4)
z = A + B*pr_ + (1.0 - A)*exp(-C) - D*(pr_/10.0)**4
return z
But it fails when the function is something like this:
def zvdw(pr_, Tr_):
A = 0.421875*pr_/Tr_**2 # 0.421875 = 27.0/64.0
B = 0.125*pr_/Tr_ # 0.125 = 1.0/8.0
z = 9.5e-01
erro = 1.0
while erro >= 1.0e-06:
c2 = -(B + 1.0)
c1 = A
c0 = -A*B
f = z**3 + c2*z**2 + c1*z + c0
df = 3.0e0*z**2 + 2.0e0*c2*z + c1
zf = z - f/df
erro = abs((zf - z)/z)
z = zf
return z
I strongly suspect that the failure is caused by the iterative method inside function zvdw(pr_, Tr_) (zvdw has been previously tested and works perfectly well when float arguments are passed to it). That is the error message I get:
Traceback (most recent call last):
File "/home/fausto/.../TestMesh.py", line 81, in <module>
Z = zvdw(Xpr, YTr)
File "/home/fausto/.../TestMesh.py", line 63, in zvdw
while erro >= 1.0e-06:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Yet the error message doesn't seem (to me) to be directly related to the while statement.
Any ideas?

Tl;dr
There is a numpy functionality for that. You just replace
def zvdw(pr_, Tr_):
by
#np.vectorize
def zvdw(pr_, Tr_):
And it works.
Get it faster
Unfortunately the resulting picture looked ugly since your mesh is to sparse. I replaced your step size in TR and pr. Unfortunately here we run into the limitation of np.vectorize. From the numpy documentation
The vectorize function is provided primarily for convenience, not for
performance. The implementation is essentially a for loop.
I.e. it is painfully slow. Already at 0.0010 and 0.0025 it took 17.5 seconds. So it is not realistic to go a factor of 10 each smaller since that would take ~100 times longer. Fortunately your code is simple enough that I can use #numba.vectorize which on my machine was a factor of ~23 faster.
Notice that this only works for some python code. Numba compiles the python code to llvm code so it can run fast. But it is unable to do that for arbitrary python code. See https://numba.pydata.org/numba-doc/dev/user/vectorize.html
And here is your picture
Code seems not to work with np.vectorize/numba.vectorize
That is very odd. Here is a literal copy paste of code that works on my machine:
import numpy as np
import matplotlib.pyplot as plt
Tr = np.arange(1.0, 2.6, 0.10)
pr = np.arange(0.5, 9.0, 0.25)
Xpr, YTr = np.meshgrid(pr, Tr)
def zshell(pr_, Tr_):
A = -0.101 - 0.36*Tr_ + 1.3868*sqrt(Tr_ - 0.919)
B = 0.021 + 0.04275/(Tr_ - 0.65)
E = 0.6222 - 0.224*Tr_
F = 0.0657/(Tr_ - 0.85) - 0.037
G = 0.32*exp(-19.53*(Tr_ - 1.0))
D = 0.122*exp(-11.3*(Tr_ - 1.0))
C = pr_*(E + F*pr_ + G*pr_**4)
z = A + B*pr_ + (1.0 - A)*exp(-C) - D*(pr_/10.0)**4
return z
#np.vectorize
def zvdw(pr_, Tr_):
A = 0.421875*pr_/Tr_**2 # 0.421875 = 27.0/64.0
B = 0.125*pr_/Tr_ # 0.125 = 1.0/8.0
z = 9.5e-01
erro = 1.0
while erro >= 1.0e-06:
c2 = -(B + 1.0)
c1 = A
c0 = -A*B
f = z**3 + c2*z**2 + c1*z + c0
df = 3.0e0*z**2 + 2.0e0*c2*z + c1
zf = z - f/df
erro = abs((zf - z)/z)
z = zf
return z
Z = zvdw(Xpr, YTr)
fig = plt.figure(1, figsize=(7, 6))
ax = fig.add_subplot(projection='3d')
surf = ax.plot_surface(Xpr, YTr, Z, cmap=plt.cm.jet, linewidth=0, antialiased=False)

Related

Why Sympy does not solve my nonlinear system? Python interpreter remains in execution until I kill the process

I need to solve a nonlinear system of equations in Python using Sympy.
For this, I wrote the code below. However, when I run this code the Python remains busy without returning any error message and, additionally, does not return the solution.
For comparison, I did the same work in Matlab and within a few seconds, the program returns two solutions for this system.
How, using Sympy, I can solve the system?
Regards.
import sympy as sym
import numpy as np
# Variables of the system
S, V, E, A, I, R = sym.symbols('S, V, E, A, I, R')
# Parameters of the system
N = sym.Symbol("N", positive = True)
mi = sym.Symbol("mi", positive = True)
v = sym.Symbol("v", positive = True)
epsilon = sym.Symbol("epsilon", positive = True)
alpha = sym.Symbol("alpha", positive = True)
gamma_as = sym.Symbol("gamma_as", positive = True)
gamma_s = sym.Symbol("gamma_s", positive = True)
gamma_a = sym.Symbol("gamma_a", positive = True)
lamb = sym.Symbol("lamb", positive = True)
tau = sym.Symbol("tau", positive = True)
beta = sym.Symbol("beta", positive = True)
x = sym.Symbol("x")
# Declaration of the system equations
system = [mi*N - v*S + R - (beta*(A+I)/N)*S - mi*S,\
v*S - (1-epsilon)*(beta*(A+I)/N)*V - mi*V,\
(beta*(A+I)/N)*S + (1-epsilon)*(beta*(A+I)/N)*V - sym.exp(-mi*tau)*(beta*(A+I)/N)*S - mi*E,\
alpha*sym.exp(-mi*tau)*(beta*(A+I)/N)*S - (gamma_as + gamma_a + mi)*A,\
(1-alpha)*sym.exp(-mi*tau)*(beta*(A+I)/N)*S + gamma_as*A - (gamma_s + mi)*I,\
gamma_a*A + gamma_s*I - (1+mi)*R]
# Solution
solution_set = sym.nonlinsolve(system, [S, V, E, A, I, R])
pyS, pyV, pyE, pyA, pyI, pyR = solution_set[0]
````
SymPy generally solves a system of polynomial equations like this using Groebner bases. To compute the Groebner basis SymPy needs to identify each of the equations as a polynomial in the given unknowns with coefficients in a computable field (a "domain"). Your coefficients involve both mi and exp(-mi*tau) which SymPy's construct_domain doesn't like so it gives up constructing a computable domain and uses the "EX" domain instead which is very slow.
The solution then is to replace exp(mi*tau) with another symbol (I'll just use tau) and then compute the Groebner basis explicitly yourself:
In [103]: rep = {exp(-mi*tau):tau}
In [104]: system2 = [eq.subs(rep) for eq in system]
In [105]: for eq in system2: pprint(eq)
S⋅β⋅(A + I)
N⋅mi + R - S⋅mi - S⋅v - ───────────
N
V⋅β⋅(1 - ε)⋅(A + I)
S⋅v - V⋅mi - ───────────────────
N
S⋅β⋅τ⋅(A + I) S⋅β⋅(A + I) V⋅β⋅(1 - ε)⋅(A + I)
-E⋅mi - ───────────── + ─────────── + ───────────────────
N N N
S⋅α⋅β⋅τ⋅(A + I)
-A⋅(γₐ + γₐₛ + mi) + ───────────────
N
S⋅β⋅τ⋅(1 - α)⋅(A + I)
A⋅γₐₛ - I⋅(γₛ + mi) + ─────────────────────
N
A⋅γₐ + I⋅γₛ - R⋅(mi + 1)
Now we could use solve or nonlinsolve but it's faster to compute and solve the Groebner basis ourselves:
In [106]: %time gb = groebner(system2, [S, V, E, A, I, R])
CPU times: user 3min 1s, sys: 100 ms, total: 3min 1s
Wall time: 3min 1s
The Groebner basis puts the system of equations into an almost solved form known as a rational univariate representation (RUR). In this case it looks like
S - a*R
V - b*R
E - c*R
A - d*R
I - e*R
R**2 + f*R + g
where the coefficients a, b, c, d, e, f, g are complicated rational functions of the symbolic parameters in the equations (alpha, beta etc). From here we can follow these steps to solve the Groebner basis:
Solve the first 5 linear equations for S, V, E, A and I in terms of R.
Solve the final quadratic equation for R giving two solutions R1 and R2.
Substitute the the solutions for R1 and R2 into the solutions for S, V, E, A and I.
Put it all together as two solution tuples.
That is:
In [115]: syms = [S, V, E, A, I, R]
In [116]: [lsol] = linsolve(gb[:-1], syms[:-1])
In [117]: R1, R2 = roots(gb[-1], R)
In [118]: sol1 = lsol.subs(R, R1) + (R1,)
In [119]: sol2 = lsol.subs(R, R2) + (R2,)
Now we have the two solution tuples in the form that would have been returned by nonlinsolve. Unfortunately the solutions are quite complicated so I won't show them in full. You can get some idea of the complexity by seeing the length of their string representations:
In [122]: print(len(str(sol1)))
794100
In [123]: print(len(str(sol2)))
27850
Now at this point it's worth considering what you actually wanted these solutions for. Maybe it's just that you wanted to substitute some explicit numeric values in for the symbolic parameters. It's worth noting here that potentially it would have been more efficient in the first place to substitute those values into the equations before attempting to solve them symbolically. If you want to see how your solutions depend on some particular parameters say just mi then you can substitute values for everything else and obtain a simpler form of the solution involving only that parameter more quickly:
In [136]: rep = {alpha:1, beta:2, epsilon:3, gamma_as:4, gamma_s:5, gamma_a:6, exp(-mi*tau):7, N:8, v
...: :9}
In [137]: system2 = [eq.subs(rep) for eq in system]
In [138]: %time solve(system2, syms)
CPU times: user 3.92 s, sys: 4 ms, total: 3.92 s
Wall time: 3.92 s
Out[138]:
⎡ ⎛ ⎛ 2
⎢⎛ 8⋅mi 72 ⎞ ⎜4⋅(mi + 5)⋅(mi + 10) 36⋅(mi + 5)⋅(mi + 10)⋅(mi + 12)⋅⎝mi + 4⋅mi
⎢⎜──────, ──────, 0, 0, 0, 0⎟, ⎜────────────────────, ─────────────────────────────────────────────
⎢⎝mi + 9 mi + 9 ⎠ ⎜ 7⋅(mi + 9) ⎛ 4 3 2
⎣ ⎝ 7⋅(mi + 9)⋅⎝3⋅mi + 38⋅mi + 161⋅mi + 718⋅mi
⎞ ⎛ 2 ⎞ ⎛ 3 2 ⎞
- 25⎠ 24⋅(mi + 1)⋅(mi + 5)⋅(mi + 10)⋅⎝mi + mi + 50⎠⋅⎝3⋅mi + 41⋅mi + 209⋅mi + 787⎠ -4⋅(mi + 1
───────, ──────────────────────────────────────────────────────────────────────────────, ──────────
⎞ ⎛ 2 ⎞ ⎛ 4 3 2 ⎞
+ 900⎠ 7⋅(mi + 12)⋅⎝mi + 4⋅mi - 25⎠⋅⎝3⋅mi + 38⋅mi + 161⋅mi + 718⋅mi + 900⎠ (mi +
⎛ 2 ⎞ ⎛ 2 ⎞ ⎛ 2 ⎞ ⎞⎤
)⋅(mi + 5)⋅⎝mi + mi + 50⎠ -16⋅(mi + 1)⋅⎝mi + mi + 50⎠ -8⋅(3⋅mi + 25)⋅⎝mi + mi + 50⎠ ⎟⎥
───────────────────────────, ─────────────────────────────, ───────────────────────────────⎟⎥
⎛ 2 ⎞ ⎛ 2 ⎞ ⎛ 2 ⎞ ⎟⎥
12)⋅⎝mi + 4⋅mi - 25⎠ (mi + 12)⋅⎝mi + 4⋅mi - 25⎠ (mi + 12)⋅⎝mi + 4⋅mi - 25⎠ ⎠⎦
If you substitute values for all parameters then it's a lot faster:
In [139]: rep = {alpha:1, beta:2, epsilon:3, gamma_as:4, gamma_s:5, gamma_a:6, exp(-mi*tau):7, N:8, v
...: :9, mi:10}
In [140]: system2 = [eq.subs(rep) for eq in system]
In [141]: %time solve(system2, syms)
CPU times: user 112 ms, sys: 0 ns, total: 112 ms
Wall time: 111 ms
Out[141]:
⎡⎛1200 124200 5224320 -960 -256 -640 ⎞ ⎛80 72 ⎞⎤
⎢⎜────, ──────, ───────, ─────, ─────, ─────⎟, ⎜──, ──, 0, 0, 0, 0⎟⎥
⎣⎝133 55727 67459 23 23 23 ⎠ ⎝19 19 ⎠⎦
If you look at your system you will see that the 4th and 5th equations have two solutions since solving the 4th for A and substituting into the 5th gives an expression that factors as I*f(S) -- giving, for the value of A, I = 0 and S such that f(S) = 0. Judicious selection of which equation(s) to solve next and taking time to lump constants together so you don't bog down the solver gives both solutions in about 10 seconds with relatively small operation counts (relative to the results of nonlinsolve above -- 10 and 5192 operations). The process gives the same solutions for the representative values above:
def condense(eq, *x, reps=None):
"""collapse additive/multiplicative constants into single
variables, returning condensed expression and replacement
values.
Note: use of the replacement dictionary may require topological sorting
if values depend on the keys.
"""
from sympy.core.traversal import bottom_up
from sympy.simplify.radsimp import collect
from sympy.utilities.iterables import numbered_symbols
if reps is None:
reps = {}
else:
reps = {v:k for k,v in reps.items()}
con = numbered_symbols('c')
free = eq.free_symbols
def c():
while True:
rv = next(con)
if rv not in free:
return rv
def do(e):
if not e.args:
return e
e = e.func(*[do(i) for i in e.args])
isAdd=e.is_Add
if not (isAdd or e.is_Mul):
return e
if isAdd:
ee = collect(e, x, exact=None)
if ee != e:
e = do(ee)
co, id = e.as_coeff_Add() if isAdd else e.as_coeff_Mul()
i, d = id.as_independent(*x, as_Add=isAdd)
if not i.args:
return e
return e.func(co, reps.get(i, reps.setdefault(i, c())), d)
rv = do(bottom_up(eq, do))
return rv, {v: k for k, v in reps.items()}
def repsort(*replace):
"""Return sorted replacement tuples `(o, n)` such that `(o_i, n_i)`
will appear before `(o_j, n_j)` if `o_j` appears in `n_i`. An error
will be raised if `o_j` appears in `n_i` and `o_i` appears in `n_k`
if `k >= i`.
Examples
========
>>> from sympy.abc import x, y, z, a
>>> repsort((x, y + 1), (z, x + 2))
[(z, x + 2), (x, y + 1)]
>>> repsort((x, y + 1), (z, x**2))
[(z, x**2), (x, y + 1)]
>>> repsort(*Tuple((x,y+z),(y,a),(z,1/y)))
[(x, y + z), (z, 1/y), (y, a)]
Any two of the following 3 tuples will not raise an error,
but together they contain a cycle that raises an error:
>>> repsort((x, y), (y, z), (z, x))
Traceback (most recent call last):
...
raise ValueError("cycle detected")
"""
from itertools import permutations
from sympy import default_sort_key, topological_sort
free = {i for i,_ in replace}
defs, replace = sift(replace,
lambda x: x[1].is_number or not x[1].has_free(*free),
binary=True)
edges = [(i, j) for i, j in permutations(replace, 2) if
i[1].has(j[0]) and (not j[0].is_Symbol or j[0] in i[1].free_symbols)]
rv = topological_sort([replace, edges], default_sort_key)
rv.extend(ordered(defs))
return rv
def dupdate(d, s):
"""update values in d with values from s and return the combined dictionaries"""
rv = {k: v.xreplace(s) for k,v in d.items()}
rv.update(s)
return rv
# Variables of the system
syms=S, V, E, A, I, R = symbols('S, V, E, A, I, R')
# Parameters of the system
const = var('a:j k')
system = [
-A*S*c/a - I*S*c/a + R + S*(-h - j) + a*h,
A*(V*c*d/a - V*c/a) + I*(V*c*d/a - V*c/a) + S*j - V*h,
A*(-S*c*k/a + S*c/a - V*c*d/a + V*c/a) - E*h +
I*(-S*c*k/a + S*c/a - V*c*d/a + V*c/a),
A*(S*b*c*k/a - e - f - h) + I*S*b*c*k/a,
A*(-S*b*c*k/a + S*c*k/a + f) + I*(-S*b*c*k/a + S*c*k/a - g - h),
A*e + I*g + R*(-h - 1)
]
import sympy as sym
# Variables of the system
syms = S, V, E, A, I, R = sym.symbols('S, V, E, A, I, R')
# Parameters of the system
N = sym.Symbol("N", positive = True)
mi = sym.Symbol("mi", positive = True)
v = sym.Symbol("v", positive = True)
epsilon = sym.Symbol("epsilon", positive = True)
alpha = sym.Symbol("alpha", positive = True)
gamma_as = sym.Symbol("gamma_as", positive = True)
gamma_s = sym.Symbol("gamma_s", positive = True)
gamma_a = sym.Symbol("gamma_a", positive = True)
lamb = sym.Symbol("lamb", positive = True)
tau = sym.Symbol("tau", positive = True)
beta = sym.Symbol("beta", positive = True)
# Declaration of the system equations
system = [
mi*N - v*S + R - (beta*(A+I)/N)*S - mi*S,
v*S - (1-epsilon)*(beta*(A+I)/N)*V - mi*V,
(beta*(A+I)/N)*S + (1-epsilon)*(beta*(A+I)/N)*V -
sym.exp(-mi*tau)*(beta*(A+I)/N)*S - mi*E,
alpha*sym.exp(-mi*tau)*(beta*(A+I)/N)*S - (gamma_as + gamma_a + mi)*A,
(1-alpha)*sym.exp(-mi*tau)*(beta*(A+I)/N)*S + gamma_as*A - (gamma_s + mi)*I,
gamma_a*A + gamma_s*I - (1+mi)*R]
system, srep = condense(Tuple(*system), *syms)
asol = solve(system[3], A, dict=True)[0]
aeq=Tuple(*[i.xreplace(asol) for i in system])
si = solve(aeq[4], *syms, dict=True)
sol1 = dupdate(asol, si[0])
sol1 = dupdate(sol1, solve(Tuple(*system).xreplace(sol1),syms,dict=1)[0]); sol1
aeqs4 = Tuple(*[i.xreplace(si[1]) for i in aeq])
ceq, crep = condense(Tuple(*aeqs4),*syms,reps=srep)
ir = solve([ceq[0], ceq[-1]], I, R, dict=1)[0]
ve = solve([i.simplify() for i in Tuple(*ceq).xreplace(ir)], syms, dict=True)[0] # if we don't simplify to make first element 0 we get no solution -- bug?
sol2 = dupdate(asol, si[1])
sol2 = dupdate(sol2, ir)
sol2 = dupdate(sol2, ve)
crep = repsort(*crep.items())
sol1 = Dict({k:v.subs(crep) for k,v in sol1.items()}) # 10 ops
sol2 = Dict({k:v.subs(crep) for k,v in sol2.items()}) # 5192 ops
Test for specific values (as above):
>>> rep = {alpha: 1, beta: 2, epsilon: 3, gamma_as: 4, gamma_s: 5,
... gamma_a: 6, exp(-mi*tau): 7, N: 8, v: 9, mi: 10}
...
>>> sol1.xreplace(rep)
{A: 0, E: 0, I: 0, R: 0, S: 80/19, V: 72/19}
>>> sol2.xreplace(rep)
{A: -960/23, E: 89280/851, I: -256/23,
R: -640/23, S: 1200/133, V: -124200/4921}
Of course, it took time to find this path to the solution. But if the solver could make better selections of what to solve (rather than trying to get the Groebner basis of the whole system) the time for obtaining a solution from SymPy could be greatly reduced.

Optimizing asymmetrically reweighted penalized least squares smoothing (from matlab to python)

I'm trying to apply the method for baselinining vibrational spectra, which is announced as an improvement over asymmetric and iterative re-weighted least-squares algorithms in the 2015 paper (doi:10.1039/c4an01061b), where the following matlab code was provided:
function z = baseline(y, lambda, ratio)
% Estimate baseline with arPLS in Matlab
N = length(y);
D = diff(speye(N), 2);
H = lambda*D'*D;
w = ones(N, 1);
while true
W = spdiags(w, 0, N, N);
% Cholesky decomposition
C = chol(W + H);
z = C \ (C' \ (w.*y) );
d = y - z;
% make d-, and get w^t with m and s
dn = d(d<0);
m = mean(d);
s = std(d);
wt = 1./ (1 + exp( 2* (d-(2*s-m))/s ) );
% check exit condition and backup
if norm(w-wt)/norm(w) < ratio, break; end
end
that I rewrote into python:
def baseline_arPLS(y, lam, ratio):
# Estimate baseline with arPLS
N = len(y)
k = [numpy.ones(N), -2*numpy.ones(N-1), numpy.ones(N-2)]
offset = [0, 1, 2]
D = diags(k, offset).toarray()
H = lam * numpy.matmul(D.T, D)
w_ = numpy.ones(N)
while True:
W = spdiags(w_, 0, N, N, format='csr')
# Cholesky decomposition
C = cholesky(W + H)
z_ = spsolve(C.T, w_ * y)
z = spsolve(C, z_)
d = y - z
# make d- and get w^t with m and s
dn = d[d<0]
m = numpy.mean(dn)
s = numpy.std(dn)
wt = 1. / (1 + numpy.exp(2 * (d - (2*s-m)) / s))
# check exit condition and backup
norm_wt, norm_w = norm(w_-wt), norm(w_)
if (norm_wt / norm_w) < ratio:
break
w_ = wt
return(z)
Except for the input vector y the method requires parameters lam and ratio and it runs ok for values lam<1.e+07 and ratio>1.e-01, but outputs poor results. When values are changed outside this range, for example lam=1e+07, ratio=1e-02 the CPU starts heating up and job never finishes (I interrupted it after 1min). Also in both cases the following warning shows up:
/usr/local/lib/python3.9/site-packages/scipy/sparse/linalg/dsolve/linsolve.py: 144: SparseEfficencyWarning: spsolve requires A to be CSC or CSR matrix format warn('spsolve requires A to be CSC or CSR format',
although I added the recommended format='csr' option to the spdiags call.
And here's some synthetic data (similar to one in the paper) for testing purposes. The noise was added along with a 3rd degree polynomial baseline The method works well for parameters bl_1 and fails to converge for bl_2:
import numpy
from matplotlib import pyplot
from scipy.sparse import spdiags, diags, identity
from scipy.sparse.linalg import spsolve
from numpy.linalg import cholesky, norm
import sys
x = numpy.arange(0, 1000)
noise = numpy.random.uniform(low=0, high = 10, size=len(x))
poly_3rd_degree = numpy.poly1d([1.2e-06, -1.23e-03, .36, -4.e-04])
poly_baseline = poly_3rd_degree(x)
y = 100 * numpy.exp(-((x-300)/15)**2)+\
200 * numpy.exp(-((x-750)/30)**2)+ \
100 * numpy.exp(-((x-800)/15)**2) + noise + poly_baseline
bl_1 = baseline_arPLS(y, 1e+07, 1e-01)
bl_2 = baseline_arPLS(y, 1e+07, 1e-02)
pyplot.figure(1)
pyplot.plot(x, y, 'C0')
pyplot.plot(x, poly_baseline, 'C1')
pyplot.plot(x, bl_1, 'k')
pyplot.show()
sys.exit(0)
All this is telling me that I'm doing something very non-optimal in my python implementation. Since I'm not knowledgeable enough about the intricacies of scipy computations I'm kindly asking for suggestions on how to achieve convergence in this calculations.
(I encountered an issue in running the "straight" matlab version of the code because the line D = diff(speye(N), 2); truncates the last two rows of the matrix, creating dimension mismatch later in the function. Following the description of matrix D's appearance I substituted this line by directly creating a tridiagonal matrix using the diags function.)
Guided by the comment #hpaulj made, and suspecting that the loop exit wasn't coded properly, I re-visited the paper and found out that the authors actually implemented an exit condition that was not featured in their matlab script. Changing the while loop condition provides an exit for any set of parameters; my understanding is that algorithm is not guaranteed to converge in all cases, which is why this condition is necessary but was omitted by error. Here's the edited version of my python code:
def baseline_arPLS(y, lam, ratio):
# Estimate baseline with arPLS
N = len(y)
k = [numpy.ones(N), -2*numpy.ones(N-1), numpy.ones(N-2)]
offset = [0, 1, 2]
D = diags(k, offset).toarray()
H = lam * numpy.matmul(D.T, D)
w_ = numpy.ones(N)
i = 0
N_iterations = 100
while i < N_iterations:
W = spdiags(w_, 0, N, N, format='csr')
# Cholesky decomposition
C = cholesky(W + H)
z_ = spsolve(C.T, w_ * y)
z = spsolve(C, z_)
d = y - z
# make d- and get w^t with m and s
dn = d[d<0]
m = numpy.mean(dn)
s = numpy.std(dn)
wt = 1. / (1 + numpy.exp(2 * (d - (2*s-m)) / s))
# check exit condition and backup
norm_wt, norm_w = norm(w_-wt), norm(w_)
if (norm_wt / norm_w) < ratio:
break
w_ = wt
i += 1
return(z)

Simpson's rule 3/8 for n intervals in Python

im trying to write a program that gives the integral approximation of e(x^2) between 0 and 1 based on this integral formula:
Formula
i've done this code so far but it keeps giving the wrong answer (Other methods gives 1.46 as an answer, this one gives 1.006).
I think that maybe there is a problem with the two for cycles that does the Riemman sum, or that there is a problem in the way i've wrote the formula. I also tried to re-write the formula in other ways but i had no success
Any kind of help is appreciated.
import math
import numpy as np
def f(x):
y = np.exp(x**2)
return y
a = float(input("¿Cual es el limite inferior? \n"))
b = float(input("¿Cual es el limite superior? \n"))
n = int(input("¿Cual es el numero de intervalos? "))
x = np.zeros([n+1])
y = np.zeros([n])
z = np.zeros([n])
h = (b-a)/n
print (h)
x[0] = a
x[n] = b
suma1 = 0
suma2 = 0
for i in np.arange(1,n):
x[i] = x[i-1] + h
suma1 = suma1 + f(x[i])
alfa = (x[i]-x[i-1])/3
for i in np.arange(0,n):
y[i] = (x[i-1]+ alfa)
suma2 = suma2 + f(y[i])
z[i] = y[i] + alfa
int3 = ((b-a)/(8*n)) * (f(x[0])+f(x[n]) + (3*(suma2+f(z[i]))) + (2*(suma1)))
print (int3)
I'm not a math major but I remember helping a friend with this rule for something about waterplane area for ships.
Here's an implementation based on Wikipedia's description of the Simpson's 3/8 rule:
# The input parameters
a, b, n = 0, 1, 10
# Divide the interval into 3*n sub-intervals
# and hence 3*n+1 endpoints
x = np.linspace(a,b,3*n+1)
y = f(x)
# The weight for each points
w = [1,3,3,1]
result = 0
for i in range(0, 3*n, 3):
# Calculate the area, 4 points at a time
result += (x[i+3] - x[i]) / 8 * (y[i:i+4] * w).sum()
# result = 1.4626525814387632
You can do it using numpy.vectorize (Based on this wikipedia post):
a, b, n = 0, 1, 10**6
h = (b-a) / n
x = np.linspace(0,n,n+1)*h + a
fv = np.vectorize(f)
(
3*h/8 * (
f(x[0]) +
3 * fv(x[np.mod(np.arange(len(x)), 3) != 0]).sum() + #skip every 3rd index
2 * fv(x[::3]).sum() + #get every 3rd index
f(x[-1])
)
)
#Output: 1.462654874404461
If you use numpy's built-in functions (which I think is always possible), performance will improve considerably:
a, b, n = 0, 1, 10**6
x = np.exp(np.square(np.linspace(0,n,n+1)*h + a))
(
3*h/8 * (
x[0] +
3 * x[np.mod(np.arange(len(x)), 3) != 0].sum()+
2 * x[::3].sum() +
x[-1]
)
)
#Output: 1.462654874404461

Why can't I get this Runge-Kutta solver to converge as the time step decreases?

For reasons, I need to implement the Runge-Kutta4 method in PyTorch (so no, I'm not going to use scipy.odeint). I tried and I get weird results on the simplest test case, solving x'=x with x(0)=1 (analytical solution: x=exp(t)). Basically, as I reduce the time step, I cannot get the numerical error to go down. I'm able to do it with a simpler Euler method, but not with the Runge-Kutta 4 method, which makes me suspect some floating point issue here (maybe I'm missing some hidden conversion from double precision to single)?
import torch
import numpy as np
import matplotlib.pyplot as plt
def Euler(f, IC, time_grid):
y0 = torch.tensor([IC])
time_grid = time_grid.to(y0[0])
values = y0
for i in range(0, time_grid.shape[0] - 1):
t_i = time_grid[i]
t_next = time_grid[i+1]
y_i = values[i]
dt = t_next - t_i
dy = f(t_i, y_i) * dt
y_next = y_i + dy
y_next = y_next.unsqueeze(0)
values = torch.cat((values, y_next), dim=0)
return values
def RungeKutta4(f, IC, time_grid):
y0 = torch.tensor([IC])
time_grid = time_grid.to(y0[0])
values = y0
for i in range(0, time_grid.shape[0] - 1):
t_i = time_grid[i]
t_next = time_grid[i+1]
y_i = values[i]
dt = t_next - t_i
dtd2 = 0.5 * dt
f1 = f(t_i, y_i)
f2 = f(t_i + dtd2, y_i + dtd2 * f1)
f3 = f(t_i + dtd2, y_i + dtd2 * f2)
f4 = f(t_next, y_i + dt * f3)
dy = 1/6 * dt * (f1 + 2 * (f2 + f3) +f4)
y_next = y_i + dy
y_next = y_next.unsqueeze(0)
values = torch.cat((values, y_next), dim=0)
return values
# differential equation
def f(T, X):
return X
# initial condition
IC = 1.
# integration interval
def integration_interval(steps, ND=1):
return torch.linspace(0, ND, steps)
# analytical solution
def analytical_solution(t_range):
return np.exp(t_range)
# test a numerical method
def test_method(method, t_range, analytical_solution):
numerical_solution = method(f, IC, t_range)
L_inf_err = torch.dist(numerical_solution, analytical_solution, float('inf'))
return L_inf_err
if __name__ == '__main__':
Euler_error = np.array([0.,0.,0.])
RungeKutta4_error = np.array([0.,0.,0.])
indices = np.arange(1, Euler_error.shape[0]+1)
n_steps = np.power(10, indices)
for i, n in np.ndenumerate(n_steps):
t_range = integration_interval(steps=n)
solution = analytical_solution(t_range)
Euler_error[i] = test_method(Euler, t_range, solution).numpy()
RungeKutta4_error[i] = test_method(RungeKutta4, t_range, solution).numpy()
plots_path = "./plots"
a = plt.figure()
plt.xscale('log')
plt.yscale('log')
plt.plot(n_steps, Euler_error, label="Euler error", linestyle='-')
plt.plot(n_steps, RungeKutta4_error, label="RungeKutta 4 error", linestyle='-.')
plt.legend()
plt.savefig(plots_path + "/errors.png")
The result:
As you can see, the Euler method converges (slowly, as expected of a first order method). However, the Runge-Kutta4 method does not converge as the time step gets smaller and smaller. The error goes down initially, and then up again. What's the issue here?
The reason is indeed a floating point precision issue. torch defaults to single precision, so once the truncation error becomes small enough, the total error is basically determined by the roundoff error, and reducing the truncation error further by increasing the number of steps <=> decreasing the time step doesn't lead to any decrease in the total error.
To fix this, we need to enforce double precision 64bit floats for all floating point torch tensors and numpy arrays. Note that the right way to do this is to use respectively torch.float64 and np.float64 rather than, e.g., torch.double and np.double, because the former are fixed-sized float values, (always 64bit) while the latter depend on the machine and/or compiler. Here's the fixed code:
import torch
import numpy as np
import matplotlib.pyplot as plt
def Euler(f, IC, time_grid):
y0 = torch.tensor([IC], dtype=torch.float64)
time_grid = time_grid.to(y0[0])
values = y0
for i in range(0, time_grid.shape[0] - 1):
t_i = time_grid[i]
t_next = time_grid[i+1]
y_i = values[i]
dt = t_next - t_i
dy = f(t_i, y_i) * dt
y_next = y_i + dy
y_next = y_next.unsqueeze(0)
values = torch.cat((values, y_next), dim=0)
return values
def RungeKutta4(f, IC, time_grid):
y0 = torch.tensor([IC], dtype=torch.float64)
time_grid = time_grid.to(y0[0])
values = y0
for i in range(0, time_grid.shape[0] - 1):
t_i = time_grid[i]
t_next = time_grid[i+1]
y_i = values[i]
dt = t_next - t_i
dtd2 = 0.5 * dt
f1 = f(t_i, y_i)
f2 = f(t_i + dtd2, y_i + dtd2 * f1)
f3 = f(t_i + dtd2, y_i + dtd2 * f2)
f4 = f(t_next, y_i + dt * f3)
dy = 1/6 * dt * (f1 + 2 * (f2 + f3) +f4)
y_next = y_i + dy
y_next = y_next.unsqueeze(0)
values = torch.cat((values, y_next), dim=0)
return values
# differential equation
def f(T, X):
return X
# initial condition
IC = 1.
# integration interval
def integration_interval(steps, ND=1):
return torch.linspace(0, ND, steps, dtype=torch.float64)
# analytical solution
def analytical_solution(t_range):
return np.exp(t_range, dtype=np.float64)
# test a numerical method
def test_method(method, t_range, analytical_solution):
numerical_solution = method(f, IC, t_range)
L_inf_err = torch.dist(numerical_solution, analytical_solution, float('inf'))
return L_inf_err
if __name__ == '__main__':
Euler_error = np.array([0.,0.,0.], dtype=np.float64)
RungeKutta4_error = np.array([0.,0.,0.], dtype=np.float64)
indices = np.arange(1, Euler_error.shape[0]+1)
n_steps = np.power(10, indices)
for i, n in np.ndenumerate(n_steps):
t_range = integration_interval(steps=n)
solution = analytical_solution(t_range)
Euler_error[i] = test_method(Euler, t_range, solution).numpy()
RungeKutta4_error[i] = test_method(RungeKutta4, t_range, solution).numpy()
plots_path = "./plots"
a = plt.figure()
plt.xscale('log')
plt.yscale('log')
plt.plot(n_steps, Euler_error, label="Euler error", linestyle='-')
plt.plot(n_steps, RungeKutta4_error, label="RungeKutta 4 error", linestyle='-.')
plt.legend()
plt.savefig(plots_path + "/errors.png")
Result:
Now, as we decrease the time step, the error of the RungeKutta4 approximation decreases with the correct rate.

PyMC3- Custom theano Op to do numerical integration

I am using PyMC3 for parameter estimation using a particular likelihood function which has to be defined. I googled it and found out that I should use the densitydist method for implementing the user defined likelihood functions but it is not working. How to incorporate a user defined likelihood function in PyMC3 and to find out the maximum a posteriori (MAP) estimate for my model? My code is given below. Here L is the analytic form of my Likelihood function. I have some observational data for the radial velocity(vr) and postion (r) for some objects, which is imported from excel file.
data_ = np.array(pandas.read_excel('aaa.xlsx',header=None))
gamma=3.77;
G = 4.302*10**-6;
rmin = 3.0;
R = 95.7;
vr=data_[:,1];
r= data_[:,0];
h= np.pi;
class integrateOut(theano.Op):
def __init__(self,f,t,t0,tf,*args,**kwargs):
super(integrateOut,self).__init__()
self.f = f
self.t = t
self.t0 = t0
self.tf = tf
def make_node(self,*inputs):
self.fvars=list(inputs)
try:
self.gradF = tt.grad(self.f,self.fvars)
except:
self.gradF = None
return theano.Apply(self,self.fvars,[tt.dscalar().type()])
def perform(self,node, inputs, output_storage):
args = tuple(inputs)
f = theano.function([self.t]+self.fvars,self.f)
output_storage[0][0] = quad(f,self.t0,self.tf,args=args)[0]
def grad(self,inputs,grads):
return [integrateOut(g,self.t,self.t0,self.tf)(*inputs)*grads[0] \
for g in self.gradF]
basic_model = pm.Model()
with basic_model:
M=[]
beta=[]
interval=0.01*10**12
M=pm.Uniform('M',
lower=0.5*10**12,upper=3.50*10**12,transform='interval')
beta=pm.Uniform('beta',lower=2.001,upper=2.999,transform='interval')
gamma=3.77
logp=[]
arr=[]
vnew=[]
rnew=[]
theta = tt.scalar('theta')
beta = tt.scalar('beta')
z = tt.cos(theta)**(2*( (gamma/(beta - 2)) - 3/2) + 3)
intZ = integrateOut(z,theta,-(np.pi)/2,(np.pi)/2)(beta)
gradIntZ = tt.grad(intZ,[beta])
funcIntZ = theano.function([beta],intZ)
funcGradIntZ = theano.function([beta],gradIntZ)
for j in np.arange(0,59,1):
vnew.append(vr[j]+(0.05*vr[j]*float(dm.Decimal(rm.randrange(1,
20))/10)));
rnew.append(r[j]+(0.05*r[j]*float(dm.Decimal(rm.randrange(1,
20))/10)));
vn=np.array(vnew)
rn=np.array(rnew)
for beta in np.arange (2.01,2.99,0.01):
for M in np.arange (0.5,2.50,0.01):
i=np.arange(0,59,1)
q =( gamma/(beta - 2)) - 3/2
B = (G*M*10**12)/((beta -2 )*( R**(3 - beta)))
K = (gamma - 3)/((rmin**(3 - gamma))*funcIntZ(beta)*m.sqrt(2*B))
logp= -np.log(K*((1 -(( 1/(2*B) )*((vn[i]**2)*rn[i]**(beta -
2))))**(q+1))*(rn[i]**(1-gamma +(beta/2))))
arr.append(logp.sum())
def logp_func(rn,vn):
return min(np.array(arr))
logpvar = pm.DensityDist("logpvar", logp_func, observed={"rn": rn,"vn":vn})
start = pm.find_MAP(model=basic_model)
step = pm.Metropolis()
basicmodeltrace = pm.sample(10000, step=step,
start=start,random_seed=1,progressbar=True)
print(pm.summary(basicmodeltrace))
map_estimate = pm.find_MAP(model=basic_model)
print(map_estimate)
I am getting the following error message:
ValueError: Cannot compute test value: input 0 (theta) of Op
Elemwise{cos,no_inplace}(theta) missing default value.
Backtrace when that variable is created:
I am unable to get the output since the numerical integration is not working. I have used custom theano op for numerical integration code which i got from Custom Theano Op to do numerical integration . The integration works if I run it seperately inputting a particular value of beta, but not within the model.
I made a few changes to your code, this still does not work, but I hope it is closer to a solution. Please check this thread, as someone is trying so solve essentially the same problem.
class integrateOut(theano.Op):
def __init__(self, f, t, t0, tf,*args, **kwargs):
super(integrateOut,self).__init__()
self.f = f
self.t = t
self.t0 = t0
self.tf = tf
def make_node(self, *inputs):
self.fvars=list(inputs)
try:
self.gradF = tt.grad(self.f, self.fvars)
except:
self.gradF = None
return theano.Apply(self, self.fvars, [tt.dscalar().type()])
def perform(self,node, inputs, output_storage):
args = tuple(inputs)
f = theano.function([self.t] + self.fvars,self.f)
output_storage[0][0] = quad(f, self.t0, self.tf, args=args)[0]
def grad(self,inputs,grads):
return [integrateOut(g, self.t, self.t0, self.tf)(*inputs)*grads[0] \
for g in self.gradF]
gamma = 3.77
G = 4.302E-6
rmin = 3.0
R = 95.7
vr = data[:,1]
r = data[:,0]
h = np.pi
interval = 1E10
vnew = []
rnew = []
for j in np.arange(0,59,1):
vnew.append(vr[j]+(0.05*vr[j] * float(dm.Decimal(rm.randrange(1, 20))/10)))
rnew.append(r[j]+(0.05*r[j] * float(dm.Decimal(rm.randrange(1, 20))/10)))
vn = np.array(vnew)
rn = np.array(rnew)
def integ(gamma, beta, theta):
z = tt.cos(theta)**(2*((gamma/(beta - 2)) - 3/2) + 3)
return integrateOut(z, theta, -(np.pi)/2, (np.pi)/2)(beta)
with pm.Model() as basic_model:
M = pm.Uniform('M', lower=0.5*10**12, upper=3.50*10**12)
beta = pm.Uniform('beta', lower=2.001, upper=2.999)
theta = pm.Normal('theta', 0, 10**2)
def logp_func(rn,vn):
q = (gamma/(beta - 2)) - 3/2
B = (G*M*1E12) / ((beta -2 )*(R**(3 - beta)))
K = (gamma - 3) / ((rmin**(3 - gamma)) * integ(gamma, beta, theta) * (2*B)**0.5)
logp = - np.log(K*((1 -((1/(2*B))*((vn**2)*rn**(beta -
2))))**(q+1))*(rn**(1-gamma +(beta/2))))
return logp.sum()
logpvar = pm.DensityDist("logpvar", logp_func, observed={"rn": rn,"vn":vn})
start = pm.find_MAP()
#basicmodeltrace = pm.sample()
print(start)

Resources