I want to solve a system of nonlinear equations using scipy.root. For performance reason, I want to provide the jacobian of the system using a LinearOperator. However, I cannot get it to work. Here is a minimal example using the gradient of the Rosenbrock function, where I first define the Jacobian (i.e. the Hessian of the Rosenbrock function) as a LinearOperator.
import numpy as np
import scipy.optimize as opt
import scipy.sparse as sp
ndim = 10
def rosen_hess_LO(x):
return sp.linalg.LinearOperator((ndim,ndim) ,matvec = (lambda dx,xl=x : opt.rosen_hess_prod(xl,dx)))
opt_result = opt.root(fun=opt.rosen_der,x0=np.zeros((ndim),float),jac=rosen_hess_LO)
Upon execution, I get the following error :
TypeError: fsolve: there is a mismatch between the input and output shape of the 'fprime' argument 'rosen_hess_LO'.Shape should be (10, 10) but it is (1,).
What am I missing here ?
Partial answer :
I was able to input my "exact" jacobian into scipy.optimize.nonlin.nonlin_solve . This really felt hacky.
Long story short, I defined a class inheriting from scipy.optimize.nonlin.Jacobian, where I defined "update" and "solve" method so that my exact jacobian would be used by the solver.
I expect performance results to greatly vary from problem to problem. Let me detail my experience for a ~10k dimensional critial point solve of an "almost" coercive function (i.e. the problem would be coercive if I had taken the time to remove a 4-dimensional symmetry generator), with many many local minima (and thus presumably many many critical points).
Long story short, this gave terrible results far from the optimum, but local convergence was achieved in fewer optimization cycles. The cost of each of those optimization cycle was (for my personal problem at hand) far greater than the "standard" krylov lgmres, so in the end even close to the optimum, I cannot really say it was worth the trouble.
To be honest, I am very impressed with the Jacobian finite difference approximation of the 'krylov' method of scipy.optimize.root.
I am trying to Maximize 10 very long linear equations, All equations are similar except for one variable (say Z).
I was thinking of putting a single equation inside a function and pass Z as a parameter.
Can we optimise the python functions?
I have looked into pyomo,pulp,cvxpy documentations and haven't found any code samples. which makes me think it this isn't possible
#This is what currently it is
Maximize
(X*fun(1,Z)) + (X2*fun(1,Z)) + ...
(X*fun(1,Z1)) + (X2*fun(1,Z1)) + ...
..
..
Solve for
X1 and X2
#This an example what I am trying to do
Def optimise(Z):
(X*fun(1,Z)) + (X2*fun(1,Z)) + ...
Maximize
optimise(13)
optimise(24)
optimise(34)
optimise(14)
optimise(12)
optimise(11) #is optimizing with funtions possible ?
Solve for
X1 and X2
It depends on what your function returns. Pyomo is an algebraic modeling language and requires access to the full algebraic equations. If your Python function returns an expression involving Pyomo Var components then it should work. If the function simply returns a value depending on the current value of a Pyomo Var then it will not work. You need to provide more details on the function and the model you're trying to solve for us to say for sure if it's supported or not.
For a few months I started working with python, considering the great advantages it has. But recently, i used odeint from scipy to solve a system of differential equations. But during the integration process the implemented function doesn't work as expected.
In this case, I want to solve a system of differential equations where one of the initial conditions (x[0]) varies (between 4-5) depending on the value that the variable reaches during the integration process (It is programmed inside of the function by means of the if structure).
#Control of oxygen
SO2_lower=4
SO2_upper=5
if x[0]<=SO2_lower:
x[0]=SO2_upper
When the function is used by odeint, some lines of code inside the function are obviated, even when the functions changes the value of x[0]. Here is all my code:
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
plt.ion()
# Stoichiometric parameters
YSB_OHO_Ox=0.67 #Yield for XOHO growth per SB (Aerobic)
YSB_Stor_Ox=0.85 #Yield for XOHO,Stor formation per SB (Aerobic)
YStor_OHO_Ox=0.63 #Yield for XOHO growth per XOHO,Stor (Aerobic)
fXU_Bio_lys=0.2 #Fraction of XU generated in biomass decay
iN_XU=0.02 #N content of XU
iN_XBio=0.07 #N content of XBio
iN_SB=0.03 #N content of SB
fSTO=0.67 #Stored fraction of SB
#Kinetic parameters
qSB_Stor=5 #Rate constant for XOHO,Stor storage of SB
uOHO_Max=2 #Maximum growth rate of XOHO
KSB_OHO=2 #Half-saturation coefficient for SB
KStor_OHO=1 #Half-saturation coefficient for XOHO,Stor/XOHO
mOHO_Ox=0.2 #Endogenous respiration rate of XOHO (Aerobic)
mStor_Ox=0.2 #Endogenous respiration rate of XOHO,Stor (Aerobic)
KO2_OHO=0.2 #Half-saturation coefficient for SO2
KNHx_OHO=0.01 #Half-saturation coefficient for SNHx
#Other parameters
DT=1/86400.0
def f(x,t):
#Control of oxygen
SO2_lower=4
SO2_upper=5
if x[0]<=SO2_lower:
x[0]=SO2_upper
M=np.matrix([[-(1.0-YSB_Stor_Ox),-1,iN_SB,0,0,YSB_Stor_Ox],
[-(1.0-YSB_OHO_Ox)/YSB_OHO_Ox,-1/YSB_OHO_Ox,iN_SB/YSB_OHO_Ox-iN_XBio,0,1,0],
[-(1.0-YStor_OHO_Ox)/YStor_OHO_Ox,0,-iN_XBio,0,1,-1/YStor_OHO_Ox],
[-(1.0-fXU_Bio_lys),0,iN_XBio-fXU_Bio_lys*iN_XU,fXU_Bio_lys,-1,0],
[-1,0,0,0,0,-1]])
R=np.matrix([[DT*fSTO*qSB_Stor*(x[0]/(KO2_OHO+x[0]))*(x[1]/(KSB_OHO+x[1]))*x[4]],
[DT*(1-fSTO)*uOHO_Max*(x[0]/(KO2_OHO+x[0]))*(x[1]/(KSB_OHO+x[1]))* (x[2]/(KNHx_OHO+x[2]))*x[4]],
[DT*uOHO_Max*(x[0]/(KO2_OHO+x[0]))*(x[2]/(KNHx_OHO+x[2]))*((x[5]/x[4])/(KStor_OHO+(x[5]/x[4])))*(KSB_OHO/(KSB_OHO+x[1]))*x[4]],
[DT*mOHO_Ox*(x[0]/(KO2_OHO+x[0]))*x[4]],
[DT*mStor_Ox*(x[0]/(KO2_OHO+x[0]))*x[5]]])
Mt=M.transpose()
MxRm=Mt*R
MxR=MxRm.tolist()
return ([MxR[0][0],
MxR[1][0],
MxR[2][0],
MxR[3][0],
MxR[4][0],
MxR[5][0]])
#ODE solution
t=np.linspace(0.0,3600,3600)
#Initial conditions
y0=np.array([5,176,5,30,100,5])
Var=odeint(f,y0,t,args=(),h0=1,hmin=1,hmax=1,atol=1e-5,rtol=1e-5)
Sol=Var.tolist()
plt.plot(t,Var[:,0])
Thanks very much in advance!!!!!
Short answer:
You should not modify input state vector inside your ODE function. Instead try the following and verify your results:
x0 = x[0]
if x0<=SO2_lower:
x0=SO2_upper
# use x0 instead of x[0] in the rest of this function body
I suppose that this is your problem, but I am not sure, since you did not explain what exactly was wrong with the results. Moreover, you do not change "initial condition". Initial condition is
y0=np.array([5,176,5,30,100,5])
you just change the input state vector.
Detailed answer:
Your odeint integrator is probably using one of the higher order adaptive Runge-Kutta methods. This algorithm requires multiple ODE function evaluations to calculate single integration step, therefore changing the input state vector may lead to undefined results. In C++ boost::odeint this is even not possible to do so, because input variable is "const". Python however is not as strict as C++ and I suppose that it is possible to make this kind of bug unintentionally (I did not try it, though).
EDIT:
OK, I understand what you want to achieve.
Your variable x[0] is constrained by modular algebra and it is not possible to express in the form
x' = f(x,t)
which is one of the possible definitions of the Ordinary Differential Equation, that ondeint library is meant to solve. However, few possible "hacks" can be used here to bypass this limitation.
One possibility is to use a fixed step and low order (because for higher order solvers you need to know, which part of the algorithm you are actually in, see RK4 for example) solver and change your dx[0] equation (in your code it is MxR[0][0] element) to:
# at the beginning of your system
if (x[0] > S02_lower): # everything is normal here
x0 = x[0]
dx0 = # normal equation for dx0
else: # x[0] is too low, we must somehow force it to become S02_upper again
dx0 = (x[0] - S02_upper)/step_size // assuming that x0_{n+1} = x0_{n} + dx0*step_size
x0 = S02_upper
# remember to use x0 in the rest of your code and also remember to return dx0
However, I do not recommend this technique, because it makes you strongly dependent on the algorithm and you must know the exact step size (although, I may recommend it for saturation constraints). Another possibility is to perform a single integration step at a time and correct your x0 each time it is necessary:
// 1 do_step(sys, in, dxdtin, t, out, dt);
// 2 do something with output
// 3 in = out
// 4 return to 1 or finish
Sorry for C++ syntax, here is the exhaustive documentation (C++ odeint steppers), and here is its equivalent in python (Python ode class). C++ odeint interface is better for your task, however you may achieve exactly the same in python. Just look for:
integrate(t[, step, relax])
set_initial_value(y[, t])
in docs.
Yesterday, I posted a question about general concept of SVM Primal Form Implementation:
Support Vector Machine Primal Form Implementation
and "lejlot" helped me out to understand that what I am solving is a QP problem.
But I still don't understand how my objective function can be expressed as QP problem
(http://en.wikipedia.org/wiki/Support_vector_machine#Primal_form)
Also I don't understand how QP and Quasi-Newton method are related
All I know is Quasi-Newton method will SOLVE my QP problem which supposedly formulated from
my objective function (which I don't see the connection)
Can anyone walk me through this please??
For SVM's, the goal is to find a classifier. This problem can be expressed in terms of a function that you are trying to minimize.
Let's first consider the Newton iteration. Newton iteration is a numerical method to find a solution to a problem of the form f(x) = 0.
Instead of solving it analytically we can solve it numerically by the follwing iteration:
x^k+1 = x^k - DF(x)^-1 * F(x)
Here x^k+1 is the k+1th iterate, DF(x)^-1 is the inverse of the Jacobian of F(x) and x is the kth x in the iteration.
This update runs as long as we make progress in terms of step size (delta x) or if our function value approaches 0 to a good degree. The termination criteria can be chosen accordingly.
Now consider solving the problem f'(x)=0. If we formulate the Newton iteration for that, we get
x^k+1 = x - HF(x)^-1 * DF(x)
Where HF(x)^-1 is the inverse of the Hessian matrix and DF(x) the gradient of the function F. Note that we are talking about n-dimensional Analysis and can not just take the quotient. We have to take the inverse of the matrix.
Now we are facing some problems: In each step, we have to calculate the Hessian matrix for the updated x, which is very inefficient. We also have to solve a system of linear equations, namely y = HF(x)^-1 * DF(x) or HF(x)*y = DF(x).
So instead of computing the Hessian in every iteration, we start off with an initial guess of the Hessian (maybe the identity matrix) and perform rank one updates after each iterate. For the exact formulas have a look here.
So how does this link to SVM's?
When you look at the function you are trying to minimize, you can formulate a primal problem, which you can the reformulate as a Dual Lagrangian problem which is convex and can be solved numerically. It is all well documented in the article so I will not try to express the formulas in a less good quality.
But the idea is the following: If you have a dual problem, you can solve it numerically. There are multiple solvers available. In the link you posted, they recommend coordinate descent, which solves the optimization problem for one coordinate at a time. Or you can use subgradient descent. Another method is to use L-BFGS. It is really well explained in this paper.
Another popular algorithm for solving problems like that is ADMM (alternating direction method of multipliers). In order to use ADMM you would have to reformulate the given problem into an equal problem that would give the same solution, but has the correct format for ADMM. For that I suggest reading Boyds script on ADMM.
In general: First, understand the function you are trying to minimize and then choose the numerical method that is most suited. In this case, subgradient descent and coordinate descent are most suited, as stated in the Wikipedia link.
I'm an engineering student. Pretty much all math I have to do is something in R2 or R3 and concerns differential geometry. Naturally I really like sympy because it makes my calculations reusable and presentable.
What I found:
The thing in sympy that comes closeset to what I know functions as, which is as mapping of scalar or vector values to scalar or vector values, with a name and connected to an expressions seems to be something of the form
functionname=sympy.Lambda(Variables in tuple, Expression)
or as an example
f=sympy.Lambda((x),x+1)
I also found that sympy has the diffgeom module that defines Manifolds, Patches and can then perform some operations on functions without expressions or points. Like translating a point in a coordinate system to the same point in a different, linked coordinate system.
I haven't found a way to perform those operations and transformations on functions like those above. Or to define something in the diffgeom context that performs like the Lambda function.
Examples of what I'd like to do:
scalarfield f (x,y,z) = expression
grad (f) = ( d expression / dx , d expression / dy , d expression / dz)^T
vectorfield v (x,y,z) = ( expression 1 , expression 2 , expression 3 )^T
I'd then like to be able to integrate the vectorfield over bodies or curves.
Do these things exist and I haven't found them?
Are they doable with diffgeom and I didn't understand it?
Would I have to write this myself with the backbones that sympy already provides?
There is a differential geometry module within sympy:
http://docs.sympy.org/latest/modules/diffgeom.html
For more examples you can see http://blog.krastanov.org/pages/diff-geometry-in-python.html
To do the suggested in the diffgeom module, just define your expression using the base coordinates of your manifold:
from diffgeom.rn import R2
scalar = (R2.r**2 - R2.x**2 - R2.y**2) # you can mix coordinate systems
gradient = (R2.e_x + R2.e_y).rcall(scalar)
There are various functions for change of coordinates, etc. Probably many things are missing, but it would take usage and bug reports (and help) for all this to get implemented.
You can see some other examples in the test files:
tested examples from a text book https://github.com/sympy/sympy/blob/master/sympy/diffgeom/tests/test_function_diffgeom_book.py
more tests https://github.com/sympy/sympy/blob/master/sympy/diffgeom/tests/test_diffgeom.py
However for doing what is suggested in your question, doing it through differential geometry (while possible) would be an overkill. You can just use the matrices module:
def gradient(expr, vars):
return Matrix([expr.diff(v) for v in vars])
More fancy things like matrix jacobians and more are implemented.
A final remark: using expressions instead of functions and lambdas will probably result in more readable and idiomatic sympy code (often it is more natural to use subs to substitute a symbols instead of some kind of closure, lambda, function call, etc).