Best modelling language for modelling LP/MILP? (NOT solver) - modeling

I have a Gurobi licence and I am after a good MILP/LP modelling language, which should be
free/open source
intuitive, i.e. something that looks like (taken from MiniZinc)
var int: x;
constraint x >= 0.5;
solve minimize x;
fast: the time to build the model and send it to Gurobi should be of similar order to the best ones (AMPL GAMS etc.)
flexible/powerful (ability to deal with 3D+ arrays, activate/deactivate constraints easily, provide initial solutions to the solver, etc.)
Of course, and correct me if I'm wrong, AMPL GAMS fail at 1), Python and R fail at 2) (and perhaps at 3)?).
How about GLPK, Minizinc, ZIMPL etc.? They satisfy 1) and 2) but what about 3) and 4)? Are they as good as AMPL in this regard? If not, is there a modelling language satisfying 1-4?

I've used AMPL with Gurobi for mid-sized MIPs (~ 100k-1m variables?) and MiniZinc, mostly with Gecode, for smaller combinatorial problems. I've seen some Gurobi work done with R and Python, but haven't used it that way myself.
I'm less familiar with the other options. My understanding is that GAMS is quite similar to AMPL and much of what I have to say about AMPL may also be valid for GAMS, but I can't vouch for it.
Of course, and correct me if I'm wrong, AMPL GAMS fail at 1),
Yes, generally. There is an exception which probably isn't helpful for your specific requirements but might be useful to others: you can get free use of AMPL, Gurobi, and many other optimisation products, by using the NEOS web service. This is restricted to academic non-commercial purposes and you have to grant NEOS certain rights in relation to the problems you send them; definitely read those terms of service before using it. It also requires waiting for an available server, so if speed is a high priority this probably isn't the solution for you.
Python and R fail at 2) (and perhaps at 3)?).
In my limited experience, yes for (2). AMPL, GAMS, and MiniZinc are designed specifically for defining optimisation problems, so it's unsurprising that their syntax is more user-friendly for that purpose than languages like Python and R.
The flip-side to this is that if you want to do just about anything other than defining an optimisation problem with these languages, Python/R/etc. will probably be better for that purpose.
On speed: for the problems I usually work with, AMPL takes maybe a couple of seconds to build and presolve a MIP model which takes Gurobi a couple of minutes to solve. Obviously this is going to vary somewhat with hardware and details of the problem, but in general I would expect build time to be small compared to solve time for any of the solutions under discussion. Even with a good solver like Gurobi, big MIPs are hard. Many of the serious optimisation programmers I've met do use Python, so I presume the performance side is good enough.
However, that doesn't mean the choice of language/platform is irrelevant to speed. One of the nice features of AMPL (and also GAMS) is presolve, which attempts to reduce the problem size before sending it to the solver. My standard problems have a lot of redundant variables and constraints; AMPL identifies and eliminates many of these, reducing the problem size by about 80% and giving a noticeable improvement in solver time (as compared to runs where I switch off presolve, which I sometimes do for debugging-related reasons). This might be a consideration if you expect a lot of redundancy.
flexible/powerful (ability to deal with 3D+ arrays, activate/deactivate constraints easily, provide initial solutions to the solver, etc.)
MiniZinc handles up to 6D arrays, which may or may not be enough depending on your applications.
It's more flexible than AMPL in some areas and less so in others. AMPL has a lot of set-based functionality that I find useful (e.g. I can define a variable whose index set is something like "pairs of non-identical cities separated by no more than 500 km") and MiniZinc doesn't have this. OTOH, MiniZinc seems to be better than AMPL for solver-hopping, e.g. if I write a MZ model with a combinatorial constraint like "alldifferent" but then try to run it on a solver that doesn't recognise such constraints, MZ will translate it into something the solver can deal with.
I haven't tried deactivating constraints in MZ other than by commenting them out, so I can't help there, and similarly on providing initial solutions.
Overall, MiniZinc is a good choice to consider. Some pluses and minuses relative to AMPL ("free" being a big plus!) but it fills a similar niche.

IMHO, there is no such system if you consider the Python interfaces/modeling environments to SCIP or Gurobi too complicated:
x = model.addVar()
y = model.addVar(vtype="INTEGER")
model.setObjective(x + y)
model.addCons(2*x - y*y >= 0)
model.optimize()
To me this looks quite natural and straight forward. The immense benefit of using an actual programming language instead of modeling language is that you can do anything in there, while there will always be boundaries in the latter.
If you are a looking for a modeling GUI, you should check out LITIC. It can be used almost entirely with drag-and-drop operations: https://litic.com/showcase.html

I've used a lot of the options mentioned, and some not yet mentioned
GAMS
GAMS' Python API
GAMS' MATLAB API
AMPL
FICO Xpress Mosel
FICO Xpress Model's Python API
IBM ILOG OPL
Gurobi's Python API
PuLP (Python)
Pyomo (Python)
Python-MIP
JuMP (Julia)
MATLAB Optimization Toolbox
Google OR-Tools
Based on your requirements, I'd suggest trying Python-MIP, PuLP or JuMP. They are free and have easy syntax with no limit on array dimensionality.

Take a look at Google or-tools. I’m not sure if getting initial solution to the solver is available in all of its interfaces, but if you use it in python, it should probably satisfy all 1-4.

Related

Consistent terminology: Modeling, DAE, ODE

I am new to the subject "modeling of physical systems". I read some basic literature and did some tutorials in Modelica and Simulink/Simscape. I wanted to ask you, if I understand the following content correctly:
Symbolic manipulation is the process of transforming a differential-algebraic system of equation (physical model: DAE) into a system of differential equations (ODE) that can be solved by standard solvers (Runge, Kutta, BDF, ...)
There are also solver that can solve DAE's directly. But Modelica (openModelica, Dymola) and Simscape transfer the System into an ODE (why are this methods better compared to direct DAE solvers?)
A "flat Modelica code" is the result ( = ODE) of the transformation.
Thank you very much for your answers.
Symbolic processing for Modelica includes:
remove object oriented structure and obtain an hybrid DAE (flat Modelica)
perform matching, index reduction, casualization to get an ODE
perform optimization (tearing, common subexpression elimination, etc)
generate code for a particular solver
OpenModelica can also solve the system in DAE mode without transforming it to ODE and I guess other Modelica tools can also do that.
A "flat Modelica code" is Modelica code where the object orientation is removed, connect equations are expanded to normal equations. The result is a hybrid DAE.
See Modelica Spec 3.3 for more info about all this (for example Appendix C):
https://modelica.org/documents/ModelicaSpec33Revision1.pdf
So I think your understanding of the terminology is very good too.
Due to the declarative way (opposed to imperative) of programming in modelica, we get immediately very high numbers of algebraic equations. Solving these (partly) symbolically has, above all, these essential advantages:
Speed. Without eliminating algebraic loops, modelica would not be practically usable for any real-world problem and even then only in simple cases no algebraic equations remain. It would be too slow and would force you to do transformations manually yourself in modelica too (as in imperative languages e.g. in C/C++ or Simulink). Even today modelica can still be slower than manually transformed and optimized solutions.
Moreover modelica applications often need simulations in real-time.
Correctness. Symbolic transformations are based on proofs and modelica applications often are in the area of safety critical or cyber-physical systems.
One additional consideration is that there are different forms of DAEs, and modeling often lead to high-index DAEs that are complicated to solve numerically (*). (Note "high" means index greater than 1, typically 2 - but sometimes even higher.)
Symbolic transformations can reduce high-index DAEs to semi-explicit index 1 DAEs, and then by (numerically) solving the systems of equations they are transformed into ODEs.
Thus even if a tool solves DAEs directly it is normally the semi-explicit index 1 DAEs that are solved, not the original high index DAE.
(I know this answer is late. The hybrid part for the symbolic transformations is more complicated, still working on that.)
For more information see https://en.wikipedia.org/wiki/Differential-algebraic_system_of_equations
(*): There are some solvers for high index DAEs (in particular index 2), but typically they rely on a specific structure of the model and finding that structure requires similar techniques as reducing the index to 1.

Timetabling/Scheduling Library

I'm looking for a library to help me solve a constraint based logic problem where I need to schedule a number of different events of varying duration. The events have different attributes associated with them and my main issue is that I need to encode "preferences" based on these attributes. These preferences aren't hard constraints, but I would like to maximise how well they are satisfied in the solution. There are also different preferences of competing priorities.
I've taken a look at a few constraint solvers (Sat4j, clasp, Glucose, GlueMiniSat, etc.) but from what I've seen they all seem to only deal with fixed constraints, and setting up preferences would be non-trivial.
I don't care too much about what technology/language it's in - I'm happy to write a wrapper around it.
Absolutely, Choco Solver is a powerful Java constraint solver that is often used for scheduling and planning.
Let's take the following example:
"it would be nice if x = 10"
You can encode preferences in different ways.
1) through variables and constraints.
1.1) reify the constraint with a binary variables
ICF.arithm(x,"=",10).reifyWith(b);
it basically means b = 1 <=> x = 10 (so the constraint may or may not be satisfied), then you can maximise b (possibly with a weight)
1.2) through gap variables
solver.post(ICF.arithm(x,'-',gap,"=",10);
then you can minimise the absolute value of gap (possibly with a weight)
to the constraint.
2) through search : when solving the problem ask the search strategy to try x=10 before trying another value. There is not optimality proof but it works quite well in practice.
Hope this help. Please feel free to contact us for more support on Choco Solver www.cosling.com
best,
I think OptaPlanner is a tool that can help you to solve this problem, check this:
OptaPlanner is a constraint satisfaction solver. It optimizes business
resource planning. Every organization faces scheduling puzzles: assign
a limited set of constrained resources (employees, assets, time and
money) to provide products or services to customers. OptaPlanner
optimizes such planning problems to do more business with less
resources. Use cases include Vehicle Routing, Employee Rostering, Job
Scheduling, Bin Packing and many more.
OptaPlanner is a lightweight, embeddable planning engine. It enables
normal Java™ programmers to solve optimization problems efficiently.
Constraints apply on plain domain objects and can reuse existing code.
There’s no need to input difficult mathematical equations. Under the
hood, OptaPlanner combines sophisticated optimization heuristics and
metaheuristics (such as Tabu Search, Simulated Annealing and Late
Acceptance) with very efficient score calculation.
OptaPlanner is open source software, released under the Apache
Software License. It is written in 100% pure Java™, runs on any JVM
and is available in the Maven Central repository too.
Source:
http://www.optaplanner.org/
It's part of Drools, which has another interesting tools:
http://www.drools.org/
Another actively-maintained library is "choco-solver".
website
github
Another alternative is the Gecode Toolkit. It is an open-source and modern Constraint Programming Solver.

Opinions on and Experiences with Excel 2010's Evolutionary Solver method

Microsoft has augmented the existing Simplex (Linear) and Gradient (Non-linear) solver engines of the standard Solver Add-In by an Evolutionary solver engine aiming at non-smooth discontinuous problems where global optimal solutions are generally hard (or most of the time even impossible) to find with the other engines. In fact, it is one of the solvers that was previously only available through Frontline's Premium Solver product line, so I think it can be considered a generous addition to the standard solver that ships with Excel.
I haven't heard a lot about people using this new engine and guess that most solver users haven't noticed this recent addition by Microsoft. I become aware of it here: http://office.microsoft.com/en-us/excel-help/what-s-new-in-excel-2010-HA010369709.aspx
I would therefore like to hear about your opinions and experiences with it, also with respect to reasonable settings as it seems to take a lot more time to converge than the other methods.
I've used Evolutionary solver engine developing a MPS (master production schedule) trying to involve most aspects and get an optimal solution, I've found this:
Sometimes it gives the optimal solution, but sometimes you have to give it some kind of hints, such as moving some variables, as will, and see what happens, therefore, I wouldn't recommend it for final decisions, but give it a chance!

Machine learning in OCaml or Haskell?

I'm hoping to use either Haskell or OCaml on a new project because R is too slow. I need to be able to use support vectory machines, ideally separating out each execution to run in parallel. I want to use a functional language and I have the feeling that these two are the best so far as performance and elegance are concerned (I like Clojure, but it wasn't as fast in a short test). I am leaning towards OCaml because there appears to be more support for integration with other languages so it could be a better fit in the long run (e.g. OCaml-R).
Does anyone know of a good tutorial for this kind of analysis, or a code example, in either Haskell or OCaml?
Hal Daume has written several major machine learning algorithms during his Ph.D. (now he is an assistant professor and rising star in machine learning community)
On his web page, there are a SVM, a simple decision tree and a logistic regression all in OCaml. By reading these code, you can have a feeling how machine learning models are implemented in OCaml.
Another good example of writing basic machine learning models is Owl library for scientific and numeric computations in OCaml.
I'd also like to mention F#, a new .Net language similar to OCaml. Here's a factor graph model written in F# analyzing Chess play data. This research also has a NIPS publication.
While FP is suitable for implementing machine learning and data mining models. But what you can get here most is NOT performance. It is right that FP supports parallel computing better than imperative languages, like C# or Java. But implementing a parallel SVM, or decision tree, has very little relation to do with the language! Parallel is parallel. The numerical optimizations behind machine learning and data mining are usually imperative, writing them pure-functionally is usually hard and less efficient. Making these sophisticated algorithms parallel is very hard task in the algorithm level, not in the language level. If you want to run 100 SVM in parallel, FP helps here. But I don't see the difficulty running 100 libsvm parallel in C++, not to consider that the single thread libsvm is more efficient than a not-well-tested haskell svm package.
Then what do FP languages, like F#, OCaml, Haskell, give?
Easy to test your code. FP languages usually have a top-level interpreter, you can test your functions on the fly.
Few mutable states. This means that passing the same parameter to a function, this function always gives the same result, thus debugging is easy in FPs.
Code is succinct. Type inference, pattern matching, closures, etc. You focus more on the domain logic, and less on the language part. So when you write the code, your mind is mainly thinking about the programming logic itself.
Writing code in FPs is fun.
The only problem I can see is that OCaml doesn't really support multicore parallelism, while GHC has excellent support and performance. If you're looking to use multiple threads of execution, on multiple calls, GHC Haskell will be a lot easier.
Secondly, the Haskell FFI is more powerful (that is, it does more with less code) than OCaml's, and more libraries are avaliable (via Hackage: http://hackage.haskell.org ) so I don't think foreign interfaces will be a deciding factor.
As far as multi-language integration goes, combining C and Haskell is remarkably easy, and I say this as someone who is (unlike dons) not really much of an expert on either. Any other language that integrates well with C shouldn't be much trickier; you can always fall back to a thin interface layer in C if nothing else. For better or worse, C is still the lingua franca of programming, so Haskell is more than acceptable for most cases.
...but. You say you're motivated by performance issues, and want to use "a functional language". From this I infer you're not previously familiar with the languages you ask about. Among Haskell's defining features are that it, by default, uses non-strict evaluation and immutable data structures--which are both incredibly useful in many ways, but it also means that optimizing Haskell for performance is often dramatically different from other languages, and well-honed instincts may lead you astray in baffling ways. You may want to browse performance-related topics on the Haskell wiki to get a feel for the issues.
Which isn't to say that you can't do what you want in Haskell--you certainly can. Both laziness and immutability can in fact be exploited for performance benefits (Chris Okasaki's thesis provides some nice examples). But be aware that there'll be a bit of a learning curve when it comes to dealing with performance.
Both Haskell and OCaml provide the lovely benefits of using an ML-family language, but for most programmers, OCaml is likely to offer a gentler learning curve and better immediate results.
It's hard to give a definitive answer on this. Haskell has the advantages that Don mentioned along with having a more powerful type system and cleaner syntax. OCaml will be easier to learn if you coming from almost any other language (this is because Haskell is as function as functional languages get), and working with mutable random access structures can be a little clunky in Haskell. You will also likely find the performance characteristics of your OCaml code more intuitive than Haskell because of Haskell's lazy evaluation.
Really, I would recommend you evaluate both if you have the time. Here are some relevant Haskell resources:
http://hackage.haskell.org/package/hslibsvm
http://hackage.haskell.org/package/HSvm
Real World Haskell: this is a great freely available book for Haskell
Learn You a Haskell: this tutorial is just plain fun to read
Oh, if you look further into Haskell be sure to sign up for the Haskell Beginners and Haskell Cafe lists. The community is friendly and eager to help out newcomers (is my bias showing?).
If speed is your prime concern then go for C. Haskell is pretty good performance wise but you are never going to get as fast as C. To my knowledge the only functional language that has bettered C in a benchmark is Stalin Scheme but that is very old and nobody really knows how it works.
I've written genetic programming libraries where performance was key and I wrote it in a functional style in C. The functional style allowed me to easily parallelise it using OMP and it scales linearly upto 8 cores within a single process. You certainly can't do that in OCaml although Haskell is improving all the time with regards to concurrency and parallelism.
The downside of using C was that it took me months to finally find all the bugs and stop the core dumps which was extremely challenging because of the concurrency. Haskell would probably have caught 90% of those bugs on the first compilation.
So speed at any cost ? Looking back I'd wish I'd used Haskell as I could stand it to be 2 - 3 times slower if I'd saved over a month in development time.
While dons is correct that multicore parallelism at the thread level is better supported in Haskell, it sounds like you could live with process level parallelism (from your phrase: ideally separating out each execution to run in parallel.) which is supported quite well in OCaml. Keith pointed out that Haskell has a more powerful type system, but it can also be said that OCaml has a more powerful module system than Haskell.
As others have pointed out, OCaml's learning curve will be lower than Haskell's; you'll likely be more productive more quickly in OCaml. That said, learning OCaml is a great stepping-stone towards learning Haskell because many of the underlying concepts are very similar, so you could always migrate to Haskell later and find a lot of things familiar there. And as you pointed out, there is an OCaml-R bridge.
As an examples of Haskell and Ocaml in machine learning see stuff at Hal Daume and Lloyd Allison homepages. IMO it's is much more straightforward to achieve C++-like performance in Ocaml, than in Haskell. Through, as already said, Haskell has much nicer community (packages, tools and support), syntax&features (i.e. FFI, probability monads via typeclasses) and parallel programming support.
Having revamped OCaml-R, I've got a few comments to make on integrating OCaml and R. It might be worthwile to use OCaml to call R code, it works, but is not yet exactly straightforward. So using it to pilot R is worthwile. Integrating R functionality much more thoroughly is still cumbersome as, for example, much remains to be done to export R's type system and data to OCaml in a seamless way (you will have work to do). Moreover, the interaction of R's GC and OCaml's GC is a delicate point: you free n values in O(n^2) time, which isn't nice (to solve this point, you either need a more flexible R API, as far as I understand it, or to implement a GC in the binding itself as a big R array for proper interaction between GCs).
In a nutshell, I'd go for the "pilot R from OCaml" approach.
Contributions on the GC interaction layer and on mapping R datatypes to OCaml are most welcome.
You may want to take a look at this : http://www.haskell.org/pipermail/haskell-cafe/2010-May/077243.html
Late answer but a machine learning library in Haskell is available here : https://github.com/mikeizbicki/HLearn
This library implements various ML algorithms who are designed to have a much faster cross-validation than the usual implementations. It is based on the following paper Algebraic classifiers: a generic approach to fast cross-validation,
online training, and parallel training. The authors claims a 400x speed-up compared to the same task in Weka.
for haskell, consider checking hasktorch (which I managed to use for my AI thesis). for ocaml there seem to be tensorflow bindings.

Comparison of GAMS versus AMPL Algebraic Modelling Languages

I'd be interested in getting the opinion from users of GAMS and AMPL on what the strength and weaknesses of each these languages are.
In terms of functionality they are pretty much the same allowing to express most types of optimization problems. Personally, I prefer AMPL because it has intuitive and expressive syntax and it is very well documented in the book. Another important advantage of AMPL is that despite the fact that it is commercial you can avoid the vendor lock-in because there is an open source alternative - GNU MathProg. GAMS on the other hand used to have a more advanced IDE than those that existed for AMPL although it changed with the introduction of the new AMPL IDE.
You can find an example of the same transportation problem from George Dantzig formulated in AMPL and GAMS in their Wikipedia articles: AMPL and GAMS.
This blog has the following to say:
Both systems are very good in what they are doing and widely used, so you cannot
really go wrong with either choice. I would probably suggest to add extra points
for the modeling system that is used by your colleagues and collaborators. That
makes exchanging models and data easier and also is easier when discussing
problems, tricks, issues etc.
Bob Fourer (AMPL) answered:
It's hard to find someone who can give equally expert advice on two competing
systems, as once you become familiar with one of them you don't usually have
much incentive to keep learning about the other. But here are a few comments
from my hardly unbiased view.
AMPL was designed with the idea of being much closer to mathematical notation
and generally much more natural to use than GAMS, and it's superior on that
score. A GAMS model typically relies on more special conventions and
reformulations than its AMPL counterpart; a case in point is the often extensive
use of the GAMS $ operator to impose various conditions. Also, the IDE
notwithstanding, GAMS is fundamentally more of a batch system whereas AMPL
offers a more flexible option of interactively exploring models and results.
Finally while in certain areas GAMS is established through long use, still I see
modelers in these areas choosing AMPL, particularly when they are undertaking
new projects that do not depend on existing GAMS models.
In my opinion AMPL and GAMS are closer in practice than suggested here (e.g.
where you use $ in GAMS, one would use : in AMPL). I actually slightly prefer
the GAMS syntax when doing real work, as it is a little bit more compact and it
is obvious where a summation ends (in AMPL this is based on operator priority,
in GAMS a sum is visually bracketed by parentheses).
In my opinion, all syntax considerations are really a matter of taste; both AMPL and GAMS languages are easy to learn and offer arguably the same scope in terms of the types of models that can be considered.
At the moment of writing this post, GAMS offers a larger number of solvers. That being said, AMPL's list of solvers is not a subset of the list of GAMS solvers. For a specific application, I suggest benchmarking solvers before buying either AMPL or GAMS (for example, via the NEOS server for optimization).
Personally, I prefer the syntax of AMPL since it is closer to mathematical notation. However, I prefer GAMS for industrial applications mainly because of solver availability and because it is embedded/proven in many industries. This often simplifies dialogue with an industrial partner/client who already uses GAMS.

Resources