How can I express scheduling problems in minisat? - constraint-programming

Minisat is a constraint programming/satisfaction tool, there is a version of Minisat which works here in the browser http://www.msoos.org/2013/09/minisat-in-your-browser/
How can I express a scheduling problem with Minisat? Is there a higher level language which compiles to Minisat which would let me express it?
I mean for solving problems like exam timetabling. http://docs.jboss.org/drools/release/6.1.0.Final/optaplanner-docs/html_single/#examination

Another high level modeling language is Picat (http://picat-lang.org/), which have an option to solve/2 to convert to CNF when using the sat module, e.g. "solve([dump], Vars)". The syntax when using the sat module - as well as for the cp and mip modules - is similar to standard CLP syntax.
For some Picat examples, see my Picat page: http://hakank.org/picat/ .

SAT solvers like Minisat or Cryptominisat typically read a clause set of logical OR expressions in Conjunctive Normal Form (CNF). It takes an encoding step to translate your problem into this CNF format.
Circuit SAT Solvers process a nested Boolean expression rather than a CNF. But it appears that this type of solvers is nowadays outperformed by the CNF SAT Solvers.
Constraint programming solvers like Minizinc use a high level language which is easier to write and to comprehend. Depending on the features being used, Minizinc can translate its input language into a CNF/DIMACS format suitable for a SAT solver.
Peter Stuckey's paper "There are no CNF Problems" explains the idea. His slides also contain some insights on scheduling.
Have a look at Minizinc examples for scheduling written by Hakan Kjellerstrand.
Emmanuel Hebrard's Scheduling and SAT is an extensive treatment of the topic.

I worked on this project few months ago.
It was really interesting to do.
To use miniSAT (or any other SAT solvers),
you will have to reduce the Scheduling Problem to a SAT problem.
I can recommand you this question that I asked in 3 parts.
Class Scheduling to Boolean satisfiability [Polynomial-time reduction]
Class Scheduling to Boolean satisfiability [Polynomial-time reduction] part 2
Class Scheduling to Boolean satisfiability [Polynomial-time reduction] Final Part
And you will basically see, step by step, how to transform the Scheduling Problem to a SAT problem that MiniSAT can read and solve :).
Thanks again to #amit who was a very big help in this project.
With this answer, you will be able to solve N rooms with T teachers, who are teaching S subjects to G different group of students :) which is I think, enough for 99% of Scheduling Problems.

Related

Best modelling language for modelling LP/MILP? (NOT solver)

I have a Gurobi licence and I am after a good MILP/LP modelling language, which should be
free/open source
intuitive, i.e. something that looks like (taken from MiniZinc)
var int: x;
constraint x >= 0.5;
solve minimize x;
fast: the time to build the model and send it to Gurobi should be of similar order to the best ones (AMPL GAMS etc.)
flexible/powerful (ability to deal with 3D+ arrays, activate/deactivate constraints easily, provide initial solutions to the solver, etc.)
Of course, and correct me if I'm wrong, AMPL GAMS fail at 1), Python and R fail at 2) (and perhaps at 3)?).
How about GLPK, Minizinc, ZIMPL etc.? They satisfy 1) and 2) but what about 3) and 4)? Are they as good as AMPL in this regard? If not, is there a modelling language satisfying 1-4?
I've used AMPL with Gurobi for mid-sized MIPs (~ 100k-1m variables?) and MiniZinc, mostly with Gecode, for smaller combinatorial problems. I've seen some Gurobi work done with R and Python, but haven't used it that way myself.
I'm less familiar with the other options. My understanding is that GAMS is quite similar to AMPL and much of what I have to say about AMPL may also be valid for GAMS, but I can't vouch for it.
Of course, and correct me if I'm wrong, AMPL GAMS fail at 1),
Yes, generally. There is an exception which probably isn't helpful for your specific requirements but might be useful to others: you can get free use of AMPL, Gurobi, and many other optimisation products, by using the NEOS web service. This is restricted to academic non-commercial purposes and you have to grant NEOS certain rights in relation to the problems you send them; definitely read those terms of service before using it. It also requires waiting for an available server, so if speed is a high priority this probably isn't the solution for you.
Python and R fail at 2) (and perhaps at 3)?).
In my limited experience, yes for (2). AMPL, GAMS, and MiniZinc are designed specifically for defining optimisation problems, so it's unsurprising that their syntax is more user-friendly for that purpose than languages like Python and R.
The flip-side to this is that if you want to do just about anything other than defining an optimisation problem with these languages, Python/R/etc. will probably be better for that purpose.
On speed: for the problems I usually work with, AMPL takes maybe a couple of seconds to build and presolve a MIP model which takes Gurobi a couple of minutes to solve. Obviously this is going to vary somewhat with hardware and details of the problem, but in general I would expect build time to be small compared to solve time for any of the solutions under discussion. Even with a good solver like Gurobi, big MIPs are hard. Many of the serious optimisation programmers I've met do use Python, so I presume the performance side is good enough.
However, that doesn't mean the choice of language/platform is irrelevant to speed. One of the nice features of AMPL (and also GAMS) is presolve, which attempts to reduce the problem size before sending it to the solver. My standard problems have a lot of redundant variables and constraints; AMPL identifies and eliminates many of these, reducing the problem size by about 80% and giving a noticeable improvement in solver time (as compared to runs where I switch off presolve, which I sometimes do for debugging-related reasons). This might be a consideration if you expect a lot of redundancy.
flexible/powerful (ability to deal with 3D+ arrays, activate/deactivate constraints easily, provide initial solutions to the solver, etc.)
MiniZinc handles up to 6D arrays, which may or may not be enough depending on your applications.
It's more flexible than AMPL in some areas and less so in others. AMPL has a lot of set-based functionality that I find useful (e.g. I can define a variable whose index set is something like "pairs of non-identical cities separated by no more than 500 km") and MiniZinc doesn't have this. OTOH, MiniZinc seems to be better than AMPL for solver-hopping, e.g. if I write a MZ model with a combinatorial constraint like "alldifferent" but then try to run it on a solver that doesn't recognise such constraints, MZ will translate it into something the solver can deal with.
I haven't tried deactivating constraints in MZ other than by commenting them out, so I can't help there, and similarly on providing initial solutions.
Overall, MiniZinc is a good choice to consider. Some pluses and minuses relative to AMPL ("free" being a big plus!) but it fills a similar niche.
IMHO, there is no such system if you consider the Python interfaces/modeling environments to SCIP or Gurobi too complicated:
x = model.addVar()
y = model.addVar(vtype="INTEGER")
model.setObjective(x + y)
model.addCons(2*x - y*y >= 0)
model.optimize()
To me this looks quite natural and straight forward. The immense benefit of using an actual programming language instead of modeling language is that you can do anything in there, while there will always be boundaries in the latter.
If you are a looking for a modeling GUI, you should check out LITIC. It can be used almost entirely with drag-and-drop operations: https://litic.com/showcase.html
I've used a lot of the options mentioned, and some not yet mentioned
GAMS
GAMS' Python API
GAMS' MATLAB API
AMPL
FICO Xpress Mosel
FICO Xpress Model's Python API
IBM ILOG OPL
Gurobi's Python API
PuLP (Python)
Pyomo (Python)
Python-MIP
JuMP (Julia)
MATLAB Optimization Toolbox
Google OR-Tools
Based on your requirements, I'd suggest trying Python-MIP, PuLP or JuMP. They are free and have easy syntax with no limit on array dimensionality.
Take a look at Google or-tools. I’m not sure if getting initial solution to the solver is available in all of its interfaces, but if you use it in python, it should probably satisfy all 1-4.

Is incomplete testing versus exhaustive analysis an apples-to-oranges comparison?

I often hear arguments like this: A disadvantage of traditional testing is that it is incomplete whereas Alloy analysis is exhaustive and complete (within a bound). But, the first is talking about software, the second is talking about models. Isn't it an apples-to-oranges comparison?
Update: I was wrong. The comparison is not this: testing code versus analyzing models. That is an apples-to-oranges comparison. Instead, the comparisons are these:
Testing models versus analysis of models.
Testing code versus analysis of code.
Those are apples-to-apples comparisons.
So, whether the artifact is a model or code, you can compare two kinds of analysis: testing, which corresponds to drawing a relatively small number of cases randomly, without a bound on the size, versus small scope analysis, which involves all cases within a small bound.
Thanks to Daniel Jackson for clearing up my misunderstanding.
First, when Alloy was invented, the only existing tools for analyzing models in data-rich languages such as Z and VDM that were not proof-based used scenarios to test the model. Each scenario was constructed by the user, so the approach suffered from the cost of creating the scenarios and the low coverage of their small number.
Second, Alloy has been used to find bugs in code: see the PhD theses by Mandana Vaziri, Mana Taghdiri, Greg Dennis, Juan Pablo Galeotti and others. In all of these, bugs were found that evaded conventional tests.
Third, it's worth noting that bounded-exhaustive forms of testing are becoming viable. Sarfraz Khurshid was a pioneer in this work with his thesis on generating test cases, initially in a tool called TestEra based on Alloy, and later (with Darko Marinov et al) in a tool called Korat that traded a more diected solving method for less declarative constraints.

Haskell: binding to fast and simple SAT solver

Today I wanted too look into options on SAT solving in haskell. First I tought about writing my own interface to the picosat solver.
Then I found out there is the SBV library.
It's interfaces to Z3, Yices, CVC4 and Boolector.
Also, I did a google search on github and it turs out there is even Picosat binding availiable.
Are there any other haskell bindings to SAT solvers that are worth looking at given the constraint of fast/high performance. Carification: that are as suitable for high performance SAT-solving (e.g., problems that run for days, as well as problems that need to finish as fast as possible as I check 2^20 or more SAT problems). For example, what I am particularly missing on hackage is a binding to a fast parrallel SAT solver like Plingeling. (Also, I found out about the current updated picosat binding on github more by accident and I very well might miss other options)
The default option of the SBV library is the Z3 SMT solver. Am I right in my educated guess that picosat is faster for plain SAT-solving than Z3?
Disclosure, I'm the author of the Haskell picosat bindings you mentioned.
SBV is really robust library that's been around for a while, it's good if you want an interface to external SMT or SAT solvers like Yices or Z3. Picosat is a much simpler library that I wrote simply because I wanted a library that could be installed simply without external dependencies.
Am I right in my educated guess that picosat is faster for plain SAT-solving than Z3?
I don't know what your performance constraints are, but as far as underlying solver libraries go you're not going to notice a significant difference between Z3 or Picosat until you hit really enormous problems ( billions of variables ). Both are very heavily optimized libraries and the bottleneck ( at least from the Haskell side ) is likely going to be marshalling data between the library and Haskell's runtime.
SBV is thread-safe.
Comparing Z3 and Lingeling for SAT performance is not an easy task. I'd hazard a guess that they would be more or less the same unless you take your time to figure out the exact parameters to fine tune their internal heuristics.
The good thing is that SBV provides a common interface, so you can change the solver by merely importing a different bridge:
import Data.SBV.Bridge.Z3
vs
import Data.SBV.Bridge.Boolector
and if you compile boolector to use lingeling, then you can test performance easily by merely changing one line of Haskell.

Constraint programming framework for scheduling problem

I'm going to work on a software application for project planning and I’m looking for a constraint programming library that supports interval arithmetic and constraints on real numbers.
The feature I have to implement is the scheduling of project.
Could you advise me a constraint programming framework for such problem?
Thanks in advance!
There is a nice book called "Constraint-Based Scheduling: Applying Constraint Programming to Scheduling Problems" co-authored by Philippe Baptiste, Claude Le Pape and Wim Nuijten.
IBM has a commercial tool called ILOG CPLEX specially designed for constraint solving as means of addressing scheduling questions http://www-01.ibm.com/software/integration/optimization/cplex-optimization-studio/ Free alternatives would be GeoCode http://www.gecode.org/ and Minion http://minion.sourceforge.net/ Both are C++-based.
You should however be aware that constraints solving over the reals is not always decidable and would depend on the kind of constraints you intend to use in your modeling. As long as your constraints are linear, you are safe ;-)

Machine learning in OCaml or Haskell?

I'm hoping to use either Haskell or OCaml on a new project because R is too slow. I need to be able to use support vectory machines, ideally separating out each execution to run in parallel. I want to use a functional language and I have the feeling that these two are the best so far as performance and elegance are concerned (I like Clojure, but it wasn't as fast in a short test). I am leaning towards OCaml because there appears to be more support for integration with other languages so it could be a better fit in the long run (e.g. OCaml-R).
Does anyone know of a good tutorial for this kind of analysis, or a code example, in either Haskell or OCaml?
Hal Daume has written several major machine learning algorithms during his Ph.D. (now he is an assistant professor and rising star in machine learning community)
On his web page, there are a SVM, a simple decision tree and a logistic regression all in OCaml. By reading these code, you can have a feeling how machine learning models are implemented in OCaml.
Another good example of writing basic machine learning models is Owl library for scientific and numeric computations in OCaml.
I'd also like to mention F#, a new .Net language similar to OCaml. Here's a factor graph model written in F# analyzing Chess play data. This research also has a NIPS publication.
While FP is suitable for implementing machine learning and data mining models. But what you can get here most is NOT performance. It is right that FP supports parallel computing better than imperative languages, like C# or Java. But implementing a parallel SVM, or decision tree, has very little relation to do with the language! Parallel is parallel. The numerical optimizations behind machine learning and data mining are usually imperative, writing them pure-functionally is usually hard and less efficient. Making these sophisticated algorithms parallel is very hard task in the algorithm level, not in the language level. If you want to run 100 SVM in parallel, FP helps here. But I don't see the difficulty running 100 libsvm parallel in C++, not to consider that the single thread libsvm is more efficient than a not-well-tested haskell svm package.
Then what do FP languages, like F#, OCaml, Haskell, give?
Easy to test your code. FP languages usually have a top-level interpreter, you can test your functions on the fly.
Few mutable states. This means that passing the same parameter to a function, this function always gives the same result, thus debugging is easy in FPs.
Code is succinct. Type inference, pattern matching, closures, etc. You focus more on the domain logic, and less on the language part. So when you write the code, your mind is mainly thinking about the programming logic itself.
Writing code in FPs is fun.
The only problem I can see is that OCaml doesn't really support multicore parallelism, while GHC has excellent support and performance. If you're looking to use multiple threads of execution, on multiple calls, GHC Haskell will be a lot easier.
Secondly, the Haskell FFI is more powerful (that is, it does more with less code) than OCaml's, and more libraries are avaliable (via Hackage: http://hackage.haskell.org ) so I don't think foreign interfaces will be a deciding factor.
As far as multi-language integration goes, combining C and Haskell is remarkably easy, and I say this as someone who is (unlike dons) not really much of an expert on either. Any other language that integrates well with C shouldn't be much trickier; you can always fall back to a thin interface layer in C if nothing else. For better or worse, C is still the lingua franca of programming, so Haskell is more than acceptable for most cases.
...but. You say you're motivated by performance issues, and want to use "a functional language". From this I infer you're not previously familiar with the languages you ask about. Among Haskell's defining features are that it, by default, uses non-strict evaluation and immutable data structures--which are both incredibly useful in many ways, but it also means that optimizing Haskell for performance is often dramatically different from other languages, and well-honed instincts may lead you astray in baffling ways. You may want to browse performance-related topics on the Haskell wiki to get a feel for the issues.
Which isn't to say that you can't do what you want in Haskell--you certainly can. Both laziness and immutability can in fact be exploited for performance benefits (Chris Okasaki's thesis provides some nice examples). But be aware that there'll be a bit of a learning curve when it comes to dealing with performance.
Both Haskell and OCaml provide the lovely benefits of using an ML-family language, but for most programmers, OCaml is likely to offer a gentler learning curve and better immediate results.
It's hard to give a definitive answer on this. Haskell has the advantages that Don mentioned along with having a more powerful type system and cleaner syntax. OCaml will be easier to learn if you coming from almost any other language (this is because Haskell is as function as functional languages get), and working with mutable random access structures can be a little clunky in Haskell. You will also likely find the performance characteristics of your OCaml code more intuitive than Haskell because of Haskell's lazy evaluation.
Really, I would recommend you evaluate both if you have the time. Here are some relevant Haskell resources:
http://hackage.haskell.org/package/hslibsvm
http://hackage.haskell.org/package/HSvm
Real World Haskell: this is a great freely available book for Haskell
Learn You a Haskell: this tutorial is just plain fun to read
Oh, if you look further into Haskell be sure to sign up for the Haskell Beginners and Haskell Cafe lists. The community is friendly and eager to help out newcomers (is my bias showing?).
If speed is your prime concern then go for C. Haskell is pretty good performance wise but you are never going to get as fast as C. To my knowledge the only functional language that has bettered C in a benchmark is Stalin Scheme but that is very old and nobody really knows how it works.
I've written genetic programming libraries where performance was key and I wrote it in a functional style in C. The functional style allowed me to easily parallelise it using OMP and it scales linearly upto 8 cores within a single process. You certainly can't do that in OCaml although Haskell is improving all the time with regards to concurrency and parallelism.
The downside of using C was that it took me months to finally find all the bugs and stop the core dumps which was extremely challenging because of the concurrency. Haskell would probably have caught 90% of those bugs on the first compilation.
So speed at any cost ? Looking back I'd wish I'd used Haskell as I could stand it to be 2 - 3 times slower if I'd saved over a month in development time.
While dons is correct that multicore parallelism at the thread level is better supported in Haskell, it sounds like you could live with process level parallelism (from your phrase: ideally separating out each execution to run in parallel.) which is supported quite well in OCaml. Keith pointed out that Haskell has a more powerful type system, but it can also be said that OCaml has a more powerful module system than Haskell.
As others have pointed out, OCaml's learning curve will be lower than Haskell's; you'll likely be more productive more quickly in OCaml. That said, learning OCaml is a great stepping-stone towards learning Haskell because many of the underlying concepts are very similar, so you could always migrate to Haskell later and find a lot of things familiar there. And as you pointed out, there is an OCaml-R bridge.
As an examples of Haskell and Ocaml in machine learning see stuff at Hal Daume and Lloyd Allison homepages. IMO it's is much more straightforward to achieve C++-like performance in Ocaml, than in Haskell. Through, as already said, Haskell has much nicer community (packages, tools and support), syntax&features (i.e. FFI, probability monads via typeclasses) and parallel programming support.
Having revamped OCaml-R, I've got a few comments to make on integrating OCaml and R. It might be worthwile to use OCaml to call R code, it works, but is not yet exactly straightforward. So using it to pilot R is worthwile. Integrating R functionality much more thoroughly is still cumbersome as, for example, much remains to be done to export R's type system and data to OCaml in a seamless way (you will have work to do). Moreover, the interaction of R's GC and OCaml's GC is a delicate point: you free n values in O(n^2) time, which isn't nice (to solve this point, you either need a more flexible R API, as far as I understand it, or to implement a GC in the binding itself as a big R array for proper interaction between GCs).
In a nutshell, I'd go for the "pilot R from OCaml" approach.
Contributions on the GC interaction layer and on mapping R datatypes to OCaml are most welcome.
You may want to take a look at this : http://www.haskell.org/pipermail/haskell-cafe/2010-May/077243.html
Late answer but a machine learning library in Haskell is available here : https://github.com/mikeizbicki/HLearn
This library implements various ML algorithms who are designed to have a much faster cross-validation than the usual implementations. It is based on the following paper Algebraic classifiers: a generic approach to fast cross-validation,
online training, and parallel training. The authors claims a 400x speed-up compared to the same task in Weka.
for haskell, consider checking hasktorch (which I managed to use for my AI thesis). for ocaml there seem to be tensorflow bindings.

Resources