How to use Picat to create CNF files from Minizinc files? - constraint-programming

I'm interested in counting the number of solutions a problem have (not enumerating the solutions). For that, I have tools that uses CNF files. I want to convert Minizinc files (mzn or Flatzinc fzn format) and convert it to CNF.
I learned Picat has the ability to "dump" a CNF file after loading the constraints. Moreover, Picat has a clever Module that interprets basic Flatzinc files. I modified the module fzn_picat_sat.pi to "dump" the CNF file. So, what I do is that I convert a Minizinc file to Flatzinc using mzn2fzn, then I try to convert it to CNF using my (slightly) modified version of fzn_picat_sat.pi.
What I want : I expect Picat to load the Flatzinc files and dump an appropriate CNF file. If the original problems has X solutions, I expect the corresponding CNF to have X solutions.
What happens : Most Flatzinc files I tried worked just fine. But some seem to give unwanted results. For example, the following mzn translate to this Flatzinc file. That file has only 211 solutions, but the CNF file dumped by Picat has over 130k solutions. Many SAT solvers can show that the CNF file has over 211 solutions (for example cryptominisat with the option --maxsol). Weirdly, when I ask Picat to solve the Flatzinc file without translating to CNF, Picat does find only 211 solutions. The problem seems to be somewhere in the translation. Finally, even if the CNF file has the good number of solutions using Picat, I do receive an error % fzn_interpretation_error.
If anyone tried something like that and succeeded, or if anyone knows how to translate from a CP (constraint programming) language to CNF, that would be much appreciated. Thanks everyone.

As mentioned in the comments by Axel Kemper, MiniZinc may add additional variables that should not be used to differentiate multiple solutions. As a simple example, consider the following (artificial) MiniZinc model
predicate separated(var int: x, var int: y) =
let {
var int: z
} in
x < z /\ z < y;
var 1..10: x;
var 1..10: y;
constraint separated(x, y);
solve satisfy;
Here the predicate separated adds another variable that is just used as a witness that there is some number between x and y (the predicate is equivalent to abs(x-y)>0).
The model has 36 solutions (1,3, 1,4, ..., 8,10). However, if you consider the solution 3,8, there are 5 different choices of z that makes the predicate true. In general, a user is most likely only interested in solutions that differ in the output variables.
When translating the above MiniZinc to CNF the predicate internal z-variable will not be distinguished from the "real" problem variables x and y. For counting the solutions, you would need to distinguish the literals that correspond to the domains of output variables in the model, and to instruct the SAT solver to only consider two solutions different if those literals are different (not sure if that is a feature that is available).

Related

What does major, minor, revison, transpose means in loading weights in tensorflow?

Loading weights of YOLOv3 Object-detection-with-yolov3-in-keras
Can somebody please tell me what is happening here in loading weights and the meaning of transpose statement
with open(weight_file, 'rb') as w_f:
major, = struct.unpack('i', w_f.read(4))
minor, = struct.unpack('i', w_f.read(4))
revision, = struct.unpack('i', w_f.read(4))
if (major*10 + minor) >= 2 and major < 1000 and minor < 1000:
w_f.read(8)
else:
w_f.read(4)
transpose = (major > 1000) or (minor > 1000)
binary = w_f.read()
It is parsing the weights file, which is saved in the format used by the Darknet framework (source code). This format is not formally specified anywhere outside the actual framework code, as far as I can tell. The relevant parts are in the file src/parser.c, functions save_weights_upto and load_weights_upto. As you can probably tell, that piece of Python code seems to be a direct translation of the corresponding C code.
It would seem that the file format begins with three 32-bit integers (although the C code uses sizeof(int), which is not necessarily 32 bits, but whatever) corresponding to the major, minor and revision values of the file format version. Then comes a seen attribute for the network that, depending on the file version, can be int or size_t, usually meaning 32 or 64 bits. Then, transpose is a boolean depending on whether the major and minor versions are greater than 1000. The usage of the major, minor and revision fields seems to be, to say the least, unorthodox.
In the Python code, neither seen or transpose are used. transpose is read into a variable, but then that variable is not used. In theory it should, though. I imagine the code works for the file linked in the post, but if transpose happened to be true, the loading of the weights should be different, as you can see in load_connected_weights.
In any case, all of that are very specific concerns related to the file format of Darknet used in that example. Unless you want to explicitly be compatible with several different weight files from Darknet models, you probably don't need to worry too much about that code, as long as it is working for that case.

Computing numeric derivatives with pyomo

I need to repeatedly compute derivatives at a given point of nonlinear pyomo constraints. According to this post, there are basically two options: one symbolic approach (which uses sympy) and one numeric approach via the NL file generated by Pyomo.
I have tried the symbolic approach which looks like:
def gradient(constr, variables):
grad_num = np.array([value(partial) for partial in
differentiate(constr.body, wrt_list=variables)])
if (not (is_leq_constr(constr))):
grad_num = -grad_num
return grad_num
In principle, this approach works. However, when dealing with larger problems, the function call takes very long and I would expect the numeric approach to be much faster.
Is this true? If so, can anyone help me, to "translate" the above function to the gjh_asl_json approach? I am grateful for any hint.

Algorithm to detect if a file(or string) have been patched

This question is related to string algorithm, not version control tools or management tools.
I learnt the diff algorithm and tried to implement one. That is, given string A and string B, the diff calculate a sequence of actions that can convert A into B.
I wonder, if it possible, given a string S, and a sequence of actions that diff algorithm can produce, the algorithm will tell if the string S is (a) the origin string A, (b) the patched string B, (c) unrelated string. And what if S is only one of A and B.
Actuallly, what I'm really doing is researching a method that can tell if a patch have been applied (source code level or binary code level). I tried google some time, but didn't find something useful.
It's pretty complicated, but it can be done, on some level.
Essentially, you parse the source level into tokens, after that, you build the abstract syntax tree. Once that is done, you must build a diff tool that can do semantic differential analysis between abstract syntax trees. SemanticMerge for example, does that.
Once that is done, you have semantical difference between two source codes, and then you need to define what exactly consists of a patch.
Some of the rules can be:
1) Variable content was changed
2) A if check was added
The bottom line is, differenting between patch and new functionality is not an easy task. The most reliable way is to probably check the binary file version numbers, and understand the versioning schema.
Eg, only minor version is updated, if patches are applied.

Manipulating/Clearing Variables via Lists: Mathematica

My problem (in Mathematica) is referring to variables given in a particular array and manipulating them in the following manner (as an example):
Inputs: vars={x,y,z}, system=some ODE like x^2+3*x*y+...etc
(note that I haven't actually created variables x y and z)
Aim:
To assign values to the variables in the list "var" with the intention of inputting these values into the system of ODEs. Then, once I am done, clear the values of the variables in the array vars so that it is in its original form {x,y,z} (and not something like {x,1,3} where y=1 and z=3). I want to do this by referring to the positional elements of vars (I aim not to know that x, y and z are the actual variables).
The reason why: I am trying to write a program that can have any number of variables and ODEs as defined by the user. Since the number of variables and the actual letters used for them are unknown, it is necessary to perform manipulations with the array itself.
Attempt:
A fixed number of variables is easy. For the arbitrary case, I have tried modules and blocks, but with no success. Consider the following code:
Clear[x,y,z,vars,svars]
vars={x,y,z}
svars=Map[ToString,vars]
Module[{vars=vars,svars=svars},
Symbol[svars[[1]]]//Evaluate=1
]
then vars={1,y,z} and not {x,y,z} after running this. I have done functional programming with lists, atoms etc. Thus is makes sense to me that vars is changed afterwards, because I have changed x and not vars. However, I cannot get "x" in the list of variables to remain local. Of course I could put in "x" itself, but that is particular to this specific case. I would prefer to put something like:
Clear[x,y,z,vars,svars]
vars={x,y,z}
svars=Map[ToString,vars]
Module[{vars=vars,svars=svars, vars[[1]]},
Symbol[svars[[1]]]//Evaluate=1
]
which of course doesn't work because vars[[1]] is not a symbol or an assignment to a symbol.
Other possibilities:
I found a function
assignToName[name_String, value_] :=
ToExpression[name, InputForm, Function[var, var = value, HoldAll]]
which looked promising. Basically name_String is the name of the variable and value is its new value. I attempted to do:
vars={x,y,z}
svars=Map[ToString,vars]
vars[[1]]=//Evaluate=1
assignToName[svars[[1]],svars[[1]]]
but then something likeD[x^2, vars[[1]]] doesn't work (x is not a valid variable).
If I am missing something, or if perhaps I am going down the wrong path, I'm open to trying other things.
Thanks.
I can't say that I followed your train(s) of thought very well, so these are fragments which might help you to answer your own questions than a coherent and fully-formed answer. But to answer your final 'question', I think you may be going down some wrong path(s).
In passing, note that evaluating the expression
vars = {x,y,z}
does in fact define those three variables though it doesn't define any rewrite rules (such as values) for them.
Given a polynomial poly you can extract the variables in it with the function Variables[poly] so something like
Variables[x^2+3*x*y]
should return
{x,y}
Note that I write 'should' rather than does because I don't have Mathematica on this machine so my syntax may be a bit wonky. Note also that your example ODE is nothing of the sort but it strikes me that you can probably write a wrapper to manipulate an ODE into a form from which Variables can extract the variables. Mathematica offers a lot of other functions for picking expressions apart and re-assembling them, follow the trails from Variables. It often allows the use of functions defined on Lists on expressions with other heads too so it's always worth experimenting a bit.
There are a couple of widely applicable ways to avoid setting values of variables in Mathematica. For instance, you could write
x^2+3*x*y/.{x->2,y->3}
which will evaluate to
22
but not set values for x and y. This is a very simple example of using (sets of) replacement rules for temporary assignment of values to variables
The other way to avoid setting values for variables is to define functions using Modules or Blocks both of which define their own contexts. The documentation will tell you all about these two and the differences between them.
I can't help thinking that all your clever tricks using Symbol, ToExpression and ToString are a bit beside the point. Spend some time familiarising yourself with Mathematica's in-built functionality before going further down that route, you may well find you don't need to.
Finally, writing, in any language, expressions such as
vars=vars,svars=svars
will lead to madness. It may be syntactically correct, you may even be able to decrypt the semantics when you first write code like that, but in a week's time you will curse your younger self for writing it.

Identifying frequent formulas in a codebase

My company maintains a domain-specific language that syntactically resembles the Excel formula language. We're considering adding new builtins to the language. One way to do this is to identify verbose commands that are repeatedly used in our codebase. For example, if we see people always write the same 100-character command to trim whitespace from the beginning and end of a string, that suggests we should add a trim function.
Seeing a list of frequent substrings in the codebase would be a good start (though sometimes the frequently used commands differ by a few characters because of different variable names used).
I know there are well-established algorithms for doing this, but first I want to see if I can avoid reinventing the wheel. For example, I know this concept is the basis of many compression algorithms, so is there a compression module that lets me retrieve the dictionary of frequent substrings? Any other ideas would be appreciated.
The string matching is just the low hanging fruit, the obvious cases. The harder cases are where you're doing similar things but in different order. For example suppose you have:
X+Y
Y+X
Your string matching approach won't realize that those are effectively the same. If you want to go a bit deeper I think you need to parse the formulas into an AST and actually compare the AST's. If you did that you could see that the tree's are actually the same since the binary operator '+' is commutative.
You could also apply reduction rules so you could evaluate complex functions into simpler ones, for example:
(X * A) + ( X * B)
X * ( A + B )
Those are also the same! String matching won't help you there.
Parse into AST
Reduce and Optimize the functions
Compare the resulting AST to other ASTs
If you find a match then replace them with a call to a shared function.
I would think you could use an existing full-text indexer like Lucene, and implement your own Analyzer and Tokenizer that is specific to your formula language.
You then would be able to run queries, and be able to see the most used formulas, which ones appear next to each other, etc.
Here's a quick article to get you started:
Lucene Analyzer, Tokenizer and TokenFilter
You might want to look into tag-cloud generators. I couldn't find any source in the minute that I spent looking, but here's an online one:
http://tagcloud.oclc.org/tagcloud/TagCloudDemo which probably won't work since it uses spaces as delimiters.

Resources