Python - How to parse Boolean Sympy tree expressions to Boolean Z3Py expressions - python-3.x

I have some CNFs of Boolean expressions from the logic module in Sympy.
I get their Sympy expression trees with srepr() (see documentation).
Find below an example with two CNFs.
from sympy import Symbol
from sympy.logic.boolalg import And, Or, Not
# (a | ~c) & (a | ~e) & (c | e | ~a)
expr_1 = And(Or(Symbol('a'), Not(Symbol('c'))), Or(Symbol('a'), Not(Symbol('e'))), Or(Symbol('c'), Symbol('e'), Not(Symbol('a'))))
# (b | ~d) & (b | ~e) & (d | e | ~b)
expr_2 = And(Or(Symbol('b'), Not(Symbol('d'))), Or(Symbol('b'), Not(Symbol('e'))), Or(Symbol('d'), Symbol('e'), Not(Symbol('b'))))
I want to give those expression trees to a Z3Py solver as Boolean constraints.
For that, I think that need to:
transform sympy.Symbol() to z3.Bool(), and
transform the sympy logic operators to Z3 logic operators (e.g., sympy.logic.boolalg.And() to z3.And())
Then, I would add the constraints to a Z3 solver to output a solution.
If we continue with the example, as I see it, we would have the two following constraints (I wrote explicitly that I use Z3 Boolean operators to avoid confusion with the Sympy ones):
import z3 as z3
from z3 import Bool
const_1 = z3.And(z3.Or(Bool('a'), z3.Not(Bool('c'))), z3.Or(Bool('a'), z3.Not(Bool('e'))), z3.Or(Bool('c'), Bool('e'), z3.Not(Bool('a'))))
const_2 = z3.And(z3.Or(Bool('b'), z3.Not(Bool('d'))), z3.Or(Bool('b'), z3.Not(Bool('e'))), z3.Or(Bool('d'), Bool('e'), z3.Not(Bool('b'))))
How could we parse Sympy Boolean expression trees for Z3Py in an automated fashion? Is there a better way to do it than what I presented as an example?

You're on the right track. Essentially, you need to "compile" SymPy to Z3. This can be achieved in a variety of ways, but it's not a cheap/easy thing to do since you'd need to analyze large swaths of SymPy code. However, looks like your expressions are "simple" enough, so you can get away with a simple translator. Start by looking at how SymPy trees can be recursively processed: https://docs.sympy.org/latest/tutorials/intro-tutorial/manipulation.html#recursing-through-an-expression-tree
If you're in a hurry, you can use Axel's program, given in the other answer. Here's a version that is probably a bit more idiomatic and easier to extend and more robust:
import sympy
import z3
# Sympy vs Z3. Add more correspondences as necessary!
table = { sympy.logic.boolalg.And : z3.And
, sympy.logic.boolalg.Or : z3.Or
, sympy.logic.boolalg.Not : z3.Not
, sympy.logic.boolalg.Implies: z3.Implies
}
# Sympy vs Z3 Constants
constants = { sympy.logic.boolalg.BooleanTrue : z3.BoolVal(True)
, sympy.logic.boolalg.BooleanFalse: z3.BoolVal(False)
}
def compile_to_z3(exp):
"""Compile sympy expression to z3"""
pexp = sympy.parsing.sympy_parser.parse_expr(exp)
pvs = {v: z3.Bool(str(v)) for v in pexp.atoms() if type(v) not in constants}
def cvt(expr):
if expr in pvs:
return pvs[expr]
texpr = type(expr)
if texpr in constants:
return constants[texpr]
if texpr in table:
return table[texpr](*map(cvt, expr.args))
raise NameError("Unimplemented: " + str(expr))
return cvt(pexp)
if __name__ == '__main__':
z3.solve(compile_to_z3("false"))
z3.solve(compile_to_z3("a & ~b | c"))
z3.solve(compile_to_z3("false >> (a & ~b | c)"))
This prints:
no solution
[c = False, b = False, a = True]
[]
You can add new functions to table to extend it as you see fit.

I couldn't resist and implemented a basic converter.
from sympy import symbols
from sympy.parsing.sympy_parser import parse_expr
from z3 import *
# extract all variables and define a SymPy expression
def create_sympy_expression(expr):
declare_sympy_symbols(expr)
return parse_expr(expr)
# assume all single-character operands as SymPy variables
dicz3sym = {}
def declare_sympy_symbols(expr):
for c in expr:
if 'a' <= c <= 'z':
if not c in dicz3sym:
dicz3sym[c] = z3.Bool(c)
def transform_sympy_to_z3(exp):
params = [transform_sympy_to_z3(arg) for arg in exp.args]
func = str(exp.func)
if func == "And":
return z3.And(params)
elif func == "Not":
return z3.Not(params[0])
elif func == "Or":
return z3.Or(params)
elif exp.name in dicz3sym:
return dicz3sym[exp.name]
else:
raise NameError("unknown/unimplemented operator: " + func)
if __name__ == '__main__':
exp = create_sympy_expression("a & ~b | c")
z3exp = transform_sympy_to_z3(exp)
s = Solver()
s.add(z3exp)
if s.check() == sat:
m = s.model()
print("Solution found:")
for v in dicz3sym:
print(f"{v} = {m[dicz3sym[v]]}")
else:
print("No solution. Sorry!")

Related

Implementing an efficient 2-SAT solving algorithm

I was reading about the 2-SAT problem on Wikipedia and I was wondering what the O(n) algorithm looks like in Python.
So far I've only found implementations that either are in other programming languages or that just determine whether an expression has solutions or not, without given the solution itself.
How could the O(n) algorithm for finding the values of variables be written in Python?
Here is an OOP implementation in Python:
import re
class TwoSat:
class Variable:
def __init__(self, name, negated=None):
self.name = name
self.negated = negated or TwoSat.Variable("~" + name, self)
self.implies = set()
self.impliedby = set()
self.component = -1
def disjunction(self, b):
self.negated.implication(b)
b.negated.implication(self)
def implication(self, b):
self.implies.add(b)
b.impliedby.add(self)
def postorder(self, visited):
if self not in visited:
visited.add(self)
for neighbor in self.implies:
yield from neighbor.postorder(visited)
yield self
def setcomponent(self, component):
if self.component == -1:
self.component = component
for neighbor in self.impliedby:
neighbor.setcomponent(component)
def value(self):
diff = self.component - self.negated.component
return diff > 0 if diff else None
### end of class Variable
def __init__(self, s):
self.variables = {}
for a_neg, a_name, b_neg, b_name in re.findall(r"(~)?(\w+).*?(~)?(\w+)", s):
self.getvariable(a_neg, a_name).disjunction(self.getvariable(b_neg, b_name))
def getvariable(self, neg, name):
if name not in self.variables:
self.variables[name] = TwoSat.Variable(name)
self.variables["~" + name] = self.variables[name].negated
a = self.variables[name]
return a.negated if neg else a
def postorder(self):
visited = set()
for startvariable in self.variables.values():
yield from startvariable.postorder(visited)
def setcomponents(self):
for i, variable in enumerate(reversed(list(self.postorder()))):
variable.setcomponent(i)
def issolved(self):
return all(variable.value() is not None for variable in self.variables.values())
def solve(self):
self.setcomponents()
return self.issolved()
def truevariables(self):
if self.issolved():
return [variable.name for variable in self.variables.values() if variable.value()]
def __repr__(self):
return " ∧ ".join(
f"({a.name} → {b.name})"
for a in self.variables.values()
for b in a.implies
)
Here is an example of how this class can be used:
problem = TwoSat("(~a+~b)*(b+~c)*(c+g)*(d+a)*(~f+i)*(~i+~j)*(~h+d)*(~d+~b)*(~f+c)*(h+~i)*(i+~g)")
print(problem)
problem.solve()
print("solution: ", problem.truevariables())
The TwoSat constructor takes one argument, a string, which should provide the conjugation of disjunction pairs. The syntax rules for this string are:
literals must use alphanumeric characters (underscores allowed), representing a variable, optionally prefixed with a ~ to denote negation.
All other characters are just taken as separators and are not validated.
All literals are taken in pairs and each consecutive pair is assumed to form a disjunction clause.
If the number of literals is odd, then although that expression is not a valid 2SAT expression, the last literal is simply ignored.
So the above example could also have taken this string representing the same problem:
problem = TwoSat("~a ~b b ~c c g d a ~f i ~i ~j ~h d ~d ~b ~f c h ~i i ~g")
Alternatively, you can use the getvariable and disjunction methods to build the expression. Look at the __init__ method how the constructor uses those methods when parsing the string. For example:
problem = TwoSat()
for variable in "abcdefghij":
problem.getvariable(False, variable)
# Define the disjunction ~a + ~b:
problem.variables["a"].negated.disjunction(problem.variables["b"].negated)
# ...etc
The algorithm is the one explained in the 2-satisiability article on Wikipedia, identifying strongly connected components using Kosaraju's algorithm

How can one represent distinct non-numeric symbols in sympy?

I am experimenting with the representation of a trivial statistics problem in Sympy:
For a sample space S, there are 6 possible distinct outcomes
a,b,c,d,e,f. We can define event A as having occurred if any of
a,b,c have, and event B as having ocurred if any of b,c,d have.
Given a probability mass function pmf defined over S, what is the
probability of event A?
When attempting to implement this symbolically, I receive the following error:
~/project/.envs/dev/lib/python3.6/site-packages/sympy/stats/frv.py in _test(self, elem)
164 elif val.is_Equality:
165 return val.lhs == val.rhs
--> 166 raise ValueError("Undecidable if %s" % str(val))
167
168 def __contains__(self, other):
ValueError: Undecidable if Eq(d, a) | Eq(d, b) | Eq(d, c)
The problem is implemented as follows with comments on the failing lines of code:
from sympy import Eq, Function, symbols
from sympy.logic import Or
from sympy.sets import FiniteSet, Union
from sympy.stats import FiniteRV, P
# 1. Define a sample space S with outcomes: a,b,c,d,e,f; Define events A, B
A = FiniteSet(*symbols('a b c'))
B = FiniteSet(*symbols('b c d'))
S = Union(A, B, FiniteSet(*symbols('e f')))
display("Sample Space", S)
pmfFunc = Function("pmf")
pmfDict = {v: pmfFunc(v) for v in S}
X = FiniteRV('X', pmfDict)
a,b = symbols('a b')
# 2. P(X = a) = pmf(a)
display(P(Eq(X,a)))
# 3. A.as_relational(X) yields `(X=a) \lor (X=b) \lor (X=c)`
display(A.as_relational(X))
# 4. P(X = a \lor X = b) = pmf(a) + pmf(b)
# - Actual Output: ValueError: Undecidable if Eq(c, a) | Eq(c, b)
display(P(Or(Eq(X,a), Eq(X,b)))) # [FAILS]
# 5. P(A) = pmf(a) + pmf(b) + pmf(c)
# - Actual Output: ValueError: Undecidable if Eq(d, a) | Eq(d, b) | Eq(d, c)
display(P(A.as_relational(X))) # [FAILS]
I obtain expected output up to display(A.as_relational(X)):
Interpreting the failure message suggests that Sympy is unable to tell that the set members are distinct. Replacing the symbols with integers resolves the error and I get output similar to what I desire.
A = FiniteSet(1, 2, 3)
B = FiniteSet(2, 3, 4)
S = Union(A, B, FiniteSet(5, 6))
If I am not misunderstanding the error or the fundamental use of the library, is there a way to tell Sympy that a collection of symbols is entirely distinct? I have attempted to replace the symbols with Dummy instances without success, and I have also attempted to leverage the assumptions module without success:
facts = [Eq(a,b) if a is b else Not(Eq(a,b)) for a, b in itertools.product(S, S)]
with assuming(*facts):
I would like to avoid confusing mappings between integers and symbolic forms, as user error may not be apparent when the results are printed as latex. I am willing to adopt some burden in a workaround (e.g., as it would have been maintaining a collection of Dummy instances), but I have yet to find an acceptable workaround.
Interesting question. Maybe it can be done with with assuming(Ne(a,b), ...): context but I take a more pragmatic approach: replace symbols with cos(non-zero integer) which SymPy can easily distinguish as equal or not:
>>> reps = dict(zip(var('a:f'),(cos(i) for i in range(1,7))))
>>> ireps = {v:k for k,v in reps.items()}
>>> a,b,c,d,e,f = [reps[i] for i in var('a:f')]
Then remove your a, b = symbols... line and replace display(x) with display(x.subs(ireps) to get
('Sample Space', FiniteSet(a, b, c, d, e, f))
(pmf(a),)
(Eq(X, a) | Eq(X, b) | Eq(X, c),)
(pmf(a) + pmf(b),)
(I use cos(int) instead of int because I am not sure whether any computation would result in addition of two elements and I want to make sure they stay distinct.)
Another approach would be to define a constant class that derives from Symbol:
class con(Symbol):
def __hash__(self):
return id(self)
def __eq__(a,b):
if isinstance(b, con):
return a.name == b.name
_eval_Eq = __eq__
a,b,c,d,e,f=map(con,'abcdef')
display=lambda*x:print(x)
from sympy import Eq, Function, symbols
from sympy.logic import Or
from sympy.sets import FiniteSet, Union
from sympy.stats import FiniteRV, P
A = FiniteSet(a,b,c)
B = FiniteSet(b,c,d)
S = Union(A, B, FiniteSet(e,f))
pmfFunc = Function("pmf")
pmfDict = {v: pmfFunc(v) for v in S}
X = FiniteRV('X', pmfDict)
display("Sample Space", S)
display(P(Eq(X,a)))
display(A.as_relational(X))
display(P(Or(Eq(X,a), Eq(X,b))))
display(P(A.as_relational(X)))
gives
('Sample Space', FiniteSet(a, b, c, d, e, f))
(pmf(a),)
(Eq(X, a) | Eq(X, b) | Eq(X, c),)
(pmf(a) + pmf(b),)
(pmf(a) + pmf(b) + pmf(c),)

Formatting a string with elements in a list based on conditions

So I've created a series of grammers for use within a method in a class I've created. Each list can be n elements long so placing each word via list index is prettings straight forward wordlist[1:], however I need to use an | operator and that can't be done with explicit string indexes (at least I think so). This is what I've written so far:
noun_types = ['port', 'harbor', 'harbour']
target_pronouns = ['rotterdam', 'moscow']
grammer1 = (
F"""
S -> Det N P NP
P -> P
NP -> '{target_pronouns[0]}' | '{target_pronouns[1]}'
Det -> 'the' | 'a'
P -> 'of'
N -> '{noun_types[0]}' | '{noun_types[1]}' | '{noun_types[2]}'
""")
Ideally, I'd like to be able to pass a list of n number of pronouns and nouns and have the strings be formatted with each element without explicit string indexes, so something like this:
noun_types = ['port', 'harbor', 'harbour']
target_pronouns = ['rotterdam', 'moscow']
grammer1 = (
F"""
S -> Det N P NP
P -> P
NP -> '{target_pronouns[range(0, len(target_pronouns))]}'
Det -> 'the' | 'a'
P -> 'of'
N -> '{noun_types[range(0, len(target_pronouns))]}'
""")
However, I'm not sure how to implment the | operator, much less any conditional formatting when doing string formatting. The grammer formatting is based on nltk's grammer constructor used in this context:
from nltk.parse.generate import generate
from nltk import CFG
grammar = CFG.fromstring(grammer1)
for sentence in generate(grammar, n = 10, depth = 5):
words = ' '.join(sentence)
Its a bit of a confusing question, so I'm happy to try and clarify any confusion!
So I think there is a hacky way to do that: escape your strings with ' character beforehands, and then just plug them into your f-string using " | ".join().
Add both ' before and after each string of your input lists:
noun_types = [f"'{noun}'" for noun in noun_types]
target_pronouns = [f"'{pronoun}'" for pronoun in target_pronouns]
Now you can just put them into the f-string using " | ".join(). This will work regardless sizes of your input lists, no need for indices.
print(f"""
NP -> {' | '.join(target_pronouns)}
N -> {' | '.join(noun_types)}
""")
Output:
NP -> 'rotterdam' | 'moscow'
N -> 'port' | 'harbor' | 'harbour'
Another solution, if things get more complicated, could be to go into Jinja templating although right now it seems sufficient to hack it and avoid an extra library.

Manipulating sympy expression trees

Lets say, we have a sympy function cos(x). Every function can be
presented by a tree, e.g. like the image here https://reference.wolfram.com/language/tutorial/ExpressionsAsTrees.html
I want to insert a parameter into every node of this expression tree, that means
cos(x) -> a*cos(b*x)
For more complicated expression, it should look like
(exp(x)+cos(x)*x)/(x) -> h*(b*exp(a*x)+f*(c*cos(d*x)*e*x))/(j*x)
where a,b,c,d,e,f,g,h,j are parameters, that I want to fit.
A helpful source could be https://docs.sympy.org/latest/tutorial/manipulation.html in the chapter "walking the tree". I tried to replace parts of the
expr.args
tuple, but it is not possible.
This is the expression:
from simply import symbols, exp, cos
x, y = symbols('x y')
expr = (exp(x)+cos(x)*y)/(x)
This might get you started:
>>> s = numbered_symbols('C')
>>> cform = ((exp(x)+cos(x)*x)/(x)).replace(
... lambda x:not x.is_Number,
... lambda x:x*next(s))
>>> cform
C1*C8*C9*(C2*C4*C5*x*cos(C3*x) + C7*exp(C6*x))/(C0*x)
>>> from sympy.solvers.ode import constantsimp, constant_renumber
>>> constantsimp(cform, [i for i in cform.atoms(Symbol) if i.name.startswith('C')])
C0*(C2*x*cos(C3*x) + C7*exp(C6*x))/x
>>> constant_renumber(_)
C1*(C2*x*cos(C3*x) + C4*exp(C5*x))/x
>>> eq = _
>>> cons = ordered(i for i in eq.atoms(Symbol) if i.name.startswith('C'))
>>> eq.xreplace(dict(zip(cons, var('a:z'))))
a*(b*x*cos(c*x) + d*exp(e*x))/x

python3: How to get logical complement (negation) of a binary number, eg. '010' => '101'?

Maybe I'm missing something but I can't find a straightforward way to accomplish this simple task. When I go to negate a binary number through the "~" operator it returns a negative number due to the two's complement:
>>> bin(~0b100010) # this won't return '0b011101'
'-0b100011'
What about if I just want to switch 0s into 1s and vice-versa, like in classic logical complement?
>>> bin(0b111111 ^ 0b100010)
'0b11101'
>>>
YOU's answer as a function:
def complement(n):
size = len(format(n, 'b'))
comp = n ^ ((1 << size) - 1)
return '0b{0:0{1}b}'.format(comp, size)
>>> complement(0b100010)
'0b011101'
I made it preserve the bit length of the original. The int constructor doesn't care about the leading zeros:
>>> complement(0b1111111100000000)
'0b0000000011111111'
>> int(complement(0b1111111100000000), 2)
255
Ultra nasty:
>>> '0b' + ''.join('10'[int(x)] for x in format(0b100010,'b')).lstrip('0')
'0b11101'
Here's another couple of functions that returns the complement of a number I came out with.
A one-liner:
def complement(c):
return c ^ int('1'*len(format(c, 'b')), 2)
A more mathematical way:
def complement(c):
n=0
for b in format(c, 'b'): n=n<<1|int(b)^1
return n
Moreover, one-linerizing this last one with functools (such baroque):
def complement(c):
return functools.reduce( lambda x,y: x<<1|y, [ int(b)^1 for b in format(c, 'b') ])
Finally, a uselessly nerdish variant of the first one that uses math.log to count the binary digits:
def complement(c):
c ^ int('1' * math.floor(math.log((c|1)<<1, 2)), 2)
Another function more a 'Hack' for complementing a Integer. You can use the same logic for complementing binary. Wonder why I did not come across python external libs that can do same. The next ver of Python should take care of this in built-ins
def complement(x):
b = bin(x)[2:]
c= []
for num in b:
if num == '1': c.append('0')
elif num == '0': c.append('1')
cat = ''.join(c)
res = int(cat, 2)
return print(res)

Resources