How to recursively simplify a mathematical expression with AST in python3? - python-3.x

I have this mathematical expression:
tree = ast.parse('1 + 2 + 3 + x')
which corresponds to this abstract syntax tree:
Module(body=[Expr(value=BinOp(left=BinOp(left=BinOp(left=Num(n=1), op=Add(), right=Num(n=2)), op=Add(), right=Num(n=3)), op=Add(), right=Name(id='x', ctx=Load())))])
and I would like to simplify it - that is, get this:
Module(body=[Expr(value=BinOp(left=Num(n=6), op=Add(), right=Name(id='x', ctx=Load())))])
According to the documentation, I should use the NodeTransformer class. A suggestion in the docs says the following:
Keep in mind that if the node you’re operating on has child nodes you
must either transform the child nodes yourself or call the
generic_visit() method for the node first.
I tried implementing my own transformer:
class Evaluator(ast.NodeTransformer):
def visit_BinOp(self, node):
print('Evaluating ', ast.dump(node))
for child in ast.iter_child_nodes(node):
self.visit(child)
if type(node.left) == ast.Num and type(node.right) == ast.Num:
print(ast.literal_eval(node))
return ast.copy_location(ast.Subscript(value=ast.literal_eval(node)), node)
else:
return node
What it should do in this specific case is simplify 1+2 into 3, and then 3 +3 into 6.
It does simplify the binary operations I want to simplify, but it doesn't update the original Syntax Tree. I tried different approaches but I still don't get how I can recursively simplify all binary operations (in a depth-first manner). Could anyone point me in the right direction?
Thank you.

There are three possible return values for the visit_* methods:
None which means the node will be deleted,
node (the node itself) which means no change will be applied,
A new node, which will replace the old one.
So when you want to replace the BinOp with a Num you need to return a new Num node. The evaluation of the expression cannot be done via ast.literal_eval as this function only evaluates literals (not arbitrary expressions). Instead you can use eval for example.
So you could use the following node transformer class:
import ast
class Evaluator(ast.NodeTransformer):
ops = {
ast.Add: '+',
ast.Sub: '-',
ast.Mult: '*',
ast.Div: '/',
# define more here
}
def visit_BinOp(self, node):
self.generic_visit(node)
if isinstance(node.left, ast.Num) and isinstance(node.right, ast.Num):
# On Python <= 3.6 you can use ast.literal_eval.
# value = ast.literal_eval(node)
value = eval(f'{node.left.n} {self.ops[type(node.op)]} {node.right.n}')
return ast.Num(n=value)
return node
tree = ast.parse('1 + 2 + 3 + x')
tree = ast.fix_missing_locations(Evaluator().visit(tree))
print(ast.dump(tree))

Related

Is there a way changing actual value of an int without creating a new instance? [duplicate]

How can I pass an integer by reference in Python?
I want to modify the value of a variable that I am passing to the function. I have read that everything in Python is pass by value, but there has to be an easy trick. For example, in Java you could pass the reference types of Integer, Long, etc.
How can I pass an integer into a function by reference?
What are the best practices?
It doesn't quite work that way in Python. Python passes references to objects. Inside your function you have an object -- You're free to mutate that object (if possible). However, integers are immutable. One workaround is to pass the integer in a container which can be mutated:
def change(x):
x[0] = 3
x = [1]
change(x)
print x
This is ugly/clumsy at best, but you're not going to do any better in Python. The reason is because in Python, assignment (=) takes whatever object is the result of the right hand side and binds it to whatever is on the left hand side *(or passes it to the appropriate function).
Understanding this, we can see why there is no way to change the value of an immutable object inside a function -- you can't change any of its attributes because it's immutable, and you can't just assign the "variable" a new value because then you're actually creating a new object (which is distinct from the old one) and giving it the name that the old object had in the local namespace.
Usually the workaround is to simply return the object that you want:
def multiply_by_2(x):
return 2*x
x = 1
x = multiply_by_2(x)
*In the first example case above, 3 actually gets passed to x.__setitem__.
Most cases where you would need to pass by reference are where you need to return more than one value back to the caller. A "best practice" is to use multiple return values, which is much easier to do in Python than in languages like Java.
Here's a simple example:
def RectToPolar(x, y):
r = (x ** 2 + y ** 2) ** 0.5
theta = math.atan2(y, x)
return r, theta # return 2 things at once
r, theta = RectToPolar(3, 4) # assign 2 things at once
Not exactly passing a value directly, but using it as if it was passed.
x = 7
def my_method():
nonlocal x
x += 1
my_method()
print(x) # 8
Caveats:
nonlocal was introduced in python 3
If the enclosing scope is the global one, use global instead of nonlocal.
Maybe it's not pythonic way, but you can do this
import ctypes
def incr(a):
a += 1
x = ctypes.c_int(1) # create c-var
incr(ctypes.ctypes.byref(x)) # passing by ref
Really, the best practice is to step back and ask whether you really need to do this. Why do you want to modify the value of a variable that you're passing in to the function?
If you need to do it for a quick hack, the quickest way is to pass a list holding the integer, and stick a [0] around every use of it, as mgilson's answer demonstrates.
If you need to do it for something more significant, write a class that has an int as an attribute, so you can just set it. Of course this forces you to come up with a good name for the class, and for the attribute—if you can't think of anything, go back and read the sentence again a few times, and then use the list.
More generally, if you're trying to port some Java idiom directly to Python, you're doing it wrong. Even when there is something directly corresponding (as with static/#staticmethod), you still don't want to use it in most Python programs just because you'd use it in Java.
Maybe slightly more self-documenting than the list-of-length-1 trick is the old empty type trick:
def inc_i(v):
v.i += 1
x = type('', (), {})()
x.i = 7
inc_i(x)
print(x.i)
A numpy single-element array is mutable and yet for most purposes, it can be evaluated as if it was a numerical python variable. Therefore, it's a more convenient by-reference number container than a single-element list.
import numpy as np
def triple_var_by_ref(x):
x[0]=x[0]*3
a=np.array([2])
triple_var_by_ref(a)
print(a+1)
output:
7
The correct answer, is to use a class and put the value inside the class, this lets you pass by reference exactly as you desire.
class Thing:
def __init__(self,a):
self.a = a
def dosomething(ref)
ref.a += 1
t = Thing(3)
dosomething(t)
print("T is now",t.a)
In Python, every value is a reference (a pointer to an object), just like non-primitives in Java. Also, like Java, Python only has pass by value. So, semantically, they are pretty much the same.
Since you mention Java in your question, I would like to see how you achieve what you want in Java. If you can show it in Java, I can show you how to do it exactly equivalently in Python.
class PassByReference:
def Change(self, var):
self.a = var
print(self.a)
s=PassByReference()
s.Change(5)
class Obj:
def __init__(self,a):
self.value = a
def sum(self, a):
self.value += a
a = Obj(1)
b = a
a.sum(1)
print(a.value, b.value)// 2 2
In Python, everything is passed by value, but if you want to modify some state, you can change the value of an integer inside a list or object that's passed to a method.
integers are immutable in python and once they are created we cannot change their value by using assignment operator to a variable we are making it to point to some other address not the previous address.
In python a function can return multiple values we can make use of it:
def swap(a,b):
return b,a
a,b=22,55
a,b=swap(a,b)
print(a,b)
To change the reference a variable is pointing to we can wrap immutable data types(int, long, float, complex, str, bytes, truple, frozenset) inside of mutable data types (bytearray, list, set, dict).
#var is an instance of dictionary type
def change(var,key,new_value):
var[key]=new_value
var =dict()
var['a']=33
change(var,'a',2625)
print(var['a'])

Reversing a LinkedList in with multiple assignment

I have this code right down here:
# Definition for singly-linked list.
# class ListNode:
# def __init__(self, x):
# self.val = x
# self.next = None
class Solution:
def reverseList(self, head: ListNode) -> ListNode:
if head == None:
return
pre, node = None, head
while node:
pre, node.next, node = node, pre, node.next
return pre
I am trying to vizualize how this works. If it starts on a list, the pre becomes the head, since node was assigned to head. then node.next is assigned to pre, so it points to itself? Finally, node becomes node.next, which is itself? am I missing something here?
Multiple assignment isn't the same as several assignments one after the other. The difference is that the values on the right hand side of the statement all get evaluated before anything get rebound. The values on the right hand side are in fact packed up in a tuple, then unpacked into the names on the left hand side.
That's important in this situation as it means that node.next on the right hand side gets its value saved, so that when you rebind it to something else (pre), the old value is still available to become the new node value after the assignment.
You may want to play around with some simpler assignments, like the classic swap operation:
x = 1
y = 2
x, y = y, x # swap x and y's values
print(x, y) # prints "2 1"
_tup = y, x # this is how it works, first pack the RHS values into a tuple
x, y = _tup # then unpack the values into the names on the LHS
print(x, y) # prints "1 2" as we've swapped back
The main idea is to convert the original head node becomes the last node of the new linked list and convert the original last one become the new head node and convert the link direction between nodes.
suppose the original linked list consists 2 nodes.
first, pre = None, the node = head, then node.next = pre that means the original head node becomes the last node of the new linked list. node = node.next that means to convert the link direction between nodes. node.next = pre means to convert the original last one becomes the new head.
while repeatedly execute the above process
Here is a related question that links to docs on evaluation order: Tuple unpacking order changes values assigned
From https://docs.python.org/3/reference/expressions.html#evaluation-order, the example expr3, expr4 = expr1, expr2 shows evaluation order through the suffix number. It shows that the right side of assignment is evaluated first, from left to right, then the left side of assignment is evaluated, also from left to right.
For mutable objects like in this question, it gets more confusing without knowing the evaluation order.
To prove that it is indeed left-to-right on the left-hand-side, you can imagine what happens when pre, node.next, node = node, pre, node.next is assigned from right-to-left, meaning:
node = node.next
node.next = pre
pre = node
This wouldn't be reversing the Linked List at all.
Other ways to write this reversal:
Sometimes you can see others express this pattern as
pre, pre.next, node = node, pre, node.next
(Notice the 2nd element on LHS changed from node.next to pre.next.
This still works because after the first evaluation of pre = node, pre and node are referring to the same node. However, it introduces an extra dependency on the first evaluation of pre = node, which adds unnecessary cognitive load on the reader.
If we remained at pre, node.next, node = node, pre, node.next, then even swapping the first two variables (do it on both left and right of assignment) works:
node.next, pre, node = pre, node, node.next.
This is also my most preferred form since the right-hand-side naturally follows a previous,current,next order of the linked list.
Generally, we should place the dependent objects on the left of independent objects when ordering a tuple of variables on the left-hand-side. Any ordering with node = node.next before node.next = pre should break the implementation. (One example already shown in the thought experiment above on right-to-left evaluation order.)

Recursive strategies with additional parameters in Hypothesis

Using recursive, I can generate simple ASTs, e.g.
from hypothesis import *
from hypothesis.strategies import *
def trees():
base = integers(min_value=1, max_value=10).map(lambda n: 'x' + str(n))
#composite
def extend(draw, children):
op = draw(sampled_from(['+', '-', '*', '/']))
return (op, draw(children), draw(children))
return recursive(base, draw)
Now I want to change it so I can generate boolean operations in addition to the arithmetical ones. My initial idea is to add a parameter to trees:
def trees(tpe):
base = integers(min_value=1, max_value=10).map(lambda n: 'x' + str(n) + ': ' + tpe)
#composite
def extend(draw, children):
if tpe == 'bool':
op = draw(sampled_from(['&&', '||']))
return (op, draw(children), draw(children))
elif tpe == 'num':
op = draw(sampled_from(['+', '-', '*', '/']))
return (op, draw(children), draw(children))
return recursive(base, draw)
Ok so far. But how do I mix them? That is, I also want comparison operators and the ternary operator, which would require "calling children with a different parameter", so to say.
The trees need to be well-typed: if the operation is '||' or '&&', both arguments need to be boolean, arguments to '+' or '<' need to be numbers, etc. If I only had two types, I could just use filter (given a type_of function):
if op in ('&&', '||'):
bool_trees = children.filter(lambda x: type_of(x) == 'bool')
return (op, draw(bool_trees), draw(bool_trees))
but in the real case it wouldn't be acceptable.
Does recursive support this? Or is there another way? Obviously, I can directly define trees recursively, but that runs into the standard problems.
You can simply describe trees where the comparison is drawn from either set of operations - in this case trivially by sampling from ['&&', '||', '+', '-', '*', '/'].
def trees():
return recursive(
integers(min_value=1, max_value=10).map('x{}'.format),
lambda node: tuples(sampled_from('&& || + - * /'.split()), node, node)
)
But of course that won't be well-typed (except perhaps by rare coincidence). I think the best option for well-typed ASTs is:
For each type, define a strategy for trees which evaluate to that type. The base case is simply (a strategy for) a value of that type.
The extension is to pre-calculate the possible combinations of types and operations that would generate a value of this type, using mutual recursion via st.deferred. That would look something like...
bool_strat = deferred(
lambda: one_of(
booleans(),
tuples(sampled_from(["and", "or"], bool_strat, bool_strat),
tuples(sampled_from(["==", "!=", "<", ...]), integer_strat, integer_strat),
)
)
integer_strat = deferred(
lambda: one_of(
integers(),
tuples(sampled_from("= - * /".split()), integer_strat, integer_strat),
)
)
any_type_ast = bool_strat | integer_strat
And it will work as if by magic :D
(on the other hand, this is a fair bit more complex - if your workaround is working for you, don't feel obliged to do this instead!)
If you're seeing problematic blowups in size - which should be very rare, as the engine has had a lot of work since that article was written - there's honestly not much to do about it. Threading a depth limit through the whole thing and decrementing it each step does work as a last resort, but it's not nice to work with.
The solution I used for now is to adapt the generated trees so e.g. if a num tree is generated when the operation needs a bool, I also draw a comparison operator op and a constant const and return (op, tree, const):
def make_bool(tree, draw):
if type_of(tree) == 'bool':
return tree
else type_of(tree) == 'num':
op = draw(sampled_from(comparison_ops))
const = draw(integers())
side = draw(booleans())
return (op, tree, const) if side else (op, const, tree)
// in def extend:
if tpe == 'bool':
op = draw(sampled_from(bool_ops + comparison_ops))
if op in bool_ops:
return (op, make_bool(draw(children), draw), make_bool(draw(children), draw))
else:
return (op, make_num(draw(children), draw), make_num(draw(children), draw))
Unfortunately, it's specific to ASTs and will mean specific kinds of trees are generated more often. So I'd still be happy to see better alternatives.

Basic first order logic inference fails for symmetric binary predicate

Super basic question. I am trying to express a symmetric relationship between two binary predicates (parent and child). But, with the following statement, my resolution prover allows me to prove anything. The converted CNF form makes sense to me as does the proof by resolution, but this should be an obvious case for false. What am I missing?
forall x,y (is-parent-of(x,y) <-> is-child-of(y,x))
I am using the nltk python library and the ResolutionProver prover. Here is the nltk code:
from nltk.sem import Expression as exp
from nltk.inference import ResolutionProver as prover
s = exp.fromstring('all x.(all y.(parentof(y, x) <-> childof(x, y)))')
q = exp.fromstring('foo(Bar)')
print prover().prove(q, [s], verbose=True)
output:
[1] {-foo(Bar)} A
[2] {-parentof(z9,z10), childof(z10,z9)} A
[3] {parentof(z11,z12), -childof(z12,z11)} A
[4] {} (2, 3)
True
Here is a quick fix for the ResolutionProver.
The issue that causes the prover to be unsound is that it does not implement the resolution rule correctly when there is more than one complementary literal. E.g. given the clauses {A B C} and {-A -B D} binary resolution would produce the clauses {A -A C D} and {B -B C D}. Both would be discarded as tautologies. The current NLTK implementation instead would produce {C D}.
This was probably introduced because clauses are represented in NLTK as lists, therefore identical literals may occur more than once within a clause. This rule does correctly produce an empty clause when applied to the clauses {A A} and {-A -A}, but in general this rule is not correct.
It seems that if we keep clauses free from repetitions of identical literals we can regain soundness with a few changes.
First define a function that removes identical literals.
Here is a naive implementation of such a function
import nltk.inference.resolution as res
def _simplify(clause):
"""
Remove duplicate literals from a clause
"""
duplicates=[]
for i,c in enumerate(clause):
if i in duplicates:
continue
for j,d in enumerate(clause[i+1:],start=i+1):
if j in duplicates:
continue
if c == d:
duplicates.append(j)
result=[]
for i,c in enumerate(clause):
if not i in duplicates:
result.append(clause[i])
return res.Clause(result)
Now we can plug this function into some of the functions of the nltk.inference.resolution module.
def _iterate_first_fix(first, second, bindings, used, skipped, finalize_method, debug):
"""
This method facilitates movement through the terms of 'self'
"""
debug.line('unify(%s,%s) %s'%(first, second, bindings))
if not len(first) or not len(second): #if no more recursions can be performed
return finalize_method(first, second, bindings, used, skipped, debug)
else:
#explore this 'self' atom
result = res._iterate_second(first, second, bindings, used, skipped, finalize_method, debug+1)
#skip this possible 'self' atom
newskipped = (skipped[0]+[first[0]], skipped[1])
result += res._iterate_first(first[1:], second, bindings, used, newskipped, finalize_method, debug+1)
try:
newbindings, newused, unused = res._unify_terms(first[0], second[0], bindings, used)
#Unification found, so progress with this line of unification
#put skipped and unused terms back into play for later unification.
newfirst = first[1:] + skipped[0] + unused[0]
newsecond = second[1:] + skipped[1] + unused[1]
# We return immediately when `_unify_term()` is successful
result += _simplify(finalize_method(newfirst,newsecond,newbindings,newused,([],[]),debug))
except res.BindingException:
pass
return result
res._iterate_first=_iterate_first_fix
Similarly update res._iterate_second
def _iterate_second_fix(first, second, bindings, used, skipped, finalize_method, debug):
"""
This method facilitates movement through the terms of 'other'
"""
debug.line('unify(%s,%s) %s'%(first, second, bindings))
if not len(first) or not len(second): #if no more recursions can be performed
return finalize_method(first, second, bindings, used, skipped, debug)
else:
#skip this possible pairing and move to the next
newskipped = (skipped[0], skipped[1]+[second[0]])
result = res._iterate_second(first, second[1:], bindings, used, newskipped, finalize_method, debug+1)
try:
newbindings, newused, unused = res._unify_terms(first[0], second[0], bindings, used)
#Unification found, so progress with this line of unification
#put skipped and unused terms back into play for later unification.
newfirst = first[1:] + skipped[0] + unused[0]
newsecond = second[1:] + skipped[1] + unused[1]
# We return immediately when `_unify_term()` is successful
result += _simplify(finalize_method(newfirst,newsecond,newbindings,newused,([],[]),debug))
except res.BindingException:
#the atoms could not be unified,
pass
return result
res._iterate_second=_iterate_second_fix
Finally, plug our function into the clausify() to ensure the inputs are repetition-free.
def clausify_simplify(expression):
"""
Skolemize, clausify, and standardize the variables apart.
"""
clause_list = []
for clause in res._clausify(res.skolemize(expression)):
for free in clause.free():
if res.is_indvar(free.name):
newvar = res.VariableExpression(res.unique_variable())
clause = clause.replace(free, newvar)
clause_list.append(_simplify(clause))
return clause_list
res.clausify=clausify_simplify
After applying these changes the prover should run the standard tests and also deal correctly with the parentof/childof relationships.
print res.ResolutionProver().prove(q, [s], verbose=True)
output:
[1] {-foo(Bar)} A
[2] {-parentof(z144,z143), childof(z143,z144)} A
[3] {parentof(z146,z145), -childof(z145,z146)} A
[4] {childof(z145,z146), -childof(z145,z146)} (2, 3) Tautology
[5] {-parentof(z146,z145), parentof(z146,z145)} (2, 3) Tautology
[6] {childof(z145,z146), -childof(z145,z146)} (2, 3) Tautology
False
Update: Achieving correctness is not the end of the story. A more efficient solution would be to replace the container used to store literals in the Clause class with the one based on built-in Python hash-based sets, however that seems to require a more thorough rework of the prover implementation and introducing some performance testing infrastructure as well.

Use literal operators (eg "and", "or") in Groovy expressions?

My current work project allows user-provided expressions to be evaluated in specific contexts, as a way for them to extend and influence the workflow. These expressions the usual logical ones f. To make it a bit palatable for non-programmers, I'd like to give them the option of using literal operators (e.g. and, or, not instead of &, |, !).
A simple search & replace is not sufficient, as the data might contains those words within quotes and building a parser, while doable, may not be the most elegant and efficient solution.
To make the question clear: is there a way in Groovy to allow the users to write
x > 10 and y = 20 or not z
but have Groovy evaluate it as if it were:
x > 10 && y == 20 || !z
Thank you.
Recent versions of Groovy support Command chains, so it's indeed possible to write this:
compute x > 10 and y == 20 or not(z)
The word "compute" here is arbitrary, but it cannot be omitted, because it's the first "verb" in the command chain. Everything that follows alternates between verb and noun:
compute x > 10 and y == 20 or not(z)
───┬─── ──┬─── ─┬─ ───┬─── ─┬─ ──┬───
verb noun verb noun verb noun
A command chain is compiled like this:
verb(noun).verb(noun).verb(noun)...
so the example above is compiled to:
compute(x > 10).and(y == 20).or(not(z))
There are many ways to implement this. Here is just a quick & dirty proof of concept, that doesn't implement operator precedence, among other things:
class Compute {
private value
Compute(boolean v) { value = v }
def or (boolean w) { value = value || w; this }
def and(boolean w) { value = value && w; this }
String toString() { value }
}
def compute(v) { new Compute(v) }
def not(boolean v) { !v }
You can use command chains by themselves (as top-level statements) or to the right-hand side of an assignment operator (local variable or property assignment), but not inside other expressions.
If you can swap operators like > and = for the facelets-like gt and eq, respectively, i THINK your case may be doable, though it will require a lot of effort:
x gt 10 and y eq 20 or not z
resolves to:
x(gt).10(and).y(eq).20(or).not(z)
And this will be hell to parse.
The way #Brian Henry suggested is the easiest way, though not user-friendly, since it needs the parens and dots.
Well, considering we can swap the operators, you could try to intercept the Integer.call to start expressions. Having the missing properties in a script being resolved to operations can solve your new keywords problem. Then you can build expressions and save them to a list, executing them in the end of the script. It's not finished, but i came along with this:
// the operators that can be used in the script
enum Operation { eq, and, gt, not }
// every unresolved variable here will try to be resolved as an Operation
def propertyMissing(String property) { Operation.find { it.name() == property} }
// a class to contain what should be executed in the end of the script
#groovy.transform.ToString
class Instruction { def left; Operation operation; def right }
// a class to handle the next allowed tokens
class Expression {
Closure handler; Instruction instruction
def methodMissing(String method, args) {
println "method=$method, args=$args"
handler method, args
}
}
// a list to contain the instructions that will need to be parsed
def instructions = []
// the start of the whole mess: an integer will get this called
Integer.metaClass {
call = { Operation op ->
instruction = new Instruction(operation: op, left: delegate)
instructions << instruction
new Expression(
instruction: instruction,
handler:{ String method, args ->
instruction.right = method.toInteger()
println instructions
this
})
}
}
x = 12
y = 19
z = false
x gt 10 and y eq 20 or not z
Which will give an exception, due the not() part not being implemented, but it can build two Instruction objects before failing:
[Instruction(12, gt, 10), Instruction(19, eq, 20)]
Not sure if it is worth it.
The GDK tacks on and() and or() methods to Boolean. If you supplied a method like
Boolean not(Boolean b) {return !b}
you could write something like
(x > 10).and(y == 20).or(not(4 == 1))
I'm not sure that's particularly easy to write, though.

Resources