ast nodes not preserving some properties (lineno or/and col_offset)

ast nodes not preserving some properties (lineno or/and col_offset) - python-3.x

I'm trying to convert every break statement with exec('break') in a code. So far I've got this:
import ast
source = '''some_list = [2, 3, 4, 5]
for i in some_list:
if i == 4:
p = 0
break
exec('d = 9')'''
tree = ast.parse(source)
class NodeTransformer(ast.NodeTransformer):
def visit_Break(self, node: ast.Break):
print(ast.dump(node))
exec_break = ast.Call(func=ast.Name(id='exec', ctx=ast.Load()),
args=[ast.Constant(value='break')],
keywords=[])
return ast.copy_location(exec_break, node)
NodeTransformer().visit(tree)
print(ast.unparse(tree))
However, at the end it outputs p = 0 and exec('break') at the same line:
some_list = [2, 3, 4, 5]
for i in some_list:
if i == 4:
p = 0exec('break')
exec('d = 9')
I created the ast.Call object to the exec function with first argument 'break' but it seems not to transform properly. What did I miss?

I've found the bug. The ast.Call node has to be an ast.Expr object:
def visit_Break(self, node: ast.Break):
exec_break = ast.Call(func=ast.Name(id='exec', ctx=ast.Load()),
args=[ast.Constant(value='break')],
keywords=[])
new_node = ast.Expr(value=exec_break)
ast.copy_location(new_node, node)
ast.fix_missing_locations(new_node)
return new_node
Reference: https://greentreesnakes.readthedocs.io/en/latest/examples.html#simple-test-framework

Related

Python: eq(self, other): len(self.sensor_data) ok, but len(other.senor_data) => AttributeError

I have a class named CAN_MSG for which I want to define the __eq__ method so that I can check two classes for equality.
Input_CAN_Sorter.py:
class CAN_MSG:
def __init__(self, first_sensor_id, timestamp, sensor_data):
self.first_sensor_id = first_sensor_id
self.sensor_data = sensor_data
self.timestamp = timestamp
def __eq__(self, other):
result = self.first_sensor_id == other.first_sensor_id
n = len(self.sensor_data) # no Error
i = len(other.senor_data) # AttributeError: 'CAN_MSG' object has no attribute 'senor_data'.
result = result and len(self.sensor_data) == len(other.senor_data)
for i in range(len(self.sensor_data)):
result = result and self.sensor_data[i] == other.senor_data[i]
result = result and self.timestamp == other.timestamp
return result
The class has a list of ints called sensor_data. When I compare using __eq__(self, other):
There is no problem with len(self.sensor_data), but with len(other.sensor_data) I get the following error:
AttributeError: 'CAN_MSG' object has no attribute 'senor_data'.
I don't understand why I can access self.sensor_data but not other.sensor_data.
test.py:
from Input_CAN_Sorter import CAN_MSG
list_temp = [1, 2, 3, 4, 5, 6, 7]
list_temp2 = [1, 2, 3, 4, 5, 6, 7]
CAN_MSG_1 = CAN_MSG(1, "TIME", list_temp)
CAN_MSG_2 = CAN_MSG(1, "TIME", list_temp2)
if CAN_MSG_1 == CAN_MSG_2:
print("=")
In C++ I would have done a check for the class type before and maybe a cast afterwards so that the compiler knows for sure that it is the same class, but in Python this is not possible/necessary if I understand correctly.
Probably this is a completely stupid mistake but I'm not 100% familiar with Python and can't come up with a reasonable explanation.

You misspelled your variable name.
AttributeError: 'CAN_MSG' object has no attribute 'senor_data'.
i = len(other.senor_data)
You meant to write:
i = len(other.sensor_data)

Python variable reference (primitive type vs objects)

I've been trying to practice some linked list leet code questions. While I'm working on removing elements from a linked list I'm having a hard time understanding a "reference". The code is below:
# Definition for singly-linked list.
# class ListNode(object):
# def __init__(self, val=0, next=None):
# self.val = val
# self.next = next
class Solution(object):
def removeElements(self, head, val):
"""
:type head: ListNode
:type val: int
:rtype: ListNode
"""
current = head
if head == None:
return head
if current != None:
if current.val == val:
head = current.next
# print("current ", current)
while (current.next is not None):
if current.next.val == val:
current.next = current.next.next
else:
current = current.next
print("current",current)
print("head", head)
return head
First thing is the code above doesn't pass all the tests but that's not the question I'm having here. Given the inputs of [1,2,6,3,4,5,6] for the input linked list and the nodes to remove are nodes with a value of 6, I noticed the print(head) statement results in the correct solution:
('head', ListNode{val: 1, next: ListNode{val: 2, next: ListNode{val: 3, next: ListNode{val: 4, next: ListNode{val: 5, next: None}}}}})
But the print(current) statement results in just a single node:
('current', ListNode{val: 5, next: None})
I understand why current prints out a single node but how does the head print out the correct linked list with 6s removed? As far as I am aware, for the given inputs, current is assigned to head, so if I modify current only itself should be modified and not the head right? Obviously I'm wrong but where is the error in my logic?
One thing I did notice by playing in the python playground:
a = [1,2,3]
b = a
b.append(4)
# a = [1,2,3,4]
# b = [1,2,3,4]
c = 3
d = c
d = 4
# c = 3
# d = 4
I'm guessing python treats primitive types differently to other objects? Thank you.

Why is a list variable sometimes not impacted by changes in function as I thought python3 works on pass by reference with list variables?

For python3, I originally needed to extract odd and even positions from a list and assign it to new lists, then clear the original list. I thought lists were impacted by a function call through "pass by reference". Testing some scenarios, it works sometime. Could someone please explain how exactly python3 works here?
Case 1: empty list is populated with string as expected.
def func1(_in):
_in.append('abc')
mylist = list()
print(f"Before:\nmylist = {mylist}")
func1(mylist)
print(f"After:\nmylist = {mylist}")
Output case 1:
Before:
mylist = []
After:
mylist = ['abc']
Case 2: middle list element is replaced with string as expected.
def func2(_in):
_in[1] = 'abc'
mylist = list(range(3))
print(f"Before:\nmylist = {mylist}")
func2(mylist)
print(f"After:\nmylist = {mylist}")
Output case 2:
Before:
mylist = [0, 1, 2]
After:
mylist = [0, 'abc', 2]
Case 3: why is the list not empty after function call?
def func3(_in):
_in = list()
mylist = list(range(3))
print(f"Before:\nmylist = {mylist}")
func3(mylist)
print(f"After:\nmylist = {mylist}")
Output case 3:
Before:
mylist = [0, 1, 2]
After:
mylist = [0, 1, 2]
Case 4: working exactly as expected, but note I have returned all three lists from function.
def func4_with_ret(_src, _dest1, _dest2):
_dest1 = [val for val in _src[0:len(_src):2]]
_dest2 = [val for val in _src[1:len(_src):2]]
_src = list()
return _src, _dest1, _dest2
source = list(range(6))
evens, odds = list(), list()
print(f"Before function call:\nsource = {source}\nevens = {evens}\nodds = {odds}")
source, evens, odds = func4_with_ret(source, evens, odds)
print(f"\nAfter function call:\nsource = {source}\nevens = {evens}\nodds = {odds}")
Output case 4:
Before function call:
source = [0, 1, 2, 3, 4, 5]
evens = []
odds = []
After function call:
source = []
evens = [0, 2, 4]
odds = [1, 3, 5]
Case 5: why no impact on the variables outside the function if I do not explicitly return from function call?
def func5_no_ret(_src, _dest1, _dest2):
_dest1 = [val for val in _src[0:len(_src):2]]
_dest2 = [val for val in _src[1:len(_src):2]]
_src = list()
source = list(range(6))
evens, odds = list(), list()
print(f"Before function call:\nsource = {source}\nevens = {evens}\nodds = {odds}")
func5_no_ret(source, evens, odds)
print(f"\nAfter function call:\nsource = {source}\nevens = {evens}\nodds = {odds}")
Output case 5:
Before function call:
source = [0, 1, 2, 3, 4, 5]
evens = []
odds = []
After function call:
source = [0, 1, 2, 3, 4, 5]
evens = []
odds = []
Thank you.

Your ultimate problem is confusing (in-place) mutation with rebinding (also referred to somewhat less precisely as "reassignment").
In all the cases where the change isn't visible outside the function, you rebound the name inside the function. When you do:
name = val
it does not matter what used to be in name; it's rebound to val, and the reference to the old object is thrown away. When it's the last reference, this leads to the object being cleaned up; in your case, the argument used to alias an object also bound to a name in the caller, but after rebinding, that aliasing association is lost.
Aside for C/C++ folks: Rebinding is like assigning to a pointer variable, e.g. int *px = pfoo; (initial binding), followed later by px = pbar; (rebinding), where both pfoo and pbar are themselves pointers to int. When the px = pbar; assignment occurs, it doesn't matter that px used to point to the same thing as pfoo, it points to something new now, and following it up with *px = 1; (mutation, not rebinding) only affects whatever pbar points to, leaving the target of pfoo unchanged.
By contrast, mutation doesn't break aliasing associations, so:
name[1] = val
does rebind name[1] itself, but it doesn't rebind name; it continues to refer to the same object as before, it just mutates that object in place, leaving all aliasing intact (so all names aliasing the same object see the result of the change).
For your specific case, you could change the "broken" functions from rebinding to aliasing by changing to slice assignment/deletion or other forms of in-place mutation, e.g.:
def func3(_in):
# _in = list() BAD, rebinds
_in.clear() # Good, method mutates in place
del _in[:] # Good, equivalent to clear
_in[:] = list() # Acceptable; needlessly creates empty list, but closest to original
# code, and has same effect
def func5_no_ret(_src, _dest1, _dest2):
# BAD, all rebinding to new lists, not changing contents of original lists
#_dest1 = [val for val in _src[0:len(_src):2]]
#_dest2 = [val for val in _src[1:len(_src):2]]
#_src = list()
# Acceptable (you should just use multiple return values, not modify caller arguments)
# this isn't C where multiple returns are a PITA
_dest1[:] = _src[::2] # Removed slice components where defaults equivalent
_dest2[:] = _src[1::2] # and dropped pointless listcomp; if _src might not be a list
# list(_src[::2]) is still better than no-op listcomp
_src.clear()
# Best (though clearing _src is still weird)
retval = _src[::2], _src[1::2]
_src.clear()
return retval
# Perhaps overly clever to avoid named temporary:
try:
return _src[::2], _src[1::2]
finally:
_src.clear()

Dynamically updating a nested dictionary with multiprocessing.pool (speed issue)

I have written a simple code to understand how lack of communication between the child processes leads to a random result when using multiprocessing.Pool. I input a nested dictionary as a dictproxy object made by multiprocessing.Manager:
manager = Manager()
my_dict = manager.dict()
my_dict['nested'] = nested
into a pool embedding 16 open processes. The nested dictionary is defined below. The function my_function simply generates the second power of each number stored in the elements of the nested dictionary.
As expected because of the shared memory in multithreading, I get the correct result when I use multiprocessing.dummy
{0: 1, 1: 4, 2: 9, 3: 16}
{0: 4, 1: 9, 2: 16, 3: 25}
{0: 9, 1: 16, 2: 25, 3: 36}
{0: 16, 1: 25, 2: 36, 3: 49}
{0: 25, 1: 36, 2: 49, 3: 64}
but when I use multiprocessing, the result is incorrect and completely random in each run. One example of the incorrect result is:
{0: 1, 1: 2, 2: 3, 3: 4}
{0: 4, 1: 9, 2: 16, 3: 25}
{0: 3, 1: 4, 2: 5, 3: 6}
{0: 16, 1: 25, 2: 36, 3: 49}
{0: 25, 1: 36, 2: 49, 3: 64}
In this particular run, the 'data' in 'element' 1 and 3 was not updated. I understand that this happens due to the lack of communication between the child processes which prohibits the "updated" nested dictionary in each child process to be properly sent to the others. However, can someone help me use Manager.Queue to organize this inter-child communication and get the correct results possibly with minimal runtime?
Code (Python 3.5)
from multiprocessing import Pool, Manager
import numpy as np
def my_function(A):
arg1 = A[0]
my_dict = A[1]
temporary_dict = my_dict['nested']
for arg2 in np.arange(len(my_dict['nested']['elements'][arg1]['data'])):
temporary_dict['elements'][arg1]['data'][arg2] = temporary_dict['elements'][arg1]['data'][arg2] ** 2
my_dict['nested'] = temporary_dict
if __name__ == '__main__':
# nested dictionary definition
strs1 = {}
strs2 = {}
strs3 = {}
strs4 = {}
strs5 = {}
strs1['data'] = {}
strs2['data'] = {}
strs3['data'] = {}
strs4['data'] = {}
strs5['data'] = {}
for i in [0,1,2,3]:
strs1['data'][i] = i + 1
strs2['data'][i] = i + 2
strs3['data'][i] = i + 3
strs4['data'][i] = i + 4
strs5['data'][i] = i + 5
nested = {}
nested['elements'] = [strs1, strs2, strs3, strs4, strs5]
nested['names'] = ['series1', 'series2', 'series3', 'series4', 'series5']
# parallel processing
pool = Pool(processes = 16)
manager = Manager()
my_dict = manager.dict()
my_dict['nested'] = nested
sequence = np.arange(len(my_dict['nested']['elements']))
pool.map(my_function, ([seq,my_dict] for seq in sequence))
pool.close()
pool.join()
# printing the data in all elements of the nested dictionary
print(my_dict['nested']['elements'][0]['data'])
print(my_dict['nested']['elements'][1]['data'])
print(my_dict['nested']['elements'][2]['data'])
print(my_dict['nested']['elements'][3]['data'])
print(my_dict['nested']['elements'][4]['data'])
One way to go around this and get correct results would be using multiprocessing.Lock, but that kills the speed:
from multiprocessing import Pool, Manager, Lock
import numpy as np
def init(l):
global lock
lock = l
def my_function(A):
arg1 = A[0]
my_dict = A[1]
with lock:
temporary_dict = my_dict['nested']
for arg2 in np.arange(len(my_dict['nested']['elements'][arg1]['data'])):
temporary_dict['elements'][arg1]['data'][arg2] = temporary_dict['elements'][arg1]['data'][arg2] ** 2
my_dict['nested'] = temporary_dict
if __name__ == '__main__':
# nested dictionary definition
strs1 = {}
strs2 = {}
strs3 = {}
strs4 = {}
strs5 = {}
strs1['data'] = {}
strs2['data'] = {}
strs3['data'] = {}
strs4['data'] = {}
strs5['data'] = {}
for i in [0,1,2,3]:
strs1['data'][i] = i + 1
strs2['data'][i] = i + 2
strs3['data'][i] = i + 3
strs4['data'][i] = i + 4
strs5['data'][i] = i + 5
nested = {}
nested['elements'] = [strs1, strs2, strs3, strs4, strs5]
nested['names'] = ['series1', 'series2', 'series3', 'series4', 'series5']
# parallel processing
manager = Manager()
l = Lock()
my_dict = manager.dict()
my_dict['nested'] = nested
pool = Pool(processes = 16, initializer=init, initargs=(l,))
sequence = np.arange(len(my_dict['nested']['elements']))
pool.map(my_function, ([seq,my_dict] for seq in sequence))
pool.close()
pool.join()
# printing the data in all elements of the nested dictionary
print(my_dict['nested']['elements'][0]['data'])
print(my_dict['nested']['elements'][1]['data'])
print(my_dict['nested']['elements'][2]['data'])
print(my_dict['nested']['elements'][3]['data'])
print(my_dict['nested']['elements'][4]['data'])

Python 3.x - function args type-testing

I started learning Python 3.x some time ago and I wrote a very simple code which adds numbers or concatenates lists, tuples and dicts:
X = 'sth'
def adder(*vargs):
if (len(vargs) == 0):
print('No args given. Stopping...')
else:
L = list(enumerate(vargs))
for i in range(len(L) - 1):
if (type(L[i][1]) != type(L[i + 1][1])):
global X
X = 'bad'
break
if (X == 'bad'):
print('Args have different types. Stopping...')
else:
if type(L[0][1]) == int: #num
temp = 0
for i in range(len(L)):
temp += L[i][1]
print('Sum is equal to:', temp)
elif type(L[0][1]) == list: #list
A = []
for i in range(len(L)):
A += L[i][1]
print('List made is:', A)
elif type(L[0][1]) == tuple: #tuple
A = []
for i in range(len(L)):
A += list(L[i][1])
print('Tuple made is:', tuple(A))
elif type(L[0][1]) == dict: #dict
A = L[0][1]
for i in range(len(L)):
A.update(L[i][1])
print('Dict made is:', A)
adder(0, 1, 2, 3, 4, 5, 6, 7)
adder([1,2,3,4], [2,3], [5,3,2,1])
adder((1,2,3), (2,3,4), (2,))
adder(dict(a = 2, b = 433), dict(c = 22, d = 2737))
My main issue with this is the way I am getting out of the function when args have different types with the 'X' global. I thought a while about it, but I can't see easier way of doing this (I can't simply put the else under for, because the results will be printed a few times; probably I'm messing something up with the continue and break usage).
I'm sure I'm missing an easy way to do this, but I can't get it.
Thank you for any replies. If you have any advice about any other code piece here, I would be very grateful for additional help. I probably have a lot of bad non-Pythonian habits coming from earlier C++ coding.

Here are some changes I made that I think clean it up a bit and get rid of the need for the global variable.
def adder(*vargs):
if len(vargs) == 0:
return None # could raise ValueError
mytype = type(vargs[0])
if not all(type(x) == mytype for x in vargs):
raise ValueError('Args have different types.')
if mytype is int:
print('Sum is equal to:', sum(vargs))
elif mytype is list or mytype is tuple:
out = []
for item in vargs:
out += item
if mytype is list:
print('List made is:', out)
else:
print('Tuple made is:', tuple(out))
elif mytype is dict:
out = {}
for i in vargs:
out.update(i)
print('Dict made is:', out)
adder(0, 1, 2, 3, 4, 5, 6, 7)
adder([1,2,3,4], [2,3], [5,3,2,1])
adder((1,2,3), (2,3,4), (2,))
adder(dict(a = 2, b = 433), dict(c = 22, d = 2737))
I also made some other improvements that I think are a bit more 'pythonic'. For instance
for item in list:
print(item)
instead of
for i in range(len(list)):
print(list[i])
In a function like this if there are illegal arguments you would commonly short-cuircuit and just throw a ValueError.
if bad_condition:
raise ValueError('Args have different types.')

Just for contrast, here is another version that feels more pythonic to me (reasonable people might disagree with me, which is OK by me).
The principal differences are that a) type clashes are left to the operator combining the arguments, b) no assumptions are made about the types of the arguments, and c) the result is returned instead of printed. This allows combining different types in the cases where that makes sense (e.g, combine({}, zip('abcde', range(5)))).
The only assumption is that the operator used to combine the arguments is either add or a member function of the first argument's type named update.
I prefer this solution because it does minimal type checking, and uses duck-typing to allow valid but unexpected use cases.
from functools import reduce
from operator import add
def combine(*args):
if not args:
return None
out = type(args[0])()
return reduce((getattr(out, 'update', None) and (lambda d, u: [d.update(u), d][1]))
or add, args, out)
print(combine(0, 1, 2, 3, 4, 5, 6, 7))
print(combine([1,2,3,4], [2,3], [5,3,2,1]))
print(combine((1,2,3), (2,3,4), (2,)))
print(combine(dict(a = 2, b = 433), dict(c = 22, d = 2737)))
print(combine({}, zip('abcde', range(5))))

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

ast nodes not preserving some properties (lineno or/and col_offset) - python-3.x

Related

Python: eq(self, other): len(self.sensor_data) ok, but len(other.senor_data) => AttributeError

Python variable reference (primitive type vs objects)

Why is a list variable sometimes not impacted by changes in function as I thought python3 works on pass by reference with list variables?

Dynamically updating a nested dictionary with multiprocessing.pool (speed issue)

Python 3.x - function args type-testing

Categories

Resources

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

ast nodes not preserving some properties (lineno or/and col_offset) - python-3.x

Related

Python: __eq__(self, other): len(self.sensor_data) ok, but len(other.senor_data) => AttributeError

Python variable reference (primitive type vs objects)

Why is a list variable sometimes not impacted by changes in function as I thought python3 works on pass by reference with list variables?

Dynamically updating a nested dictionary with multiprocessing.pool (speed issue)

Python 3.x - function args type-testing

Categories

Resources

Python: eq(self, other): len(self.sensor_data) ok, but len(other.senor_data) => AttributeError