AST: check if two nodes are connected - python-3.x

I am parsing some code as below. I get a list of ast.node. Now, I want to know which nodes are linked to which node. Is it possible to do that?
Example in below code: 1st assign is not connected (f = lambda..) to any other assign. But 2nd assign and FunctioDef are connected. Is it possible to infer this from some node attribute?
Assuming of course there is no other definition or code outside this.
code = '''
f = lambda x : x
factor = 2
def f(arg):
if arg == 0:
return 1
return factor*arg*f(arg-1)
'''
parsed = ast.parse(code)
>>> parsed.body
[<_ast.Assign at 0x7f9c81f0e670>,
<_ast.Assign at 0x7f9c92042070>,
<_ast.FunctionDef at 0x7f9c920421f0>]

Related

python pass multiple parameters to function that expects function

I am stuck on a problem and I would be grateful for help.
Consider the following example.
# can be changed
def inner(x):
x = 4
return x
# can not be changed
def outer(a, b):
b = b(b)
return a + b
# can be changed - I want to pass two parameters to inner
res = outer(
a = 3,
b = inner
)
print(res)
Now important to note is that I can not change the function outer(), because it comes from a common code base / package.
How can I pass multiple parameters to inner()? inner() can be changed as needed, but it needs to return the function, not the int since outer() expects a function.
Thank you in advance!
I tried passing multiple parameters, but outer() expects a function. This example illustrates the function ExternalTaskSensor of the python package airflow. The parameter
execution_date_fn
expects a function to be returned.

Access to the local environment of a function from another function called by the previous one

Let's take a look at this piece of code:
def a(): # N = 0
string = "pizza"
# stuff
res_b = b(string)
def b(string): # N = 1
# stuff
res_c = c(string)
return res_c
def c(string): # N = 2
# stuff
return thing
I have a long file which has basically the same shape than that. I would like to remove the parameter str from the definitions and to make b and c able to read it directly (I mean not using an external dictionary) from the N-1 function. So I wonder if a function could read the local environment of the one which called it.
Does anything look like what I am looking for ?

how do i write multiple function outputs to single csv file

i am scraping multiple websites so i am using one function for each website script, so each function returns 4 values, i want to print them in dataframe and write them in csv but i am facing this problem, i may be asking something too odd or basic but please help
Either i will have to write whole script in one block and that will look very nasty to handle so if i could find a way around, this is just a sample of problem i am facing..
def a1(x):
z=x+1
r = x+2
print(z, r)
def a2(x):
y=x+4
t=x+3
print(y, t)
x = 2
a1(x)
a2(x)
3 4
6 5
data = pd.Dataframe({'first' : [z],
'second' : [r],
'third' : [y],
'fourth' : [t]
})`
data
*error 'z' is not defined*
You may find it convenient to write functions that return a list of dicts.
For example:
rows = [dict(a=1, b=2, c=3),
dict(a=4, b=5, c=6)]
df = pd.DataFrame(rows)
The variables are only defined in the local scope of your functions, you'd either need to declare them globally or - the better way - return them so you can use them outside of the function by assigning the return values to new variables
import pandas as pd
def a1(x):
z = x+1
r = x+2
return (z, r)
def a2(x):
y = x+4
t = x+3
return (y, t)
x = 2
z, r = a1(x)
y, t = a2(x)
data = pd.DataFrame({'first' : [z],
'second' : [r],
'third' : [y],
'fourth' : [t]
})

Problem with calling a variable from one function into another

I am trying to call a variable from one function into another by using the command return, without success. This is the example code I have:
def G():
x = 2
y = 3
g = x*y
return g
def H():
r = 2*G(g)
print(r)
return r
H()
When I run the code i receive the following error NameError: name 'g' is not defined
Thanks in advance!
Your function def G(): returns a variable. Therefore, when you call it, you assign a new variable for the returned variable.
Therefore you could use the following code:
def H():
G = G()
r = 2*G
print (r)
You don't need to give this statement:
return r
While you've accepted the answer above, I'd like to take the time to help you learn and clean up your code.
NameError: name 'g' is not defined
You're getting this error because g is a local variable of the function G()
Clean Version:
def multiple_two_numbers():
"""
Multiplies two numbers
Args:
none
Returns:
product : the result of multiplying two numbers
"""
x = 2
y = 3
product = x*y
return product
def main():
result = multiple_two_numbers()
answer = 2 * result
print(answer)
if __name__ == "__main__":
# execute only if run as a script
main()
Problems with your code:
Have clear variable and method names. g and G can be quiet confusing to the reader.
Your not using the if __name__ == "__main__":
Your return in H() unnecessary as well as the H() function.
Use docstrings to help make your code more readable.
Questions from the comments:
I have one question what if I had two or more variables in the first
function but I only want to call one of them
Your function can have as many variables as you want. If you want to return more than one variable you can use a dictionary(key,value) List, or Tuple. It all depends on your requirements.
Is it necessary to give different names, a and b, to the new
variables or can I use the same x and g?
Absolutely! Declaring another variable called x or y will cause the previous declaration to be overwritten. This could make it hard to debug and you and readers of your code will be frustrated.

Implement Kahn's topological sorting algorithm using Python

Kahn proposed an algorithm in 62 to topologically sort any DAG (directed acyclic graph), pseudo code copied from Wikipedia:
L ← Empty list that will contain the sorted elements
S ← Set of all nodes with no incoming edges
while S is non-empty do
remove a node n from S
add n to tail of L
for each node m with an edge e from n to m do
remove edge e from the graph # This is a DESTRUCTIVE step!
if m has no other incoming edges then
insert m into S if graph has edges then
return error (graph has at least one cycle) else
return L (a topologically sorted order)
I need to implement it using IPython3, with the following implementation of a DAG:
class Node(object):
def __init__(self, name, parents):
assert isinstance(name, str)
assert all(isinstance(_, RandomVariable) for _ in parents)
self.name, self.parents = name, parents
where name is the label for the node and parents stores all of its parent nodes. Then the DAG class is implemented as:
class DAG(object):
def __init__(self, *nodes):
assert all(isinstance(_, Node) for _ in nodes)
self.nodes = nodes
(The DAG implementation is fixed and not to be improved.) Then I need to implement Kahn's algorithm as a function top_order which takes in a DAG instance and returns an ordering like (node_1, node_2, ..., node_n). The main trouble is, this algorithm is destructive because one of its steps is remove edge e from the graph (line 5) which will delete one member of m.parents. However, I have to leave the DAG instance intact.
One way I can think of so far is to create a deep copy of the DAG instance taken in (even a shallow copy can't do the job because the algorithm still destroys the original instance via references), and perform the destructive algorithm on this copy, and then get the correct ordering of node names of this copy (assume there is no naming conflict between nodes), and then use this ordering of names to infer the correct ordering of the nodes of the original instance, which roughly goes like:
def top_order(network):
'''takes in a DAG, prints and returns a topological ordering.'''
assert type(network) == DAG
temp = copy.deepcopy(network) # to leave the original instance intact
ordering_name = []
roots = [node for node in temp.nodes if not node.parents]
while roots:
n_node = roots[0]
del roots[0]
ordering_name.append(n_node.name)
for m_node in temp.nodes:
if n_node in m_node.parents:
temp_list = list(m_node.parents)
temp_list.remove(n_node)
m_node.parents = tuple(temp_list)
if not m_node.parents:
roots.append(m_node)
print(ordering_name) # print ordering by name
# gets ordering of nodes of the original instance
ordering = []
for name in ordering_name:
for node in network.nodes:
if node.name == name:
ordering.append(node)
return tuple(ordering)
Two problems: first, when network is huge the deep copy will be resource consuming; second, I want an improvement to my nested for loops which gets the ordering of the original instance. (For the second I think something like the sorted method etc pops into my mind.)
Any suggestion?
I'm going to suggest a less literal implementation of the algorithm: you don't need to manipulate the DAG at all, you just need to manipulate info about the DAG. The only "interesting" things the algorithm needs are a mapping from a node to its children (the opposite of what your DAG actually stores), and a count of the number of each node's parents.
These are easy to compute, and dicts can be used to associate this info with node names (assuming all names are distinct - if not, you can invent unique names with a bit more code).
Then this should work:
def topsort(dag):
name2node = {node.name: node for node in dag.nodes}
# map name to number of predecessors (parents)
name2npreds = {}
# map name to list of successors (children)
name2succs = {name: [] for name in name2node}
for node in dag.nodes:
thisname = node.name
name2npreds[thisname] = len(node.parents)
for p in node.parents:
name2succs[p.name].append(thisname)
result = [n for n, npreds in name2npreds.items() if npreds == 0]
for p in result:
for c in name2succs[p]:
npreds = name2npreds[c]
assert npreds
npreds -= 1
name2npreds[c] = npreds
if npreds == 0:
result.append(c)
if len(result) < len(name2node):
raise ValueError("no topsort - cycle")
return tuple(name2node[p] for p in result)
There's one subtle point here: the outer loop appends to result while it's iterating over result. That's intentional. The effect is that every element in result is processed exactly once by the outer loop, regardless of whether an element was in the initial result or added later.
Note that while the input DAG and Nodes are traversed, nothing in them is altered.

Resources