Wrong networkx predecessors when used inside a function - python-3.x

I am trying to find the parents of my nodes in the graph G but when I use the predecessor method inside a function my filtering method returns the wrong answer.
MWE:
import networkx as nx
G=nx.MultiDiGraph()
G.add_node("Z_1")
G.add_node("Z_0")
G.add_node("X_1")
G.add_edge('X_1','Z_1')
G.add_edge('Z_0','Z_1')
Simple function to find nodes at different time-indices:
def node_parents(node: str, temporal_index: int = None) -> tuple:
#  Returns the parents of this node with optional filtering on the time-index.
if temporal_index:
# return (*[v for v in G.predecessors(node) if v.split("_")[1] == str(temporal_index)],)
return tuple(filter(lambda x: x.endswith(str(temporal_index)), G.predecessors(node)))
else:
return tuple(G.predecessors(node))
Now then, let's use the function:
node_parents("Z_1",0)
>>>('X_1', 'Z_0')
Ok. Let's use the predecessor method in a filter outside the function:
(*[v for v in G.predecessors('Z_1') if v.split("_")[1] == "0"],)
>>>('Z_0',)
All I want to do is to filter out, in this example, nodes which are zero-indexed (i.e. strings which have a zero at the end). But for some reason I am getting different answers. Why is this?

Thanks to #Paul Brodersen the correct way to write this is:
def node_parents(node: str, temporal_index: int = None) -> tuple:
# Returns the parents of this node with optional filtering on the time-index.
if temporal_index is not None:
# return (*[v for v in G.predecessors(node) if v.split("_")[1] == str(temporal_index)],)
return tuple(filter(lambda x: x.endswith(str(temporal_index)), G.predecessors(node)))
else:
return tuple(G.predecessors(node))

Related

python pass multiple parameters to function that expects function

I am stuck on a problem and I would be grateful for help.
Consider the following example.
# can be changed
def inner(x):
x = 4
return x
# can not be changed
def outer(a, b):
b = b(b)
return a + b
# can be changed - I want to pass two parameters to inner
res = outer(
a = 3,
b = inner
)
print(res)
Now important to note is that I can not change the function outer(), because it comes from a common code base / package.
How can I pass multiple parameters to inner()? inner() can be changed as needed, but it needs to return the function, not the int since outer() expects a function.
Thank you in advance!
I tried passing multiple parameters, but outer() expects a function. This example illustrates the function ExternalTaskSensor of the python package airflow. The parameter
execution_date_fn
expects a function to be returned.

Identifying and handling more than one dataframe in Python [duplicate]

Suppose I have code like:
x = 0
y = 1
z = 2
my_list = [x, y, z]
for item in my_list:
print("handling object ", name(item)) # <--- what would go instead of `name`?
How can I get the name of each object in Python? That is to say: what could I write instead of name in this code, so that the loop will show handling object x and then handling object y and handling object z?
In my actual code, I have a dict of functions that I will call later after looking them up with user input:
def fun1():
pass
def fun2():
pass
def fun3():
pass
fun_dict = {'fun1': fun1,
'fun2': fun2,
'fun3': fun3}
# suppose that we get the name 'fun3' from the user
fun_dict['fun3']()
How can I create fun_dict automatically, without writing the names of the functions twice? I would like to be able to write something like
fun_list = [fun1, fun2, fun3] # and I'll add more as the need arises
fun_dict = {}
for t in fun_list:
fun_dict[name(t)] = t
to avoid duplicating the names.
Objects do not necessarily have names in Python, so you can't get the name.
When you create a variable, like the x, y, z above then those names just act as "pointers" or "references" to the objects. The object itself does not know what name(s) you are using for it, and you can not easily (if at all) get the names of all references to that object.
However, it's not unusual for objects to have a __name__ attribute. Functions do have a __name__ (unless they are lambdas), so we can build fun_dict by doing e.g.
fun_dict = {t.__name__: t for t in fun_list)
That's not really possible, as there could be multiple variables that have the same value, or a value might have no variable, or a value might have the same value as a variable only by chance.
If you really want to do that, you can use
def variable_for_value(value):
for n,v in globals().items():
if v == value:
return n
return None
However, it would be better if you would iterate over names in the first place:
my_list = ["x", "y", "z"] # x, y, z have been previously defined
for name in my_list:
print "handling variable ", name
bla = globals()[name]
# do something to bla
This one-liner works, for all types of objects, as long as they are in globals() dict, which they should be:
def name_of_global_obj(xx):
return [objname for objname, oid in globals().items()
if id(oid)==id(xx)][0]
or, equivalently:
def name_of_global_obj(xx):
for objname, oid in globals().items():
if oid is xx:
return objname
As others have mentioned, this is a really tricky question. Solutions to this are not "one size fits all", not even remotely. The difficulty (or ease) is really going to depend on your situation.
I have come to this problem on several occasions, but most recently while creating a debugging function. I wanted the function to take some unknown objects as arguments and print their declared names and contents. Getting the contents is easy of course, but the declared name is another story.
What follows is some of what I have come up with.
Return function name
Determining the name of a function is really easy as it has the __name__ attribute containing the function's declared name.
name_of_function = lambda x : x.__name__
def name_of_function(arg):
try:
return arg.__name__
except AttributeError:
pass`
Just as an example, if you create the function def test_function(): pass, then copy_function = test_function, then name_of_function(copy_function), it will return test_function.
Return first matching object name
Check whether the object has a __name__ attribute and return it if so (declared functions only). Note that you may remove this test as the name will still be in globals().
Compare the value of arg with the values of items in globals() and return the name of the first match. Note that I am filtering out names starting with '_'.
The result will consist of the name of the first matching object otherwise None.
def name_of_object(arg):
# check __name__ attribute (functions)
try:
return arg.__name__
except AttributeError:
pass
for name, value in globals().items():
if value is arg and not name.startswith('_'):
return name
Return all matching object names
Compare the value of arg with the values of items in globals() and store names in a list. Note that I am filtering out names starting with '_'.
The result will consist of a list (for multiple matches), a string (for a single match), otherwise None. Of course you should adjust this behavior as needed.
def names_of_object(arg):
results = [n for n, v in globals().items() if v is arg and not n.startswith('_')]
return results[0] if len(results) is 1 else results if results else None
If you are looking to get the names of functions or lambdas or other function-like objects that are defined in the interpreter, you can use dill.source.getname from dill. It pretty much looks for the __name__ method, but in certain cases it knows other magic for how to find the name... or a name for the object. I don't want to get into an argument about finding the one true name for a python object, whatever that means.
>>> from dill.source import getname
>>>
>>> def add(x,y):
... return x+y
...
>>> squared = lambda x:x**2
>>>
>>> print getname(add)
'add'
>>> print getname(squared)
'squared'
>>>
>>> class Foo(object):
... def bar(self, x):
... return x*x+x
...
>>> f = Foo()
>>>
>>> print getname(f.bar)
'bar'
>>>
>>> woohoo = squared
>>> plus = add
>>> getname(woohoo)
'squared'
>>> getname(plus)
'add'
Use a reverse dict.
fun_dict = {'fun1': fun1,
'fun2': fun2,
'fun3': fun3}
r_dict = dict(zip(fun_dict.values(), fun_dict.keys()))
The reverse dict will map each function reference to the exact name you gave it in fun_dict, which may or may not be the name you used when you defined the function. And, this technique generalizes to other objects, including integers.
For extra fun and insanity, you can store the forward and reverse values in the same dict. I wouldn't do that if you were mapping strings to strings, but if you are doing something like function references and strings, it's not too crazy.
Note that while, as noted, objects in general do not and cannot know what variables are bound to them, functions defined with def do have names in the __name__ attribute (the name used in def). Also if the functions are defined in the same module (as in your example) then globals() will contain a superset of the dictionary you want.
def fun1:
pass
def fun2:
pass
def fun3:
pass
fun_dict = {}
for f in [fun1, fun2, fun3]:
fun_dict[f.__name__] = f
Here's another way to think about it. Suppose there were a name() function that returned the name of its argument. Given the following code:
def f(a):
return a
b = "x"
c = b
d = f(c)
e = [f(b), f(c), f(d)]
What should name(e[2]) return, and why?
And the reason I want to have the name of the function is because I want to create fun_dict without writing the names of the functions twice, since that seems like a good way to create bugs.
For this purpose you have a wonderful getattr function, that allows you to get an object by known name. So you could do for example:
funcs.py:
def func1(): pass
def func2(): pass
main.py:
import funcs
option = command_line_option()
getattr(funcs, option)()
I know This is late answer.
To get func name , you can use func.__name__
To get the name of any python object that has no name or __name__ method. You can iterate over its module members.
Ex:.
# package.module1.py
obj = MyClass()
# package.module2.py
import importlib
def get_obj_name(obj):
mod = Obj.__module__ # This is necessary to
module = module = importlib.import_module(mod)
for name, o in module.__dict__.items():
if o == obj:
return name
Performance note: don't use it in large modules.
Variable names can be found in the globals() and locals() dicts. But they won't give you what you're looking for above. "bla" will contain the value of each item of my_list, not the variable.
Generally when you are wanting to do something like this, you create a class to hold all of these functions and name them with some clear prefix cmd_ or the like. You then take the string from the command, and try to get that attribute from the class with the cmd_ prefixed to it. Now you only need to add a new function/method to the class, and it's available to your callers. And you can use the doc strings for automatically creating the help text.
As described in other answers, you may be able to do the same approach with globals() and regular functions in your module to more closely match what you asked for.
Something like this:
class Tasks:
def cmd_doit(self):
# do it here
func_name = parse_commandline()
try:
func = getattr('cmd_' + func_name, Tasks())
except AttributeError:
# bad command: exit or whatever
func()
I ran into this page while wondering the same question.
As others have noted, it's simple enough to just grab the __name__ attribute from a function in order to determine the name of the function. It's marginally trickier with objects that don't have a sane way to determine __name__, i.e. base/primitive objects like basestring instances, ints, longs, etc.
Long story short, you could probably use the inspect module to make an educated guess about which one it is, but you would have to probably know what frame you're working in/traverse down the stack to find the right one. But I'd hate to imagine how much fun this would be trying to deal with eval/exec'ed code.
% python2 whats_my_name_again.py
needle => ''b''
['a', 'b']
[]
needle => '<function foo at 0x289d08ec>'
['c']
['foo']
needle => '<function bar at 0x289d0bfc>'
['f', 'bar']
[]
needle => '<__main__.a_class instance at 0x289d3aac>'
['e', 'd']
[]
needle => '<function bar at 0x289d0bfc>'
['f', 'bar']
[]
%
whats_my_name_again.py:
#!/usr/bin/env python
import inspect
class a_class:
def __init__(self):
pass
def foo():
def bar():
pass
a = 'b'
b = 'b'
c = foo
d = a_class()
e = d
f = bar
#print('globals', inspect.stack()[0][0].f_globals)
#print('locals', inspect.stack()[0][0].f_locals)
assert(inspect.stack()[0][0].f_globals == globals())
assert(inspect.stack()[0][0].f_locals == locals())
in_a_haystack = lambda: value == needle and key != 'needle'
for needle in (a, foo, bar, d, f, ):
print("needle => '%r'" % (needle, ))
print([key for key, value in locals().iteritems() if in_a_haystack()])
print([key for key, value in globals().iteritems() if in_a_haystack()])
foo()
You define a class and add the Unicode private function insert the class like
class example:
def __init__(self, name):
self.name = name
def __unicode__(self):
return self.name
Of course you have to add extra variable self.name which is the name of the object.
Here is my answer, I am also using globals().items()
def get_name_of_obj(obj, except_word = ""):
for name, item in globals().items():
if item == obj and name != except_word:
return name
I added except_word because I want to filter off some word used in for loop.
If you didn't add it, the keyword in for loop may confuse this function, sometimes the keyword like "each_item" in the following case may show in the function's result, depends on what you have done to your loop.
eg.
for each_item in [objA, objB, objC]:
get_name_of_obj(obj, "each_item")
eg.
>>> objA = [1, 2, 3]
>>> objB = ('a', {'b':'thi is B'}, 'c')
>>> for each_item in [objA, objB]:
... get_name_of_obj(each_item)
...
'objA'
'objB'
>>>
>>>
>>> for each_item in [objA, objB]:
... get_name_of_obj(each_item)
...
'objA'
'objB'
>>>
>>>
>>> objC = [{'a1':'a2'}]
>>>
>>> for item in [objA, objB, objC]:
... get_name_of_obj(item)
...
'objA'
'item' <<<<<<<<<< --------- this is no good
'item'
>>> for item in [objA, objB]:
... get_name_of_obj(item)
...
'objA'
'item' <<<<<<<<--------this is no good
>>>
>>> for item in [objA, objB, objC]:
... get_name_of_obj(item, "item")
...
'objA'
'objB' <<<<<<<<<<--------- now it's ok
'objC'
>>>
Hope this can help.
Based on what it looks like you're trying to do you could use this approach.
In your case, your functions would all live in the module foo. Then you could:
import foo
func_name = parse_commandline()
method_to_call = getattr(foo, func_name)
result = method_to_call()
Or more succinctly:
import foo
result = getattr(foo, parse_commandline())()
Python has names which are mapped to objects in a hashmap called a namespace. At any instant in time, a name always refers to exactly one object, but a single object can be referred to by any arbitrary number of names. Given a name, it is very efficient for the hashmap to look up the single object which that name refers to. However given an object, which as mentioned can be referred to by multiple names, there is no efficient way to look up the names which refer to it. What you have to do is iterate through all the names in the namespace and check each one individually and see if it maps to your given object. This can easily be done with a list comprehension:
[k for k,v in locals().items() if v is myobj]
This will evaluate to a list of strings containing the names of all local "variables" which are currently mapped to the object myobj.
>>> a = 1
>>> this_is_also_a = a
>>> this_is_a = a
>>> b = "ligma"
>>> c = [2,3, 534]
>>> [k for k,v in locals().items() if v is a]
['a', 'this_is_also_a', 'this_is_a']
Of course locals() can be substituted with any dict that you want to search for names that point to a given object. Obviously this search can be slow for very large namespaces because they must be traversed in their entirety.
Hi there is one way to get the variable name that stores an instance of a class
is to use
locals()
function, it returns a dictionary that contains the variable name as a string and its value

Interviewbit - Merge k sorted linked lists: heappop returns max element instead of min

I'm solving the Interviewbit code challenge Merge K Sorted Lists:
Merge k sorted linked lists and return it as one sorted list.
Example :
1 -> 10 -> 20
4 -> 11 -> 13
3 -> 8 -> 9
will result in
1 -> 3 -> 4 -> 8 -> 9 -> 10 -> 11 -> 13 -> 20
The Python template code is:
# Definition for singly-linked list.
# class ListNode:
# def __init__(self, x):
# self.val = x
# self.next = None
class Solution:
# #param A : list of linked list
# #return the head node in the linked list
def mergeKLists(self, A):
pass
Here's my python 3 solution for the same:
from heapq import heapify, heappop, heappush
class Solution:
# #param A : list of linked list
# #return the head node in the linked list
def mergeKLists(self, A):
minheap = [x for x in A]
# print(minheap)
# heapify(minheap)
# print(minheap)
head = tail = None
# print(minheap)
while minheap:
# print(minheap)
heapify(minheap)
print([x.val for x in minheap])
minimum = heappop(minheap)
print(minimum.val)
if head is None:
head = minimum
tail = minimum
else:
tail.next = minimum
tail = minimum
if minimum.next:
heappush(minheap, minimum.next)
return head
With the print commands that are uncommented, you'll notice that in the intermediate runs of the while loop, heappop returns the largest element, as if we were dealing with a max heap, which we're not!
That's the place where the answer is going wrong as far as I can see. Can anyone suggest the reason for why heappop is working like this? And how that can be corrected?
When I run your code locally with sample data, I get an error on:
heapify(minheap)
TypeError: < not supported between instances of ListNode and ListNode
This is expected. The template definition of ListNode shows no support for making comparisons, and a heapify function will need to compare the items in the given list.
As the class ListNode is already defined by the code-challenge framework, it is probably better not to try to make that class comparable.
I would propose to put tuples on the heap which have list node instances as members, but have their val attribute value come first, followed by the number of the list (in A) they originate from. As third tuple member you'd finally have the node itself. This way comparisons will work, since tuples are comparable when their members are. And since the second tuple member will be a tie-breaker when the first member value is the same, the third tuple member (the list node instance) will never be subject to a comparison.
Unrelated to your question, but you should only heapify once, not in each iteration of the loop. The actions on the heap (heappush, heappop) maintain the heap property, so there is no need for calling heapify a second time. If you do it in each iteration, you actually destroy the efficiency benefit you would get from using a heap.
Here is your code updated with that change:
from heapq import heapify, heappop, heappush
class Solution:
def mergeKLists(self, A):
# place comparable tuples in the heap
minheap = [(node.val, i, node) for i, node in enumerate(A)]
heapify(minheap) # call only once
head = tail = None
while minheap:
# extract the tuple information we need
_, i, minimum = heappop(minheap)
if head is None:
head = minimum
tail = minimum
else:
tail.next = minimum
tail = minimum
minimum = minimum.next
if minimum:
# push a tuple, using same list index
heappush(minheap, (minimum.val, i, minimum))
return head

multiple assignment for functions with *args

I am trying to write a function that will be used on multiple dictionaries of dataframes. My hope is to perform multiple assignments and do it all in one line. for example:
x, y, z = function(x, y, z)
However, with the function, I can't return multiple values for the multiple assignments. This is what I currently have
def split_pre(*args):
for arg in args:
newdict = {}
for key, sheet in arg.items():
if isinstance(sheet, str):
continue
else:
newdict[key] = sheet[sheet.Year < 2000]
return newdict
My thinking is that for each arg it would return the dictionary I created, but I get:
ValueError: too many values to unpack (expected 2)
The inputs to this function would be a dictionary made up of dataframes, e.g.,
x = {1:df, 2:df, 3:df...}
and the desired output would be of the same structure, but with the altered dfs from the function
I'm still quite new to python and this isn't super important, but I was wondering if anyone knew of a succinct way to get at this.
Do you want to return a dictionary per arg?
As already stated by #DeepSpace, Python stops processing the function when the first return command is executed. You can fix your problem in two ways: either create a list where you collect the dictionaries you want to return, or create a generator function:
# Solution with a list
def split_pre(*args):
ans = []
for arg in args:
newdict = {}
for key, sheet in arg.items():
if isinstance(sheet, str):
continue
else:
newdict[key] = sheet[sheet.Year < 2000]
ans.append(newdict)
return ans
or
# Solution with a generator
def split_pre(*args):
for arg in args:
newdict = {}
for key, sheet in arg.items():
if isinstance(sheet, str):
continue
else:
newdict[key] = sheet[sheet.Year < 2000]
yield newdict
In case you call a function in the way you do (a, b, c = func(x, y, z)) both samples are going to work in the same way. But they are not actually the same and I'd recommend using the solution with lists if you're not familiar with generators (you can read more about the yield keyword here)

Generate all valid binary search trees given a list of values

Hello I am trying to solve the following question on leetcode, [https://leetcode.com/problems/unique-binary-search-trees-ii/].
I know I have access to the solution but I tried solving the problem my way and I am stuck and I would like to know if it is solvable the way I am doing it.
Here is the code:
class TreeNode:
def __init__(self, x):
self.val = x
self.left = None
self.right = None
def generateTrees(myrange, n, res = None):
if res == None:
res = []
if myrange == []:
res.append(None)
return
for root in myrange:
res.append(root)
generateTrees([i for i in range(root) if i in set(myrange)], n, res) #leftchild
generateTrees([i for i in range(root+1, n) if i in set(myrange)], n, res) #rightchild
return res
Initially myrange is just the list containing the node values, and n is the length of myrange.
The way I am doing it is a sort of DFS where I loop over the nodes making each one of them the root once and then I do the same for the left and right subtrees to get all combinations. But the problem I am facing is I can't figure out how to manage res to remove elements from it as my recursion backtracks (and make it so res only contains valid bst's and then put those in some other list that will be my actual result).
I would like some pointers or even just comments on if you think my approach is valid or bad ..etc.
Issues:
As you mention, your code only creates one list to which it keeps appending.
Even if you would fix that, the lists would never come out in the BFS kind of order, which is what the question's example seems to suggest.
For a chosen root, you need to list all combinations of its possible left subtrees with its possible right subtrees -- a Cartesian product if you wish. This logic is missing in your code.
I would:
not pass res as argument to the recursive function. Just return it, and let the caller deal with it.
not use ranges, as that only seems to complicate things. The if i in set(myrange) seems like an inefficient way to get the overlap between two ranges. I would instead pass the two extremes of the range as separate arguments.
use the TreeNode class to actually create the trees, and deal with generating the required output format later.
For generating the output format you need a BFS walk through the tree, and this could be implemented as a method on TreeNode.
Here is what I think would work:
class TreeNode:
def __init__(self, x):
self.val = x
self.left = None
self.right = None
def breadth_first(self):
lst = []
todo = [self]
while any(todo):
node = todo.pop(0)
lst.append(node.val if node else None)
if node:
todo.append(node.left)
todo.append(node.right)
return lst
def generateTrees(n):
def recur(start, end): # end is not included
if start >= end:
return [None]
trees = []
for root in range(start, end):
lefts = recur(start, root)
rights = recur(root+1, end)
# Cartesian product:
for left in lefts:
for right in rights:
# Start with a new tree, and append to result
tree = TreeNode(root)
tree.left = left
tree.right = right
trees.append(tree)
return trees
return recur(1, n+1)
# Create the trees as a list of TreeNode instances:
trees = generateTrees(3)
# Convert to a list of lists
print([tree.breadth_first() for tree in trees])

Resources