Breadth First Tree Traversal using Generators in Python - python-3.x

I am studying how to use Generators in Python in David Beazly's excellent Python Cookbook text. The following code recipe defines Depth First Tree Traversal using generators very elegantly:
# example.py
#
# Example of depth-first search using a generator
class Node:
def __init__(self, value):
self._value = value
self._children = []
def __repr__(self):
return 'Node({!r})'.format(self._value)
def add_child(self, node):
self._children.append(node)
def __iter__(self):
return iter(self._children)
def depth_first(self):
yield self
for c in self:
yield from c.depth_first()
# Example
if __name__ == '__main__':
root = Node(0)
child1 = Node(1)
child2 = Node(2)
root.add_child(child1)
root.add_child(child2)
child1.add_child(Node(3))
child1.add_child(Node(4))
child2.add_child(Node(5))
for ch in root.depth_first():
print(ch)
# Outputs: Node(0), Node(1), Node(3), Node(4), Node(2), Node(5)
I am trying to come up with an equally elegant method
def breadth_first(self):
pass
I am deliberately not posting the crazy stuff that I have been trying since everything that I have tried requires maintaining 'state' within it. I don't want to use the traditional queue based solutions. The whole point of this academic exercise is to learn how generators behave in depth. Therefore, I want to create a parallel 'breadth_first' method using generators for the tree above.
Any pointers/solutions are welcome.

You can't use recursion (stack) for bfs without some serious hacks, but a queue would work:
def breadth_first(self):
q = [self]
while q:
n = q.pop(0)
yield n
for c in n._children:
q.append(c)

I find it both useful and elegant generating the whole breadth, one level at a time. Python 3 generator below:
def get_level(t: Node) -> Iterable[List]:
curr_level = [t] if t else []
while len(curr_level) > 0:
yield [node._value for node in curr_level]
curr_level = [child for parent in curr_level
for child in parent._children
if child]

It would be easy if itertools had:
# zip_chain('ABCD', 'xy') --> A x B y C D
It's almost a itertools.chain(itertools.zip_longest()), but not quite.
Anyway, zip_chain allows:
def bf(self):
yield self
yield from zip_chain(*map(Node.bf, self.children))
It doesn't create a whole row at a time either, I think,

The depth first solution you implement is essentially "iterate then recurse"
def depth_first(self):
yield self
for c in self.children:
yield from c.depth_first():
Inspired by this blog post the activestate post it references, you obtain a breadth first search as "recurse then iterate":
def breadth_first(self):
yield self
for c in self.breadth_first():
if not c.children:
return # stop the recursion as soon as we hit a leaf
yield from c.children
Edit: it turns out that the linked blog says "Termination check is absent", replaced with an activestate link, which I've adapted to use yield from above.

Related

Using Recursion to add to a trie

I have been learning about the Trie structure through python. What is a little bit different about his trie compared to other tries is the fact that we are trying to implement a counter into every node of the trie in order to do an autocomplete (that is the final hope for the project). So far, I decided that having a recursive function to put the letter into a list of dictionaries would be a good idea.
Final Product (Trie):
Trie = {"value":"*start"
"count":1
"children":["value":"t"
"count":1
"children":["value":"e"
"count":1
"children":[...]
I know that a recursion would be very useful as it is just adding letters to the list, however, I can't figure out how to construct the basic function and how to tell the computer to refer to the last part of the dictionary without writing out
Trie["children"]["children"]["children"]["children"]
a bunch of times. Can you guys please give me some ideas as of how to construct the function?
--Thanks
Perhaps you only want to store a counter in the end-of-word node. (I always use a separate node for this!). So given a prefix, you work your way down the Trie, then perform a traversal to find the end-of-word nodes below your prefix, pushing them into a binary tree by their counter value. Once you have all the end-of-word nodes, you perform a post-order traversal to collect the end-of-word nodes with the largest counters - however many you decide. With each of these collected end-of-word nodes, you can obtain a list of candidate words most likely to fit the prefix! From just the end-of-word node, and storing the parent node in each node, you zoom up to the root to re-assemble the word!
The Trie itself containing the root node and insert/find functions
class Trie:
def __init__(self):
## Initialize this Trie (add a root node)
self.root = TrieNode()
def insert(self, word):
## Add a word to the Trie
current_node = self.root
for char in word:
if char not in current_node.children:
current_node.insert(char)
current_node = current_node.children[char]
current_node.is_word = True
def find(self, prefix):
## Find the Trie node that represents this prefix
current_node = self.root
for char in prefix:
if char not in current_node.children:
return None
current_node = current_node.children[char]
return current_node
This is a class method to implement autocomplete with trie.
class TrieNode:
def __init__(self):
## Initialize this node in the Trie
self.is_word = False
self.children = {}
self.results = []
def insert(self, char):
## Add a child node in this Trie
self.children[char] = TrieNode()
def suffixes(self, suffix=''):
## Recursive function that collects the suffix for
## all complete words below this point
for char, node in self.children.items():
if node.is_word:
self.results.append(suffix + char)
if node.children:
self.results += node.suffixes(suffix + char)
results = self.results.copy()
self.results = []
return list(set(results))
Solutions can be obtained by using the following command.
MyTrie = Trie()
wordList = ["ant", "anthology", "antagonist", "antonym",
"fun", "function", "factory", "trie", "trigger", "trigonometry", "tripod"]
for word in wordList:
MyTrie.insert(word)
MyTrie.find('a').suffixes()
MyTrie.find('f').suffixes()
MyTrie.find('t').suffixes()
MyTrie.find('tri').suffixes()
MyTrie.find('anto').suffixes()

Python recursive function to create generic tree [duplicate]

This question already has answers here:
Why does my recursive function return None?
(4 answers)
Closed 7 months ago.
I'm trying to create a tree with n children for each node.
Problem is, when I try to implement this using a recursive function, I ends up with multiple recursive calls so I can't have a single return statement, hence having None as a final result.
Here is my piece of code :
def recursive_func(tree, n):
if n == 0:
return tree
else:
permutations = get_permutations()
for permutation in permutations:
child_tree = recursive_func(Tree(permutation), n-1)
tree.addChild(child_tree)
get_permutations() gives a list of child trees to create. I also have a Tree class with a node value and a list of children.
Here is the Tree class:
class Tree:
def __init__(self, result):
self.node = node
self.children = []
def addChild(self, tree):
self.children.append(tree)
This is probably a rookie error in the design of my problem, but I would be glad to get some help.
TLDR: Since you use the result of recursive_func, it should always return its tree.
The recursive_func has three important points for this problem:
def recursive_func(tree, n):
if n == 0:
return tree # 1.
else:
permutations = get_permutations()
for permutation in permutations:
child_tree = recursive_func(Tree(permutation), n-1) # 2.
tree.addChild(child_tree)
# 3.
Now, 1. defines that the function sometimes returns a Tree. This matches 2. which always expects the function to return a Tree. However, both conflict with 3., which implicitly returns None when the second branch is done.
Since the first branch is empty aside from the return, both 1. and 3. can be collapsed to one path that always returns a Tree.
def recursive_func(tree, n):
if n > 0:
permutations = get_permutations()
for permutation in permutations:
child_tree = recursive_func(Tree(permutation), n-1)
tree.addChild(child_tree)
return tree

How passing a container(list/dictionary) to a function would be different in python?

I have a data structure/container-python list/dictionary that I want to update based some computation. There are a few ways I have in mind:
1.
new_list=list() # initialised globally!
def func():
for i in range(5):
new_list.append(i) # updtaing here!
print('in function:', new_list)
pass
def main():
print('before:', new_list)
func()
print('after:',new_list)
if __name__ == '__main__':
main()
2.
def func(container):
for i in range(5):
container.append(i)
print('in function:', container)
pass
def main():
new_list=list()
print('before:', new_list)
func(new_list)
print('after:',new_list)
if __name__ == '__main__':
main()
3.
def func(container):
for i in range(5):
container.append(i)
print('in function:', container)
return container
def main():
new_list=list()
print('before:', new_list)
new_list = func(new_list)
print('after:',new_list)
if __name__ == '__main__':
main()
Could anyone explain what the difference between the 3 versions.? Logically all 3 of them work and even the results are same! but I am curious to know what the difference between these approaches and which is better?
Globals are evil. For this specific example, it might work. But if you would later decide to add a second list, you would have to rewrite your code or duplicate your functions. And it would be more complicated to write unit tests for the function.
I think there's nothing wrong with this approach in general.
This might be a matter of taste. Returning the object has no real purpose here, since the caller already has the object. And returning it might give the impression that the returned object is a different object. So personally, I wouldn't recommend this approach. I think this pattern is more often used in some other high-level object oriented programming languages as Java (or possibly C++), but I don't think it's very Pythonic.
PS: The pass statements do not have any effect. Why did you add these?
UPDATE: Extending a bit on your related question about how arguments are passed (by value or reference) and how that impacts the code:
In Python, all types are classes, and are passed by reference. When you assign a value to a local variable (e.g. the function argument), a new reference is made, but the caller still refers to the original object. However, when you modify the contents of the object, the caller "sees" that changes as well. Simply said, the difference is whether the statement includes an assignment operator (=) or not.
With integers, you would always create a new integer object using an assignment (e.g. x = 3, or even x += 3). Also strings are immutable, so you cannot modify a string in a function, but only create a new string (e.g. word = word.lower()).
If you modify a list using one of its class methods, such as list.append(), you update the original object. But if you create and assign a new list, the original list will not be changed. So, to clear a list in a function, you could use container.clear() but not container = []. I hope the following example clarifies this:
def add_numbers_to_container(container):
for i in range(5):
container.append(i)
def clear1(container):
container = []
# This creates a new list and assigns it to the local variable.
# The original list is not modified!
def clear2(container):
container.clear()
# This clears the list that was passed as argument.
def main():
new_list = []
print(new_list) # []
add_numbers_to_container(new_list)
print(new_list) # [0, 1, 2, 3, 4]
clear1(new_list)
print(new_list) # STILL [0, 1, 2, 3, 4] !
clear2(new_list)
print(new_list) # []
if __name__ == '__main__':
main()
ALSO NOTE: if you have a number of functions/methods that are processing the same data, it's a good practice to create a class for it. This has both benefits: you don't have to pass the list to each function, but you don't have to create global variables as well. So you could easily handle multiple lists with the same code. See the following example code.
Method 4:
class MyContainer:
def __init__(self):
self.container = []
# Here the container is initialized with an empty list.
def add_numbers(self, start, stop):
for i in range(start, stop):
self.container.append(i)
def clear(self):
# Both of the following lines are correct (only one is necessary):
self.container = []
self.container.clear()
def print(self):
print(self.container)
def main():
# You could even create multiple independent containers, and use the
# same functions for each object:
container1 = MyContainer()
container2 = MyContainer()
container1.print() # []
container2.print() # []
container1.add_numbers(0, 5)
container2.add_numbers(5, 8)
container1.print() # [0, 1, 2, 3, 4]
container2.print() # [5, 6, 7]
container1.clear()
container1.print() # []
if __name__ == '__main__':
main()
When you pass the object as a parameter (example 2) it should make a so-called by-reference passage so it doesn't copy the variable but just pass the pointer to that variable so it's basically the same as using a global variable, using the addictional return container (example 3) is redoundant for lists because they're passed as pointers. Instead for integers andother classes there is a big difference because the copy by value and copy by reference have differents effects. The first create a local copy and the changes won't affect the global variable while the second method would affect the global variable.

Generate all valid binary search trees given a list of values

Hello I am trying to solve the following question on leetcode, [https://leetcode.com/problems/unique-binary-search-trees-ii/].
I know I have access to the solution but I tried solving the problem my way and I am stuck and I would like to know if it is solvable the way I am doing it.
Here is the code:
class TreeNode:
def __init__(self, x):
self.val = x
self.left = None
self.right = None
def generateTrees(myrange, n, res = None):
if res == None:
res = []
if myrange == []:
res.append(None)
return
for root in myrange:
res.append(root)
generateTrees([i for i in range(root) if i in set(myrange)], n, res) #leftchild
generateTrees([i for i in range(root+1, n) if i in set(myrange)], n, res) #rightchild
return res
Initially myrange is just the list containing the node values, and n is the length of myrange.
The way I am doing it is a sort of DFS where I loop over the nodes making each one of them the root once and then I do the same for the left and right subtrees to get all combinations. But the problem I am facing is I can't figure out how to manage res to remove elements from it as my recursion backtracks (and make it so res only contains valid bst's and then put those in some other list that will be my actual result).
I would like some pointers or even just comments on if you think my approach is valid or bad ..etc.
Issues:
As you mention, your code only creates one list to which it keeps appending.
Even if you would fix that, the lists would never come out in the BFS kind of order, which is what the question's example seems to suggest.
For a chosen root, you need to list all combinations of its possible left subtrees with its possible right subtrees -- a Cartesian product if you wish. This logic is missing in your code.
I would:
not pass res as argument to the recursive function. Just return it, and let the caller deal with it.
not use ranges, as that only seems to complicate things. The if i in set(myrange) seems like an inefficient way to get the overlap between two ranges. I would instead pass the two extremes of the range as separate arguments.
use the TreeNode class to actually create the trees, and deal with generating the required output format later.
For generating the output format you need a BFS walk through the tree, and this could be implemented as a method on TreeNode.
Here is what I think would work:
class TreeNode:
def __init__(self, x):
self.val = x
self.left = None
self.right = None
def breadth_first(self):
lst = []
todo = [self]
while any(todo):
node = todo.pop(0)
lst.append(node.val if node else None)
if node:
todo.append(node.left)
todo.append(node.right)
return lst
def generateTrees(n):
def recur(start, end): # end is not included
if start >= end:
return [None]
trees = []
for root in range(start, end):
lefts = recur(start, root)
rights = recur(root+1, end)
# Cartesian product:
for left in lefts:
for right in rights:
# Start with a new tree, and append to result
tree = TreeNode(root)
tree.left = left
tree.right = right
trees.append(tree)
return trees
return recur(1, n+1)
# Create the trees as a list of TreeNode instances:
trees = generateTrees(3)
# Convert to a list of lists
print([tree.breadth_first() for tree in trees])

real-time decorator for functions and generators

I have a situation in which I need to hook certain functions so that I can inspect the return values and track them. This is useful for tracking for example running averages of values returned by methods/functions. However, these methods/function can also be generators.
However, if i'm not wrong, python detects generators when parsing and when the function is called at runtime it always returns a generator. Thus I can't simply do something like:
import types
def decorator(func):
average = None # assume average can be accessed by other means
def wrap(*args, **kwargs):
nonlocal average
ret_value = func(*args, **kwargs)
#if False wrap is still a generator
if isinstance(ret_value, types.GeneratorType):
for value in ret_value:
# update average
yield value
else:
# update average
return ret_value # ret_value can't ever be fetched
return wrap
And yielding in this decorator is necessary, since I need to track the values as the caller iterates this decorated generator (i.e. "real-time"). Meaning, I can't simply replace the for and yield with values = list(ret_value), and return values. (i.e.) If the func is a generator it needs to remain a generator once decorated. But if func is a pure function/method, even if the else is executed, wrap still remains a generator. Meaning, the ret_value can't ever be fetched.
A toy example of using such a generator would be:
#decorated
def some_gen(some_list):
for _ in range(10):
if some_list[0] % 2 == 0:
yield 1
else:
yield 0
def caller():
some_list = [0]
for i in some_gen(some_list):
print(i)
some_list[0] += 1 # changes what some_gen yields
For the toy example, there may be simpler solutions, but it's just to prove a point.
Maybe I'm missing something obvious, but I did some research and didn't find anything. The closest thing I found was this. However, that still doesn't let the decorator inspect every value returned by the wrapped generator (just the first). Does this have a solution, or are two types of decorators (one for functions and one for decorators) necessary?
Once solution I realized is:
def as_generator(gen, avg_update):
for i in gen:
avg_update(i)
yield i
import types
def decorator(func):
average = None # assume average can be accessed by other means
def wrap(*args, **kwargs):
def avg_update(ret_value):
nonlocal average
#update average
pass
ret_value = func(*args, **kwargs)
#if False wrap is still a generator
if isinstance(ret_value, types.GeneratorType):
return as_generator(ret_value, avg_update)
else:
avg_update(ret_value)
return ret_value # ret_value can't ever be fetched
return wrap
I don't know if this is the only one, or if there exists one without making a separate function for the generator case.

Resources