Implement Kahn's topological sorting algorithm using Python - python-3.x

Kahn proposed an algorithm in 62 to topologically sort any DAG (directed acyclic graph), pseudo code copied from Wikipedia:
L ← Empty list that will contain the sorted elements
S ← Set of all nodes with no incoming edges
while S is non-empty do
remove a node n from S
add n to tail of L
for each node m with an edge e from n to m do
remove edge e from the graph # This is a DESTRUCTIVE step!
if m has no other incoming edges then
insert m into S if graph has edges then
return error (graph has at least one cycle) else
return L (a topologically sorted order)
I need to implement it using IPython3, with the following implementation of a DAG:
class Node(object):
def __init__(self, name, parents):
assert isinstance(name, str)
assert all(isinstance(_, RandomVariable) for _ in parents)
self.name, self.parents = name, parents
where name is the label for the node and parents stores all of its parent nodes. Then the DAG class is implemented as:
class DAG(object):
def __init__(self, *nodes):
assert all(isinstance(_, Node) for _ in nodes)
self.nodes = nodes
(The DAG implementation is fixed and not to be improved.) Then I need to implement Kahn's algorithm as a function top_order which takes in a DAG instance and returns an ordering like (node_1, node_2, ..., node_n). The main trouble is, this algorithm is destructive because one of its steps is remove edge e from the graph (line 5) which will delete one member of m.parents. However, I have to leave the DAG instance intact.
One way I can think of so far is to create a deep copy of the DAG instance taken in (even a shallow copy can't do the job because the algorithm still destroys the original instance via references), and perform the destructive algorithm on this copy, and then get the correct ordering of node names of this copy (assume there is no naming conflict between nodes), and then use this ordering of names to infer the correct ordering of the nodes of the original instance, which roughly goes like:
def top_order(network):
'''takes in a DAG, prints and returns a topological ordering.'''
assert type(network) == DAG
temp = copy.deepcopy(network) # to leave the original instance intact
ordering_name = []
roots = [node for node in temp.nodes if not node.parents]
while roots:
n_node = roots[0]
del roots[0]
ordering_name.append(n_node.name)
for m_node in temp.nodes:
if n_node in m_node.parents:
temp_list = list(m_node.parents)
temp_list.remove(n_node)
m_node.parents = tuple(temp_list)
if not m_node.parents:
roots.append(m_node)
print(ordering_name) # print ordering by name
# gets ordering of nodes of the original instance
ordering = []
for name in ordering_name:
for node in network.nodes:
if node.name == name:
ordering.append(node)
return tuple(ordering)
Two problems: first, when network is huge the deep copy will be resource consuming; second, I want an improvement to my nested for loops which gets the ordering of the original instance. (For the second I think something like the sorted method etc pops into my mind.)
Any suggestion?

I'm going to suggest a less literal implementation of the algorithm: you don't need to manipulate the DAG at all, you just need to manipulate info about the DAG. The only "interesting" things the algorithm needs are a mapping from a node to its children (the opposite of what your DAG actually stores), and a count of the number of each node's parents.
These are easy to compute, and dicts can be used to associate this info with node names (assuming all names are distinct - if not, you can invent unique names with a bit more code).
Then this should work:
def topsort(dag):
name2node = {node.name: node for node in dag.nodes}
# map name to number of predecessors (parents)
name2npreds = {}
# map name to list of successors (children)
name2succs = {name: [] for name in name2node}
for node in dag.nodes:
thisname = node.name
name2npreds[thisname] = len(node.parents)
for p in node.parents:
name2succs[p.name].append(thisname)
result = [n for n, npreds in name2npreds.items() if npreds == 0]
for p in result:
for c in name2succs[p]:
npreds = name2npreds[c]
assert npreds
npreds -= 1
name2npreds[c] = npreds
if npreds == 0:
result.append(c)
if len(result) < len(name2node):
raise ValueError("no topsort - cycle")
return tuple(name2node[p] for p in result)
There's one subtle point here: the outer loop appends to result while it's iterating over result. That's intentional. The effect is that every element in result is processed exactly once by the outer loop, regardless of whether an element was in the initial result or added later.
Note that while the input DAG and Nodes are traversed, nothing in them is altered.

Related

Interviewbit - Merge k sorted linked lists: heappop returns max element instead of min

I'm solving the Interviewbit code challenge Merge K Sorted Lists:
Merge k sorted linked lists and return it as one sorted list.
Example :
1 -> 10 -> 20
4 -> 11 -> 13
3 -> 8 -> 9
will result in
1 -> 3 -> 4 -> 8 -> 9 -> 10 -> 11 -> 13 -> 20
The Python template code is:
# Definition for singly-linked list.
# class ListNode:
# def __init__(self, x):
# self.val = x
# self.next = None
class Solution:
# #param A : list of linked list
# #return the head node in the linked list
def mergeKLists(self, A):
pass
Here's my python 3 solution for the same:
from heapq import heapify, heappop, heappush
class Solution:
# #param A : list of linked list
# #return the head node in the linked list
def mergeKLists(self, A):
minheap = [x for x in A]
# print(minheap)
# heapify(minheap)
# print(minheap)
head = tail = None
# print(minheap)
while minheap:
# print(minheap)
heapify(minheap)
print([x.val for x in minheap])
minimum = heappop(minheap)
print(minimum.val)
if head is None:
head = minimum
tail = minimum
else:
tail.next = minimum
tail = minimum
if minimum.next:
heappush(minheap, minimum.next)
return head
With the print commands that are uncommented, you'll notice that in the intermediate runs of the while loop, heappop returns the largest element, as if we were dealing with a max heap, which we're not!
That's the place where the answer is going wrong as far as I can see. Can anyone suggest the reason for why heappop is working like this? And how that can be corrected?
When I run your code locally with sample data, I get an error on:
heapify(minheap)
TypeError: < not supported between instances of ListNode and ListNode
This is expected. The template definition of ListNode shows no support for making comparisons, and a heapify function will need to compare the items in the given list.
As the class ListNode is already defined by the code-challenge framework, it is probably better not to try to make that class comparable.
I would propose to put tuples on the heap which have list node instances as members, but have their val attribute value come first, followed by the number of the list (in A) they originate from. As third tuple member you'd finally have the node itself. This way comparisons will work, since tuples are comparable when their members are. And since the second tuple member will be a tie-breaker when the first member value is the same, the third tuple member (the list node instance) will never be subject to a comparison.
Unrelated to your question, but you should only heapify once, not in each iteration of the loop. The actions on the heap (heappush, heappop) maintain the heap property, so there is no need for calling heapify a second time. If you do it in each iteration, you actually destroy the efficiency benefit you would get from using a heap.
Here is your code updated with that change:
from heapq import heapify, heappop, heappush
class Solution:
def mergeKLists(self, A):
# place comparable tuples in the heap
minheap = [(node.val, i, node) for i, node in enumerate(A)]
heapify(minheap) # call only once
head = tail = None
while minheap:
# extract the tuple information we need
_, i, minimum = heappop(minheap)
if head is None:
head = minimum
tail = minimum
else:
tail.next = minimum
tail = minimum
minimum = minimum.next
if minimum:
# push a tuple, using same list index
heappush(minheap, (minimum.val, i, minimum))
return head

How to check if tree is symmetric python

For practice, I solved Leetcode 101. Symmetric Tree question:
Given a binary tree, check whether it is a mirror of itself (ie, symmetric around its center).
I have an idea to do in order traversal, record each node value into list and check the value from the first part, and reverse the second part from the list. But
it failed on test case [1,2,3,3,null,2,null]
from my local, my value return [3, 2, None, 1, 2, 3, None], but from leetcode it return [3,2,1,2,3] anyone know why and anything wrong with my code?
def isSymmetric(root: 'TreeNode') -> 'bool':
if not root: return True
value = []
def traversal(cur):
if cur:
traversal(cur.left)
value.append(cur.val)
traversal(cur.right)
traversal(root)
size = int(len(value) / 2)
return value[:size] == value[size + 1:][-1::-1]
I'm afraid inorder traversal does not uniquely determine a tree. e.g. a tree with structure
1
\
2
\
3
has the same inorder traversal as
2
/ \
1 3
Since you have the if cur condition, your inorder traversal won't include null nodes, which makes the traversal non-unique. You can include the null nodes like this:
def traverse(cur):
if cur:
traverse(cur.left)
values.append(cur.val if cur else None)
if cur:
traverse(cur.right)
This would serialize the tree nodes uniquely.
What you can also do is in this case determine the structure of the left node and right node are the same (except left and right are reversed). Here is my accepted solution:
class Solution:
def isSymmetric(self, root: 'TreeNode') -> 'bool':
if not root:
return True
return self.isSymmetricHelper(root.left, root.right)
def isSymmetricHelper(self, node1, node2):
if node1 is None and node2 is None:
return True
if node1 is None or node2 is None:
return False
if node1.val != node2.val: # early stopping - two nodes have different value
return False
out = True
out = out and self.isSymmetricHelper(node1.left, node2.right)
if not out: # early stopping
return False
out = out and self.isSymmetricHelper(node1.right, node2.left)
return out
It recursively check if two trees are the mirror of each other (with some early stopping). The idea is if two trees are mirror, the left subtree of tree1 must be the mirror of the right subtree of tree2, same applies for right subtree of tree1 and left subtree of tree2.
Although the runtime of both are O(n), he recursive method takes O(logn) average space (used by call stack) and has early stopping built-in, while your serialize-all-nodes method takes O(n) space O(n) time.
this is how the symmetric tree is:
class Solution:
def isSymmetric(self, root: Optional[TreeNode]) -> bool:
def is_mirror(t1,t2):
# If I reached all the way down that means I always got True.
if t1 is None and t2 is None:
return True
# if one of them is None but other one is not then False
if t1 is None or t2 is None:
return False
# 2=2 and t1.left==t2.right and t1.right==t2.left
return t1.val==t2.val and is_mirror(t1.left,t2.right) and is_mirror(t1.right,t2.left)
return is_mirror(root.left,root.right)

Implementation of Depth-First-Search on a permutation tree in Python

I have a quadratic Matrix of size n, say A, with non-negative real entries a_ij.
Furthermore I have a permutation tree. For n = 3 it looks like this: .
Now I would like to do a Depth-search (I don't know really, whether "Depth-search" is the correct description for this, but let's use it for now) along the branches of the tree in the following way:
On the first partial tree on the very left do the following starting with an "empty" Permutation (x,x,x):
If a_12 > a_21 set (1,2,x) and then check whether a_23 > a_32. If this is true as well, save (1,2,3) in a list, say P. Then go back to the first Level and check whether a_13 > a_31 and so on.
If a_21 > a_12 or a_32 > a_23 do not save the Permutation in P and go back to the first Level and check whether a_13 > a_31. If this is true set (1,3,x) and then check whether a_23 > a_32. If this is true save (1,3,2) in P and continue with the next partial tree. If a_31 > a_13 or a_32 > a_23 do not save the Permutation in P and continue with the same procedure for the next partial tree.
This procedure/algorithm I would like to implement for an arbitrary natural n > 0 with Input just the Matrix A and n and as an Output all permutations of size n that fullfill these conditions. By now I am not able to implement this in a general way.
Preferably in Python, but Pseudo Code would be nice as well. I also want to avoid functions like "itertools Permutation", because in the use case I Need to apply this for large n, for example n = 100, and then itertools Permutation is very slow.
If I understand correctly, this should get you what you want:
import numpy as np
from itertools import permutations
def fluboxing_permutations(a, n):
return [p for p in permutations(range(n))
if all(a[i, j] > a[j, i] for i, j in zip(p, p[1:]))]
n = 3
a = np.random.random([n, n])
fluboxing_permutations(a, n)
itertools.permutations will yield permutations in lexicographical order, which corresponds to your tree; then we check that for each consecutive pair of indices in the permutation, the element in the matrix is greater than the element at swapped indices. If so, we retain the permutation.
(No idea how to describe what the function does, so I made a new name. Hope you like it. If anyone knows a better way to describe it, please edit! :P )
EDIT: Here's a recursive function that should do the same, but with pruning:
def fluboxing_permutations_r(a, n):
nset = set(range(n))
def inner(p):
l = len(p)
if l > 1 and a[p[-2]][p[-1]] <= a[p[-1]][p[-2]]:
return []
if l == n:
return [p]
return [r for i in nset - set(p)
for r in inner(p + (i,))]
return inner(())
p starts as empty tuple, but it grows in recursion. Once there's at least two elements in the partial permutation, we can test the last two elements and see if it fails the test, and reject it if it does (pruning its subtree out of the search space). If it is a full permutation that wasn't rejected, we return it. If it's not full yet, we append to it all possible indices that are not already in there, and recurse.
tinyEDIT: BTW, parameter n is kind of redundant, because n = len(a) at the top of the function should take care of it.

EA: Custom crossover for a list of lists in Python

Could anyone explain me a little bit on how I could perform a custom crossover on a list of lists? Let's say I have a candidate like this:
candidate = [[0,1,2,3][4,5,6,7,8][9,10,11]]
I know how I could do a crossover for a single list. But how to do this actually for a list of lists?
Here is the method for a crossover for a single list:
#crossover
def partially_matched_crossover(random, mom, dad, args):
"""Return the offspring of partially matched crossover on the candidates.
This function performs partially matched crossover (PMX). This type of
crossover assumes that candidates are composed of discrete values that
are permutations of a given set (typically integers). It produces offspring
that are themselves permutations of the set.
.. Arguments:
random -- the random number generator object
mom -- the first parent candidate
dad -- the second parent candidate
args -- a dictionary of keyword arguments
Optional keyword arguments in args:
- *crossover_rate* -- the rate at which crossover is performed
(default 1.0)
"""
crossover_rate = args.setdefault('crossover_rate', 1.0)
if random.random() < crossover_rate:
size = len(mom)
points = random.sample(range(size), 2)
x, y = min(points), max(points)
bro = copy.copy(dad)
bro[x:y+1] = mom[x:y+1]
sis = copy.copy(mom)
sis[x:y+1] = dad[x:y+1]
for parent, child in zip([dad, mom], [bro, sis]):
for i in range(x, y+1):
if parent[i] not in child[x:y+1]:
spot = i
while x <= spot <= y:
print(child[spot])
spot = parent.index(child[spot])
child[spot] = parent[i]
return [bro, sis]
else:
return [mom, dad]
The above comes from the Python-based inpyred library. I am attempting to make an algorithm for the Vehicle Routing Problem where the above list of lists is an example of a proposed solution.

set from the union of elements contained in two lists

this is for a pre-interview questioner. i believe i have the answer just wanted to get confirmation that im right.
Part 1 - Tell me what this code does, and its big-O performance
Part 2 - Re-write it yourself and tell me the big-O performance of your solution
def foo(a, b):
""" a and b are both lists """
c = []
for i in a:
if is_bar(b, i):
c.append(i)
return unique(c)
def is_bar(a, b):
for i in a:
if i == b:
return True
return False
def unique(arr):
b = {}
for i in arr:
b[i] = 1
return b.keys()
ANSWERS:
It creates a set from the union of elements contained in two lists. It big O performance is O(n2)
my solution which i believe achieves O(n)
Set A = getSetA();
Set B = getSetB();
Set UnionAB = new Set(A);
UnionAB.addAll(B);
for (Object inA : a)
if(B.contains(inA))
UnionAB.remove(inA);
It seems like the original code is doing an intersection not a union. It's traversing all the elements in the first list (a) and checking if it exists in the second list (b), in which case it is adding it to list c. Then it is returning the unique elements from c. Performance of O(n^2) seems right.

Resources