How to generate a graph presenting all the possibilities using Python - python-3.x

I need to write a python code that allows me to generate a tree of possibilities that depend on each other. In fact, if we have two vectors: a=[0, 1] and b=[0, 1], we can construct 4 different possibilities:
(0, 0)
(0, 1)
(1, 0)
(1, 1)
If we will take (0,0) as the parent node, we can generate 3 edges from (0, 0) to all other possibilities: (0, 0) -> (0, 1), (1, 0), (1, 1).
Then for each possibility we can generate 3 edges to the other possibilities, e.g:
(0, 1) -> (0, 0), (1, 0), (1, 1)
(1, 0) -> (0, 0), (1, 1), (0, 1)
(1, 1) -> (0, 0), (1, 0), (0, 1)
I need to repeat that N times. The result should be a tree, where every non-leaf node has 3 successors - for every possibility except the current.

The correct naming of your graph is complete graph. The good graph processing libraries for Python - networkx - has a special function to generate this type of graphs:
complete_graph
Edit 1: I constructed the workflow for you that solves your problem. You can copy-paste it into your Jupyter notebook, but note that you need:
networkx
graphviz
pydot
to be installed.
import networkx as nx
# Set main parameters
items = {(0, 0), (0, 1), (1, 0), (1, 1)}
root = (0, 1)
N = 4
# Calculate the number of nodes for our tree
node_count = sum((len(items)-1)**i for i in range(N))
# Construct full r-rary tree
G = nx.full_rary_tree(len(items)-1, node_count, create_using=nx.DiGraph)
# Create LG-topologically sorted array of nodes
# NOTE THAT NODES' IDs AREN'T EQUAL TO YOUR ITEMS
lgts = list(nx.lexicographical_topological_sort(G))
# Get the first element to preset its label
first = lgts[0]
# Preset an empty label for all nodes
nx.set_node_attributes(G, '', 'label')
# Set the label for the root
G.nodes[first]['label'] = root
# For all nodes:
for node in lgts:
# Get needed names
s_labels = list(items - {G.nodes[node]['label']})
# For all childs:
for s_node in G.successors(node):
# Set the child's label
G.nodes[s_node]['label'] = s_labels.pop()
# Create dict for drawing labels
labels = {n: G.nodes[n]['label'] for n in G.nodes}
# And draw the final graph
nx.draw(
G,
pos=nx.nx_pydot.graphviz_layout(G, prog='dot'),
with_labels=True,
labels=labels
)
Finally you will get this graph:

Related

python find the top N weighted edges regardless of weight

I am looking for a way to find the biggest 5 weighted edges in a node. Is there a way to specify that I want exactly the biggest 5 edges without a specific threshold value(a.k.a universal for any weighted graph)?
You could consider the edges sorted by weight and build a dictionary that maps a node with its edges, sorted by weight in a non-increasing way.
>>> from collections import defaultdict
>>> res = defaultdict(list)
>>> for u,v in sorted(G.edges(), key=lambda x: G.get_edge_data(x[0], x[1])["weight"], reverse=True):
... res[u].append((u,v))
... res[v].append((u,v))
...
Then, given a node (e.g., 0), you could get the top N (e.g., 5) weighted edges as
>>> res[0][:5]
[(0, 7), (0, 2), (0, 6), (0, 1), (0, 3)]
If you only need to do it for a node (e.g., 0), you can directly do:
>>> sorted_edges_u = sorted(G.edges(0), key=lambda x: G.get_edge_data(x[0], x[1])["weight"], reverse=True)
>>> sorted_edges_u[:5]
[(0, 7), (0, 2), (0, 6), (0, 1), (0, 3)]

Efficient way to loop through orthodiagonal indices in order

I wanted to find a better way to loop through orthodiagonal indices in order, I am currently using numpy but I think I'm making an unnecessary number of function calls.
import numpy as np
len_x, len_y = 50, 50 #they don't have to equal
index_arr = np.add.outer(np.arange(len_x), np.arange(len_y))
Currently, I am looping through like this:
for i in range(np.max(index_arr)):
orthodiag_indices = zip(*np.where(index_arr == i))
for index in orthodiag_indices:
# DO FUNCTION OF index #
I have an arbitrary function of the index tuple, index and other parameters outside of this loop. It feels like I don't need the second for loop, and I should be able to do the whole thing in one loop. On top of this, I'm making a lot of function calls from zip(*np.where(index_arr == i)) for every i. What's the most efficient way to do this?
Edit: should mention that it's important that the function applies to index_arr == i in order, i.e., it does 0 first, then 1, then 2 etc. (the order of the second loop doesn't matter).
Edit 2: I guess what I want is a way to get the indices [(0,0), (0,1), (1,0), (2,0), (1,1), (2,0), ...] efficiently. I don't think I can apply a vectorized function because I am populating an np.zeros((len_x, len_y)) array, and going back to the first edit, the order matters.
You could use tril/triu_indices. Since the order of the (former) inner loop doesn't matter dimensions can be swapped as needed, I'll assume L>=S:
L,S = 4,3
a0,a1 = np.tril_indices(L,0,S)
b0,b1 = np.triu_indices(S,1)
C0 = np.concatenate([a0-a1,b0+L-b1])
C1 = np.concatenate([a1,b1])
*zip(C0,C1),
# ((0, 0), (1, 0), (0, 1), (2, 0), (1, 1), (0, 2), (3, 0), (2, 1), (1, 2), (3, 1), (2, 2), (3, 2))
I think itertools.product() will be of use here
import itertools as it
x,y = 2,3
a=list(it.product(range(x),range(y))
which gives a as
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]
If you need them in order then,
b=np.argsort(np.sum(a,1))
np.array(a)[b]
which gives,
array([[0, 0],
[0, 1],
[1, 0],
[0, 2],
[1, 1],
[1, 2]])
Hope that helps!

python: create numpy array from dictionary, where key is coordinates

I have a dictionary of the following form:
{(2, 2): 387, (1, 2): 25, (0, 1): 15, (2, 1): 12, (2, 6): 5, (6, 2): 5, (4, 2): 4, (3, 4): 4, (5, 2): 2, (0, 2): 1}
where key represents coordinates to the matrix, and value is actual value to be added at the coordinates.
At the moment I create and populate matrix in the following way:
import numpy as np
def build_matrix(data, n):
M = np.zeros(shape=(n, n), dtype=np.float64)
for key, val in data.items():
(row, col) = key
M[row][col] = val
Is there a way to do it shorter, using numpy'a API? I looked at np.array(), np.asarray() bit none seem to fit my needs.
The shortest version given n and the input dictionary itself seems to be -
M = np.zeros(shape=(n, n), dtype=np.float64)
M[tuple(zip(*d.keys()))] = list(d.values())
That tuple(zip(*d.keys())) is basically transposing nested items and then packing into tuples as needed for integer-indexing into NumPy arrays. More info on transposing nested items.
Generic case
To handle generic cases, when n is not given and is required to generated based on the extents of the keys alongwith dtype from dictionary values, it would be -
idx_ar = np.array(list(d.keys()))
out_shp = idx_ar.max(0)+1
data = np.array(list(d.values()))
M = np.zeros(shape=out_shp, dtype=data.dtype)
M[tuple(idx_ar.T)] = data
If you don't mind using scipy, what you've basically created is a sparse dok_matrix (Dictionary of Keys)
from scipy.sparse import dok_matrix
out = dok_matrix((n, n))
out.update(data)
out = out.todense()

how to get all coordinates in the rectangle between two coordinates?

say i have a rectangle, and its top-left and bottom-right coordinates are A(0,0) and B(2,3) respectively. is there a method/formula that i can use to get all the coordinates inside this rectangle? I want my output to be like this if the input was these two coordinates:
input: [(0, 0), (2, 3)]
output: [(0, 0), (1, 0), (2, 0), (0, 1), (1, 1), (2, 1), (0, 2), (1, 2), (2, 2), (0, 3,) (1, 3,) (2, 3)]
also, a python 3 implementation would be greatly appreciated, although not necessary.
thanks
EDIT: full story: i'm using python, and at first i thought i could achieve what i want by getting all the values between x1 and x2, y1 and y2. so for example i have x = 0, x = 1, x = 2 and y = 0, y = 1, y = 2, y = 3, but i honestly don't know where to go from there, or if this is correct in the first place. i thought i could get all the coordinates by somehow getting all the coordinates with y = 0 with different x values, then all the coordinates with y = 1... but i can't seem to wrap my head around a way of doing this. any help is appreciated, thanks.
One thing you could do is make a list of all x coordinates inside the rectangle [x1..x2] and all y coordinates inside the rectangle [y1..y2] and then take the Cartesian product of the two lists using itertools:
import itertools
...
input = [(0, 0), (2, 3)]
x_coords = [x for x in range(input[0][0], input[1][0] + 1)]
y_coords = [y for y in range(input[0][1], input[1][1] + 1)]
output = list(itertools.product(x_coords, y_coords))
If you don't want to use itertools to compute the product, you could also easily use a for loop or a list comprehension to do it instead, which is roughly equivalent to what itertools is doing behind the scenes anyway:
output = [(x, y) for x in x_coords for y in y_coords]

Graph reduction

I have been working on an piece of code to reduce a graph. The problem is that there are some branches that I want to remove. Once I remove a branch I can merge the nodes or not, depending on the number of paths between the nodes the branch joined.
Maybe the following example illustrates what I want:
The code I have is the following:
from networkx import DiGraph, all_simple_paths, draw
from matplotlib import pyplot as plt
# data preparation
branches = [(2,1), (3,2), (4,3), (4,13), (7,6), (6,5), (5,4),
(8,7), (9,8), (9,10), (10,11), (11,12), (12,1), (13,9)]
branches_to_remove_idx = [11, 10, 9, 8, 6, 5, 3, 2, 0]
ft_dict = dict()
graph = DiGraph()
for i, br in enumerate(branches):
graph.add_edge(br[0], br[1])
ft_dict[i] = (br[0], br[1])
# Processing -----------------------------------------------------
for idx in branches_to_remove_idx:
# get the nodes that define the edge to remove
f, t = ft_dict[idx]
# get the number of paths from 'f' to 't'
n_paths = len(list(all_simple_paths(graph, f, t)))
if n_paths == 1:
# remove branch and merge the nodes 'f' and 't'
#
# This is what I have no clue how to do
#
pass
else:
# remove the branch and that's it
graph.remove_edge(f, t)
print('Simple removal of', f, t)
# -----------------------------------------------------------------
draw(graph, with_labels=True)
plt.show()
I feel that there should be a simpler direct way to obtain the last figure from the first, given the branch indices, but I have no clue.
I think this is more or less what you want. I am merging all nodes that are in chains (connected nodes of degree 2) into one hypernode. I return the the new graph and a dictionary mapping the hypernode to the contracted nodes.
import networkx as nx
def contract(g):
"""
Contract chains of neighbouring vertices with degree 2 into one hypernode.
Arguments:
----------
g -- networkx.Graph instance
Returns:
--------
h -- networkx.Graph instance
the contracted graph
hypernode_to_nodes -- dict: int hypernode -> [v1, v2, ..., vn]
dictionary mapping hypernodes to nodes
"""
# create subgraph of all nodes with degree 2
is_chain = [node for node, degree in g.degree_iter() if degree == 2]
chains = g.subgraph(is_chain)
# contract connected components (which should be chains of variable length) into single node
components = list(nx.components.connected_component_subgraphs(chains))
hypernode = max(g.nodes()) +1
hypernodes = []
hyperedges = []
hypernode_to_nodes = dict()
false_alarms = []
for component in components:
if component.number_of_nodes() > 1:
hypernodes.append(hypernode)
vs = [node for node in component.nodes()]
hypernode_to_nodes[hypernode] = vs
# create new edges from the neighbours of the chain ends to the hypernode
component_edges = [e for e in component.edges()]
for v, w in [e for e in g.edges(vs) if not ((e in component_edges) or (e[::-1] in component_edges))]:
if v in component:
hyperedges.append([hypernode, w])
else:
hyperedges.append([v, hypernode])
hypernode += 1
else: # nothing to collapse as there is only a single node in component:
false_alarms.extend([node for node in component.nodes()])
# initialise new graph with all other nodes
not_chain = [node for node in g.nodes() if not node in is_chain]
h = g.subgraph(not_chain + false_alarms)
h.add_nodes_from(hypernodes)
h.add_edges_from(hyperedges)
return h, hypernode_to_nodes
edges = [(2, 1),
(3, 2),
(4, 3),
(4, 13),
(7, 6),
(6, 5),
(5, 4),
(8, 7),
(9, 8),
(9, 10),
(10, 11),
(11, 12),
(12, 1),
(13, 9)]
g = nx.Graph(edges)
h, hypernode_to_nodes = contract(g)
print("Edges in contracted graph:")
print(h.edges())
print('')
print("Hypernodes:")
for hypernode, nodes in hypernode_to_nodes.items():
print("{} : {}".format(hypernode, nodes))
This returns for your example:
Edges in contracted graph:
[(9, 13), (9, 14), (9, 15), (4, 13), (4, 14), (4, 15)]
Hypernodes:
14 : [1, 2, 3, 10, 11, 12]
15 : [8, 5, 6, 7]
I built this function that scales much better and runs faster with larger graphs:
def add_dicts(vector):
l = list(map(lambda x: Counter(x),vector))
return reduce(lambda x,y:x+y,l)
def consolidate_dup_edges(g):
edges = pd.DataFrame(g.edges(data=True),columns=['start','end','weight'])
edges_consolidated = edges.groupby(['start','end']).agg({'weight':add_dicts}).reset_index()
return nx.from_edgelist(list(edges_consolidated.itertuples(index=False,name=None)))
def graph_reduce(g):
g = consolidate_dup_edges(g)
is_deg2 = [node for node, degree in g.degree() if degree == 2]
is_deg2_descendents =list(map(lambda x: tuple(nx.descendants_at_distance(g,x,1)),is_deg2))
edges_on_deg2= list(map(lambda x: list(map(lambda x:x[2],g.edges(x,data=True))),is_deg2))
edges_on_deg2= list(map(lambda x: add_dicts(x),edges_on_deg2))
new_edges = list(zip(is_deg2_descendents,edges_on_deg2))
new_edges = [(a,b,c) for (a,b),c in new_edges]
g.remove_nodes_from(is_deg2)
g.add_edges_from(new_edges)
g.remove_edges_from(nx.selfloop_edges(g))
g.remove_nodes_from([node for node, degree in g.degree() if degree <= 1])
return consolidate_dup_edges(g)
The graph_reduce function basically removes nodes with degree 1 and removes intermediate nodes with degree 2 and connects the nodes that the degree 2 node was connected to. We can see the best impact when we run this code iteratively until the number of nodes plateaus to a stable number. This only works on undirected graphs.

Resources