Graph reduction - python-3.x

I have been working on an piece of code to reduce a graph. The problem is that there are some branches that I want to remove. Once I remove a branch I can merge the nodes or not, depending on the number of paths between the nodes the branch joined.
Maybe the following example illustrates what I want:
The code I have is the following:
from networkx import DiGraph, all_simple_paths, draw
from matplotlib import pyplot as plt
# data preparation
branches = [(2,1), (3,2), (4,3), (4,13), (7,6), (6,5), (5,4),
(8,7), (9,8), (9,10), (10,11), (11,12), (12,1), (13,9)]
branches_to_remove_idx = [11, 10, 9, 8, 6, 5, 3, 2, 0]
ft_dict = dict()
graph = DiGraph()
for i, br in enumerate(branches):
graph.add_edge(br[0], br[1])
ft_dict[i] = (br[0], br[1])
# Processing -----------------------------------------------------
for idx in branches_to_remove_idx:
# get the nodes that define the edge to remove
f, t = ft_dict[idx]
# get the number of paths from 'f' to 't'
n_paths = len(list(all_simple_paths(graph, f, t)))
if n_paths == 1:
# remove branch and merge the nodes 'f' and 't'
#
# This is what I have no clue how to do
#
pass
else:
# remove the branch and that's it
graph.remove_edge(f, t)
print('Simple removal of', f, t)
# -----------------------------------------------------------------
draw(graph, with_labels=True)
plt.show()
I feel that there should be a simpler direct way to obtain the last figure from the first, given the branch indices, but I have no clue.

I think this is more or less what you want. I am merging all nodes that are in chains (connected nodes of degree 2) into one hypernode. I return the the new graph and a dictionary mapping the hypernode to the contracted nodes.
import networkx as nx
def contract(g):
"""
Contract chains of neighbouring vertices with degree 2 into one hypernode.
Arguments:
----------
g -- networkx.Graph instance
Returns:
--------
h -- networkx.Graph instance
the contracted graph
hypernode_to_nodes -- dict: int hypernode -> [v1, v2, ..., vn]
dictionary mapping hypernodes to nodes
"""
# create subgraph of all nodes with degree 2
is_chain = [node for node, degree in g.degree_iter() if degree == 2]
chains = g.subgraph(is_chain)
# contract connected components (which should be chains of variable length) into single node
components = list(nx.components.connected_component_subgraphs(chains))
hypernode = max(g.nodes()) +1
hypernodes = []
hyperedges = []
hypernode_to_nodes = dict()
false_alarms = []
for component in components:
if component.number_of_nodes() > 1:
hypernodes.append(hypernode)
vs = [node for node in component.nodes()]
hypernode_to_nodes[hypernode] = vs
# create new edges from the neighbours of the chain ends to the hypernode
component_edges = [e for e in component.edges()]
for v, w in [e for e in g.edges(vs) if not ((e in component_edges) or (e[::-1] in component_edges))]:
if v in component:
hyperedges.append([hypernode, w])
else:
hyperedges.append([v, hypernode])
hypernode += 1
else: # nothing to collapse as there is only a single node in component:
false_alarms.extend([node for node in component.nodes()])
# initialise new graph with all other nodes
not_chain = [node for node in g.nodes() if not node in is_chain]
h = g.subgraph(not_chain + false_alarms)
h.add_nodes_from(hypernodes)
h.add_edges_from(hyperedges)
return h, hypernode_to_nodes
edges = [(2, 1),
(3, 2),
(4, 3),
(4, 13),
(7, 6),
(6, 5),
(5, 4),
(8, 7),
(9, 8),
(9, 10),
(10, 11),
(11, 12),
(12, 1),
(13, 9)]
g = nx.Graph(edges)
h, hypernode_to_nodes = contract(g)
print("Edges in contracted graph:")
print(h.edges())
print('')
print("Hypernodes:")
for hypernode, nodes in hypernode_to_nodes.items():
print("{} : {}".format(hypernode, nodes))
This returns for your example:
Edges in contracted graph:
[(9, 13), (9, 14), (9, 15), (4, 13), (4, 14), (4, 15)]
Hypernodes:
14 : [1, 2, 3, 10, 11, 12]
15 : [8, 5, 6, 7]

I built this function that scales much better and runs faster with larger graphs:
def add_dicts(vector):
l = list(map(lambda x: Counter(x),vector))
return reduce(lambda x,y:x+y,l)
def consolidate_dup_edges(g):
edges = pd.DataFrame(g.edges(data=True),columns=['start','end','weight'])
edges_consolidated = edges.groupby(['start','end']).agg({'weight':add_dicts}).reset_index()
return nx.from_edgelist(list(edges_consolidated.itertuples(index=False,name=None)))
def graph_reduce(g):
g = consolidate_dup_edges(g)
is_deg2 = [node for node, degree in g.degree() if degree == 2]
is_deg2_descendents =list(map(lambda x: tuple(nx.descendants_at_distance(g,x,1)),is_deg2))
edges_on_deg2= list(map(lambda x: list(map(lambda x:x[2],g.edges(x,data=True))),is_deg2))
edges_on_deg2= list(map(lambda x: add_dicts(x),edges_on_deg2))
new_edges = list(zip(is_deg2_descendents,edges_on_deg2))
new_edges = [(a,b,c) for (a,b),c in new_edges]
g.remove_nodes_from(is_deg2)
g.add_edges_from(new_edges)
g.remove_edges_from(nx.selfloop_edges(g))
g.remove_nodes_from([node for node, degree in g.degree() if degree <= 1])
return consolidate_dup_edges(g)
The graph_reduce function basically removes nodes with degree 1 and removes intermediate nodes with degree 2 and connects the nodes that the degree 2 node was connected to. We can see the best impact when we run this code iteratively until the number of nodes plateaus to a stable number. This only works on undirected graphs.

Related

python find the top N weighted edges regardless of weight

I am looking for a way to find the biggest 5 weighted edges in a node. Is there a way to specify that I want exactly the biggest 5 edges without a specific threshold value(a.k.a universal for any weighted graph)?
You could consider the edges sorted by weight and build a dictionary that maps a node with its edges, sorted by weight in a non-increasing way.
>>> from collections import defaultdict
>>> res = defaultdict(list)
>>> for u,v in sorted(G.edges(), key=lambda x: G.get_edge_data(x[0], x[1])["weight"], reverse=True):
... res[u].append((u,v))
... res[v].append((u,v))
...
Then, given a node (e.g., 0), you could get the top N (e.g., 5) weighted edges as
>>> res[0][:5]
[(0, 7), (0, 2), (0, 6), (0, 1), (0, 3)]
If you only need to do it for a node (e.g., 0), you can directly do:
>>> sorted_edges_u = sorted(G.edges(0), key=lambda x: G.get_edge_data(x[0], x[1])["weight"], reverse=True)
>>> sorted_edges_u[:5]
[(0, 7), (0, 2), (0, 6), (0, 1), (0, 3)]

python3: normalize matrix of transition probabilities

I have a Python code partially borrowed from Generating Markov transition matrix in Python:
# xstates is a dictionary
# n - is the matrix size
def prob(xstates, n):
# we want to do smoothing, so create matrix of all 1s
M = [[1] * n for _ in range(n)]
# populate matrix by (row, column)
for key, val in xstates.items():
(row, col) = key
M[row][col] = val
# and finally calculate probabilities
for row in M:
s = sum(row)
if s > 0:
row[:] = [f/s for f in row]
return M
xstates here comes in a form of dictionary, e.g. :
{(2, 2): 387, (1, 2): 25, (0, 1): 15, (2, 1): 12, (3, 2): 5, (2, 3): 5, (6, 2): 4, (5, 6): 4, (4, 2): 2, (0, 2): 1}
where (1, 2) means state 1 transits to state 2 and similar to others.
This function generates the matrix of transition probabilities, the sum of all elements in a row is 1. Now I need to normalize the values. How would I do that? Can I do that with numpy library?
import numpy as np
M = np.random.random([3,2])
print(M)
row sum to 1
M = M / M.sum(axis=1)[:, np.newaxis]
print(M)
column sum to 1
M = M / M.sum(axis=0)[np.newaxis,:]
print(M)

How to generate a graph presenting all the possibilities using Python

I need to write a python code that allows me to generate a tree of possibilities that depend on each other. In fact, if we have two vectors: a=[0, 1] and b=[0, 1], we can construct 4 different possibilities:
(0, 0)
(0, 1)
(1, 0)
(1, 1)
If we will take (0,0) as the parent node, we can generate 3 edges from (0, 0) to all other possibilities: (0, 0) -> (0, 1), (1, 0), (1, 1).
Then for each possibility we can generate 3 edges to the other possibilities, e.g:
(0, 1) -> (0, 0), (1, 0), (1, 1)
(1, 0) -> (0, 0), (1, 1), (0, 1)
(1, 1) -> (0, 0), (1, 0), (0, 1)
I need to repeat that N times. The result should be a tree, where every non-leaf node has 3 successors - for every possibility except the current.
The correct naming of your graph is complete graph. The good graph processing libraries for Python - networkx - has a special function to generate this type of graphs:
complete_graph
Edit 1: I constructed the workflow for you that solves your problem. You can copy-paste it into your Jupyter notebook, but note that you need:
networkx
graphviz
pydot
to be installed.
import networkx as nx
# Set main parameters
items = {(0, 0), (0, 1), (1, 0), (1, 1)}
root = (0, 1)
N = 4
# Calculate the number of nodes for our tree
node_count = sum((len(items)-1)**i for i in range(N))
# Construct full r-rary tree
G = nx.full_rary_tree(len(items)-1, node_count, create_using=nx.DiGraph)
# Create LG-topologically sorted array of nodes
# NOTE THAT NODES' IDs AREN'T EQUAL TO YOUR ITEMS
lgts = list(nx.lexicographical_topological_sort(G))
# Get the first element to preset its label
first = lgts[0]
# Preset an empty label for all nodes
nx.set_node_attributes(G, '', 'label')
# Set the label for the root
G.nodes[first]['label'] = root
# For all nodes:
for node in lgts:
# Get needed names
s_labels = list(items - {G.nodes[node]['label']})
# For all childs:
for s_node in G.successors(node):
# Set the child's label
G.nodes[s_node]['label'] = s_labels.pop()
# Create dict for drawing labels
labels = {n: G.nodes[n]['label'] for n in G.nodes}
# And draw the final graph
nx.draw(
G,
pos=nx.nx_pydot.graphviz_layout(G, prog='dot'),
with_labels=True,
labels=labels
)
Finally you will get this graph:

Merge tuples in list with similar elements

I have to merge all the tuples containing atleast one element of each other.
tups=[(1,2),(2,3),(8,9),(4,5),(15,12),(9,6),(7,8),(3,11),(1,15)]
first tuple (1,2) should be merged with (2,3),(3,11),(1,15),(15,12) since each of these tuples contains similar items of the preceding tuple. so the final ouput should be
lst1 = [1,2,3,11,12,15]
lst2=[6,7,8,9] since (8,9),(9,6) and (7,8) have matching elements
My code so far:
finlst=[]
for items in range(len(tups)):
for resid in range(len(tups)):
if(tups[items] != tups[resid] ):
if(tups[items][0]==tups[resid][0] or tups[items][0]==tups[resid][1]):
finlst.append(list(set(tups[items]+tups[resid])))
You could do it like this, using sets that are expanded with matching tuples:
tups = [(1, 2), (2, 3), (8, 9), (4, 5), (15, 12), (9, 6), (7, 8), (3, 11), (1, 15)]
groups = []
for t in tups:
for group in groups:
# find a group that has at least one element in common with the tuple
if any(x in group for x in t):
# extend the group with the items from the tuple
group.update(t)
# break from the group-loop as we don’t need to search any further
break
else:
# otherwise (if the group-loop ended without being cancelled with `break`)
# create a new group from the tuple
groups.append(set(t))
# output
for group in groups:
print(group)
{1, 2, 3, 11, 15}
{8, 9, 6, 7}
{4, 5}
{12, 15}
Since this solution iterates the original tuple list once and in order, this will not work for inputs where the connections are not directly visible. For that, we could use the following solution instead which uses fixed-point iteration to combine the groups for as long as that still works:
tups = [(1, 2), (3, 4), (1, 4)]
import itertools
groups = [set(t) for t in tups]
while True:
for a, b in itertools.combinations(groups, 2):
# if the groups can be merged
if len(a & b):
# construct new groups list
groups = [g for g in groups if g != a and g != b]
groups.append(a | b)
# break the for loop and restart
break
else:
# the for loop ended naturally, so no overlapping groups were found
break
I found the solution, it's more of graph theory problem related to connectivity, Connectivity-Graph Theory
We can use NetworkX for this, it's pretty much guaranteed to be correct:
def uniqueGroup(groups):
# grp=[]
# for group in groups:
# grp.append(list(group))
# l=groups
import networkx
from networkx.algorithms.components.connected import connected_components
def to_graph(groups):
G = networkx.Graph()
for part in groups:
# each sublist is a bunch of nodes
G.add_nodes_from(part)
# it also imlies a number of edges:
G.add_edges_from(to_edges(part))
return G
def to_edges(groups):
"""
treat `l` as a Graph and returns it's edges
to_edges(['a','b','c','d']) -> [(a,b), (b,c),(c,d)]
"""
it = iter(groups)
last = next(it)
for current in it:
yield last, current
last = current
G = to_graph(groups)
return connected_components(G)
Output:
tups = [(1, 2),(3,4),(1,4)]
uniqueGroup(tups)
{1, 2, 3, 4}

Get degree of each nodes in a graph by Networkx in python

Suppose I have a data set like below that shows an undirected graph:
1 2
1 3
1 4
3 5
3 6
7 8
8 9
10 11
I have a python script like it:
for s in ActorGraph.degree():
print(s)
that is a dictionary consist of key and value that keys are node names and values are degree of nodes:
('9', 1)
('5', 1)
('11', 1)
('8', 2)
('6', 1)
('4', 1)
('10', 1)
('7', 1)
('2', 1)
('3', 3)
('1', 3)
In networkx documentation suggest to use values() for having nodes degree.
now I like to have just keys that are degree of nodes and I use this part of script but it does't work and say object has no attribute 'values':
for s in ActorGraph.degree():
print(s.values())
how can I do it?
You are using version 2.0 of networkx. Which changed from using a dict for G.degree() to using a dict-like (but not dict) DegreeView. See this guide.
To have the degrees in a list you can use a list-comprehension:
degrees = [val for (node, val) in G.degree()]
I'd like to add the following: if you're initializing the undirected graph with nx.Graph() and adding the edges afterwards, just beware that networkx doesn't guarrantee the order of nodes will be preserved -- this also applies to degree(). This means that if you use the list comprehension approach then try to access the degree by list index the indexes may not correspond to the right nodes. If you'd like them to correspond, you can instead do:
degrees = [val for (node, val) in sorted(G.degree(), key=lambda pair: pair[0])]
Here's a simple example to illustrate this:
>>> edges = [(0, 1), (0, 3), (0, 5), (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (2, 5)]
>>> g = nx.Graph()
>>> g.add_edges_from(edges)
>>> print(g.degree())
[(0, 3), (1, 4), (3, 3), (5, 2), (2, 4), (4, 2)]
>>> print([val for (node, val) in g.degree()])
[3, 4, 3, 2, 4, 2]
>>> print([val for (node, val) in sorted(g.degree(), key=lambda pair: pair[0])])
[3, 4, 4, 3, 2, 2]
You can also use a dict comprehension to get an actual dictionary:
degrees = {node:val for (node, val) in G.degree()}

Resources