python find the top N weighted edges regardless of weight - python-3.x

I am looking for a way to find the biggest 5 weighted edges in a node. Is there a way to specify that I want exactly the biggest 5 edges without a specific threshold value(a.k.a universal for any weighted graph)?

You could consider the edges sorted by weight and build a dictionary that maps a node with its edges, sorted by weight in a non-increasing way.
>>> from collections import defaultdict
>>> res = defaultdict(list)
>>> for u,v in sorted(G.edges(), key=lambda x: G.get_edge_data(x[0], x[1])["weight"], reverse=True):
... res[u].append((u,v))
... res[v].append((u,v))
...
Then, given a node (e.g., 0), you could get the top N (e.g., 5) weighted edges as
>>> res[0][:5]
[(0, 7), (0, 2), (0, 6), (0, 1), (0, 3)]
If you only need to do it for a node (e.g., 0), you can directly do:
>>> sorted_edges_u = sorted(G.edges(0), key=lambda x: G.get_edge_data(x[0], x[1])["weight"], reverse=True)
>>> sorted_edges_u[:5]
[(0, 7), (0, 2), (0, 6), (0, 1), (0, 3)]

Related

How to pass list of tuples through a object method in python

Having this frustrating issue where i want to pass through the tuples in the following list
through a method on another list of instances of a class that i have created
list_1=[(0, 20), (10, 1), (0, 1), (0, 10), (5, 5), (10, 50)]
instances=[instance[0], instance[1],...instance[n]]
results=[]
pos_list=[]
for i in range(len(list_1)):
a,b=List_1[i]
result=sum(instance.method(a,b) for instance in instances)
results.append(result)
if result>=0:
pos_list.append((a,b))
print(results)
print(pos_list)
the issue is that all instances are taking the same tuple, where as i want the method on the first instance to take the first tuple and so on.
I ultimately want to see it append to the new list (pos_list) if the sum is >0.
Anyone know how i can iterate this properly?
EDIT
It will make it clearer if I print the result of the sum also.
Basically I want the sum to perform as follows:
result = instance[0].method(0,20), instance[1].method(10,1), instance[2].method(0,1), instance[3].method(0,10), instance[4].method(5,5), instance[5].method(10,50)
For info the method is just the +/- product of the two values depending on the attributes of the instance.
So results for above would be:
result = [0*20 - 10*1 - 0*1 + 0*10 - 5*5 + 10*50] = [465]
pos_list=[(0, 20), (10, 1), (0, 1), (0, 10), (5, 5), (10, 50)]
except what is actually doing is using the same tuple for all instances like this:
result = instance[0].method(0,20), instance[1].method(0,20), instance[2].method(0,20), instance[3].method(0,20), instance[4].method(0,20), instance[5].method(0,20)
result = [0*20 - 0*20 - 0*20 + 0*20 - 0*20 + 0*20] = [0]
pos_list=[]
and so on for (10,1) etc.
How do I make it work like the first example?
You can compute your sum using zip to generate all the pairs of correspondent instances and tuples.
result=sum(instance.payout(*t) for instance, t in zip(instances, List_1))
The zip will stop as soon as it reaches the end of the shortest of the two iterators. So if you have 10 instances and 100 tuples, zip will produce only 10 pairs, using the first 10 elements of both lists.
The problem I see in your code is that you are computing this sum for each element of List_1, so if payout produces always the same result with the same inputs (e.g., it has no memory or randomness), the value of result will be the same at each iteration. So, in the end, results will be composed by the same value repeated a number of times equal to the length of List_1, while pos_list will contain all (the sum is greater than 0) or none (the sum is less or equal to zero) of the input tuples.
Instead, it would make sense if items of List_1 were lists or tuples themselves:
List_1 = [
[(0, 1), (2, 3), (4, 5)],
[(6, 7), (8, 9), (10, 11)],
[(12, 13), (14, 15), (16, 17)],
]
So, in this case, supposing that your class for instances is something like this:
class Goofy:
def __init__(self, positive_sum=True):
self.positive_sum = positive_sum
def payout(self, *args):
if self.positive_sum:
return sum(args)
else:
return -1 * sum(args)
instances = [Goofy(i) for i in [True, True, False]]
you can rewrite your code in this way:
results=[]
pos_list=[]
for el in List_1:
result = sum(g.payout(*t) for g, t in zip(instances, el))
results.append(result)
if result >= 0:
pos_list.append(el)
Running the previous code, results will be:
[-3, 9, 21]
while pop_list:
[[(6, 7), (8, 9), (10, 11)], [(12, 13), (14, 15), (16, 17)]]
If you are interested only in pop_list, you can compact your code in only one line:
pop_list = list(filter(lambda el: sum(g.payout(*t) for g, t in zip(instances, el)) > 0, List_1))
many thanks for the above! I have it working now.
Wasn't able to use args given my method had a bit more to it but the use of zip is what made it click
import random
rand=random.choices(list_1, k=len(instances))
results=[]
pos_list=[]
for r in rand:
x,y=r
result=sum(instance.method(x,y) for instance,(x,y) in zip(instances, rand))
results.append(result)
if result>=0:
pos_list.append(rand)
print(results)
print(pos_list)
for list of e.g.
rand=[(20, 5), (0, 2), (0, 100), (2, 50), (5, 10), (50, 100)]
this returns the following
results=[147]
pos_list=[(20, 5), (0, 2), (0, 100), (2, 50), (5, 10), (50, 100)]
so exactly what I wanted. Thanks again!

Efficient way to loop through orthodiagonal indices in order

I wanted to find a better way to loop through orthodiagonal indices in order, I am currently using numpy but I think I'm making an unnecessary number of function calls.
import numpy as np
len_x, len_y = 50, 50 #they don't have to equal
index_arr = np.add.outer(np.arange(len_x), np.arange(len_y))
Currently, I am looping through like this:
for i in range(np.max(index_arr)):
orthodiag_indices = zip(*np.where(index_arr == i))
for index in orthodiag_indices:
# DO FUNCTION OF index #
I have an arbitrary function of the index tuple, index and other parameters outside of this loop. It feels like I don't need the second for loop, and I should be able to do the whole thing in one loop. On top of this, I'm making a lot of function calls from zip(*np.where(index_arr == i)) for every i. What's the most efficient way to do this?
Edit: should mention that it's important that the function applies to index_arr == i in order, i.e., it does 0 first, then 1, then 2 etc. (the order of the second loop doesn't matter).
Edit 2: I guess what I want is a way to get the indices [(0,0), (0,1), (1,0), (2,0), (1,1), (2,0), ...] efficiently. I don't think I can apply a vectorized function because I am populating an np.zeros((len_x, len_y)) array, and going back to the first edit, the order matters.
You could use tril/triu_indices. Since the order of the (former) inner loop doesn't matter dimensions can be swapped as needed, I'll assume L>=S:
L,S = 4,3
a0,a1 = np.tril_indices(L,0,S)
b0,b1 = np.triu_indices(S,1)
C0 = np.concatenate([a0-a1,b0+L-b1])
C1 = np.concatenate([a1,b1])
*zip(C0,C1),
# ((0, 0), (1, 0), (0, 1), (2, 0), (1, 1), (0, 2), (3, 0), (2, 1), (1, 2), (3, 1), (2, 2), (3, 2))
I think itertools.product() will be of use here
import itertools as it
x,y = 2,3
a=list(it.product(range(x),range(y))
which gives a as
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)]
If you need them in order then,
b=np.argsort(np.sum(a,1))
np.array(a)[b]
which gives,
array([[0, 0],
[0, 1],
[1, 0],
[0, 2],
[1, 1],
[1, 2]])
Hope that helps!

How to generate a graph presenting all the possibilities using Python

I need to write a python code that allows me to generate a tree of possibilities that depend on each other. In fact, if we have two vectors: a=[0, 1] and b=[0, 1], we can construct 4 different possibilities:
(0, 0)
(0, 1)
(1, 0)
(1, 1)
If we will take (0,0) as the parent node, we can generate 3 edges from (0, 0) to all other possibilities: (0, 0) -> (0, 1), (1, 0), (1, 1).
Then for each possibility we can generate 3 edges to the other possibilities, e.g:
(0, 1) -> (0, 0), (1, 0), (1, 1)
(1, 0) -> (0, 0), (1, 1), (0, 1)
(1, 1) -> (0, 0), (1, 0), (0, 1)
I need to repeat that N times. The result should be a tree, where every non-leaf node has 3 successors - for every possibility except the current.
The correct naming of your graph is complete graph. The good graph processing libraries for Python - networkx - has a special function to generate this type of graphs:
complete_graph
Edit 1: I constructed the workflow for you that solves your problem. You can copy-paste it into your Jupyter notebook, but note that you need:
networkx
graphviz
pydot
to be installed.
import networkx as nx
# Set main parameters
items = {(0, 0), (0, 1), (1, 0), (1, 1)}
root = (0, 1)
N = 4
# Calculate the number of nodes for our tree
node_count = sum((len(items)-1)**i for i in range(N))
# Construct full r-rary tree
G = nx.full_rary_tree(len(items)-1, node_count, create_using=nx.DiGraph)
# Create LG-topologically sorted array of nodes
# NOTE THAT NODES' IDs AREN'T EQUAL TO YOUR ITEMS
lgts = list(nx.lexicographical_topological_sort(G))
# Get the first element to preset its label
first = lgts[0]
# Preset an empty label for all nodes
nx.set_node_attributes(G, '', 'label')
# Set the label for the root
G.nodes[first]['label'] = root
# For all nodes:
for node in lgts:
# Get needed names
s_labels = list(items - {G.nodes[node]['label']})
# For all childs:
for s_node in G.successors(node):
# Set the child's label
G.nodes[s_node]['label'] = s_labels.pop()
# Create dict for drawing labels
labels = {n: G.nodes[n]['label'] for n in G.nodes}
# And draw the final graph
nx.draw(
G,
pos=nx.nx_pydot.graphviz_layout(G, prog='dot'),
with_labels=True,
labels=labels
)
Finally you will get this graph:

Graph reduction

I have been working on an piece of code to reduce a graph. The problem is that there are some branches that I want to remove. Once I remove a branch I can merge the nodes or not, depending on the number of paths between the nodes the branch joined.
Maybe the following example illustrates what I want:
The code I have is the following:
from networkx import DiGraph, all_simple_paths, draw
from matplotlib import pyplot as plt
# data preparation
branches = [(2,1), (3,2), (4,3), (4,13), (7,6), (6,5), (5,4),
(8,7), (9,8), (9,10), (10,11), (11,12), (12,1), (13,9)]
branches_to_remove_idx = [11, 10, 9, 8, 6, 5, 3, 2, 0]
ft_dict = dict()
graph = DiGraph()
for i, br in enumerate(branches):
graph.add_edge(br[0], br[1])
ft_dict[i] = (br[0], br[1])
# Processing -----------------------------------------------------
for idx in branches_to_remove_idx:
# get the nodes that define the edge to remove
f, t = ft_dict[idx]
# get the number of paths from 'f' to 't'
n_paths = len(list(all_simple_paths(graph, f, t)))
if n_paths == 1:
# remove branch and merge the nodes 'f' and 't'
#
# This is what I have no clue how to do
#
pass
else:
# remove the branch and that's it
graph.remove_edge(f, t)
print('Simple removal of', f, t)
# -----------------------------------------------------------------
draw(graph, with_labels=True)
plt.show()
I feel that there should be a simpler direct way to obtain the last figure from the first, given the branch indices, but I have no clue.
I think this is more or less what you want. I am merging all nodes that are in chains (connected nodes of degree 2) into one hypernode. I return the the new graph and a dictionary mapping the hypernode to the contracted nodes.
import networkx as nx
def contract(g):
"""
Contract chains of neighbouring vertices with degree 2 into one hypernode.
Arguments:
----------
g -- networkx.Graph instance
Returns:
--------
h -- networkx.Graph instance
the contracted graph
hypernode_to_nodes -- dict: int hypernode -> [v1, v2, ..., vn]
dictionary mapping hypernodes to nodes
"""
# create subgraph of all nodes with degree 2
is_chain = [node for node, degree in g.degree_iter() if degree == 2]
chains = g.subgraph(is_chain)
# contract connected components (which should be chains of variable length) into single node
components = list(nx.components.connected_component_subgraphs(chains))
hypernode = max(g.nodes()) +1
hypernodes = []
hyperedges = []
hypernode_to_nodes = dict()
false_alarms = []
for component in components:
if component.number_of_nodes() > 1:
hypernodes.append(hypernode)
vs = [node for node in component.nodes()]
hypernode_to_nodes[hypernode] = vs
# create new edges from the neighbours of the chain ends to the hypernode
component_edges = [e for e in component.edges()]
for v, w in [e for e in g.edges(vs) if not ((e in component_edges) or (e[::-1] in component_edges))]:
if v in component:
hyperedges.append([hypernode, w])
else:
hyperedges.append([v, hypernode])
hypernode += 1
else: # nothing to collapse as there is only a single node in component:
false_alarms.extend([node for node in component.nodes()])
# initialise new graph with all other nodes
not_chain = [node for node in g.nodes() if not node in is_chain]
h = g.subgraph(not_chain + false_alarms)
h.add_nodes_from(hypernodes)
h.add_edges_from(hyperedges)
return h, hypernode_to_nodes
edges = [(2, 1),
(3, 2),
(4, 3),
(4, 13),
(7, 6),
(6, 5),
(5, 4),
(8, 7),
(9, 8),
(9, 10),
(10, 11),
(11, 12),
(12, 1),
(13, 9)]
g = nx.Graph(edges)
h, hypernode_to_nodes = contract(g)
print("Edges in contracted graph:")
print(h.edges())
print('')
print("Hypernodes:")
for hypernode, nodes in hypernode_to_nodes.items():
print("{} : {}".format(hypernode, nodes))
This returns for your example:
Edges in contracted graph:
[(9, 13), (9, 14), (9, 15), (4, 13), (4, 14), (4, 15)]
Hypernodes:
14 : [1, 2, 3, 10, 11, 12]
15 : [8, 5, 6, 7]
I built this function that scales much better and runs faster with larger graphs:
def add_dicts(vector):
l = list(map(lambda x: Counter(x),vector))
return reduce(lambda x,y:x+y,l)
def consolidate_dup_edges(g):
edges = pd.DataFrame(g.edges(data=True),columns=['start','end','weight'])
edges_consolidated = edges.groupby(['start','end']).agg({'weight':add_dicts}).reset_index()
return nx.from_edgelist(list(edges_consolidated.itertuples(index=False,name=None)))
def graph_reduce(g):
g = consolidate_dup_edges(g)
is_deg2 = [node for node, degree in g.degree() if degree == 2]
is_deg2_descendents =list(map(lambda x: tuple(nx.descendants_at_distance(g,x,1)),is_deg2))
edges_on_deg2= list(map(lambda x: list(map(lambda x:x[2],g.edges(x,data=True))),is_deg2))
edges_on_deg2= list(map(lambda x: add_dicts(x),edges_on_deg2))
new_edges = list(zip(is_deg2_descendents,edges_on_deg2))
new_edges = [(a,b,c) for (a,b),c in new_edges]
g.remove_nodes_from(is_deg2)
g.add_edges_from(new_edges)
g.remove_edges_from(nx.selfloop_edges(g))
g.remove_nodes_from([node for node, degree in g.degree() if degree <= 1])
return consolidate_dup_edges(g)
The graph_reduce function basically removes nodes with degree 1 and removes intermediate nodes with degree 2 and connects the nodes that the degree 2 node was connected to. We can see the best impact when we run this code iteratively until the number of nodes plateaus to a stable number. This only works on undirected graphs.

Get degree of each nodes in a graph by Networkx in python

Suppose I have a data set like below that shows an undirected graph:
1 2
1 3
1 4
3 5
3 6
7 8
8 9
10 11
I have a python script like it:
for s in ActorGraph.degree():
print(s)
that is a dictionary consist of key and value that keys are node names and values are degree of nodes:
('9', 1)
('5', 1)
('11', 1)
('8', 2)
('6', 1)
('4', 1)
('10', 1)
('7', 1)
('2', 1)
('3', 3)
('1', 3)
In networkx documentation suggest to use values() for having nodes degree.
now I like to have just keys that are degree of nodes and I use this part of script but it does't work and say object has no attribute 'values':
for s in ActorGraph.degree():
print(s.values())
how can I do it?
You are using version 2.0 of networkx. Which changed from using a dict for G.degree() to using a dict-like (but not dict) DegreeView. See this guide.
To have the degrees in a list you can use a list-comprehension:
degrees = [val for (node, val) in G.degree()]
I'd like to add the following: if you're initializing the undirected graph with nx.Graph() and adding the edges afterwards, just beware that networkx doesn't guarrantee the order of nodes will be preserved -- this also applies to degree(). This means that if you use the list comprehension approach then try to access the degree by list index the indexes may not correspond to the right nodes. If you'd like them to correspond, you can instead do:
degrees = [val for (node, val) in sorted(G.degree(), key=lambda pair: pair[0])]
Here's a simple example to illustrate this:
>>> edges = [(0, 1), (0, 3), (0, 5), (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (2, 5)]
>>> g = nx.Graph()
>>> g.add_edges_from(edges)
>>> print(g.degree())
[(0, 3), (1, 4), (3, 3), (5, 2), (2, 4), (4, 2)]
>>> print([val for (node, val) in g.degree()])
[3, 4, 3, 2, 4, 2]
>>> print([val for (node, val) in sorted(g.degree(), key=lambda pair: pair[0])])
[3, 4, 4, 3, 2, 2]
You can also use a dict comprehension to get an actual dictionary:
degrees = {node:val for (node, val) in G.degree()}

Resources