Converting Matplotlib and Networkx/Pygraphviz units to a common unit - python-3.x

I am attempting to autoscale a networkx graph based on the number of nodes and whether there are nodes that overlap in the horizontal direction or the vertical direction (ie. if all nodes were on the same line, they would intersect). To do this, I look at the position of each node and check whether the position plus the node size is between any other nodes coordinates plus the node size. To get the node positions, I am using networkx.nx_agraph.graphviz_layout. The issue is that the positions and the node size are not of the same unit.
Networkx calls pyplot.scatter underneath the hood (source: https://networkx.github.io/documentation/networkx-1.10/_modules/networkx/drawing/nx_pylab.html#draw_networkx_nodes), which takes the node size as an area in pixels (source: https://stackoverflow.com/questions/14827650/pyplot-scatter-plot-marker-size).
Networkx draws the graph with circle nodes, so based on this it would make sense that to convert the node size to inches, I would do node_size * 4 / (math.pi * DPI) where DPI is the DPI used by Matplotlib. In my code this does not seem to work. I have also tried taking the square root of the node size and dividing it by the DPI but that doesn't seem to work either.
Currently the issue is that I am not able to detect what nodes are overlapping.
In summary: If I am plotting a networkx graph with circle nodes, how can I convert the node size and/or the Pygraphviz positions to a common unit?
It could also be that my code is wrong, so here is a short reproducible example:
NODE_SIZE = 300
DPI = 100
mpl.rcParams['figure.dpi'] = DPI
G = nx.gnp_random_graph(50, 0)
pos = nx.nx_agraph.graphviz_layout(G)
tmp_pos = sorted(pos.items(), key=lambda x: x[1][0]) # Sort it by x value so you only have to check to the right of the node in the list of positions
node_size_inches = NODE_SIZE * 4 / (math.pi * DPI) # This is my attempt at getting the height/width of the nodes
i = 0
for key, xy in tmp_pos:
x = xy[0]
overlapping = []
# Since list is sorted by x, you only need to check nodes to the right
for k, xyt in tmp_pos[i+1:]:
xt = xyt[0]
# Check for overlapping nodes to the right
if x <= xt <= x + node_size_inches:
overlapping.append(k)
# Since list is sorted by x, if the previous condition fails, nothing after can be applicable
else:
break
if overlapping: print("{}:{}".format(key, overlapping))
i += 1
nx.draw(G, pos=pos, with_labels=True)

Related

How to keep graph shape when read it by networkx

I have a file shows different points' coordinates(first 10 rows):
1 10.381090522139 55.39134945301
2 10.37928179195319 55.38858713256631
3 10.387152479898077 55.3923338690609
4 10.380048819655258 55.393938880906745
5 10.380679138517507 55.39459444742785
6 10.382474625286 55.392132993022
7 10.383736185130601 55.39454404088371
8 10.387334283235987 55.39433237195271
9 10.388468103023115 55.39536574771765
10 10.390814951258335 55.396308397998475
Now I want to calculate the MST(minimum spanning tree) of them so firstly I change my coordinates to weight graph(distance->weight):
n = 10
data = []
for i in range(0, n):
for j in range(i + 1, n):
temp = []
temp.append(i)
temp.append(j)
x = np.array(rawdata[i, 1:3])
y = np.array(rawdata[j, 1:3])
temp.append(np.linalg.norm(x - y))
data.append(temp)
Then, using networkx to load weight data:
G = nx.read_weighted_edgelist("data.txt")
T = nx.minimum_spanning_tree(G)
nx.draw(T)
plt.show()
but I cannot see the orignal shape from result:
how to solve this problem?
I'm just answering the question about the position of the nodes. I can't tell from what you've done whether the minimum spanning tree is what you're after or not.
When you plot a network, it will assign positions based on an algorithm that is in part stochastic. If you want the nodes to go at particular positions, you will have to include that information in the call in an optional argument. So define a dictionary (it's usually called pos) such that pos[node] is a tuple (x,y) where x is the x-coordinate of node and y is the y-coordinate of node.
Then the call is nx.draw(T, pos=pos).

Is there any way to scale the markers of a pyplot.scatter graph with respect to its axes length?

I am trying to plot a bunch of particles in a box. I want the partile sizes to be scaled according to the axes length. If the particles have a radius of 1 and the box of length of 100, how do I draw this using matplotlib.pyplot.scatter()
Attempt:
I have intialised the positions of the particles, such that none overlap each other.
When I try to plot these using pyplot.scatter(), I find that the particle sizes(radius)are not 0.01 times the box dimesions.
How do I do this?
I have attached a picture here.
If I change the box length from 100 to 1000, I expect to see the markers get their radius decrease by a factor of 10.
I'm using:
import matplotlib.pyplot
pyplot.scatter(particles_np[:,0],particles_np[:,1])
pyplot.show()`
particles_np is a numpy array which has the x and y positions of the particles.

Boundary enclosing a given set of points

I am having a bit of a problem with an algorithm that I am currently using. I wanted it to make a boundary.
Here is an example of the current behavior:
Here is an MSPaint example of wanted behavior:
Current code of Convex Hull in C#:https://hastebin.com/dudejesuja.cs
So here are my questions:
1) Is this even possible?
R: Yes
2) Is this even called Convex Hull? (I don't think so)
R: Nope it is called boundary, link: https://www.mathworks.com/help/matlab/ref/boundary.html
3) Will this be less performance friendly than a conventional convex hull?
R: Well as far as I researched it should be the same performance
4) Example of this algorithm in pseudo code or something similar?
R: Not answered yet or I didn't find a solution yet
Here is some Python code that computes the alpha-shape (concave hull) and keeps only the outer boundary. This is probably what matlab's boundary does inside.
from scipy.spatial import Delaunay
import numpy as np
def alpha_shape(points, alpha, only_outer=True):
"""
Compute the alpha shape (concave hull) of a set of points.
:param points: np.array of shape (n,2) points.
:param alpha: alpha value.
:param only_outer: boolean value to specify if we keep only the outer border
or also inner edges.
:return: set of (i,j) pairs representing edges of the alpha-shape. (i,j) are
the indices in the points array.
"""
assert points.shape[0] > 3, "Need at least four points"
def add_edge(edges, i, j):
"""
Add an edge between the i-th and j-th points,
if not in the list already
"""
if (i, j) in edges or (j, i) in edges:
# already added
assert (j, i) in edges, "Can't go twice over same directed edge right?"
if only_outer:
# if both neighboring triangles are in shape, it's not a boundary edge
edges.remove((j, i))
return
edges.add((i, j))
tri = Delaunay(points)
edges = set()
# Loop over triangles:
# ia, ib, ic = indices of corner points of the triangle
for ia, ib, ic in tri.vertices:
pa = points[ia]
pb = points[ib]
pc = points[ic]
# Computing radius of triangle circumcircle
# www.mathalino.com/reviewer/derivation-of-formulas/derivation-of-formula-for-radius-of-circumcircle
a = np.sqrt((pa[0] - pb[0]) ** 2 + (pa[1] - pb[1]) ** 2)
b = np.sqrt((pb[0] - pc[0]) ** 2 + (pb[1] - pc[1]) ** 2)
c = np.sqrt((pc[0] - pa[0]) ** 2 + (pc[1] - pa[1]) ** 2)
s = (a + b + c) / 2.0
area = np.sqrt(s * (s - a) * (s - b) * (s - c))
circum_r = a * b * c / (4.0 * area)
if circum_r < alpha:
add_edge(edges, ia, ib)
add_edge(edges, ib, ic)
add_edge(edges, ic, ia)
return edges
If you run it with the following test code you will get this figure, which looks like what you need:
from matplotlib.pyplot import *
# Constructing the input point data
np.random.seed(0)
x = 3.0 * np.random.rand(2000)
y = 2.0 * np.random.rand(2000) - 1.0
inside = ((x ** 2 + y ** 2 > 1.0) & ((x - 3) ** 2 + y ** 2 > 1.0)
points = np.vstack([x[inside], y[inside]]).T
# Computing the alpha shape
edges = alpha_shape(points, alpha=0.25, only_outer=True)
# Plotting the output
figure()
axis('equal')
plot(points[:, 0], points[:, 1], '.')
for i, j in edges:
plot(points[[i, j], 0], points[[i, j], 1])
show()
EDIT: Following a request in a comment, here is some code that "stitches" the output edge set into sequences of consecutive edges.
def find_edges_with(i, edge_set):
i_first = [j for (x,j) in edge_set if x==i]
i_second = [j for (j,x) in edge_set if x==i]
return i_first,i_second
def stitch_boundaries(edges):
edge_set = edges.copy()
boundary_lst = []
while len(edge_set) > 0:
boundary = []
edge0 = edge_set.pop()
boundary.append(edge0)
last_edge = edge0
while len(edge_set) > 0:
i,j = last_edge
j_first, j_second = find_edges_with(j, edge_set)
if j_first:
edge_set.remove((j, j_first[0]))
edge_with_j = (j, j_first[0])
boundary.append(edge_with_j)
last_edge = edge_with_j
elif j_second:
edge_set.remove((j_second[0], j))
edge_with_j = (j, j_second[0]) # flip edge rep
boundary.append(edge_with_j)
last_edge = edge_with_j
if edge0[0] == last_edge[1]:
break
boundary_lst.append(boundary)
return boundary_lst
You can then go over the list of boundary lists and append the points corresponding to the first index in each edge to get a boundary polygon.
I would use a different approach to solve this problem. Since we are working with a 2-D set of points, it is straightforward to compute the bounding rectangle of the points’ region. Then I would divide this rectangle into “cells” by horizontal and vertical lines, and for each cell simply count the number of pixels located within its bounds. Since each cell can have only 4 adjacent cells (adjacent by cell sides), then the boundary cells would be the ones that have at least one empty adjacent cell or have a cell side located at the bounding rectangle boundary. Then the boundary would be constructed along boundary cell sides. The boundary would look like a “staircase”, but choosing a smaller cell size would improve the result. As a matter of fact, the cell size should be determined experimentally; it could not be too small, otherwise inside the region may appear empty cells. An average distance between the points could be used as a lower boundary of the cell size.
Consider using an Alpha Shape, sometimes called a Concave Hull. https://en.wikipedia.org/wiki/Alpha_shape
It can be built from the Delaunay triangulation, in time O(N log N).
As pointed out by most previous experts, this might not be a convex hull but a concave hull, or an Alpha Shape in other words. Iddo provides a clean Python code to acquire this shape. However, you can also directly utilize some existing packages to realize that, perhaps with a faster speed and less computational memory if you are working with a large number of point clouds.
[1] Alpha Shape Toolbox: a toolbox for generating n-dimensional alpha shapes.
https://plotly.com/python/v3/alpha-shapes/
[2] Plotly: It can can generate a Mesh3d object, that depending on a key-value can be the convex hull of that set, its Delaunay triangulation, or an alpha set.
https://plotly.com/python/v3/alpha-shapes/
Here is the JavaScript code that builds concave hull: https://github.com/AndriiHeonia/hull Probably you can port it to C#.
One idea is creating triangles, a mesh, using the point cloud, perhaps through Delanuay triangulation,
and filling those triangles with a color then run level set, or active contour segmentation which will find the outer boundary of the shape whose color is now different then the outside "background" color.
https://xphilipp.developpez.com/contribuez/SnakeAnimation.gif
The animation above did not go all the way but many such algorithms can be configured to do that.
Note: The triangulation alg has to be tuned so that it doesn't merely create a convex hull - for example removing triangles with too large angles and sides from the delanuay result. A prelim code could look like
from scipy.spatial import Delaunay
points = np.array([[13.43, 12.89], [14.44, 13.86], [13.67, 15.87], [13.39, 14.95],\
[12.66, 13.86], [10.93, 14.24], [11.69, 15.16], [13.06, 16.24], [11.29, 16.35],\
[10.28, 17.33], [10.12, 15.49], [9.03, 13.76], [10.12, 14.08], [9.07, 15.87], \
[9.6, 16.68], [7.18, 16.19], [7.62, 14.95], [8.39, 16.79], [8.59, 14.51], \
[8.1, 13.43], [6.57, 11.59], [7.66, 11.97], [6.94, 13.86], [6.53, 14.84], \
[5.48, 12.84], [6.57, 12.56], [5.6, 11.27], [6.29, 10.08], [7.46, 10.45], \
[7.78, 7.21], [7.34, 8.72], [6.53, 8.29], [5.85, 8.83], [5.56, 10.24], [5.32, 7.8], \
[5.08, 9.86], [6.01, 5.75], [6.41, 7.48], [8.19, 5.69], [8.23, 4.72], [6.85, 6.34], \
[7.02, 4.07], [9.4, 3.2], [9.31, 4.99], [7.86, 3.15], [10.73, 2.82], [10.32, 4.88], \
[9.72, 1.58], [11.85, 5.15], [12.46, 3.47], [12.18, 1.58], [11.49, 3.69], \
[13.1, 4.99], [13.63, 2.61]])
tri = Delaunay(points,furthest_site=False)
res = []
for t in tri.simplices:
A,B,C = points[t[0]],points[t[1]],points[t[2]]
e1 = B-A; e2 = C-A
num = np.dot(e1, e2)
n1 = np.linalg.norm(e1); n2 = np.linalg.norm(e2)
denom = n1 * n2
d1 = np.rad2deg(np.arccos(num/denom))
e1 = C-B; e2 = A-B
num = np.dot(e1, e2)
denom = np.linalg.norm(e1) * np.linalg.norm(e2)
d2 = np.rad2deg(np.arccos(num/denom))
d3 = 180-d1-d2
res.append([n1,n2,d1,d2,d3])
res = np.array(res)
m = res[:,[0,1]].mean()*res[:,[0,1]].std()
mask = np.any(res[:,[2,3,4]] > 110) & (res[:,0] < m) & (res[:,1] < m )
plt.triplot(points[:,0], points[:,1], tri.simplices[mask])
Then fill with color and segment.

igraph half circle layout in R

I am trying various visualizations for an Igraph in R (version.3.3.1).
Currently my visualizing is as shown as below, 2 nodes (blue and green) in circular layout.
Circular Layout
visNetwork(data$nodes,data$edges) %>% visIgraphLayout(layout="layout_in_circle")
Now I want to have a semicircle structure instead of a full circle as in the pic. All blue nodes form a semicircle, green nodes another semicircle. Each semicircle separated by a small distance as well. How can i achieve this. I found grid package has an option for semicircle, but i couldnt make it work with igraph. Please provide some pointers.
The layout argument accepts an arbitrary matrix with two columns and N rows if your graph has N vertices; all you need to do is to create a list of coordinates that correspond to a semicircle. You can make use of the fact that a vertex at angle alpha around a circle with radius r centered at (0, 0) is to be found at (r * cos(alpha), r * sin(alpha)). Since you are using R, alpha should be specified in radians, spaced evenly between 0 and pi (which corresponds to 180 degrees).

How to set data values on a vtkStructuredGrid

I'm trying to fill in a structured grid with an analytical field, but despite reading the vtk docs, I haven't found out how to actually set scalar values at the grid points or the set the spacing/origin info of the grid. Starting from the code below, how do I
associate spatial information with the grid (ie cell 0,0,0 is at coordinates 0,0,0, the spacing is dx in every direction)
associate scalar values with each grid point. To start, I just need one, but eventually I'd like to store 3 pieces of data at each point (not a vector, 3 distinct scalars).
grid = vtk.vtkStructuredGrid()
numPoints = int((maxGrid - minGrid)/dx)
grid.SetDimensions(numPoints, numPoints, numPoints)
In VTK there are 3 types of "structured" grids, vtkImageData (vtkUniformGrid derives from this), vtkRectilinearGrid, and vtkStructuredGrid. They are all structured in the sense that the topology is set. vtkImageData has constant spacing between points and is axis aligned, vtkRectilinearGrid is axis aligned but can vary the spacing in each axis direction, and vtkStructuredGrid has arbitrarily located points (cells may not be valid though).
For what you want to do you should do:
from vtk import *
dx = 2.0
grid = vtkImageData()
grid.SetOrigin(0, 0, 0) # default values
grid.SetSpacing(dx, dx, dx)
grid.SetDimensions(5, 8, 10) # number of points in each direction
# print grid.GetNumberOfPoints()
# print grid.GetNumberOfCells()
array = vtkDoubleArray()
array.SetNumberOfComponents(1) # this is 3 for a vector
array.SetNumberOfTuples(grid.GetNumberOfPoints())
for i in range(grid.GetNumberOfPoints()):
array.SetValue(i, 1)
grid.GetPointData().AddArray(array)
# print grid.GetPointData().GetNumberOfArrays()
array.SetName("unit array")

Resources