Networkx (or Graphviz) layout with fixed y positions - layout

Are there any layout algorithms in networkx (or that I can call in Graphviz) that allow me to fix the Y-position of nodes in a DAG to a potentially different floating point value for each node, but spread out the X positions in some reasonable way (ideally attempting to minimise edge lengths or crossovers, although I suspect this might not be possible)? I can only find layouts that require nodes to be on discrete layers.
Added: Below is an example of the sort of graph topology I have, plotted using nx.kamada_kawai_layout. The thing is that these nodes have a "time" value (not shown here), which I want to plot on the Y axis. The vertices are directed in time, so that a parent node (e.g. 54 here) is always older than its children (here 52 and 53). So I want to lay this out with the Y position given by the node "time", and the X position such that crossings are minimised, in as much as that's possible (I know this is NP hard in general, but the layout below is actually doing a pretty good job.
p.s. usually all the leaf nodes, e.g. 2, 3, 7 here, are at time 0, so should be laid out at the bottom of the final layout.
p.p.s. Essentially what I would like to do is to imagine this as a spring diagram, "pick up" the root node (54) in the plot above and place it at the top of the page, with the topology dangling down, then adjust the Y-position of the children to the their internal "time" values.
Edit 2. Thanks to #sroush below, I can get a decent layout with the dot graphviz engine:
A = nx.nx_agraph.to_agraph(G)
fig = plt.figure(1, figsize=(10, 10))
A.add_subgraph(ts.samples(), level="same", name="cluster")
A.layout(prog="dot")
pos = {n: [float(x) for x in A.get_node(n).attr["pos"].split(",")] for n in G.nodes()}
nx.draw_networkx(G, pos, with_labels=True)
But I then want to reposition the nodes slightly so instead of ranked times (the numbers) they use their actual, floating point times. Like this:
true_times = nx.get_node_attributes(G, 'time')
reposition = {node_id: np.array([pos[node_id][0], true_times[node_id]]) for node_id in true_times}
nx.draw_networkx(G, reposition, with_labels=True)
As you can see, that squashed the nodes together rather a lot. Is there any way to increase the horizontal positions of those nodes to make them not bump into one-another? I could perhaps cluster some on to the same layer and iterate, but that seems quite expensive.

The Graphviz dot engine can get you pretty close. This is usually described as a "timeline" issue. Here is a graph that is part of the Graphviz source that seems to do what you want: https://www.flickr.com/photos/kentbye/1155560169

Related

networkx: node spacing when plotting multipartite graph

I want to plot a multiparite graph using networkx. However, when adding more nodes, the plot becomes very crowdy. Is there a way to have more space between nodes and partitions?
Looking at the documentation of multipartite_layout, I couldn't find parameters for this.
Of course, one could create complicated formulas for the positions, but since the spacing of multipartite_layout already looks so good for small graphs, I was how to scale this to bigger graphs.
Has anyone an idea how to do this (efficiently)?
Sample code, generating a graph with three partitions:
import matplotlib.pyplot as plt
import networkx as nx
# build graph:
G = nx.Graph()
for i in range (0,30):
G.add_node(i,layer=0)
for i in range (30,50):
G.add_node(i,layer=1)
for j in range(0,30):
G.add_edge(i,j)
G.add_node(100,layer=2)
G.add_edge(40,100)
# plot graph
pos = nx.multipartite_layout(G, subset_key="layer",)
plt.figure(figsize=(20, 8))
nx.draw(G, pos,with_labels=False)
plt.axis("equal")
plt.show()
The current, crowdy plot:
nx.multipartite_layout returns a dictionary with the following format: {node: array([x, y])}
I suggest you try pos = {p:array_op(pos[p]) for p in pos} where array_op is a function acting on the position of each node, array([x, y]).
In your case, I think a simple scaling along the x-axis suffice, i.e.
array_op = lambda x, sx: np.array(x[0]*sx, x[1]).
For visualization purpose I guess this should be equivalent with #JPM 's comment. However, this approach gives you the advantage of having the actual transformed position data.
In the end, if such uniform transformation does not satisfy your need, you can always manipulate the position manually with the knowledge of the format of the dict (although it might be less efficient).

only writing visible points to disk of an overplotted scatterplot

I am creating matplotlib scatterplots of around 10000 points. At the point size I am using, this results in overplotting, i.e. some of the points will be hidden by the points that are plotted over them.
While I don't mind about the fact that I cannot see the hidden points, they are redundantly written out when I write the figure to disk as pdf (or other vector format), resulting in a large file.
Is there a way to create a vector image where only the visible points would be written to the file? This would be similar to the concept of "flattening" / merging layers in photo editing software. (I still like to retain the image as vector, as I would like to have the ability to zoom in).
Example plot:
import numpy as np
import pandas as pd
import random
import matplotlib.pyplot as plt
random.seed(15)
df = pd.DataFrame({'x': np.random.normal(10, 1.2, 10000),
'y': np.random.normal(10, 1.2, 10000),
'color' : np.random.normal(10, 1.2, 10000)})
df.plot(kind = "scatter", x = "x", y = "y", c = "color", s = 80, cmap = "RdBu_r")
plt.show()
tl;dr
I don't know of any simple solution such as
RemoveOccludedCircles(C)
The algorithm below requires some implementation, but it shouldn't be too bad.
Problem reformulation
While we could try to remove existing circles when adding new ones, I find it easier to think about the problem the other way round, processing all circles in reverse order and pretending to draw each new circle behind the existing ones.
The main problem then becomes: How can I efficiently determine whether one circle would be completely hidden by another set of circles?
Conditions
In the following, I will describe an algorithm for the case where the circles are sorted by size, such that larger circles are placed behind smaller circles. This includes the special case where all circles have same size. An extension to the general case would actually be significantly more complicated as one would have to maintain a triangulation of the intersection points. In addition, I will make the assumption that no two circles have the exact same properties (radius and position). These identical circles could easily be filtered.
Datastructures
C: A set of visible circles
P: A set of control points
Control points will be placed in such a way that no newly placed circle can become visible unless either its center lies outside the existing circles or at least one control point falls inside the new circle.
Problem visualisation
In order to better understand the role of control poins, their maintenance and the algorithm, have a look at the following drawing:
Processing 6 circles
In the linked image, active control points are painted in red. Control points that are removed after each step are painted in green or blue, where blue points were created by computing intersections between circles.
In image g), the green area highlights the region in which the center of a circle of same size could be placed such that the corresponding circle would be occluded by the existing circles. This area was derived by placing circles on each control point and subtracting the resulting area from the area covered by all visible circles.
Control point maintenance
Whenever adding one circle to the canvas, we add four active points, which are placed on the border of the circle in an equidistant way. Why four? Because no circle of same or bigger size can be placed with its center inside the current circle without containing one of the four control points.
After placing one circle, the following assumption holds: A new circle is completely hidden by existing circles if
Its center falls into a visible circle.
No control point lies strictly inside the new circle.
In order to maintain this assumption while adding new circles, the set of control points needs to be updated after each addition of a visible circle:
Add 4 new control points for the new circle, as described before.
Add new control points at each intersection of the new circle with existing visible circles.
Remove all control points that lie strictly inside any visible circle.
This rule will maintain control points at the outer border of the visible circles in such a dense way that no new visible circle intersecting the existing circles can be placed without 'eating' at least one control point.
Pseudo-Code
AllCircles <- All circles, sorted from front to back
C <- {} // the set of visible circles
P <- {} // the set of control points
for X in AllCircles {
if (Inside(center(X), C) AND Outside(P, X)) {
// ignore circle, it is occluded!
} else {
C <- C + X
P <- P + CreateFourControlPoints(X)
P <- P + AllCuttingPoints(X, C)
RemoveHiddenControlPoints(P, C)
}
}
DrawCirclesInReverseOrder(C)
The functions 'Inside' and 'Outside' are a bit abstract here, as 'Inside' returns true if a point is contained in one or more circles from a seto circles and 'Outside' returns true if all points from a set of points lie outside of a circle. But none of the functions used should be hard to write out.
Minor problems to be solved
How to determine in a numerically stable way whether a point is strictly inside a circle? -> This shouldn't be too bad to solve as all points are never more complicated than the solution of a quadratic equation. It is important, though, to not rely solely on floating point representations as these will be numerically insufficient and some control points would probable get completely lost, effectively leaving holes in the final plot. So keep a symbolic and precise representation of the control point coordinates. I would try SymPy to tackle this problem as it seems to cover all the required math. The formula for intersecting circles can easily be found online, for example here.
How to efficiently determine whether a circle contains any control point or any visible circle contains the center of a new circle? -> In order to solve this, I would propose to keep all elements of P and C in grid-like structures, where the width and height of each grid element equals the radius of the circles. On average, the number of active points and visible circles per grid cell should be in O(1), although it is possible to contruct artificial setups with arbitrary amounts of elements per grid cell, which would turn the overall algorithm from O(N) to O(N * N).
Runtime thoughts
As mentioned above, I would expect the runtime to scale linearly with the number of circles on average, because the number of visible circles in each grid cell will be in O(N) unless constructed in an evil way.
The data structures should be easily maintainable in memory if the circle radius isn't excessively small and computing intersections between circles should also be quite fast. I'm curious about final computational time, but I don't expect that it would be much worse than drawing all circles in a naive way a single time.
My best guess would be to use a hexbin. Note that with a scatter plot, the dots that are plotted latest will be the only ones visible. With a hexbin, all coinciding dots will be averaged.
If interested, the centers of the hexagons can be used to again create a scatter plot only showing the minimum.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
np.random.seed(15)
df = pd.DataFrame({'x': np.random.normal(10, 1.2, 10000),
'y': np.random.normal(10, 1.2, 10000),
'color': np.random.normal(10, 1.2, 10000)})
fig, ax = plt.subplots(ncols=4, gridspec_kw={'width_ratios': [10,10,10,1]})
norm = plt.Normalize(df.color.min(), df.color.max())
df.plot(kind="scatter", x="x", y="y", c="color", s=10, cmap="RdBu_r", norm=norm, colorbar=False, ax=ax[0])
hexb = ax[1].hexbin(df.x, df.y, df.color, cmap="RdBu_r", norm=norm, gridsize=80)
centers = hexb.get_offsets()
values = hexb.get_array()
ax[2].scatter(centers[:,0], centers[:,1], c=values, s=10, cmap="RdBu_r", norm=norm)
plt.colorbar(hexb, cax=ax[3])
plt.show()
Here is another comparison. The number of dots is reduced with a factor of 10, and the plot is more "honest" as overlapping dots are averaged.

Determing the direction of face normals consistently?

I'm a newbie to computer graphics so I apologize if some of my language is inexact or the question misses something basic.
Is it possible to calculate face normals correctly, given a list of vertices, and a list of faces like this:
v1: x_1, y_1, z_1
v2: x_2, y_2, z_2
...
v_n: x_n, y_n, z_n
f1: v1,v2,v3
f2: v4,v2,v5
...
f_m: v_j, v_k, v_l
Each x_i, y_i , z_i specifies the vertices position in 3d space (but isn't neccesarily a vector)
Each f_i contains the indices of the three vertices specifying it.
I understand that you can use the cross product of two sides of a face to get a normal, but the direction of that normal depends on the order and choice of sides (from what I understand).
Given this is the only data I have is it possible to correctly determine the direction of the normals? or is it possible to determine them consistently atleast? (all normals may be pointing in the wrong direction?)
In general there is no way to assign normal "consistently" all over a set of 3d faces... consider as an example the famous Möbius strip...
You will notice that if you start walking on it after one loop you get to the same point but on the opposite side. In other words this strip doesn't have two faces, but only one. If you build such a shape with a strip of triangles of course there's no way to assign normals in a consistent way and you'll necessarily end up having two adjacent triangles with normals pointing in opposite directions.
That said, if your collection of triangles is indeed orientable (i.e. there actually exist a consistent normal assignment) a solution is to start from one triangle and then propagate to neighbors like in a flood-fill algorithm. For example in Python it would look something like:
active = [triangles[0]]
oriented = set([triangles[0]])
while active:
next_active = []
for tri in active:
for other in neighbors(tri):
if other not in oriented:
if not agree(tri, other):
flip(other)
oriented.add(other)
next_active.append(other)
active = next_active
In CG its done by polygon winding rule. That means all the faces are defined so the points are in CW (or CCW) order when looked on the face directly. Then using cross product will lead to consistent normals.
However many meshes out there does not comply the winding rule (some faces are CW others CCW not all the same) and for those its a problem. There are two approaches I know of:
for simple shapes (not too much concave)
the sign of dot product of your face_normal and face_center-cube_center will tell you if the normal points inside or outside of the object.
if ( dot( face_normal , face_center-cube_center ) >= 0.0 ) normal_points_out
You can even use any point of face instead of the face center too. Anyway for more complex concave shapes this will not work correctly.
test if point above face is inside or not
simply displace center of face by some small distance (not too big) in normal direction and then test if the point is inside polygonal mesh or not:
if ( !inside( face_center+0.001*face_normal ) ) normal_points_out
to check if point is inside or not you can use hit test.
However if the normal is used just for lighting computations then its usage is usually inside a dot product. So we can use its abs value instead and that will solve all lighting problems regardless of the normal side. For example:
output_color = face_color * abs(dot(face_normal,light_direction))
some gfx apis have implemented this already (look for double sided materials or normals, turning them on usually use the abs value ...) For example in OpenGL:
glLightModeli(GL_LIGHT_MODEL_TWO_SIDE, GL_TRUE);

Graphviz DOT arrange Nodes in circles, layout too "compact"

I'm halfway there please see the edit
OK here's my problem, I'm generating a graph of a python module, including all the files with their functions/methods/classes.
I want to arrange it so, that nodes gather in circles around their parent nodes, currently everything is on one gargantuan horizontal row, which makes the thing >50k pixels wide and also let's the svg converter fail(only renders about the half of the graph).
I went through the docs but couldn't find anything that seems to do the trick.
So the question is:
Is there a simple way to do this or do I have to layout the whole thing by myself? :/
EDIT:
Thanks to Andrews comment I've got the right layout, the only problem now is that it's a bit to "compact"... so the question now is, how to fix this?
i've mentioned all of the most significant parameters that influence your current layout and then suggested values for those parameters. Still, i suspect you can get the layout that you want just from applying a couple of these suggestions.
reduce the edge weight, eg, [weight=0.5]; this will make the
edges longer, causing the tight
clusters you currently see in your
graph to 'fan out'.
get rid of the node borders, node_A
[color=none; shape=plaintext];
especially for oval-shaped nodes, a
substantial fraction of the total
node space is 'unused' (ie, not used
to display the node label).
explicitly set the font size for
the nodes (the node borders are
enlarged so that they surround the
node text, which means that the font
size and amount of text for a given
node has a significant effect on its
size); [fontsize=11] should be large
enough to be legible yet also reduce
the 'cluttered' appearance (the
default size is 14).
increase minimum separation between
nodes, via 'nodesep'; eg, nodesep=2.0; this will
directly address your objection
regarding your graph being "too
compact." ('nodesep' and 'ranksep'
probably affect how dot draws a graph
more than any other parameters for
node, edge, or graph. In your case,
it looks like you have only two ranks
of nodes; 'ranksep' sets the minimum
distance between nodes of different
ranks--it looks like all of the nodes
that comprise your graph are of the
same rank (except for few top level
nodes in the centers).
explicitly set total graph size, eg,
size="7.75,10.25" (ensures that your
graph fits on an 8.5 x 11 page and
that it occupies the entire space)
And one purely aesthetic suggestion
that at most will only help your
graph appear less cluttered: the
default fontcolor for both edges and
nodes is black. The majority of the
ink on your graph is from those two
structures (particularly if you
remove the node borders), so i would
for instance set either the node
(text) fontcolor or the edge
fontcolor to "blue" to help the eye
distinguish the two sets of graph
structures.
If it is too compact, you will want to mess with the edge length. You have a couple options depending on the graph layout:
If your layout is sfdp or fdp, tweak the graph property K. Default is 0.3.
For neato (or fdp), tweak the edge property len. Default is 1.0 for neato and 0.3 for fdp.
For dot you can use the edge property minlen which is the minimum edge length. Default is 1.
You might also want to mess with the graph property model which determines clustering behavior. Specifically, try subset. I believe this handles len for you:
http://www.graphviz.org/doc/info/attrs.html#d:model
Also, you can remove overlaps all together with scaling techniques: http://www.graphviz.org/doc/info/attrs.html#d:overlap
I have around 500 nodes and used doug's recommendation.
This is my sample code that works (in python):
f = Digraph('companies',filename='companies.gv',
edge_attr={'weight':'1',
'fontsize':'11',
'fontcolor':'blue',
'len':'4'},
graph_attr={'fixedsize':'false',
'bgcolor':'transparent'},
node_attr={'fontsize':'11',
'shape':'plaintext',
'color':'none',
'fontcolor':'black'})
f.attr(layout="neato")
f.attr(nodesep='3')
f.attr(ranksep='3')
f.attr(size='5000,5000')

How to generate irregular ball shapes?

What kind of algorithms would generate random "goo balls" like those in World of Goo. I'm using Proccesing, but any generic algorithm would do.
I guess it boils down to how to "randomly" make balls that are kind of round, but not perfectly round, and still looking realistic?
Thanks in advance!
The thing that makes objects realistic in World of Goo is not their shape, but the fact that the behavior of objects is a (more or less) realistic simulation of 2D physics, especially
bending, stretching, compressing (elastic deformation)
breaking due to stress
and all of the above with proper simulation of dynamics, with no perceivable shortcuts
So, try to make the behavior of your objects realistic and that will make them look (feel) realistic.
Not sure if this is what you're looking for since I can't look at that site from work. :)
A circle is just a special case of an ellipse, where the major and minor axes are equal. A squished ball shape is an ellipse where one of the axes is longer than the other. You can generate different lengths for the axes and rotate the ellipse around to get these kinds of irregular shapes.
Maybe Metaballs (wiki) are something to start from.. but I'm not sure.
Otherwise I would suggest a particle approach in which a ball is composed by many particles that stick together, giving an irregularity (mind that this needs a minimal physical engine to handle the spring body that keeps all particles together).
As Unreason said, World of Goo is not so much about shape, but physics simulation.
But an easy way to create ball-like irregular shapes could be to start with n vertices (points) V_1, V_2 ... V_n on a circle and apply some random deformation to it. There are many ways to do that, going from simply moving around some single vertices to complex physical simulations.
Some ideas:
1) Chose a random vertex V_i, chose a random vector T, apply that vector as a translation (movement) to V_i, apply T to all other vertices V_j, too, but scaled down depending on the "distance" from V_i (where distance could be the absolute differenece between j and i, or the actual geometric distance of V_j to V_i). For the scaling factor you could use any function f that is 1 for f(0) and decreasing for increasing distances (basically a radial basis function).
for each V_j
V_j = scalingFactor(distance(V_i, V_j)) * translationVector + V_j
2) You move V_i as in 1, but now you simulate springlike connections between all neigbouring vertices and iteratively move all vertices based on the forces created by stretched springs.
3) For more round shapes you can do 1) or 2) on the control points of a B-spline curve.
Beware of self-intersections when you move vertices too much.
Just some rough ideas, not tested...

Resources