Creating networkx graph using Delaunay Triangulation - python-3.x

I have a Delaunay Triangulation (DT) (scipy) as follows:
# Take first 50 rows with 3 attributes to create a DT-
d = data.loc[:50, ['aid', 'x', 'y']].values
dt = Delaunay(points = d)
# List of triangles in the DT-
dt.simplices
'''
array([[1, 3, 4, 0],
[1, 2, 3, 0],
[1, 2, 3, 4]], dtype=int32)
'''
Now, I want to create a graph using 'networkx' package and add the nodes and edges found using DT from above.
# Create an empty graph with no nodes and no edges.
G = nx.Graph()
The code I have come up with to add the unique nodes from DT simplices into 'G' is-
# Python3 list to contain nodes
nodes = []
for simplex in data_time_delaunay[1].simplices.tolist():
for nde in simplex:
if nde in nodes:
continue
else:
nodes.append(nde)
nodes
# [1, 3, 4, 0, 2]
# Add nodes to graph-
G.add_nodes_from(nodes)
How do I add edges to 'G' using 'dt.simplices'? For example, the first triangle is [1, 3, 4, 0] and is between the nodes/vertices 1, 3, 4 and 0. How do I figure out which nodes are attached to each other and then add them as edges to 'G'?
Also, is there a better way to add nodes to 'G'?
I am using Python 3.8.
Thanks!

You could add the rows in the array as paths. A path just constitutes a sequence of edges, so the path 1,2,3 translates to the edge list (1,2),(2,3).
So iterate over the rows and use nx.add_path:
simplices = np.array([[1, 3, 4, 0],
[1, 2, 3, 0],
[1, 2, 3, 4]])
G = nx.Graph()
for path in simplices:
nx.add_path(G, path)
nx.draw(G, with_labels=True, node_size=500, node_color='lightgreen')

Related

For a given graph, construct shortest path tree- Python

Problem scale- I am taking OSM Road network of a city (6000 nodes and 50000 edges.)
Input - The graph is read as a netwrokx Digraph. (weighted)
For a given node r, I want to construct shortest path tree. Is there a standard Networkx function or library which can do so ? If not, How can I do this efficiently ? ( as opposed to running Dijkstra for all r-v pair)
Input in any form is highly valued!
This function returns the shortest path, from any node to every node reachable
Here's an example of its output:
For the following (very simple) graph:
G = nx.path_graph(5)
the single_source_shortest_path function:
path_1 = nx.single_source_shortest_path(G, 1)
returns:
{1: [1], 0: [1, 0], 2: [1, 2], 3: [1, 2, 3], 4: [1, 2, 3, 4]}
Where the shortest path from node 1 to any target T is returned by path_1[T].
There is also a networkX built-in solution to run Dijkstra for every node pair(docs):
shortest_paths = dict(nx.all_pairs_dijkstra_path(G))
shortest_paths[1]
# returns {1: [1], 0: [1, 0], 2: [1, 2], 3: [1, 2, 3], 4: [1, 2, 3, 4]}

Using numba to randomly sample possible combinations of categories

I am trying to speed up a function that randomly samples a number of records with the possible combinations of a number of categories for a number of records and ensures they are unique (i.e. let's assume there's 3 records, any of them can be either 0 or 1 and I want 10 random samples of unique possible combinations of records).
If I did not use numba, I might would do something like this:
import numpy as np
def myfunc(categories, NumberOfRecords, maxsamples):
return np.unique( np.random.choice(np.arange(categories), size=(maxsamples*10, NumberOfRecords), replace=True), axis=0 )[0:maxsamples]
Annoyingly, numba does not support axis in np.unique, so I can do something like this, but some of the records may turn out to be non-unique.
from numba import njit, int64
import numpy as np
#njit(int64[:,:](int64, int64, int64), cache=True)
def myfunc(categories, NumberOfRecords, maxsamples):
return np.random.choice(np.arange(categories), size=(maxsamples, NumberOfRecords), replace=True)
myfunc(categories=2, NumberOfRecords=3, maxsamples=10)
E.g. in one call (obviously there's some randomness here), I got the below (for which the indices 1 and 6, and 3 and 4, and 7 and 9 are identical rows):
array([[0, 1, 1],
[1, 1, 0],
[0, 1, 0],
[1, 0, 1],
[1, 0, 1],
[1, 1, 1],
[1, 1, 0],
[1, 0, 0],
[0, 0, 0],
[1, 0, 0]])
My questions are:
Is this something where I would even expect a speed up from numba?
If so, how can I get a unique rows (this seems rather difficult with numba, but presumably there's a way)?
Perhaps there's a way to get at this more efficiently (perhaps without creating more random samples than I need in the end)?
In the following, I don't use numba, but all the operations use vectorized numpy functions.
Each row of the result that you generate can be interpreted as an integer expressed in base N, where N is the number of categories. With that interpretation, what you want is to sample without replacement from the integers [0, 1, ... N**R-1], where R is the number of "records". You can use the choice function for that, with the argument replace=False. Once you have that, you need to convert the chosen integers to base N. For that, I use the function int2base, which is a pared down version of a function that I wrote in a different answer.
Here's the code:
import numpy as np
def int2base(x, base, ndigits):
# x = np.asarray(x) # Uncomment this line for general purpose use.
powers = base ** np.arange(ndigits)
digits = (x.reshape(x.shape + (1,)) // powers) % base
return digits
def makesample(ncategories, nrecords, nsamples, rng=None):
if rng is None:
rng = np.random.default_rng()
n = ncategories ** nrecords
choices = rng.choice(n, replace=False, size=nsamples)
return int2base(choices, ncategories, nrecords)
In makesample, I included the optional argument rng. It allows you to specify the object that holds the choice function. If not provided, it uses np.random.default_rng().
Example:
In [118]: makesample(2, 3, 6)
Out[118]:
array([[0, 1, 1],
[0, 0, 1],
[1, 0, 1],
[0, 0, 0],
[1, 1, 0],
[1, 1, 1]])
In [119]: makesample(5, 4, 12)
Out[119]:
array([[3, 4, 0, 1],
[2, 0, 2, 0],
[4, 2, 4, 3],
[0, 1, 0, 4],
[0, 2, 0, 1],
[1, 2, 0, 1],
[0, 3, 0, 4],
[3, 3, 0, 3],
[3, 4, 1, 4],
[2, 4, 1, 1],
[3, 4, 1, 0],
[1, 1, 4, 4]])
makesample will raise an exception if you ask for too many samples:
In [120]: makesample(2, 3, 10)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-120-80044e78a60a> in <module>
----> 1 makesample(2, 3, 10)
~/code_snippets/python/numpy/random_samples_for_so_question.py in makesample(ncategories, nrecords, nsamples, rng)
17 rng = np.random.default_rng()
18 n = ncategories ** nrecords
---> 19 choices = rng.choice(n, replace=False, size=nsamples)
20 return int2base(choices, ncategories, nrecords)
_generator.pyx in numpy.random._generator.Generator.choice()
ValueError: Cannot take a larger sample than population when 'replace=False'

Generate mesh and ray intersection for trimesh

I am using trimesh to generate ray intersection from points.
ray_origins = np.array([[0, 0, 2],
[1, 1, 3],
[3, 2, 6]])
ray_directions = np.array([[0, 5, 8],
[0, 0, 1],
[0, 2, 2]])
locations, index_ray, index_tri = mesh.ray.intersects_location( ray_origins=ray_origins,
ray_directions=ray_directions)
I want to know if there is anyway I can generate many rays from each point and cast them to the mesh?
You could try "sample_surface_sphere", which will randomly pick number (vector) from a sphere. 
nPoints = 100000
ray_directions = trimesh.sample.sample_surface_sphere(nPoints)

python numpy stack matrices and add specific corner/column entries

Say we have two matrices A and B with a size of 2 by 2. Is there a command that can stack them horizontally and add A[:,1] to B[:,0] so that the resulting matrix C is 2 by 3, with C[:,0] = A[:,0], C[:,1] = A[:,1] + B[:,0], C[:,2] = B[:,1]. One step further, stacking them on diagonal so that C[0:2,0:2] = A, C[1:2,1:2] = B, C[1,1] = A[1,1] + B[0,0]. C is 3 by 3 in this case. Hard coding this routine is not hard, but I'm just curious since MATLAB has a similar function if my memory serves me well.
A straight forward approach is to copy or add the two arrays to a target:
In [882]: A=np.arange(4).reshape(2,2)
In [883]: C=np.zeros((2,3),int)
In [884]: C[:,:-1]=A
In [885]: C[:,1:]+=A # or B
In [886]: C
Out[886]:
array([[0, 1, 1],
[2, 5, 3]])
Another approach is to to pad A at the end, pad B at the start, and sum; while there is a convenient pad function, it won't be any faster.
And for the diagonal
In [887]: C=np.zeros((3,3),int)
In [888]: C[:-1,:-1]=A
In [889]: C[1:,1:]+=A
In [890]: C
Out[890]:
array([[0, 1, 0],
[2, 3, 1],
[0, 2, 3]])
Again the 2 arrays could be pad and added.
I'm not aware of any specialized function to do this; even if there were, it probably would do the same thing. This isn't a common enough operation to justify a compiled version.
I have built up finite element sparse matrices by adding over lapping element matrices. The sparse formats for both MATLAB and scipy facilitate this (duplicate coordinates are summed).
============
In [896]: np.pad(A,[[0,0],[0,1]],mode='constant')+np.pad(A,[[0,0],[1,0]],mode='
...: constant')
Out[896]:
array([[0, 1, 1],
[2, 5, 3]])
In [897]: np.pad(A,[[0,1],[0,1]],mode='constant')+np.pad(A,[[1,0],[1,0]],mode='
...: constant')
Out[897]:
array([[0, 1, 0],
[2, 3, 1],
[0, 2, 3]])
What's the special MATLAB code for doing this?
in Octave I found:
prepad(A,3,0,axis=2)+postpad(A,3,0,axis=2)

scikit-learn: Get selected features for prediction data

I have a training set of data. The python script for creating the model also calculates the attributes into a numpy array (It's a bit vector). I then want to use VarianceThreshold to eliminate all features that have 0 variance (eg. all 0 or 1). I then run get_support(indices=True) to get the indices of the select columns.
My issue now is how to get only the selected features for the data I want to predict. I first calculate all features and then use array indexing but it does not work:
x_predict_all = getAllFeatures(suppl_predict)
x_predict = x_predict_all[indices] #only selected features
indices is a numpy array.
The returned array x_predict has the correct length len(x_predict) but wrong shape x_predict.shape[1] which is still the original length. My classifier then throws an error due to wrong shape
prediction = gbc.predict(x_predict)
File "C:\Python27\lib\site-packages\sklearn\ensemble\gradient_boosting.py", li
ne 1032, in _init_decision_function
self.n_features, X.shape[1]))
ValueError: X.shape[1] should be 1855, not 2090.
How can I solve this issue?
You can do it like this:
Test data
from sklearn.feature_selection import VarianceThreshold
X = np.array([[0, 2, 0, 3],
[0, 1, 4, 3],
[0, 1, 1, 3]])
selector = VarianceThreshold()
Alternative 1
>>> selector.fit(X)
>>> idxs = selector.get_support(indices=True)
>>> X[:, idxs]
array([[2, 0],
[1, 4],
[1, 1]])
Alternative 2
>>> selector.fit_transform(X)
array([[2, 0],
[1, 4],
[1, 1]])

Resources