Related
I have a list of nodes with their coordinates of the form
MCoord = [[Node1,X,Y,Z],[Node2,X,Y,Z]...]
Coordinates:
MCoord = [
[1, 0, 0, 0],
[2, 0, 1000, 1300],
[3, 0, 2000, 2000],
[4, 0, 3000, 2500],
[5, 0, 4000, 3200],
[6, 0, 5000, 4200],
[7, 0, 6000, 6000],
[8, 1000, 0, 0],
[9, 1000, 1000, 1300],
[10, 1000, 2000, 2000],
[11, 1000, 3000, 2500],
[12, 1000, 4000, 3200],
[13, 1000, 5000, 4200],
[14, 1000, 6000, 6000],
[15, 2000, 0, 0],
[16, 2000, 1000, 1300],
// ...
[27, 3500, 5000, 4200],
[28, 3500, 6000, 6000]
]
I want to store all nodes (with their coordinates) of the same X coordinate and matching the corresponding key value under the keys S1 (all nodes with same X value), S2, S3 and so on.
Script:
SectionLocation = {'S1':0 , 'S2':1000 , 'S3':2000 , 'S4':3500}
SectionComplete = {'S1':0 , 'S2':0 , 'S3':0 , 'S4':0}
k = 0
for i in range(len(MCoord)):
print(i)
if MCoord[i][1] == SectionLocation[k]:
keydic = get_key(SectionLocation[k])
SectionComplete[keydic].append(MCoord[i])
print(SectionComplete)
else:
k = k + 1
print(SectionComplete)
I cannot seem to be able to append new values to the dictionnary. Any advice ?
Desired outpput:
SectionComplete = {
'S1' : [
[1, 0, 0, 0],
[2, 0, 1000, 1300],
[3, 0, 2000, 2000],
[4, 0, 3000, 2500],
[5, 0, 4000, 3200],
[6, 0, 5000, 4200],
[7, 0, 6000, 6000]
],
'S2' : [
[8, 1000, 0, 0],
[9, 1000, 1000, 1300],
[10, 1000, 2000, 2000],
[11, 1000, 3000, 2500],
[12, 1000, 4000, 3200],
[13, 1000, 5000, 4200],
[14, 1000, 6000, 6000]
],
// ...
}
I believe this is what you are attempting to achieve. But you can correct me if I'm wrong.
# your list of nodes and coordinates
MCoord = [[1, 0, 0, 0],
[2, 0, 1000, 1300],
[3, 0, 2000, 2000],
[4, 0, 3000, 2500],
[5, 0, 4000, 3200],
[6, 0, 5000, 4200],
[7, 0, 6000, 6000],
[8, 1000, 0, 0],
[9, 1000, 1000, 1300],
[10, 1000, 2000, 2000],
[11, 1000, 3000, 2500],
[12, 1000, 4000, 3200],
[13, 1000, 5000, 4200],
[14, 1000, 6000, 6000],
[15, 2000, 0, 0],
[16, 2000, 1000, 1300]]
# a dictionary mapping of section to its ID
SectionLocation = {0: 'S1', 1000: 'S2', 2000: 'S3', 3500: 'S4'}
# A dictionary of the grouped sections
SectionComplete = {'S1': [], 'S2': [], 'S3': [], 'S4': []}
# we don't need the index since python for loops take care of that for us
for node in MCoord:
# grab the relevant section from the mapping
section = SectionLocation[node[1]]
# append it to that sections empty list
SectionComplete[section].append(node)
print(SectionComplete)
In your example here SectionComplete = {'S1':0 , 'S2':0 , 'S3':0 , 'S4':0} you initialize your dict values to ints and then are trying to append to those ints. Essentially it's like trying to do something like this.
my_int = 0
my_int.append(23)
This won't work because an int does not have an append method.
MCoord = [[...]]
import numpy as np
array_of_coords = np.array((MCoord))
uniq_X = np.unique(array_of_coords[:,1])
group_by_X = [[array_of_coords[array_of_coords[:,1]==i,:] for i in uniq_X]]
list_of_keys = ["S"+str(i) for i in range(len(uniq_X))]
dictionary = dict(zip(list_of_keys, group_by_X[0]))
print(dictionary)
out:
{'S0': array([[ 1, 0, 0, 0],
[ 2, 0, 1000, 1300],
[ 3, 0, 2000, 2000],
[ 4, 0, 3000, 2500],
[ 5, 0, 4000, 3200],
[ 6, 0, 5000, 4200],
[ 7, 0, 6000, 6000]]), 'S1': array([[ 8, 1000, 0, 0],
[ 9, 1000, 1000, 1300],
[ 10, 1000, 2000, 2000],
[ 11, 1000, 3000, 2500],
[ 12, 1000, 4000, 3200],
[ 13, 1000, 5000, 4200],
[ 14, 1000, 6000, 6000]]), 'S2': array([[ 15, 2000, 0, 0],
[ 16, 2000, 1000, 1300]])}
I'm trying to evaluate the model for multiclass classification using classification_report module of the sklean package.
Dimensions of y_pred: (1000,36)
Dimensions of y_test: (1000,36)
I tried calling the classification_report on the 2 arrays i.e y_test and y_pred
def display_results(y_test,y_pred,column_name=labels):
print(classification_report(y_test,y_pred,target_names=labels))
With this code I get:
ValueError: Unknown label type: (array([[1, 0, 0, ..., 0, 0, 0],
[1, 0, 0, ..., 0, 0, 0],
[1, 0, 0, ..., 1, 1, 0],
...,
[1, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[1, 0, 0, ..., 0, 0, 0]]), array([[1, 0, 0, ..., 0, 0, 0],
[1, 0, 0, ..., 0, 0, 0],
[1, 0, 0, ..., 0, 0, 0],
...,
[1, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[1, 0, 0, ..., 0, 0, 0]]))
I was expecting to get the Precision, Recall, F1 and the total average metrics for all the columns based on the labels passed to the function.
For your error you need np.hstack
Works best where you have multiclass-multioutput from a classifier
from sklearn.utils.multiclass import type_of_target
type_of_target(y_test)
type_of_target(y_pred)
>>'multiclass-multioutput'
So, your solution is
print(classification_report(np.hstack(y_test),np.hstack(y_pred)))
I'm using networkx library to find shortest path between two nodes using dijkstra algo as follows
import networkx as nx
A = [[0, 100, 0, 0 , 40, 0],
[100, 0, 20, 0, 0, 70],
[0, 20, 0, 80, 50, 0],
[0, 0, 80, 0, 0, 30],
[40, 0, 50, 0, 0, 60],
[0, 70, 0, 30, 60, 0]];
print(nx.dijkstra_path(A, 0, 4))
In the above code I'm using matrix directly, But library requires graph to be created as follows
G = nx.Graph()
G = nx.add_node(<node>)
G.add_edge(<node 1>, <node 2>)
It is very time consuming to create matrix by using above commands. Is there any way to give input as weighted matrix to the dijkstra_path function.
First you need to convert your adjacency matrix to a numpy matrix with np.array.
Then you can simply create your graph with from_numpy_matrix.
import networkx as nx
import numpy as np
A = [[0, 100, 0, 0 , 40, 0],
[100, 0, 20, 0, 0, 70],
[0, 20, 0, 80, 50, 0],
[0, 0, 80, 0, 0, 30],
[40, 0, 50, 0, 0, 60],
[0, 70, 0, 30, 60, 0]]
a = np.array(A)
G = nx.from_numpy_matrix(a)
print(nx.dijkstra_path(G, 0, 4))
Output:
[0, 4]
Side note: you can check the graph edges with the following code.
for edge in G.edges(data=True):
print(edge)
Output:
(0, 1, {'weight': 100})
(0, 4, {'weight': 40})
(1, 2, {'weight': 20})
(1, 5, {'weight': 70})
(2, 3, {'weight': 80})
(2, 4, {'weight': 50})
(3, 5, {'weight': 30})
(4, 5, {'weight': 60})
I have the following labels
>>> lab
array([3, 0, 3 ,3, 1, 1, 2 ,2, 3, 0, 1,4])
I want to assign this label to another numpy array i.e
>>> arr
array([[81, 1, 3, 87], # 3
[ 2, 0, 1, 0], # 0
[13, 6, 0, 0], # 3
[14, 0, 1, 30], # 3
[ 0, 0, 0, 0], # 1
[ 0, 0, 0, 0], # 1
[ 0, 0, 0, 0], # 2
[ 0, 0, 0, 0], # 2
[ 0, 0, 0, 0], # 3
[ 0, 0, 0, 0], # 0
[ 0, 0, 0, 0], # 1
[13, 2, 0, 11]]) # 4
and add all corresponding rows with same labels.
The output must be
([[108, 7, 4,117]--3
[ 0, 0, 0, 0]--0
[ 0, 0, 0, 0]--1
[ 0, 0, 0, 0]--2
[13, 2, 0, 11]])--4
You could use groupby from pandas:
import pandas as pd
parr=pd.DataFrame(arr,index=lab)
pd.groupby(parr,by=parr.index).sum()
0 1 2 3
0 2 0 1 0
1 0 0 0 0
2 0 0 0 0
3 108 7 4 117
4 13 2 0 11
numpy doesn't have a group_by function like pandas, but it does have a reduceat method that performs fast array actions on groups of elements (rows). But it's application in this case is a bit messy.
Start with our 2 arrays:
In [39]: arr
Out[39]:
array([[81, 1, 3, 87],
[ 2, 0, 1, 0],
[13, 6, 0, 0],
[14, 0, 1, 30],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[13, 2, 0, 11]])
In [40]: lbls
Out[40]: array([3, 0, 3, 3, 1, 1, 2, 2, 3, 0, 1, 4])
Find the indices that will sort lbls (and rows of arr) into contiguous blocks:
In [41]: I=np.argsort(lbls)
In [42]: I
Out[42]: array([ 1, 9, 4, 5, 10, 6, 7, 0, 2, 3, 8, 11], dtype=int32)
In [43]: s_lbls=lbls[I]
In [44]: s_lbls
Out[44]: array([0, 0, 1, 1, 1, 2, 2, 3, 3, 3, 3, 4])
In [45]: s_arr=arr[I,:]
In [46]: s_arr
Out[46]:
array([[ 2, 0, 1, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[81, 1, 3, 87],
[13, 6, 0, 0],
[14, 0, 1, 30],
[ 0, 0, 0, 0],
[13, 2, 0, 11]])
Find the boundaries of these blocks, i.e. where s_lbls jumps:
In [47]: J=np.where(np.diff(s_lbls))
In [48]: J
Out[48]: (array([ 1, 4, 6, 10], dtype=int32),)
Add the index of the start of the first block (see the reduceat docs)
In [49]: J1=[0]+J[0].tolist()
In [50]: J1
Out[50]: [0, 1, 4, 6, 10]
Apply add.reduceat:
In [51]: np.add.reduceat(s_arr,J1,axis=0)
Out[51]:
array([[ 2, 0, 1, 0],
[ 0, 0, 0, 0],
[ 0, 0, 0, 0],
[108, 7, 4, 117],
[ 13, 2, 0, 11]], dtype=int32)
These are your numbers, sorted by lbls (for 0,1,2,3,4).
With reduceat you could take other actions like maximum, product etc.
I have the following labels
>>> lab
array([2, 2, 2, 2, 2, 3, 3, 0, 0, 0, 0, 1])
I want to assign this label to another numpy array i.e
>>> arr
array([[81, 1, 3, 87], # 2
[ 2, 0, 1, 0], # 2
[13, 6, 0, 0], # 2
[14, 0, 1, 30], # 2
[ 0, 0, 0, 0], # 2
[ 0, 0, 0, 0], # 3
[ 0, 0, 0, 0], # 3
[ 0, 0, 0, 0], # 0
[ 0, 0, 0, 0], # 0
[ 0, 0, 0, 0], # 0
[ 0, 0, 0, 0], # 0
[13, 2, 0, 11]]) # 1
and add the elements of 0th group, 1st group, 2nd group, 3rd group?
If the labels of equal values are contiguous, as in your example, then you may use np.add.reduceat:
>>> lab
array([2, 2, 2, 2, 2, 3, 3, 0, 0, 0, 0, 1])
>>> idx = np.r_[0, 1 + np.where(lab[1:] != lab[:-1])[0]]
>>> np.add.reduceat(arr, idx)
array([[110, 7, 5, 117], # 2
[ 0, 0, 0, 0], # 3
[ 0, 0, 0, 0], # 0
[ 13, 2, 0, 11]]) # 1
if they are not contiguous, then use np.argsort to align the array and labels such that labels of the same values are next to each other:
>>> i = np.argsort(lab)
>>> lab, arr = lab[i], arr[i, :] # aligns array and labels such that labels
>>> lab # are sorted and equal labels are contiguous
array([0, 0, 0, 0, 1, 2, 2, 2, 2, 2, 3, 3])
>>> idx = np.r_[0, 1 + np.where(lab[1:] != lab[:-1])[0]]
>>> np.add.reduceat(arr, idx)
array([[ 0, 0, 0, 0], # 0
[ 13, 2, 0, 11], # 1
[110, 7, 5, 117], # 2
[ 0, 0, 0, 0]]) # 3
or alternatively use groupby from pandas library:
>>> pd.DataFrame(arr).groupby(lab).sum().values
array([[ 0, 0, 0, 0],
[ 13, 2, 0, 11],
[110, 7, 5, 117],
[ 0, 0, 0, 0]])