NetworkX problem with label and id when reading and writing GML - python-3.x

I have the following example, where I create a graph programmetically, write it to a GML file and read the file into a graph again.
I want to be able to use the graph loaded from file in place of the programmatically created one:
import networkx as nx
g = nx.Graph()
g.add_edge(1,4)
nx.write_gml(g, "test.gml")
gg = nx.read_gml("test.gml", label="label")
print(gg.edges(data=True))
The contents of test.gml is a follows:
graph [
node [
id 0
label "1"
]
node [
id 1
label "4"
]
edge [
source 0
target 1
]
]
Nodes 1 and 4 from the python code are now represented by two nodes with ID 0 and 1 and labels "1" and "4"
After reading the file, I now have to access node 4 as follows:
gg['4']
Instead of
g[4]
for the original graph.
I could of course make sure to cast every node to string before looking up the node, but this is not practical for huge graphs.
An alternative would be to programmatically create (yet another) graph that is identical to g but with integer keys, but this is even more cumbersome.
What should I do?

Try:
nx.read_gml(fpath, destringizer=int)
Ref:
https://networkx.org/documentation/stable/reference/readwrite/generated/networkx.readwrite.gml.read_gml.html

Related

What is the right way to use Nested states with pytransitions?

So i've been looking around on the pytransitions github and SO and it seems after 0.8 the way you could use macro-states (or super state with substates in it) has change. I would like to know if it's still possible to create such a machine with pytransition (the blue square is suppose to be a macro-state that has 2 states in it, one of them, the green one, being another macro) :
Or do I have to follow the workflow suggested here : https://github.com/pytransitions/transitions/issues/332 ?
Thx a lot for any info !
I would like to know if it's still possible to create such a machine with pytransition.
The way HSMs are created and managed has changed in 0.8 but you can of course use (deeply) nested states. For a state to have substates, you need to pass the states (or children) parameter with the state definitions/objects you'd like to nest. Furthermore, you can pass transitions for that particular scope. I am using HierarchicalGraphMachine since this allows me to create a graph right away.
from transitions.extensions.factory import HierarchicalGraphMachine
states = [
# create a state named A
{"name": "A",
# with the following children
"states":
# a state named '1' which will be accessible as 'A_1'
["1", {
# and a state '2' with its own children ...
"name": "2",
# ... 'a' and 'b'
"states": ["a", "b"],
"transitions": [["go", "a", "b"],["go", "b", "a"]],
# when '2' is entered, 'a' should be entered automatically.
"initial": "a"
}],
# we could also pass [["go", "A_1", "A_2"]] to the machine constructor
"transitions": [["go", "1", "2"]],
"initial": "1"
}]
m = HierarchicalGraphMachine(states=states, initial="A")
m.go()
m.get_graph().draw("foo.png", prog="dot") # [1]
Output of 1:

How does sklearn.linear_model.LinearRegression work with insufficient data?

To solve a 5 parameter model, I need at least 5 data points to get a unique solution. For x and y data below:
import numpy as np
x = np.array([[-0.24155831, 0.37083184, -1.69002708, 1.4578805 , 0.91790011,
0.31648635, -0.15957368],
[-0.37541846, -0.14572825, -2.19695883, 1.01136142, 0.57288752,
0.32080956, -0.82986857],
[ 0.33815532, 3.1123936 , -0.29317028, 3.01493602, 1.64978158,
0.56301755, 1.3958912 ],
[ 0.84486735, 4.74567324, 0.7982888 , 3.56604097, 1.47633894,
1.38743513, 3.0679506 ],
[-0.2752026 , 2.9110031 , 0.19218081, 2.0691105 , 0.49240373,
1.63213241, 2.4235483 ],
[ 0.89942508, 5.09052174, 1.26048572, 3.73477373, 1.4302902 ,
1.91907482, 3.70126468]])
y = np.array([-0.81388378, -1.59719762, -0.08256274, 0.61297275, 0.99359647,
1.11315445])
I used only 6 data to fit a 8 parameter model (7 slopes and 1 intercept).
lr = LinearRegression().fit(x, y)
print(lr.coef_)
array([-0.83916772, -0.57249998, 0.73025938, -0.02065629, 0.47637768,
-0.36962192, 0.99128474])
print(lr.intercept_)
0.2978781587718828
Clearly, it's using some kind of assignment to reduce the degrees of freedom. I tried to look into the source code but couldn't found anything about that. What method do they use to find the parameter of under specified model?
You don't need to reduce the degrees of freedom, it simply finds a solution to the least squares problem min sum_i (dot(beta,x_i)+beta_0-y_i)**2. For example, in the non-sparse case it uses the linalg.lstsq module from scipy. The default solver for this optimization problem is the gelsd LAPACK driver. If
A= np.concatenate((ones_v, X), axis=1)
is the augmented array with ones as its first column, then your solution is given by
x=numpy.linalg.pinv(A.T*A)*A.T*y
Where we use the pseudoinverse precisely because the matrix may not be of full rank. Of course, the solver doesn't actually use this formula but uses singular value Decomposition of A to reduce this formula.

Can't access [myself] of myself in nested ask

I want to calculate the mean of the weight of some of the links, called "fs", of a turtle (let's call it turtle A).
In particular, I need to select the turtles that have a "C" breed of link with another turtle, B, and calculate the mean of the weights of fs links of turtle A with this agentset.
I am doing it with a reporter, mean-fs.
To select the links I want to calculate the mean of, I need to nest some ask and access [myself] of myself, which should be turtle A.
I've tried passing who of turtle A to the reporter mean-fs, but there is the error "expected literal value".
I've tried with [myself] of myself but it returns self.
Here is the part of the code I'm talking about:
to go
ask turtles
[move
set-conv]
end
to set-conv
ask other turtles in-cone 4 120
[...
if (C-neighbors != nobody)[add-partecipant]
]
end
to add-partecipant
let turtle1 myself
if ( mean-fs [ turtle1 ] > 0 )
[ ... ]
end
to-report mean-fs [turtle1]
let w-list []
ask C-neighbors
[ set w-list lput [w] of f-with turtle1 w-list]
report mean w-list
end

how to initialize list of lists of lists with unknown dimensions

Here d is a list of lists of lists with structure like this
[
[
[vocab[START], vocab["image1"], vocab["caption1"], vocab[END]],
[vocab[START], vocab["image1"], vocab["caption2"], vocab[END]],
...
],
...
]
I don't know the dimensions already therefore I have problem in initializing, keeping an upper limit I could have used the xrange function like this
d=[[[[] for k in xrange(50)] for j in xrange(10)] for i in xrange(8769)]
but I'm working in Python3 and xrange is depreciated. The code goes like this
for i in range (len(t)):
for j in range (len(t[i])):
d[i][j][0]=vocab[START]
for k in range(len(t[i][j])):
if t[i][j][k] not in list(vocab.keys()):
d[i][j][k+1]=vocab[UNK]
else:
d[i][j][k+1]=vocab[t[i][j][k]]
Any help on this is appreciated.

Find clusters by edge definition in arangodb

Does arangodb provide a utility to list clusters for a given edge definition?
E.g. Given the graph:
Tyrion ----sibling---> Cercei ---sibling---> Jamie
Bran ---sibling--> Arya ---sibling--> Jon
I'd want something like the following:
my_graph._getClusters({edge: "sibling"}) -> [ [Tyrion, Cercei, Jamie], [Bran, Arya, Jon] ]
Provided you have a graph named siblings, then the following query will find all paths in the graph that are connected by edges with type sibling and that have a (path) length of 3. This should match the example data you provided:
LET options = {
followEdges: [
{ type: 'sibling' }
]
}
FOR i IN GRAPH_TRAVERSAL('sibling', { }, "outbound", options)
FILTER LENGTH(i) == 3
RETURN i[*].vertex._key
Omitting or adjusting the FILTER will also find longer or shorter paths in the graph.

Resources