SystemModeler Connector Weight - connect

I'm looking for "where to start" - I expect that this problem is a bit involved for this forum, but I need a start point, and my search has not been successful as of yet :( any input would be appreciated...
I need to create a Weighted Graph using the SystemModeler / OpenModelica interface. The first step of our process will skip the SystemModeler simulation and pass the model to Mathematica for processing
My question is about adding attributes to a connector in the System Modeler GUI:
I need to draw a model such that: State A is connected to State B and State C, with a weight of .7 for the path to B and .3 the path for C. I need to create an object to hold the weight and associate it with the connector. Also need to warn when connectors from a given state do not add to 1.
Any ideas on where to start ?

As connections in Modelica themselves does not hold any information, rather passing along information from the blocks that it connects, I believe you have two options:
Put a component between two nodes that specifies the weight of the connection.
Have a defined input and output from each node where the output from a node specifies the weight of the connection, and the inputs on a node are summed to check that they equal 1.
Here is an example of how you could do the latter:
model WeightedGraph
model Node
Modelica.Blocks.Interfaces.RealInput u[nin];
Modelica.Blocks.Interfaces.RealOutput y[size(k, 1)];
Real usum;
parameter Real k[:] = {0};
parameter Integer nin = 0;
equation
y = k;
usum = sum(u);
end Node;
Node A(nin = 0, k = {0.7});
Node B(nin = 1, k = {0.3});
Node C(nin = 1);
equation
connect(A.y[1], B.u[1]);
connect(B.y[1], C.u[1]);
end WeightedGraph;
The number of inputs into your component need to be specified using the nin parameter. The number outputs will be equal to the length of k, which is a list where you specify a weight of each connection. You could for example check that ysum adds to 1 using assert or if you wanted to do that in Mathematica.

Related

Can you use the built in derivative functions in compute shaders? (vulkan)

I want to use the built in derivative funcitons:
vec3 dpdx = dFdx(p);
vec3 dpdy = dFdy(p);
Inside a compute shader. However I get the following error:
Message ID name: UNASSIGNED-CoreValidation-Shader-InconsistentSpirv
Message: Validation Error: [ UNASSIGNED-CoreValidation-Shader-InconsistentSpirv ] Object 0: handle = 0x5654380d4dd8, name = Logical device: GeForce GT 1030, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x6bbb14 | SPIR-V module not valid: OpEntryPoint Entry Point <id> '5[%main]'s callgraph contains function <id> 46[%BiplanarMapping_s21_vf3_vf3_f1_], which cannot be used with the current execution modes:
Derivative instructions require DerivativeGroupQuadsNV or DerivativeGroupLinearNV execution mode for GLCompute execution model: DPdx
Derivative instructions require DerivativeGroupQuadsNV or DerivativeGroupLinearNV execution mode for GLCompute execution model: DPdy
%BiplanarMapping_s21_vf3_vf3_f1_ = OpFunction %v4float None %41
Severity: VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT
I don't seem to find anything on the topic when I search online.
Derivative functions only work in a fragment shader. The derivatives are based on the rate-of-change of the value across the primitive being rendered. Obviously compute shaders don't render primitives, so there is nothing to compute.
Apparently, NVIDIA has an extension that provides some derivative computation capabilities for compute shaders. That's where the weird error comes from.
Derivatives in fragment shaders are computed by subtracting between the same value from adjacent invocations. As such, you can emulate this by using shared variables.
First, you have to make sure that the spatially adjacent invocations are in the same work group. So your work group size needs to be some multiple of 2x2 invocations. Then, you need a shared variable array, which you index by invocations within a work group. Each invocation should write its own value to its own index.
To compute the derivative, issue a barrier (with memoryBarrierShared) after writing the values to the shared variables. Take the difference between one's invocation and the adjacent one in the same 2x2 quad. You should make sure that all invocations in the same quad get the same value, by always subtracting between the lower index and the higher index within the quad. Something like this:
uvec2 quadIndex = gl_LocalInvocationID.xy / 2
/*type*/ derFdX = variable[quadIndex.x + 1][quadIndex.y + 0] - variable[quadIndex.x + 0][quadIndex.y + 0]
/*type*/ derFdY = variable[quadIndex.x + 0][quadIndex.y + 1] - variable[quadIndex.x + 0][quadIndex.y + 0]
The NVIDIA extension basically does this for you, though it's probably more efficient since it wouldn't need the shared variable.

Discretizing PDE in space for use with modelica

I am currently doing a course called "Modeling of dynamic systems" and have been given the task of modeling a warm water tank in modelica with a distributed temperature description.
Most of the tasks have gone well, and my group is left with the task of introducing the heat flux due to buoyancy effects into the model. Here is where we get stuck.
the equation given is this:
Given PDE
But how do we discretize this into something we can use in modelica?
The discretized version we ended up with was this:
(Qd_pp_b[k+1] - Qd_pp_b[k]) / h_dz = -K_b *(T[k+1] - 2 * T[k] + T[k-1]) / h_dz^2
where Qd_pp_b is the left-hand side variable, ie the heat flux, k is the current slice of the tank and T is the temperature in the slices.
Are we on the right path? or completely wrong?
This doesn't seem to be a differential equation (as is) so this does not make sense without surrounding problem. For the second derivative you should always create auxiliary variables and for each partial derivative a separate equation. I added dummy values for parameters and dummy equations for T[k]. This can be simulated, is this about what you expected?
model test
constant Integer n = 10;
Real[n] Qd_pp_b;
Real[n] dT;
Real[n] T;
parameter Real K_b = 1;
equation
for k in 1:n loop
der(Qd_pp_b[k]) = -K_b *der(dT[k]);
der(T[k]) = dT[k];
T[k] = sin(time+k);
end for;
end test;

PageRank with custom initial scores

I am trying to implement a simple algorithm that will calculate PageRank on a directed network generated and handled with NetworkX. However, I'd like to add a simple change: rather than having the initial PageRank for each node be equal to 1/n, where n is the number of nodes in the graph, I want each node to have rank 1.
So far I have tried checking out the official documentations on PageRank, but I found nothing that seems to help. Apparently the 'personalization' parameter is of no use either. I tried using nstart, but to no avail. The code currently looks like this:
import networkx as nx
D=nx.DiGraph()
D.add_weighted_edges_from([('1','2',0.5),('1','3',0.5)])
nst = {n: 1 for n in D.nodes}
print(nx.pagerank(D, alpha = 0.95, nstart=nst))
At the moment, the ranks given to each node at the end of the calculation still sum up to 1, while they should sum up to 3.
Is such a thing even feasible to begin with? Should I look elsewhere to implement such an algorithm? Could there be problems with convergence if such a change is applied? Thanks in advance.
PageRank in networkx has an attribute nstart:
nstart (dictionary, optional) – Starting value of PageRank iteration for each node.
Here is source code for this:
# Choose fixed starting vector if not given
if nstart is None:
x = dict.fromkeys(W, 1.0 / N)
else:
# Normalized nstart vector
s = float(sum(nstart.values()))
x = dict((k, v / s) for k, v in nstart.items())
You can just specify nstart in your code, like this:
nst = {n: 1 for n in G.nodes}
pr = nx.pagerank(G, nstart=nst)
Edit 1: Modern PageRank algorithm forcefully normalizes start vector (you can see it in the code above). The whole algorithm is based on it and if one will force nstart values to be 1, not 1/N, it will be broken because convergence:
will never be assumed (e is increasing each iteration). If you want to use 1 as starting values, as in the original PageRank algorithm:
In the original form of PageRank, the sum of PageRank over all pages was the total number of pages on the web at that time, so each page in this example would have an initial value of 1.
You should implement the whole algorithm manually because it is deprecated.

Find path following edges with greatest value in ArangoDB

Lets say, that in my graph I've got edges that have field called value. After selecting start vertex I would like to find path by always selecting the edge that has the highest value. Unfortunatly I can't figure out how to write proper query, is it possible in ArangoDB?
Hi i am unsure what you would like to achieve, there are two possible scenarios that i can imagine from your description:
First: Shortest Path
The use-case here is you know the starting vertex and the target vertex, and you want to find the shortest (or cheapest) path between those two.
The built in SHORTEST_PATH (https://docs.arangodb.com/3.1/AQL/Graphs/ShortestPath.html#shortest-path-in-aql) feature can serve it by defining the distance attribute in the options like this:
FOR v IN OUTBOUND #start TO #end ##edgeCollections OPTIONS {weightAttribute: "value", defaultWeight: 1}
RETURN v
This will give you all vertices on the path from start to end which has the lowest some of value attributes. If you need the "highest value" you could copy the value and save it again with 1/value in a different field, to find the path with the fewest edges having in total the highest sum of values
Second: Sorting of edges
The use case is you only have the starting vertex and want to get the connected vertices, ordered by the value on the edges. There you can simply combine the traversal statement with a simple sort. (https://docs.arangodb.com/3.1/AQL/Graphs/Traversals.html#graph-traversals-in-aql):
FOR v, e IN OUTBOUND #start ##edgeCollection
SORT e.value DESC
LIMIT 1 /* Only pick the highest one */
REUTRN {v: v, e: e}
Third use-case: Iterating several depth only using the highest value
The AQL in Use-case 2 can be chained up to an arbitrary depth which has to be known a-priori. So say you would like to iterate 3 steps only using the edge with highest value:
FOR v1, e1 IN OUTBOUND #start ##edgeCollection
SORT e1.value DESC
LIMIT 1 /* Only pick the highest one */
/* Depth 1 done. now depth 2*/
FOR v2, e2 IN OUTBOUND v1 ##edgeCollection
SORT e2.value DESC
LIMIT 1 /* Only pick the highest one */
FOR v3, e3 IN OUTBOUND v2 ##edgeCollection
SORT e3.value DESC
LIMIT 1 /* Only pick the highest one */
RETURN [v1,v2,v3]
Forth use-case:
The depth is not known a-priori, in this case pure AQL in the currently release version (3.1) cannot formulate this. It will be easier to use a Foxx service (https://docs.arangodb.com/3.1/Manual/Foxx/#foxx) using the traversal module (https://docs.arangodb.com/3.1/Manual/Graphs/Traversals/UsingTraversalObjects.html#getting-started) in JavaScript which is a bit more flexible, but can only be implemented in Javascript.

What does `sample_weight` do to the way a `DecisionTreeClassifier` works in sklearn?

I've read from the relevant documentation that :
Class balancing can be done by sampling an equal number of samples from each class, or preferably by normalizing the sum of the sample weights (sample_weight) for each class to the same value.
But, it is still unclear to me how this works. If I set sample_weight with an array of only two possible values, 1's and 2's, does this mean that the samples with 2's will get sampled twice as often as the samples with 1's when doing the bagging? I cannot think of a practical example for this.
Some quick preliminaries:
Let's say we have a classification problem with K classes. In a region of feature space represented by the node of a decision tree, recall that the "impurity" of the region is measured by quantifying the inhomogeneity, using the probability of the class in that region. Normally, we estimate:
Pr(Class=k) = #(examples of class k in region) / #(total examples in region)
The impurity measure takes as input, the array of class probabilities:
[Pr(Class=1), Pr(Class=2), ..., Pr(Class=K)]
and spits out a number, which tells you how "impure" or how inhomogeneous-by-class the region of feature space is. For example, the gini measure for a two class problem is 2*p*(1-p), where p = Pr(Class=1) and 1-p=Pr(Class=2).
Now, basically the short answer to your question is:
sample_weight augments the probability estimates in the probability array ... which augments the impurity measure ... which augments how nodes are split ... which augments how the tree is built ... which augments how feature space is diced up for classification.
I believe this is best illustrated through example.
First consider the following 2-class problem where the inputs are 1 dimensional:
from sklearn.tree import DecisionTreeClassifier as DTC
X = [[0],[1],[2]] # 3 simple training examples
Y = [ 1, 2, 1 ] # class labels
dtc = DTC(max_depth=1)
So, we'll look trees with just a root node and two children. Note that the default impurity measure the gini measure.
Case 1: no sample_weight
dtc.fit(X,Y)
print dtc.tree_.threshold
# [0.5, -2, -2]
print dtc.tree_.impurity
# [0.44444444, 0, 0.5]
The first value in the threshold array tells us that the 1st training example is sent to the left child node, and the 2nd and 3rd training examples are sent to the right child node. The last two values in threshold are placeholders and are to be ignored. The impurity array tells us the computed impurity values in the parent, left, and right nodes respectively.
In the parent node, p = Pr(Class=1) = 2. / 3., so that gini = 2*(2.0/3.0)*(1.0/3.0) = 0.444..... You can confirm the child node impurities as well.
Case 2: with sample_weight
Now, let's try:
dtc.fit(X,Y,sample_weight=[1,2,3])
print dtc.tree_.threshold
# [1.5, -2, -2]
print dtc.tree_.impurity
# [0.44444444, 0.44444444, 0.]
You can see the feature threshold is different. sample_weight also affects the impurity measure in each node. Specifically, in the probability estimates, the first training example is counted the same, the second is counted double, and the third is counted triple, due to the sample weights we've provided.
The impurity in the parent node region is the same. This is just a coincidence. We can compute it directly:
p = Pr(Class=1) = (1+3) / (1+2+3) = 2.0/3.0
The gini measure of 4/9 follows.
Now, you can see from the chosen threshold that the first and second training examples are sent to the left child node, while the third is sent to the right. We see that impurity is calculated to be 4/9 also in the left child node because:
p = Pr(Class=1) = 1 / (1+2) = 1/3.
The impurity of zero in the right child is due to only one training example lying in that region.
You can extend this with non-integer sample-wights similarly. I recommend trying something like sample_weight = [1,2,2.5], and confirming the computed impurities.

Resources