How to receive an absolute instead of a decimal number for a reported ratio in percentages on NetLogo? - decimal

I've set up firms (turtles) in an industry (world) which either produce at home (firms-at-home located > ycor) or have offshored their production (offshore-firms located < ycor). I have given them a firms-own called offshored? which is answered with either true or false.
I have a monitor on my interface which shows the amount of firms which have offshored and the ones which produce at home (in %) of all firms in the setup-world:
breed [ firms firm ]
firms-own [
offshored? ;; true or false
]
to-report percentage-of-firms-at-home ;; monitors the % of firms which produce at home
report ( ( count firms with [ ycor > 0 ] ) / count firms ) * 100
end
to-report percentage-of-offshored-firms ;; monitors the % of offshored firms
report ( ( count firms with [ ycor < 0 ] ) / count firms ) * 100
end
I then plugged percentage-of-offshored-firms into a monitor on the interface. Now, I would like to have an absolute number show up for my reported percentage. How can I change the decimal number I receive so far to an absolute one?

Related

appropriate test for significance of difference

I need help conducting a hypothesis test to compare the coefficients for two of my explanatory variables in Stata. My null and alternative are:
null: β1=β2 vs alt: β1>β2.
So far I have used the command test to compare the two estimates. However, I don't know if I can modify test to fit my alt.
To do one-sided tests on coefficients, you can
perform the corresponding two-sided test on the coefficients (or sometimes just look at the regression output)
use the results to get a p-value for the one-sided test. This step can done in two ways, either by using using the reverse cumulative Student's t distribution directly or by doing some arithmetic on the p-value from the two-sided test.
If you are testing differences of coefficients (since a=b is equivalent to a-b=0), the approach is the same as doing single coefficients. You need to do a two-sided test of the difference:
sysuse auto, clear
regress price mpg weight
gen high_mpg = mpg>20
gen high_weight = weight>3000
reg price high_mpg high_weight foreign
/* Test H0: diff = 0 */
test high_weight - foreign = 0
display r(p)
display r(F)
/* The ttail approach works when the actual coefficient difference is positive or negative */
local sign_diff = sign(_b[high_weight] - _b[foreign] - 0)
display "p-value for Ha: diff < 10 = " ttail(r(df_r),`sign_diff'*sqrt(r(F)))
display "p-value for Ha: diff > 10 = " 1-ttail(r(df_r),`sign_diff'*sqrt(r(F)))
/* Can also do it by hand like this if diff is positive (like above) */
display "p-value for Ha: diff < 0 = " r(p)/2
display "p-value for Ha: diff > 0 = " 1-r(p)/2
/* if difference is negative, you can still do it by hand */
/* but need to flip the p-value division rules since we are on the other */
/* side of the distribution */
/* Test H0': diff2 = -400 */
test high_mpg - foreign = -400
local sign_diff2 = sign(_b[high_mpg] - _b[foreign] + 400)
display "p-value for Ha': diff2 < 0 = " ttail(r(df_r),`sign_diff2'*sqrt(r(F)))
display 1-r(p)/2
display "p-value for Ha': diff2 > 0 = " 1-ttail(r(df_r),`sign_diff2'*sqrt(r(F)))
display r(p)/2
If your test returns r(chi2) instead of r(F), you need to swap the ttail part for
normal(`sign_diff'*sqrt(r(chi2)))

Misinterpretation of Sum Variance Law

I'm trying to understand how to combine variances of batches of observations. My understanding is that you can simply sum them according to the sum variance law. But my experiments seem to differ from this theorem.
Here is the python code used:
import numpy as np
x = np.random.rand(100000)
expected = np.var(x)
print("expected:", expected)
for n in [2,4,5,10,20,40,50,100,1000]:
s = np.split(x, n)
sigma_sq = [np.var(v) for v in s]
result = np.sum(sigma_sq)
print("result", n, ":", result, "(", np.abs(result - expected), ")")
the printed result is:
expected: 0.0832224743666757
result 2 : 0.16644455708841321 ( 0.08322208272173752 )
result 4 : 0.3328814911392468 ( 0.24965901677257113 )
result 5 : 0.4161068624507617 ( 0.33288438808408605 )
result 10 : 0.832183555011673 ( 0.7489610806449972 )
result 20 : 1.664227484757454 ( 1.5810050103907785 )
result 40 : 3.3278497945218355 ( 3.2446273201551596 )
result 50 : 4.159353197179163 ( 4.076130722812487 )
result 100 : 8.314084653397305 ( 8.23086217903063 )
result 1000 : 82.397691161862 ( 82.31446868749532 )
As the number of splits grows the difference between the expected and the result grows.
However if I divide the sums by n (i.e. average them) then the error is acceptable (on the order of 1e-5).
I must be misinterpreting the sum variance law, but I'm not sure where my misunderstanding is.
There are different reasons for this I think.
1. If we have a small sample -> computing the variance could be error (i.e. not really the true variance of a certain distribution).
2. There might be a chance that two samples are not totally independent.
The best way to fix these things is to have two very large samples. You can run the following code and see the variance of two lists is close to the sum of two variances. This is not the case when we replace 10000 with a smaller number, say, 10, 100, 1000.

ArangoDB: Find all shortest paths

i want to get all shortest paths between 2 vertex.
Example: Give me all shortest path between node A and B should only return the 2 blue paths
this is what i have got so far:
LET source = (FOR x IN Entity FILTER x.objectID == "organization_1"
return x)[0]
LET destination = (FOR x IN Entity FILTER x.objectID == "organization_129"
return x)[0]
FOR node, edge, path IN 1..2 ANY source._id GRAPH "m"
FILTER LAST(path.vertices)._id == destination._id
LIMIT 100
RETURN path
Problems:
1. it is very slow (took 18 seconds on a graph with like 70 mio nodes)
2. it finds every path, but i want only all shortest path
UPDATE
i tried the 2-step query solution from the comments.
the problem is that the second query is also very slow
Query string:
FOR source IN Entity FILTER source.objectID == "organization_1"
LIMIT 1
FOR node, edge, path
IN 1..#depth ANY source._id
GRAPH "m"
OPTIONS {uniqueVertices: "path"}
FILTER node.objectID == "organization_129"
RETURN path
Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
11 IndexNode 1 - FOR source IN Entity /* hash index scan */
5 LimitNode 1 - LIMIT 0, 1
6 CalculationNode 1 - LET #6 = source.`_id` /* attribute expression */ /* collections used: source : Entity */
7 TraversalNode 346 - FOR node /* vertex */, path /* paths */ IN 1..2 /* min..maxPathDepth */ ANY #6 /* startnode */ GRAPH 'm'
8 CalculationNode 346 - LET #10 = (node.`objectID` == "organization_129") /* simple expression */
9 FilterNode 346 - FILTER #10
10 ReturnNode 346 - RETURN path
Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
11 hash Entity false false 100.00 % [ `objectID` ] (source.`objectID` == "organization_1")
7 edge ACTIVITYPARTY false false 100.00 % [ `_from`, `_to` ] base INBOUND
7 edge ACTIVITYPARTY false false 100.00 % [ `_from`, `_to` ] base OUTBOUND
7 edge ACTIVITY_LINK false false 100.00 % [ `_from`, `_to` ] base INBOUND
7 edge ACTIVITY_LINK false false 100.00 % [ `_from`, `_to` ] base OUTBOUND
7 edge ENTITY_LINK false false 70.38 % [ `_from`, `_to` ] base INBOUND
7 edge ENTITY_LINK false false 70.38 % [ `_from`, `_to` ] base OUTBOUND
7 edge RELATION false false 20.49 % [ `_from`, `_to` ] base INBOUND
7 edge RELATION false false 20.49 % [ `_from`, `_to` ] base OUTBOUND
7 edge SOFT_LINK false false 100.00 % [ `_from`, `_to` ] base INBOUND
7 edge SOFT_LINK false false 100.00 % [ `_from`, `_to` ] base OUTBOUND
Traversals on graphs:
Id Depth Vertex collections Edge collections Options Filter conditions
7 1..2 Activity, Entity, SOFT_LINK, Property ACTIVITYPARTY, ENTITY_LINK, SOFT_LINK, RELATION, ACTIVITY_LINK uniqueVertices: path, uniqueEdges: path
Optimization rules applied:
Id RuleName
1 move-calculations-up
2 move-filters-up
3 move-calculations-up-2
4 move-filters-up-2
5 use-indexes
6 remove-filter-covered-by-index
7 remove-unnecessary-calculations-2
8 optimize-traversals
9 move-calculations-down
First of all you need a hash index on field objectID in collection Entity to avoid the full collection scans, which heavily slows down your performance.
To get all shortest path I would first search for one shortest path with the AQL SHORTEST_PATH and return the number of visited vertices. There is also no need of subqueries (like in your query).
FOR source IN Entity FILTER source.objectID == "organization_1"
LIMIT 1
FOR destination IN Entity FILTER destination.objectID == "organization_129"
LIMIT 1
RETURN sum(
FOR v, e
IN ANY
SHORTEST_PATH source._id TO destination._id
GRAPH "m"
RETURN 1)-1
After that I would execute another query with the result from the first query as bind parameter #depth, which is used to limit the depth of the traversal.
FOR source IN Entity FILTER source.objectID == "organization_1"
LIMIT 1
FOR node, edge, path
IN 1..#depth ANY source._id
GRAPH "m"
OPTIONS {uniqueVertices: "path"}
FILTER node.objectID == "organization_129"
RETURN path
Note: To filter the last vertex in the path you don't have to use LAST(path.vertices), you can simply use node because it is already the last vertex (the same applies for edge).

ArangoDB Not Using Index During Traversal

I have a simple graph traversal query:
FOR e in 0..3 ANY 'Node/5025926' Edge
FILTER
e.ModelType == "A.Model" &&
e.TargetType == "A.Target" &&
e.SourceType == "A.Source"
RETURN e
The 'Edge' edge collection has a hash index defined for attributes ModelType, TargetType, SourceType, in that order.
When checking the execution plan, the results are:
Query string:
FOR e in 0..3 ANY 'Node/5025926' Edge
FILTER
e.ModelType == "A.Model" &&
e.TargetType == "A.Target" &&
e.SourceType == "A.Source"
RETURN e
Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
2 TraversalNode 7 - FOR e /* vertex */ IN 0..3 /* min..maxPathDepth */ ANY 'Node/5025926' /* startnode */ Edge
3 CalculationNode 7 - LET #1 = (((e.`ModelType` == "A.Model") && (e.`TargetType` == "A.Target")) && (e.`SourceType` == "A.Source")) /* simple expression */
4 FilterNode 7 - FILTER #1
5 ReturnNode 7 - RETURN e
Indexes used:
none
Traversals on graphs:
Id Depth Vertex collections Edge collections Filter conditions
2 0..3 Edge
Optimization rules applied:
none
Notice that the execution plan indicates that no indices will be used to process the query.
Is there anything I need to do to make the engine use the index on the Edge collection to process the results?
Thanks
In ArangoDB 3.0 a traversal will always use the edge index to find connected vertices, regardless of which filter conditions are present in the query and regardless of which indexes exist.
In ArangoDB 3.1 the optimizer will try to find the best possible index for each level of the traversal. It will inspect the traversal's filter condition and for each level pick the index for which it estimates the lowest cost. If there are no user-defined indexes, it will still use the edge index to find connected vertices. Other indexes will be used if there are filter conditions on edge attributes which are also indexed and the index has a better estimated average selectivity than the edge index.
In 3.1.0 the explain output will always show "Indexes used: none" for traversals, even though a traversal will always use an index. The index display is just missing in the explain output. This has been fixed in ArangoDB 3.1.1, which will show the individual indexes selected by the optimizer for each level of the traversal.
For example, the following query shows the following explain output in 3.1:
Query string:
FOR v, e, p in 0..3 ANY 'v/test0' e
FILTER p.edges[0].type == 1 && p.edges[2].type == 2
RETURN p.vertices
Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
2 TraversalNode 8000 - FOR v /* vertex */, p /* paths */ IN 0..3 /* min..maxPathDepth */ ANY 'v/test0' /* startnode */ e
3 CalculationNode 8000 - LET #5 = ((p.`edges`[0].`type` == 1) && (p.`edges`[2].`type` == 2)) /* simple expression */
4 FilterNode 8000 - FILTER #5
5 CalculationNode 8000 - LET #7 = p.`vertices` /* attribute expression */
6 ReturnNode 8000 - RETURN #7
Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
2 edge e false false 10.00 % [ `_from`, `_to` ] base INBOUND
2 edge e false false 10.00 % [ `_from`, `_to` ] base OUTBOUND
2 hash e false false 63.60 % [ `_to`, `type` ] level 0 INBOUND
2 hash e false false 64.40 % [ `_from`, `type` ] level 0 OUTBOUND
2 hash e false false 63.60 % [ `_to`, `type` ] level 2 INBOUND
2 hash e false false 64.40 % [ `_from`, `type` ] level 2 OUTBOUND
Additional indexes are present on [ "_to", "type" ] and [ "_from", "type" ]. Those are used on levels 0 and 2 of the traversal because there are filter conditions for the edges on these levels that can use these indexes. For all other levels, the traversal will use the indexes labeled with "base" in the "Ranges" column.
The explain output fix will become available with 3.1.1, which will be released soon.

How do I configure a bandpass filter?

I'm trying to use the Web Audio API's bandpass filter functionality, but I believe my question is more general. I don't understand the "Q" value of the bandpass filter. I would like to be able to configure the filter to pass frequencies that are within Y hertz of a middle frequency X hertz.
I'm very new to audio programming, so are there other variables I need to consider to compute Q?
Let's say you have a filter at 1000Hz, and you want it to start at 500Hz and end at 2000Hz.
First off, you'll notice it doesn't extend the same number of hertz in each direction. That's because filter bandwidth is based on octaves, not frequencies. So in this case, it extends one octave down and one octave up. Put another way, the frequency was divided by 2 on the low end and multiplied by 2 on the high end - which gives it a bandwidth of 2 octaves.
Anyway, here's how you can calculate it, assuming you know the frequencies:
Q = center_frequency / (top_frequency - bottom_frequency)
Which in this case would be 1000 / ( 2000 - 500 ), or 0.667.
You can also calculate it without knowing the top and bottom frequencies as long as you have a target bandwidth (in octaves) in mind:
function getQ( bandwidth ){
return Math.sqrt( Math.pow(2, bandwidth) ) / ( Math.pow(2, bandwidth) - 1 )
}
Again, if you pass 2 as the bandwidth argument, you'll get the same result: Q = 0.667.
Hope that helps.

Resources