How to get all vertices in ArangoDB graph traversal without max? - arangodb

I want to get all vertices from graph,as shown below
a -> b -> c -> d -> e -> f -> ...
b2 -> c2 -> cd -> ....
I want to get all vertices, and use syntax like:
[WITH vertexCollection1[, vertexCollection2[, ...vertexCollectionN]]]
FOR vertex[, edge[, path]]
IN [min[..max]]
OUTBOUND|INBOUND|ANY startVertex
GRAPH graphName
[PRUNE pruneCondition]
[OPTIONS options]
as you can see, I have to define the value of max first( IN [min[..max]] ), how can I get all vertices without providing a value for max when the depth is unknown?

This isn't possible, a max value must either be provided, or it will default to the min value (which in turn defaults to 1).
There's nothing to stop you from setting this to an arbitrarily large number though (e.g. FOR v IN 1..99999999) - you'll get all vertices as long as max exceeds your max depth.

Related

is there a way to LIMIT per sub-iteration (not total)?

I have a graph where type "petal" vertices "connect" to type "flower" vertices with edges.
Now, for every "flower" I only want to pull one "petal". They are all in one collection.
How exactly can I do that? It seems that LIMIT statement works per transaction, not per iteration.
What I am trying is
FOR f in Botany
FILTER type=="flower"
FOR p in 1 INBOUND f GRAPH "BotanyGraph"
LIMIT 1
RETURN p
But all I am getting is 1 petal, total.
How can I achieve one petal off every flower?
Do you mean something like that?
FOR f in Botany
FILTER type == "flower"
LET pp = (
FOR p in 1 INBOUND f GRAPH "BotanyGraph"
LIMIT 1
RETURN p)
FOR p in pp
RETURN p

Prolog ways to compare variables

I was trying to implement some graph algorithms in Prolog. I came up with an idea to use unification to build a tree from the graph structure:
The graph would be defined as follows:
A list of Vertex-Variable pairs where Vertex is a constant representing the vertex and Variable is a corresponding variable, that would be used as a "reference" to the vertex. e.g.:
[a-A, b-B, c-C, d-D]
A list of VertexVar-NeighboursList pairs, where the VertexVar and the individual neighbours in the NeighboursList are the "reference variables". e.g.:
[A-[B, C, D], B-[A, C], C-[A, B], D-[A]] meaning b, c, d are neighbours of a etc.
Then before some graph algorithm (like searching for components, or simple DFS/BFS etc.) that could use some kind of tree built from the original graph, one could use some predicate like unify_neighbours that unifies the VertexVar-NeighbourList pairs as VertexVar = NeighboursList. After that, the vertex variables may be interpreted as lists of its neighbours, where each neighbour is again a list of its neighbours.
So this would result in a good performance when traversing the graph, as there is no need in linear search for some vertex and its neighbours for every vertex in the graph.
But my problem is: How to compare those vertex variables? (To check if they're the same.) I tried to use A == B, but there are some conflicts. For the example above, (with the unify_neighbours predicate) Prolog interprets the graph internally as:
[a-[S_1, S_2, S_3], b-S_1, c-S_2, d-S_3]
where:
S_1 = [[S_1, S_2, S_3], S_2]
S_2 = [[S_1, S_2, S_3], S_1]
S_3 = [[S_1, S_2, S_3]]
The problem is with S_1 and S_2 (aka b and c) as X = [something, Y], Y = [something, X], X == Y is true. The same problem would be with vertices, that share the same neighbours. e.g. U-[A, B] and V-[A, B].
So my question is: Is there any other way to compare variables, that could help me with this? Something that compares "the variables themselves", not the content, like comparing addresses in procedural programming languages? Or would that be too procedural and break the declarative idea of Prolog?
Example
graph_component(Vertices, Neighbours, C) :-
% Vertices and Neighbours as explained above.
% C is some component found in the graph.
vertices_refs(Vertices, Refs),
% Refs are only the variables from the pairs.
unify_neighbours(Neighbours), % As explained above.
rec_(Vertices, Refs, [], C).
rec_(Vertices, Refs, Found, RFound) :-
% Vertices as before.
% Refs is a stack of the vertex variables to search.
% Found are the vertices found so far.
% RFound is the resulting component found.
[Ref|RRest] = Refs,
vertices_pair(Vertices, Vertex-Ref),
% Vertex is the corresponding Vertex for the Ref variable
not(member(Vertex, Found)),
% Go deep:
rec_(Vertices, Ref, [Vertex|Found], DFound),
list_revpush_result([Vertex|Found], DFound, Found1),
% Go wide:
rec_(Vertices, RRest, Found1, RFound).
rec_(Vertices, Refs, Found, []) :-
% End of reccursion.
[Ref|_] = Refs,
vertices_pair(Vertices, Vertex-Ref),
member(Vertex, Found).
This example doesn't really work, but it's the idea. (Also, checking whether the vertices were found is done linearly, so the performance is still not good, but it's just for demonstration.) Now the predicate, that finds the corresponding vertex for the variable is implemented as:
vertices_pair([Vertex-Ref|_], Vertex-Ref).
vertices_pair([_-OtherRef|Rest], Vertex-Ref) :-
Ref \== OtherRef,
vertices_pair(Rest, Vertex-Ref).
where the \== operator is not really what I want and it creates those conflicts.
It is an intrinsic feature of Prolog that, once you have bound a variable to a term, it becomes indistinguishable from the term itself. In other words, if you bind two variables to the same term, you have two identical things, and there is no way to tell them apart.
Applied to your example: once you have unified every vertex-variable with the corresponding neighbours-list, all the variables are gone: you are left simply with a nested (and most likely circular) data structure, consisting of a list of lists of lists...
But as you suggest, the nested structure is an attractive idea because it gives you direct access to adjacent nodes. And although Prolog system vary somewhat in how well they support circular data structures, this need not stop you from exploiting this idea.
The only problem with your design is that a node is identified purely by the (potentially deeply nested and circular) data structure that describes the sub-graph that is reachable from it. This has the consequence that
two nodes that have the same descendants are indistinguishable
it can be very expensive to check whether two "similar looking" sub-graphs are identical or not
A simple way around that is to include a unique node identifier (such as a name or number) in your data structure. To use your example (slightly modified to make it more interesting):
make_graph(Graph) :-
Graph = [A,B,C,D],
A = node(a, [C,D]),
B = node(b, [A,C]),
C = node(c, [A,B]),
D = node(d, [A]).
You can then use that identifier to check for matching nodes, e.g. in a depth-first traversal:
dfs_visit_nodes([], Seen, Seen).
dfs_visit_nodes([node(Id,Children)|Nodes], Seen1, Seen) :-
( member(Id, Seen1) ->
Seen2 = Seen1
;
writeln(visiting(Id)),
dfs_visit_nodes(Children, [Id|Seen1], Seen2)
),
dfs_visit_nodes(Nodes, Seen2, Seen).
Sample run:
?- make_graph(G), dfs_visit_nodes(G, [], Seen).
visiting(a)
visiting(c)
visiting(b)
visiting(d)
G = [...]
Seen = [d, b, c, a]
Yes (0.00s cpu)
Thanks, #jschimpf, for the answer. It clarified a lot of things for me. I just got back to some graph problems with Prolog and thought I'd give this recursive data structure another try and came up with the following predicates to construct this data structure from a list of edges:
The "manual" creation of the data structure, as proposed by #jschimpf:
my_graph(Nodes) :-
Vars = [A, B, C, D, E],
Nodes = [
node(a, [edgeTo(1, B), edgeTo(5, D)]),
node(b, [edgeTo(1, A), edgeTo(4, E), edgeTo(2, C)]),
node(c, [edgeTo(2, B), edgeTo(6, F)]),
node(d, [edgeTo(5, A), edgeTo(3, E)]),
node(e, [edgeTo(3, D), edgeTo(4, B), edgeTo(1, F)]),
node(e, [edgeTo(1, E), edgeTo(6, C)])
],
Vars = Nodes.
Where edgeTo(Weight, VertexVar) represents an edge to some vertex with a weight assosiated with it. The weight is just to show that this can be customized for any additional information. node(Vertex, [edgeTo(Weight, VertexVar), ...]) represents a vertex with its neighbours.
A more "user-friendly" input format:
[edge(Weight, FromVertex, ToVertex), ...]
With optional list of vertices:
[Vertex, ...]
For the example above:
[edge(1, a, b), edge(5, a, d), edge(2, b, c), edge(4, b, e), edge(6, c, f), edge(3, d, e), edge(1, e, f)]
This list can be converted to the recursive data structure with the following predicates:
% make_directed_graph(+Edges, -Nodes)
make_directed_graph(Edges, Nodes) :-
vertices(Edges, Vertices),
vars(Vertices, Vars),
pairs(Vertices, Vars, Pairs),
nodes(Pairs, Edges, Nodes),
Vars = Nodes.
% make_graph(+Edges, -Nodes)
make_graph(Edges, Nodes) :-
vertices(Edges, Vertices),
vars(Vertices, Vars),
pairs(Vertices, Vars, Pairs),
directed(Edges, DiretedEdges),
nodes(Pairs, DiretedEdges, Nodes),
Vars = Nodes.
% make_graph(+Edges, -Nodes)
make_graph(Edges, Nodes) :-
vertices(Edges, Vertices),
vars(Vertices, Vars),
pairs(Vertices, Vars, Pairs),
directed(Edges, DiretedEdges),
nodes(Pairs, DiretedEdges, Nodes),
Vars = Nodes.
% make_directed_graph(+Vertices, +Edges, -Nodes)
make_directed_graph(Vertices, Edges, Nodes) :-
vars(Vertices, Vars),
pairs(Vertices, Vars, Pairs),
nodes(Pairs, Edges, Nodes),
Vars = Nodes.
The binary versions of these predicates assume, that every vertex can be obtained from the list of edges only - There are no "edge-less" vertices in the graph. The ternary versions take an additional list of vertices for exactly these cases.
make_directed_graph assumes the input edges to be directed, make_graph assumes them to be undirected, so it creates additional directed edges in the opposite direction:
% directed(+UndirectedEdges, -DiretedEdges)
directed([], []).
directed([edge(W, A, B)|UndirectedRest], [edge(W, A, B), edge(W, B, A)|DirectedRest]) :-
directed(UndirectedRest, DirectedRest).
To get all the vertices from the list of edges:
% vertices(+Edges, -Vertices)
vertices([], []).
vertices([edge(_, A, B)|EdgesRest], [A, B|VerticesRest]) :-
vertices(EdgesRest, VerticesRest),
\+ member(A, VerticesRest),
\+ member(B, VerticesRest).
vertices([edge(_, A, B)|EdgesRest], [A|VerticesRest]) :-
vertices(EdgesRest, VerticesRest),
\+ member(A, VerticesRest),
member(B, VerticesRest).
vertices([edge(_, A, B)|EdgesRest], [B|VerticesRest]) :-
vertices(EdgesRest, VerticesRest),
member(A, VerticesRest),
\+ member(B, VerticesRest).
vertices([edge(_, A, B)|EdgesRest], VerticesRest) :-
vertices(EdgesRest, VerticesRest),
member(A, VerticesRest),
member(B, VerticesRest).
To construct uninitialized variables for every vertex:
% vars(+List, -Vars)
vars([], []).
vars([_|ListRest], [_|VarsRest]) :-
vars(ListRest, VarsRest).
To pair up verticies and vertex variables:
% pairs(+ListA, +ListB, -Pairs)
pairs([], [], []).
pairs([AFirst|ARest], [BFirst|BRest], [AFirst-BFirst|PairsRest]) :-
pairs(ARest, BRest, PairsRest).
To construct the recursive nodes:
% nodes(+Pairs, +Edges, -Nodes)
nodes(Pairs, [], Nodes) :-
init_nodes(Pairs, Nodes).
nodes(Pairs, [EdgesFirst|EdgesRest], Nodes) :-
nodes(Pairs, EdgesRest, Nodes0),
insert_edge(Pairs, EdgesFirst, Nodes0, Nodes).
First, a list of empty nodes for every vertex is initialized:
% init_nodes(+Pairs, -EmptyNodes)
init_nodes([], []).
init_nodes([Vertex-_|PairsRest], [node(Vertex, [])|NodesRest]) :-
init_nodes(PairsRest, NodesRest).
Then the edges are inserted one by one:
% insert_edge(+Pairs, +Edge, +Nodes, -ResultingNodes)
insert_edge(Pairs, edge(W, A, B), [], [node(A, [edgeTo(W, BVar)])]) :-
vertex_var(Pairs, B, BVar).
insert_edge(Pairs, edge(W, A, B), [node(A, EdgesTo)|NodesRest], [node(A, [edgeTo(W, BVar)|EdgesTo])|NodesRest]) :-
vertex_var(Pairs, B, BVar).
insert_edge(Pairs, edge(W, A, B), [node(X, EdgesTo)|NodesRest], [node(X, EdgesTo)|ResultingNodes]) :-
A \= X,
insert_edge(Pairs, edge(W, A, B), NodesRest, ResultingNodes).
To get a vertex variable for a given vertex: (This actually works in both directions.)
% vertex_var(+Pairs, +Vertex, -Var)
vertex_var(Pairs, Vertex, Var) :-
member(Vertex-Var, Pairs).
```Prolog
This, of course, brings additional time overhead, but you can do this once and then just copy this data structure every time you need to perform some graph algorithm on it and access neighbours in constant time.
You can also add additional information to the `node` predicate. For example:
```Prolog
node(Vertex, Neighbours, OrderingVar)
Where the uninitialized variable OrderingVar can be "assigned" (initialized) in constant time with information about the vertex' position in a partial ordering of the graph, for example. So this may be used as output. (As sometimes denoted by +- in Prolog comments - an uninitialized variable as a part of an input term, that is yet to be initialized by the used predicate and provides output.)

Taxicab Numbers in Haskell

Taxicab number is defined as a positive integer that can be expressed as a sum of two cubes in at least two different ways.
1729=1^3+12^3=9^3+10^3
I wrote this code to produce a taxicab number which on running would give the nth smallest taxicab number:
taxicab :: Int -> Int
taxicab n = [(cube a + cube b)
| a <- [1..100],
b <- [(a+1)..100],
c <- [(a+1)..100],
d <- [(c+1)..100],
(cube a + cube b) == (cube c + cube d)]!!(n-1)
cube x = x * x * x
But the output I get is not what I expected.For the numbers one to three the code produces correct output but taxicab 4 produces 39312 instead of 20683.Another strange thing is that 39312 is originally the 6th smallest taxicab number-not fourth!
So why is this happening? Where is the flaw in my code?
I think you mistakenly believe that your list contains the taxicab numbers in an increasing order. This is the actual content of your list:
[1729,4104,13832,39312,704977,46683,216027,32832,110656,314496,
216125,439101,110808,373464,593047,149389,262656,885248,40033,
195841,20683,513000,805688,65728,134379,886464,515375,64232,171288,
443889,320264,165464,920673,842751,525824,955016,994688,327763,
558441,513856,984067,402597,1016496,1009736,684019]
Recall that a list comprehension such as [(a,b) | a<-[1..100],b<-[1..100]] will generate its pairs as follows:
[(1,1),...,(1,100),(2,1),...,(2,100),...,...,(100,100)]
Note that when a gets to its next value, b is restarted from 1. In your code, suppose you just found a taxicab number of the form a^3+b^3, and then no larger b gives you a taxicab. In such case the next value of a is tried. We might find a taxicab of the form (a+1)^3+b'^3 but there is no guarantee that this number will be larger, since b' is any number in [a+2..100], and can be smaller than b. This can also happen with larger values of a: when a increases, there's no guarantee its related taxicabs are larger than what we found before.
Also note that, for the same reason, an hypotetical taxicab of the form 101^3+b^3 could be smaller than the taxicabs you have on your list, but it does not occur there.
Finally, note that you function is quite inefficient, since every time you call taxicab n you recompute all the first n taxicab values.

Find the minimum gap between two numbers in an AVL tree

I have a data structures homework, that in addition to the regular AVL tree functions, I have to add a function that returns the minimum gap between any two numbers in the AVL tree (the nodes in the AVL actually represent numbers.)
Lets say we have the numbers (as nodes) 1 5 12 20 23 21 in the AVL tree, the function should return the minimum gap between any two numbers. In this situation it should return "1" which is |20-21| or |21-20|.
It should be done in O(1).
Tried to think alot about it, and I know there is a trick but just couldn't find it, I have spent hours on this.
There was another task which is to find the maximum gap, which is easy, it is the difference between the minimal and maximal number.
You need to extend the data structure otherwise you cannot obtain a O(1) search of the minimum gap between the numbers composing the tree.
You have the additional constrain to not increase the time complexity of insert/delete/search function and I assume that you don't want to increase space complexity too.
Let consider a generic node r, with a left subtree r.L and a right subtree r.R; we will extend the information in node r additional number r.x defined as the minimum value between:
(only if r.L is not empty) r value and the value of the rightmost leaf on r.L
(only if r.L is deeper than 1) the x value of the r.L root node
(only if r.R is not empty) r value and the value of the leftmost leaf on r.R
(only if r.R is deeper than 1) the x value of the r.R root node
(or undefined if none of the previous condition is valid, in the case of a leaf node)
Additionally, in order to make fast insert/delete we need to add in each internal node the references to its leftmost and rightmost leaf nodes.
You can see that with these additions:
the space complexity increase by a constant factor only
the insert/delete functions need to update the x values and the leftmost and rightmost leafs of the roots of every altered subtree, but is trivial to implement in a way that need not more than O(log(n))
the x value of the tree root is the value that the function needs to return, therefore you can implement it in O(1)
The minimum gap in the tree is the x value of the root node, more specifically, for each subtree the minimum gap in the subtree elements only is the subtree root x value.
The proof of this statement can be made by recursion:
Let consider a tree rooted by the node r, with a left subtree r.L and a right subtree r.R.
The inductive hypothesis is that the roots of r.L and r.R x values are the values of the minimum gaps between the node values of the subtree.
It's obvious that the minimum gap can be found considering only the pairs of nodes with values adjacent in the value sorted list; the pairs formed by values stored by the nodes of r.L have their minimum gap in the r.L root x value, the same is true considering the right subtree. Given that (any value of nodes in r.L) < value of L root node < (any value of nodes in r.R), the only pairs of adjacent values not considered are two:
the pair composed by the root node value and the higher r.L node value
the pair composed by the root node value and the lower r.R node value
The r.L node with the higher value is its rightmost leaf by the AVL tree properties, and the r.R node with the lower value is its leftmost leaf.
Assigning to r x value the minimum value between the four values (r.L root x value, r.R root x value, (r - r.L root) gap, (r - r.R root) gap) is the same to assign the smaller gap between consecutive node values in the whole tree, that is equivalent to the smaller gap between any possible pair of node values.
The cases where one or two of the subtree is empty are trivial.
The base cases of a tree made of only one or three nodes, it is trivial to see that the x value of the tree root is the minimum gap value.
This function might be helpful for you:
int getMinGap(Node N)
{
int a = Integer.MAX_VALUE ,b = Integer.MAX_VALUE,c = Integer.MAX_VALUE,d = Integer.MAX_VALUE;
if(N.left != null) {
a = N.left.minGap;
c = N.key - N.left.max;
}
if(N.right != null) {
b = N.right.minGap;
d = N.right.min - N.key;
}
int minGap = min(a,min(b,min(c,d)));
return minGap;
}
Here is the Node data structure:
class Node
{
int key, height, num, sum, min, max, minGap;
Node left, right;
Node(int d)
{
key = d;
height = 1;
num = 1;
sum = d;
min = d;
max = d;
minGap = Integer.MAX_VALUE;
}
}

Calculate straight line from scattered points

I have some scattered 3D points (2d solution is sufficient). I want find different straight lines passing through (at least three points makes line) which are laying nearby (say for example 10 units). A single point could be part of different lines.
To determine whether 3 points (a,b,c) are in a line, use cross-products (2D or 3D):
V = (Vx, Vy, Vz)
Vab = b - a
Vac = c - a
CrossProd (V,W) = (VyWz - VzWy, VzWx - WzVx, VxWy - WxVy)
If CrossProd(Vab, Vac) is zero, then the points (a, b, c) are colinear. Actually the cross product is proportional to the area of the triangle (a, b ,c), so you can set a small non-zero tolerance if needed.
Re. tolerance.
The distance from b to the line Vac is given by:
d = length(CrossProd(Vab, Vac))/ length(Vac)
You can probably compare this with an absolute tolerance given your problem description. Alternatively you might use:
sin(theta) = length(CrossProd(Vab, Vac))/ length(Vac)/ length(Vab)
Then theta is the angle between the two vectors and can be compared with a fixed tolerance.

Resources