Filter neighbour's INBOUND vertices with path labels in ArangoDB - arangodb

I have the following graph:
I'd like to write an AQL query that returns all vertices which are neighbor's INBOUND vertices colored in RED from the start vertex colored in GREEN.
I tried the following AQL to retrieve red vertices from the green vertex.
WITH collection_A, collection_W
LET A_Neighbors = (FOR t IN collection_edges
FILTER t._to == 'collection_W/W'
RETURN t._from)
let all_w = []
for item in A_Neighbors
let sub_w = (for v1 in collection_edges
FILTER v1._to == item
return v1 )
return APPEND(all_w, sub_w)
Is there any good solution other than this? Because I'm not sure this gives the correct values for start vertex collection_W/W.
My collection_edges contains following two kind of documents.
{
_from: collection_W/w,
_to: collection_A/a,
label: INBOUND
}
and
{
_from: collection_A/a,
_to: collection_W/w,
label: OUTBOUND
}

Given the diagram, I would suggest using a graph traversal specific [min[..max]] value, like this (using an anonymous graph):
WITH collection_A, collection_W
FOR vertex IN 2 ANY 'collection_W/W' // green start node
collection_edges
RETURN vertex
The [min[..max]] value can be a range (1..3) or it can be a single value (1).
0 will return the start node
1 will return adjacent nodes
2 will skip the adjacent nodes and return only nodes at the next level (if any)
2..999 will return all nodes (up to 999 hops away) from the start node
Further, if you want to make sure that you're only returning nodes from a specific collection, add a filter for that:
WITH collection_A, collection_W
FOR vertex IN 2 ANY 'collection_W/W' // green start node
collection_edges
FILTER IS_SAME_COLLECTION('collection_W', vertex)
RETURN vertex
You can also filter on edges (if you've added a specific attribute/value to your edges):
WITH collection_A, collection_W
FOR vertex, edge IN 2 ANY 'collection_W/W' // green start node
collection_edges
FILTER edge.someProperty == 'someValue' // only return vertices that are beyond matching edges
RETURN vertex
Or limit the traversal with PRUNE:
WITH collection_A, collection_W
FOR vertex, edge IN 2 ANY 'collection_W/W' // green start node
collection_edges
PRUNE edge.someProperty == 'someValue' // stop traversal when this is matched
RETURN vertex

Related

AQL's PRUNE: How to combine conditions?

I am running ArangoDB 3.4.5 and I've been playing around with the PRUNE statements. I am having some difficulties combining conditions.
Assuming some vertices v on my path p have integer attributes ia and some v have boolean attributes ba. Even index v along p such as p.vertices[2] all have ba.
PRUNE HAS(v, "ia") AND v.ia != 5
works by itself.
PRUNE p.vertices[2].ba == false OR p.vertices[4].ba == false
also works by itself.
I observe, that I cannot combine them in one query, neither by multiple PRUNE statements nor by putting them in one
PRUNE (condition_1) OR (condition_2). Also I cannot put one in a PRUNE and the next in a FILTER statement.
Is anyone else experiencing this or is it just me?
UPDATE:
The FILTER and PRUNE statements did not return the desired results, the reason was however the missing OPTIONS {uniqueEdges: "none"}. As opposed to the uniqueVertices, none is not default.
I can't reproduce your issue in ArangoDB 3.4.5
If you create collections edge and vertex and populate these with an example tree:
FOR n in 0..100000
INSERT {_key: TO_STRING(n), val: n, modulo: n%2} INTO vertex
FILTER n > 0
INSERT {_from: CONCAT("vertex/", FLOOR((n-1)/2)), _to: NEW._id} INTO edge
Now I run a traversal:
WITH vertex
FOR v,e,p IN 0..5 OUTBOUND "vertex/0" edge
RETURN TO_STRING(p.vertices[*].val)
Result:
[
"[0]",
"[0,1]",
"[0,1,3]",
"[0,1,3,7]",
"[0,1,3,7,15]",
"[0,1,3,7,15,31]",
"[0,1,3,7,15,32]",
"[0,1,3,7,16]",
"[0,1,3,7,16,33]",
"[0,1,3,7,16,34]",
"[0,1,3,8]",
"[0,1,3,8,17]",
"[0,1,3,8,17,35]",
"[0,1,3,8,17,36]",
"[0,1,3,8,18]",
"[0,1,3,8,18,37]",
"[0,1,3,8,18,38]",
"[0,1,4]",
...
Next, I add "stop": true and "hide": 1 to the vertex _key: 7 and some other combinations to vertex 17 and 18. Now a PRUNE should stop traversing if the condition is meet. Be careful, the vertex itself is included in the results.
WITH vertex
FOR v,e,p IN 0..5 OUTBOUND "vertex/0" edge
PRUNE v.hide == 1 AND v.stop == true
RETURN TO_STRING(p.vertices[*].val)
Result:
[
"[0]",
"[0,1]",
"[0,1,3]",
"[0,1,3,7]", <-- stop: true, hide: 1
"[0,1,3,8]",
"[0,1,3,8,17]", <-- stop: true, hide: 1
"[0,1,3,8,18]",
"[0,1,3,8,18,37]",
"[0,1,3,8,18,38]",
...
The PRUNE condition can use AND / OR, but just one PRUNE condition is supported (in contrast to FILTERS).

how to check list of vertex is connected each other in gremlin

I have the list of vertex and I want to check is these vertexes are connected to each other or not. If these are connected then return the edges or return an empty list.
LIST of Vertex = [v[10], v[11], v[12], v[13], v[14], v[15], v[16], v[17], v[18], v[19], v[20]]
if Yes, then return v[10]-created-v[11]
if No, Then return []

Find path following edges with greatest value in ArangoDB

Lets say, that in my graph I've got edges that have field called value. After selecting start vertex I would like to find path by always selecting the edge that has the highest value. Unfortunatly I can't figure out how to write proper query, is it possible in ArangoDB?
Hi i am unsure what you would like to achieve, there are two possible scenarios that i can imagine from your description:
First: Shortest Path
The use-case here is you know the starting vertex and the target vertex, and you want to find the shortest (or cheapest) path between those two.
The built in SHORTEST_PATH (https://docs.arangodb.com/3.1/AQL/Graphs/ShortestPath.html#shortest-path-in-aql) feature can serve it by defining the distance attribute in the options like this:
FOR v IN OUTBOUND #start TO #end ##edgeCollections OPTIONS {weightAttribute: "value", defaultWeight: 1}
RETURN v
This will give you all vertices on the path from start to end which has the lowest some of value attributes. If you need the "highest value" you could copy the value and save it again with 1/value in a different field, to find the path with the fewest edges having in total the highest sum of values
Second: Sorting of edges
The use case is you only have the starting vertex and want to get the connected vertices, ordered by the value on the edges. There you can simply combine the traversal statement with a simple sort. (https://docs.arangodb.com/3.1/AQL/Graphs/Traversals.html#graph-traversals-in-aql):
FOR v, e IN OUTBOUND #start ##edgeCollection
SORT e.value DESC
LIMIT 1 /* Only pick the highest one */
REUTRN {v: v, e: e}
Third use-case: Iterating several depth only using the highest value
The AQL in Use-case 2 can be chained up to an arbitrary depth which has to be known a-priori. So say you would like to iterate 3 steps only using the edge with highest value:
FOR v1, e1 IN OUTBOUND #start ##edgeCollection
SORT e1.value DESC
LIMIT 1 /* Only pick the highest one */
/* Depth 1 done. now depth 2*/
FOR v2, e2 IN OUTBOUND v1 ##edgeCollection
SORT e2.value DESC
LIMIT 1 /* Only pick the highest one */
FOR v3, e3 IN OUTBOUND v2 ##edgeCollection
SORT e3.value DESC
LIMIT 1 /* Only pick the highest one */
RETURN [v1,v2,v3]
Forth use-case:
The depth is not known a-priori, in this case pure AQL in the currently release version (3.1) cannot formulate this. It will be easier to use a Foxx service (https://docs.arangodb.com/3.1/Manual/Foxx/#foxx) using the traversal module (https://docs.arangodb.com/3.1/Manual/Graphs/Traversals/UsingTraversalObjects.html#getting-started) in JavaScript which is a bit more flexible, but can only be implemented in Javascript.

Check if Graph is Connected Upon Removal of Vertices

I would appreciate advice/algorithms for the following problem:
Consider a graph with V vertices connected by E edges (V, E <= 10^5). When a vertex is removed, all the edges connected to that vertex are removed. The vertices are labeled 1, 2, ..., V.
Input is given on E lines, and on each line there are two space-separated vertex numbers representing an edge between those two vertices. The next V lines are a permutation of 1, 2, ..., V, representing the order in which the vertices are removed. Output V lines stating if the graph is connected (i.e. there is a sequence of paths between every pair of vertices) at each step. V and E are known and are given as space-separated integers on the first line of input.
For example, consider the following example input (edges are undirected):
5 5
1 2
3 1
2 3
2 4
5 4
3
4
1
2
5
The first line indicates that there are 5 vertices and 5 edges. The next 5 lines describe the edges (which are undirected, i.e. an edge from 1 to 2 can also be taken from 2 to 1). The 5 lines after that give the order in which the vertices are removed.
For this example, we would get the output as follows:
When vertex 3 is removed, the graph is connected, since we can go from any of 1, 2, 4, 5 to any other of 1, 2, 4, 5. When vertex 4 is removed, the graph is disconnected because there are no connections out of vertex 5. When vertex 1 is removed the same problem exists. When vertex 2 is removed only 5 is left, so the graph is connected. When all vertices are removed the graph is connected.
I tried a naive recursive approach as follows to check if it is possible to go from a start vertex to an end vertex:
void dfs(int start, int curr, int end):
if (curr == 0): // start condition, i.e. curr not yet initialized
curr = start
if (curr == end):
return true
else:
for (int v : edges[curr]):
dfs(start, v, end)
return false
Checking at each step if it is possible to travel from all vertices A to all other vertices B using the above algorithm is far too slow (O(V^2 * E^V), algorithm should be ideally O(V log V), or maybe O(V^2) to run in about one second).
For each step store number of partition graph has. For initial graph it is done by doing full traversal (DFS or BFS) from initial vertex, than repeat traversal from vertex that is not yet covered. If number of partitions is <= 1, than graph is connected.
If removed vertex has degree 0, than number of partition is decreased by one.
If removed vertex has degree 1, than number of partitions stays the same.
If removed vertex has degree larger than 1, than number of partitions can be increased by max degree-1. That is checked by similar traversing from neighbours of removed vertex. Start from initial neighbour and find all neighours that are connected to it. Repeat traversing from not visited neigbour.

Find the cross node for number of nodes in ArangoDB?

I have a number of nodes connected through intermediate node of other type. Like on picture There are can be multiple middle nodes. I need to find all the middle nodes for a given number of nodes and sort it by number of links between my initial nodes. In my example given A, B, C, D it should return node E (4 links) folowing node F (3 links). Is this possible? If not may be it can be done using multiple requests? I was thinking about using SHORTEST_PATH function but seems it can only find path between nodes from the same collection?
Very nice question, it challenged the AQL part of my brain ;)
Good news: it is totally possible with only one query utilizing GRAPH_COMMON_NEIGHBORS and a portion of math.
Common neighbors will count for how many of your selected vertices a cross is the connecting component (taking into account ordering A-E-B is different from B-E-A) using combinatorics we end up having a*(a-1)=c many combinations, where c is comupted. We use p/q formula to identify a (the number of connected vertices given in your set).
If the type of vertex is encoded in an attribute of the vertex object
the resulting AQL looks like this:
FOR x in (
(
let nodes = ["nodes/A","nodes/B","nodes/C","nodes/D"]
for n in GRAPH_COMMON_NEIGHBORS("myGraph",nodes , nodes)
for f in VALUES(n)
for s in VALUES(f)
for candidate in s
filter candidate.type == "cross"
collect crosses = candidate._key into counter
return {crosses: crosses, connections: 0.5 + SQRT(0.25 + LENGTH(counter))}
)
)
sort x.connections DESC
return x
If you put the crosses in a different collection and filter by collection name the query will even get more efficient, we do not need to open any vertices that are not of type cross at all.
FOR x in (
(
let nodes = ["nodes/A","nodes/B","nodes/C","nodes/D"]
for n in GRAPH_COMMON_NEIGHBORS("myGraph",nodes, nodes,
{"vertexCollectionRestriction": "crosses"}, {"vertexCollectionRestriction": "crosses"})
for f in VALUES(n)
for s in VALUES(f)
for candidate in s
collect crosses = candidate._key into counter
return {crosses: crosses, connections: 0.5 + SQRT(0.25 + LENGTH(counter))}
)
)
sort x.connections DESC
return x
Both queries will yield the result on your dataset:
[
{
"crosses": "E",
"connections": 4
},
{
"crosses": "F",
"connections": 3
}
]

Resources