ArangoDB uniqueVertices global excluding the first and last vertices - arangodb

I'm attempting to find all unique paths from one friend to another.
When I use uniqueVertices: 'global', it is only returning one path because the end vertices is considered is part of the global unique.
FOR v,e,p
IN 1..6
ANY "entities/foo"
GRAPH "friendGraph"
OPTIONS {
bfs: true,
uniqueVertices: 'path'
}
SORT e.weight ASC
FILTER v._id == "entities/bar"
RETURN p
Is there a way to have uniqueVertices: 'global' ignore the end vertices? I know there isn't a way to specifically do that. But is there a way to accomplish the same thing?
'path' resulted in way to many results.
Thank you.

In order to use globally unique vertices but for the last one, you could add the last step in the path manually like so:
FOR v,e,p
IN 0..5
ANY "entities/foo"
GRAPH "friendGraph"
OPTIONS {
bfs: true,
uniqueVertices: 'global'
}
FILTER p.vertices[*]._id ALL != "entities/bar"
FOR w,f
IN 1..1
ANY v
GRAPH "friendGraph"
FILTER w._id == "entities/bar"
SORT f.weight ASC
RETURN { edges: APPEND(p.edges, [f]), vertices: APPEND(p.vertices, [w]) }
I'd like to note two things:
the SORT operation you added might not achieve what you want: it sorts the paths by the weight of the path's last edge
this does not find all unique paths between the two vertices. For that using the option uniqueVertices: 'path' would be correct, and there might well be a lot of them.

Related

ArangoDB: Traversal condition on related document

Been stuck for days with this concern, trying to accomplish this:
See the provided picture.
The black is the start vertex. Trying to get:
1: All child parts OUTBOUND (from) the start vertex
2: Condition: The children MUST have the INBOUND edge"types" and the other end a document with a variable set to "true" and of the type "type".
3: When a document of type "part" fails to met up the requirements with INBOUND document of type "type" with a attribute to "true", it stops the expand for that path then and there.
4: The documents who failed isn't included in the result either.
5: Should be compatible with any depths.
6: No subqueries (if possible without).
Example of graph
With the given information, the data model seems questionable. Why are there true and false vertices instead of a boolean edge attribute per partScrew? Is there a reason why it is modeled like this?
Using this data model, I don't see how this would be possible without subqueries. The traversal down a path can be stopped early with PRUNE, but that does not support subqueries. That only leaves FILTER for post-filtering as option, but be careful, you need to check all vertices on the path and not just the emitted vertex whether it has an inbound false type.
Not sure if it works as expected in all cases, but here is what I came up with and the query result, which looks good to me:
LET startScrew = FIRST(FOR doc IN screw LIMIT 1 RETURN doc) // Screw A
FOR v,e,p IN 1..2 OUTBOUND startScrew partScrew
FILTER (
FOR v_id IN SHIFT(p.vertices[*]._id) // ignore start vertex
FOR v2 IN 1..1 INBOUND v_id types
RETURN v2.value
) NONE == false
RETURN {
path: CONCAT_SEPARATOR(" -- ", p.vertices[*].value)
}
path
Screw A -- Part D
Screw A -- Part E
Screw A -- Part E -- Part F
Dump with test data: https://gist.github.com/Simran-B/6bd9b154d1d1e2e74638caceff42c44f

Count distinct nodes from traversal in AQL

I am able to get all distinct nodes from a query, but not the count:
FOR v in 2..2 OUTBOUND "starting_node" GRAPH "some_graph"
return DISTINCT v._key
I want to get only the count of the result. I tried to use LENGTH(DISTINCT v._key) as suggested in the docs, but it's not a proper syntax of the AQL:
syntax error, unexpected DISTINCT modifier near 'DISTINCT v._key)'
The naive solution is to get all keys and count it on the client side, but I am curious how to do it on the server side?
What RETURN DISTINCT does is to remove duplicate values, but only after the traversal.
You can set traversal options to eliminate paths during the traversal, which can be more efficient especially if you have a highly interconnected graph and a high traversal depth:
RETURN LENGTH(
FOR v IN 2..2 OUTBOUND "starting_node" GRAPH "some_graph"
OPTIONS { uniqueVertices: "global", bfs: true }
RETURN v._key
)
The traversal option uniqueVertices can be set to "global" so that you don't get the same vertex returned twice from this traversal. The option for breadth-first search bfs needs to be enabled to use uniqueVertices: "global". The reason why depth-first search does not support this uniqueness option is that the result would not be deterministic, hence this combination was disabled.
Inspired by this blogpost http://jsteemann.github.io/blog/2014/12/12/aql-improvements-for-24/ I prepared the solution using LET:
LET result = (FOR v in 2..2 OUTBOUND "starting_node" GRAPH "some_graph"
return DISTINCT v._key)
RETURN LENGTH(result)
It might be not optimal solution, but it works as I expected.

Arangodb query for all relations between subset of nodes limited by depth

For example, I want to query out exactly this graph starting from dave with limit of depth 2
Now if I want to get the node connected to Dave with depth of 2 I would use
For v,c in 0..2
ANY "persons/dave" knows
OPTIONS {uniqueVertices: "global",bfs: true }
return v
This would return:
Dave-Bob-Charlie-Eve-Alice (everyone in the graph)
But I do not know how to query to return the correct set of relations which is:
Eve to Alice not missing
If graph is bigger, Alice-to-someoneelse would not be in the result
My current solution below would not return Eve-to-Alice
For v,c in 1..2
ANY "persons/dave" knows
OPTIONS {uniqueEdges: "global",bfs: true }
return c
In this case, Eve-to-Alice is a third level of traversal if you start at Dave. You can write the query:
for v, e, p in 1..3 ANY "persons/dave" knows options {uniqueEdges: "path",bfs: true}
return {vertex: v, edge: e, path: p}
This will give you every edge, including the one between Eve and Alice. Does this answer your question?
If you need to limit this to paths only between second level nodes, you need to create a filter.

Whats the best method to to filter graph edges by type in AQL

I have the following super-simple graph :
What i am trying to do is:
Select all questions where there is a property on the question document called firstQuestion with a value of true.
Select any options that are connected to the question via an outbound edge of type with_options
The following query works, however it feels like there must be a better way to inspect the edge type without using string operations - specifically the concatenation operation i use to recreate the edge _id value by joining it to the key with the edge type i want - is this the best way to inspect the type of edge?
FOR question IN questions
FILTER question.firstQuestion == true
let options =
(FOR v, e IN 1..1 OUTBOUND question._id GRAPH 'mygraph'
FILTER CONCAT('with_options/', e._key) == e._id
RETURN v)
RETURN {question: question, options: options}
We're currently introducing IS_SAME_COLLECTION for that specific purpose with ArangoDB 2.8.1.
The DOCUMENT function is also worth to mention in this context.
FOR question IN questions
FILTER question.firstQuestion == true
LET options = (FOR v, e IN 1..1 OUTBOUND question._id GRAPH 'mygraph'
FILTER IS_SAME_COLLECTION('with_options', e._id)
RETURN v)
RETURN {question: question, options: options}
However, the best solution in this special case is not to use the named graph interface, but specify the list of edge collections that should be concerned by the traversal in first place:
FOR question IN questions
FILTER question.firstQuestion == true
LET options = (FOR v, e IN 1..1 OUTBOUND question._id with_options RETURN v)
RETURN {question: question, options: options}

ArangoDB : How to get all the possible paths between 2 vertices?

How to get all the possible paths between 2 vertices (eg. X and Y) with maxDepth = 2?
I tried with TRAVERSAL but it is taking around 10 seconds to execute. Here is the query :
FOR p IN TRAVERSAL(locations, connections, "X", "outbound", { minDepth: 1, maxDepth: 2, paths: true })
FILTER p.destination._key == "Y"
RETURN p.path.vertices[*].name
The locations (vertices) collection has 23753 documents, and the connections (edges) collection has 123414 documents.
You can speed up the query a lot if you put the filter for destination right into Traversal via the options filterVertices to give examples of vertices that should be touched by the traversal. With vertexFilterMethod you can define what should happen with all vertices that do not match the example.
So in your query you only want to match the target vertex "Y" and all other vertices should be passed through but not included in the result, exclude.
This makes the later FILTER obsolete.
Right now the internal optimizer is not able to do that automagically but this magic is on our roadmap.
This is a query containing the optimization:
FOR p IN TRAVERSAL(locations, connections, "X", "outbound", { minDepth: 1, maxDepth: 2, paths: true, filterVertices: [{_key: "Y"}], vertexFilterMethod: ["exclude"]})
RETURN p.path.vertices[*].name

Resources