I am new to graph databases and arangodb too.
I try to query a graph with different edge definitions and didn't find any example for this. A query to get a result for one edge I found.
FOR p IN person
FOR vx, ex, px IN ANY p GRAPH "test" FILTER vx.brand == "BMW" RETURN DISTINCT p
For example:
I have vertices "person", "car" and "house" and edges "has_car" (person->car) and "lives_in" (person->house). To try out I created three graphs. One for each edge definition and one with both edge definitions.
My question: What is the right way to query:
Persons who have a "BMW" and live in a "Castle"
Persons who live in a "Skyscraper" and don't have a car
Persons who live in a "Skyscraper" and don't have a "BMW"
Thanks.
What if you started from an end? That way you can check a whole path.
for an example following is starting from a BMW car (for the first question)
for car in Car filter car.brand=="BMW"
for v,e,p in 0..2 any car._id graph 'test'
filter p.edges[0]!=null && is_same_collection('has_car', p.edges[0])
&& p.vertices[1]!=null && is_same_collection('Person', p.vertices[1])
&& p.edges[1]!=null && is_same_collection('lives_in', p.edges[1])
&& p.vertices[2]!=null && p.vertices[2].house=="Castle"
return distinct(p.vertices[1])
Related
I'm interested in using traversals to quickly find all the documents linked to an initial document. For this I'd use:
let id = 'documents/18787898'
for d in documents
filter d._id == id
for i in 1..1 any d edges
return i
This generally provides me with all the documents related to the initial ones. However, say that in these edges I have more information than just the standard _from and _to. Say it also contains order, in which I indicate the order in which something is to be displayed. Is there a way to also grab that information at the same time as making the traversal? Or do I now have to make a completely separate query for that information?
You are very close, but your graph traversal is slightly incorrect.
The way I read the documentation, it shows that you can return vertex, edge, and path objects in a traversal:
FOR vertex[, edge[, path]]
IN [min[..max]]
OUTBOUND|INBOUND|ANY startVertex
edgeCollection1, ..., edgeCollectionN
I suggest adding the edge variable e to your FOR statement, and you do not need to find document/vertex matches first (given than id is a single string), so the FOR/FILTER pair can be eliminated:
LET id = 'documents/18787898'
FOR v, e IN 1 ANY id edges
RETURN e
Q: Can I limit the edge collections the system will try to use when traversing named graphs AQL?
Scenario:
If I have a named graph productGraph with two vertices collections and two edge collections:
Vertices: product, price
prodParentOf (product A is parent of product B)
prodHasPrice (product A has a price of $X)
If now I want the products children of product A (and no prices) , I would like to do something like this
WITH product
FOR v, e, p IN OUTBOUND 'product/A'
GRAPH 'productGraph'
RETURN {vertice:v, edge:e, path: p}
However, if I look at the explain plan, I see that the system attempted to use the indexes for both prodParentOf and prodHasPrice (even if I explicitly put the product collection in the 'With' clause):
Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
2 edge prodHasPrice false false 75.00 % [ `_from`, `_to` ] base OUTBOUND
2 edge prodParentOf false false 65.37 % [ `_from`, `_to` ] base OUTBOUND
Can I limit the edge collections the system will try to use when querying named graphs? Or do I have to use edge collections in the query instead. (which in my mind would mean that it would better to traverse edge collections in general than named graphs).
Here is the same query using an edge collection
FOR v, e, p IN OUTBOUND 'product/A'
prodParentOf
RETURN {vertice:v, edge:e, path: p}
The WITH clause does not impose restrictions on which collections that are part
of your named graph will be used in a traversal. It is mainly for traversals in cluster, to declare which collections will be involved. This helps to avoid deadlocks, which may occur if collections are lazily locked at query runtime.
If you use a single server instance, then the WTIH clause is optional. It does not have an effect on the result. If you want to exclude collections from traversal, you can either use collections sets instead of the named graph, or use FILTERs together with IS_SAME_COLLECTION(). Using collection sets is more efficient, because with less edge collections there are less edges to traverse, whereas filters are applied after the traversal in most cases.
FOR v, e, p IN 1..5 OUTBOUND 'verts/start' GRAPH 'named-graph'
FILTER (FOR id IN p.edges[*]._id RETURN IS_SAME_COLLECTION('edgesX', id)) ALL == true
RETURN p
If your traversal has a depth of 1 only, then a filter query is simpler:
FOR v, e, p IN INBOUND 'product/A' GRAPH 'productGraph'
FILTER IS_SAME_COLLECTION('prodParentOf', e)
RETURN {vertex: v, edge: e, path: p}
A way to prune paths may come in the future, which should also help with your named graph scenario.
If there are vertices(eg: Star, Movie) and edges(eg: star_in, director, producer) in ArangoDB, and I want to get movies which starring and directed by Stephen Chow, how to write the query statement?
In this case you can use the AQL NEIGHBORS function:
FOR n IN ANY #startId ##edgeCollection OPTIONS {bfs:true,uniqueVertices: 'global'}
RETURN n._id
ANY/INBOUND/OUTBOUND determines the direction of the edges while #startId is your start vertex (in this case Stephen Crow) and ##edgecollection is your used edge collection.
When two conditions should be applied (starring and directed) a INTERSECTION of two NEIGHBOUR queries could be used.
The following AQL query is a draft for your use case:
FOR x IN INTERSECTION
((FOR y IN ANY 'star/StephenChow' star_in OPTIONS {bfs: true, uniqueVertices: 'global'} RETURN y._id),
(FOR y IN ANY 'star/StephenChow' director OPTIONS {bfs: true, uniqueVertices: 'global'} RETURN y._id))
RETURN x
A working Actor/Movie example can be found in the Cookbook section of the documentation.
I am having some trouble wrapping my head around how to traverse a certain graph to extract some data.
Given a collection of "users" and a collection of "places".
And a "likes" edge collection to denote that a user likes a certain place. The "likes" edge collection also has a "review" property to store a user's review about the place.
And a "follows" edge collection to denote that a user follows another user.
How can I traverse the graph to fetch all the places that I like with my review of the place and the reviews of the users I follow that also like the same place.
for example, in the above graph. I am user 6327 and I reviewed both places(7968 and 16213)
I also follow user 6344 which also happens to have reviewed the place 7968.
How can I get all the places that I like and the reviews of the people that I follow who also reviewed the same place that I like.
an expected output would be something like the following:
[
{
name:"my name",
place: "place 1",
id: 1
review,"my review about place 1"
},
{
name:"my name",
place: "place 2",
id: 2
review,"my review about place 2"
},
{
name:"name of the user I follow",
place: "place 2",
id: 2
review,"review about place 2 from the user I follow"
}
]
There are a number of ways to do this query, and it also depends on where you want to add parameters, but for the sake of simplicity I've built this quite verbose query below to help you understand one way of approaching the problem.
One way is to determine the _id of your user record, then find all the _id's of the friends you follow, and then to work out all related reviews in one query.
I take a different approach below, and that is to:
Determine the reviews you have written
Determine who you follow
Determine the reviews the people you follow have written
Merge together your reviews with those of the people you follow
It is possible to merge these queries together more optimally, but I thought it worth breaking them out like this (and showing the output of each stage as well as the final answer) to help you see what data is available.
A key thing to understand about AQL graph queries is how you have access to vertices, edges, and paths when you perform a query.
A path is an object in it's own right and it's worth investigating the contents of that object to better understand how to exploit it for path information.
This query assumes:
users document collection contains users
places document collection contains places
follows edge collection tracks users following other users
reviews edge collection tracks reviews people wrote
Note: When providing an id on each record I used the id of the review, because if you know that id you can fetch the edge document and get the id of both the user and the place as well as read all the data about the review.
LET my_reviews = (
FOR vertices, edges, paths IN 1..1 OUTBOUND "users/6327" reviews
RETURN {
name: FIRST(paths.vertices).name,
review_id: FIRST(paths.edges)._id,
review: FIRST(paths.edges).review,
place: LAST(paths.vertices).place
}
)
LET who_i_follow = (
FOR v IN 1..1 OUTBOUND "users/6327" follows
RETURN v
)
LET reviews_of_who_i_follow = (
FOR users IN who_i_follow
FOR vertices, edges, paths in 1..1 OUTBOUND users._id reviews
RETURN {
name: FIRST(paths.vertices).name,
review_id: FIRST(paths.edges)._id,
review: FIRST(paths.edges).review,
place: LAST(paths.vertices).place
}
)
RETURN {
my_reviews: my_reviews,
who_i_follow: who_i_follow,
reviews_of_who_i_follow: reviews_of_who_i_follow,
merged_reviews: UNION(my_reviews, reviews_of_who_i_follow)
}
The first vertex in paths.vertices is the starting vertex (users/6327)
The last vertex in paths.vertices is the end of the path, e.g. who you follow
The first edge in paths.edges is the review that the user made of the place
Here is another more compact version of the query that takes a param, the _id of the user that is 'you'.
LET target_users = APPEND(TO_ARRAY(#user), (
FOR v IN 1..1 OUTBOUND #user follows RETURN v._id
))
LET selected_reviews = (
FOR u IN target_users
FOR vertices, edges, paths in 1..1 OUTBOUND u reviews
LET user = FIRST(paths.vertices)
LET place = LAST(paths.vertices)
LET review = FIRST(paths.edges)
RETURN {
name: user.name,
review_id: review._id,
review: review.review,
place: place.place
}
)
RETURN selected_reviews
For example, I want to query out exactly this graph starting from dave with limit of depth 2
Now if I want to get the node connected to Dave with depth of 2 I would use
For v,c in 0..2
ANY "persons/dave" knows
OPTIONS {uniqueVertices: "global",bfs: true }
return v
This would return:
Dave-Bob-Charlie-Eve-Alice (everyone in the graph)
But I do not know how to query to return the correct set of relations which is:
Eve to Alice not missing
If graph is bigger, Alice-to-someoneelse would not be in the result
My current solution below would not return Eve-to-Alice
For v,c in 1..2
ANY "persons/dave" knows
OPTIONS {uniqueEdges: "global",bfs: true }
return c
In this case, Eve-to-Alice is a third level of traversal if you start at Dave. You can write the query:
for v, e, p in 1..3 ANY "persons/dave" knows options {uniqueEdges: "path",bfs: true}
return {vertex: v, edge: e, path: p}
This will give you every edge, including the one between Eve and Alice. Does this answer your question?
If you need to limit this to paths only between second level nodes, you need to create a filter.