How can I determine root objects in an arangodb tree graph? - arangodb

I have a document collection containing tree nodes and an edge collection containing "is child of" like this:
Folders=[
{_key:"1",name:"Root1"},
{_key:"2",name:"Root2"},
{_key:"3",name:"Root1.Node1"},
{_key:"4",name:"Root1.Node2"}]
FolderRelations=[
{_from:"Folders/3",_to:"Folders/1"},
{_from:"Folders/4",_to:"Folders/1"}
]
Now I would like to determine which Folder items are root objects in that tree (all objects that have no outbound relation).
Maybe, I am a bit stuck in thinking SQL, I would like to carry out something like:
SELECT *
FROM Folders
WHERE NOT EXIST (SELECT * FROM FolderRelations WHERE FolderRelations.FromKey=Folders.Key)
For using the traversal and path functionality, I have no vertex to start with.

here is an AQL example that should solve your problem:
for f in Folders
filter LENGTH( EDGES(FolderRelations, v._id, "outbound")) == 0
return f
you will get a list of all vertices that have no folder above in the hierarchy.
but be aware:
saving {key:1} will not have the desired effect, you have to set:
{_key: "1"}
_key is used for internal key attribute, and it has to be a string.

Related

ArangoDB - Get edge information while using traversals

I'm interested in using traversals to quickly find all the documents linked to an initial document. For this I'd use:
let id = 'documents/18787898'
for d in documents
filter d._id == id
for i in 1..1 any d edges
return i
This generally provides me with all the documents related to the initial ones. However, say that in these edges I have more information than just the standard _from and _to. Say it also contains order, in which I indicate the order in which something is to be displayed. Is there a way to also grab that information at the same time as making the traversal? Or do I now have to make a completely separate query for that information?
You are very close, but your graph traversal is slightly incorrect.
The way I read the documentation, it shows that you can return vertex, edge, and path objects in a traversal:
FOR vertex[, edge[, path]]
IN [min[..max]]
OUTBOUND|INBOUND|ANY startVertex
edgeCollection1, ..., edgeCollectionN
I suggest adding the edge variable e to your FOR statement, and you do not need to find document/vertex matches first (given than id is a single string), so the FOR/FILTER pair can be eliminated:
LET id = 'documents/18787898'
FOR v, e IN 1 ANY id edges
RETURN e

Limit edges used on named graph traversal

Q: Can I limit the edge collections the system will try to use when traversing named graphs AQL?
Scenario:
If I have a named graph productGraph with two vertices collections and two edge collections:
Vertices: product, price
prodParentOf (product A is parent of product B)
prodHasPrice (product A has a price of $X)
If now I want the products children of product A (and no prices) , I would like to do something like this
WITH product
FOR v, e, p IN OUTBOUND 'product/A'
GRAPH 'productGraph'
RETURN {vertice:v, edge:e, path: p}
However, if I look at the explain plan, I see that the system attempted to use the indexes for both prodParentOf and prodHasPrice (even if I explicitly put the product collection in the 'With' clause):
Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
2 edge prodHasPrice false false 75.00 % [ `_from`, `_to` ] base OUTBOUND
2 edge prodParentOf false false 65.37 % [ `_from`, `_to` ] base OUTBOUND
Can I limit the edge collections the system will try to use when querying named graphs? Or do I have to use edge collections in the query instead. (which in my mind would mean that it would better to traverse edge collections in general than named graphs).
Here is the same query using an edge collection
FOR v, e, p IN OUTBOUND 'product/A'
prodParentOf
RETURN {vertice:v, edge:e, path: p}
The WITH clause does not impose restrictions on which collections that are part
of your named graph will be used in a traversal. It is mainly for traversals in cluster, to declare which collections will be involved. This helps to avoid deadlocks, which may occur if collections are lazily locked at query runtime.
If you use a single server instance, then the WTIH clause is optional. It does not have an effect on the result. If you want to exclude collections from traversal, you can either use collections sets instead of the named graph, or use FILTERs together with IS_SAME_COLLECTION(). Using collection sets is more efficient, because with less edge collections there are less edges to traverse, whereas filters are applied after the traversal in most cases.
FOR v, e, p IN 1..5 OUTBOUND 'verts/start' GRAPH 'named-graph'
FILTER (FOR id IN p.edges[*]._id RETURN IS_SAME_COLLECTION('edgesX', id)) ALL == true
RETURN p
If your traversal has a depth of 1 only, then a filter query is simpler:
FOR v, e, p IN INBOUND 'product/A' GRAPH 'productGraph'
FILTER IS_SAME_COLLECTION('prodParentOf', e)
RETURN {vertex: v, edge: e, path: p}
A way to prune paths may come in the future, which should also help with your named graph scenario.

Get all documents within a folder or site that has a specific Aspect property value?

I have an aspect that I have associated with multiple documents. For example lests call the aspect OrderAspect.
The below query works when I fetch all nodes that have a location property from OrderAspect set to 'WAREHOUSE-A'
SELECT * FROM oa:OrderAspect WHERE oa:Location ='WAREHOUSE-A'
How can I extend this query to get ONLY documents that have this aspect value as 'WAREHOUSE-A'.
Can I extend this query to search within a folder path or site? I would like to list all the documents within a folder (including subfolder) or a site that has OrderAspect with property location set to 'WAREHOUSE-A'.
Here is how you do a CMIS query that restricts results to a value defined in an aspect:
select D.cmis:name from cmis:document as D join sc:temp as T on D.cmis:objectId = T.cmis:objectId where T.sc:prop1 = 'value1'
Here is how you add an AND clause to require that the result be in a certain path, including sub-folders:
select D.cmis:name from cmis:document as D join sc:temp as T on D.cmis:objectId = T.cmis:objectId where T.sc:prop1 = 'value1' AND CONTAINS(D, 'PATH:\"/app:company_home/st:sites/cm:jtp-test-site-1/cm:documentLibrary//*\"')

python3 wx.TreeCtrl - how to iterate through several levels

I have a treectrl structure which is populated from an external search of an open data set hosted by our municipal government. The data pertains to business licenses and is requested using Pandas and Sodapy. The tree is populated as follows:
for index, row in results_df.iterrows():
tradename = row['tradename']
address = row['address']
licTypes = row['licencetypes']
comm = row['comdistnm']
jobSts = row['jobstatusdesc']
jobCrt = row['jobcreated']
lng = row['longitude']
lng = str(lng)
lat = row['latitude']
lat = str(lat)
# Populate Tree Controls with DataFrame values
trdName = self.thrTree.AppendItem(root, tradename)
self.thrTree.AppendItem(trdName, address)
self.thrTree.AppendItem(trdName, licTypes)
self.thrTree.AppendItem(trdName, comm)
self.thrTree.AppendItem(trdName, jobSts)
self.thrTree.AppendItem(trdName, jobCrt)
self.thrTree.AppendItem(trdName, lng)
self.thrTree.AppendItem(trdName, lat)
This will result in a final structure of root, then node 1 with business name, and when expanded, contains all the information listed above, so I'm assuming root level, then child node 1, then child.child of node 1? Not even sure how the second second indented nodes are called. (I've heard the term leaf for the third level used before) But I digress; what I am interested in is grabbing the Latitude and Longitude of where the business is located, then allowing the user to map the location if they choose. I bind a wx.EVT_TREE_ITEM_ACTIVATED so that when the user double clicks on a business name to get the details, I want to grab the items displayed. This is how I am currently trying to iterate through the child nodes.
item = self.thrTree.GetSelection()
while self.thrTree.GetItemParent(item):
piece = self.thrTree.GetItemText(item)
tmpHldr.insert(0, piece)
item = self.thrTree.GetItemParent(item)
Looking at item, it appears to be collecting all the business names under root, and ignoring the third level items of interest.
What do I need to do to go deeper within the tree to grab the details under the business clicked on, and not just the list of business names under the root item, which is called 'Search Results'?
Thanks!
#YYC_Code,
Did you look here?
This has GetFirstChild()/GetNextChild() pair functions that you can use to iterate. It also has ItemHasChildren() function which you can use to verify if the item has any children and use the pair mentioned above if it does.
EDIT:
[quote]
For this enumeration function you must pass in a ‘cookie’ parameter which is opaque for the application but is necessary for the library to make these functions reentrant (i.e. allow more than one enumeration on one and the same object simultaneously). The cookie passed to GetFirstChild and GetNextChild should be the same variable.
[/quote]
You need to make sure that the cookie parameter is the same during the iteration.
You should also do this:
[quote]
Returns an invalid tree item (i.e. wx.TreeItemId.IsOk returns False) if there are no further children.
[/quote]

Create edge from attribute with ArangoDB AQL

I have imported a list of documents (in a collection named "assemblies"). One of the attributes is "parent_id".
Based on this, I want to construct the graph, that is implicitly described by this attribute.
"id","name","parent_id"
"30","Top level"
"30.1","30.1 Child 1","30"
"30.2","30.2 Child 2","30"
This is the query, that I expected to give me the info for creating the edge collection (named "contains", so it is from parent to child):
FOR assy IN assemblies
LET parent = (
FOR parent IN assemblies
FILTER parent.id == assy.parent_id
RETURN parent
)
RETURN {_from: parent._key, _to: assy._key}
What am I doing wrong? Could you give me the full query for inserting the edges?
The problem is that the result of your subquery in parent is an array and not an document. But there is actually no need of a subquery. You can also performe a join, which should offer better performance and is easier to read.
You also have to use the value of _id insteadt of _key for the fields _from and _to of your edges.
The following query does exactly what you want.
FOR assy IN assemblies
FOR parent IN assemblies
FILTER parent.id == assy.parent_id
INSERT {_from: parent._id, _to: assy._id} IN contains
RETURN NEW
Node: the RETURN NEW is optional. You can check with it whether the import was successful. With larger amount of data I would drop this.

Resources