Storing gremlin count values and comparing later in traversal - groovy

I am new to Gremlin and have just started with it.I have a graph where every vertex has a property Referenceable.qualifiedName and __typeName.
Edge label between vertices with '__typeName' as 'avro_schema' and 'avro_field' is '__avro_record.fields' and there is 1 to many relationship between 'avro_schema'(1) and 'avro_field'(many).
Edge label between vertices with '__typeName' as 'avro_field' and 'DataClassification' is 'classifiedAs' and there is 1 to many relationship between 'avro_field'(1) and 'DataClassification'(many).
I want to find out all 'avro_schema' vertices property 'Referenceable.qualifiedName' in the graph where each of the 'avro_field' has a 'classifiedAs' relationship to 'DataClassification'
I have tried the gremlin query to find out given a particular avro_schema find out the number of avro_field which has a relationship to DataClassification via classifiedAs. That works.
But i am not able to keep a count of the edges between avro_schema to avro_field and then compare with number of avro_field which has relationship to DataClassification type.
This gives number of classified avro_field for a give avro_schema
g.V().has('__typeName','avro_schema').has('Referenceable.qualifiedName' , "com.example.avro.test2Complicated16").outE('__avro_record.fields').inV().out('classifiedAs').has('__typeName','tDataClassification').count()
Also I tried to this to aggregate across all avro_schema which satisfies the condition,but it doesn't work.
g.V().has('__typeName','avro_schema').where(identity().out('__avro_record.fields').store('sumi').out('classifiedAs').has('__typeName','DataClassification').count().is(eq('sumi'.size()))).values('Referenceable.qualifiedName')
I want to know all the avro_schema in which all the avro_field has any of the 'classifiedAs' edge relationship to DataClassification
On further trying I got down to the query but size of the collection 'xm' is always returned as 0.
g.V().has('__typeName','avro_schema').local(out('__avro_record.fields').store('xm').local(out('classifiedAs').has('__typeName', 'DataClassification').count().is(eq(1))).count().is(eq(select('xm').size())))

Not sure if I'm following the problem description correctly, but here's a wild guess for the traversal you might be looking for:
g.V().has('__typeName','avro_schema').not(
out('__avro_record.fields').
out('classifiedAs').has('__typeName',neq('DataClassification'))).
values('Referenceable.qualifiedName')
UPDATE
// at least one __avro_record.fields relation
g.V().has('__typeName','avro_schema').filter(
out('__avro_record.fields').
groupCount().
by(choose(out('classifiedAs').has('__typeName','DataClassification'),
constant('y'), constant('n'))).
and(select('y'), __.not(select('n')))).
values('Referenceable.qualifiedName')
// include avro_schema w/o __avro_record.fields relations
g.V().has('__typeName','avro_schema').not(
out('__avro_record.fields').
groupCount().
by(choose(out('classifiedAs').has('__typeName','DataClassification'),
constant('y'), constant('n'))).
select('n')).
values('Referenceable.qualifiedName')

Related

ArangoDB - Get edge information while using traversals

I'm interested in using traversals to quickly find all the documents linked to an initial document. For this I'd use:
let id = 'documents/18787898'
for d in documents
filter d._id == id
for i in 1..1 any d edges
return i
This generally provides me with all the documents related to the initial ones. However, say that in these edges I have more information than just the standard _from and _to. Say it also contains order, in which I indicate the order in which something is to be displayed. Is there a way to also grab that information at the same time as making the traversal? Or do I now have to make a completely separate query for that information?
You are very close, but your graph traversal is slightly incorrect.
The way I read the documentation, it shows that you can return vertex, edge, and path objects in a traversal:
FOR vertex[, edge[, path]]
IN [min[..max]]
OUTBOUND|INBOUND|ANY startVertex
edgeCollection1, ..., edgeCollectionN
I suggest adding the edge variable e to your FOR statement, and you do not need to find document/vertex matches first (given than id is a single string), so the FOR/FILTER pair can be eliminated:
LET id = 'documents/18787898'
FOR v, e IN 1 ANY id edges
RETURN e

friends of friend Query in ArangoDB 3.0

I want to writing 'friends of friend' traversal using AQL
I have a Collection with Name:User and a edge Collection with name Conatct.
my Conatct documents:
I also read this article that implement friends of friend in ArangoDb, but that's post Uses functions of lower version of ArangoDB that used GRAPH_NEIGHBORS() function.
in ArnagoDB 3.0(latest version), GRAPH_NEIGHBORS() function have been removed!
now, how can I implement fof using Aql in ArnagoDB 3.0 ?
thanks a lot
The graph functions have been removed, because there is the more powerful, flexible and performant native AQL traversal, which was introduced with 2.8, and extended and optimized for version 3.0.
To retrieve friends of friends, a traversal starting at the user in question with a traversal depth = 2 is needed:
LET user = DOCUMENT("User/#9302796301")
LET foaf = (
FOR v IN 2..2 ANY user Contact
RETURN v // you might wanna return the name only here
)
RETURN MERGE(user, { foaf } )
The document for the user with _key = #9302796301 is loaded and assigned to a variable user. It is used as start vertex for a traversal with min and max depth = 2, using the edges of the collection Contact and ignoring their direction (ANY; can also be INBOUND or OUTBOUND). The friends of friends documents are fully returned in this example (v) and merged with the user document, using the attribute key "foaf" and the value of the variable foaf.
This is just one simple example how to traverse graphs and how to construct result sets. There are many more options of course.

loopback relational database hasManyThrough pivot table

I seem to be stuck on a classic ORM issue and don't know really how to handle it, so at this point any help is welcome.
Is there a way to get the pivot table on a hasManyThrough query? Better yet, apply some filter or sort to it. A typical example
Table products
id,title
Table categories
id,title
table products_categories
productsId, categoriesId, orderBy, main
So, in the above scenario, say you want to get all categories of product X that are (main = true) or you want to sort the the product categories by orderBy.
What happens now is a first SELECT on products to get the product data, a second SELECT on products_categories to get the categoriesId and a final SELECT on categories to get the actual categories. Ideally, filters and sort should be applied to the 2nd SELECT like
SELECT `id`,`productsId`,`categoriesId`,`orderBy`,`main` FROM `products_categories` WHERE `productsId` IN (180) WHERE main = 1 ORDER BY `orderBy` DESC
Another typical example would be wanting to order the product images based on the order the user wants them to
so you would have a products_images table
id,image,productsID,orderBy
and you would want to
SELECT from products_images WHERE productsId In (180) ORDER BY orderBy ASC
Is that even possible?
EDIT : Here is the relationship needed for an intermediate table to get what I need based on my schema.
Products.hasMany(Images,
{
as: "Images",
"foreignKey": "productsId",
"through": ProductsImagesItems,
scope: function (inst, filter) {
return {active: 1};
}
});
Thing is the scope function is giving me access to the final result and not to the intermediate table.
I am not sure to fully understand your problem(s), but for sure you need to move away from the table concept and express your problem in terms of Models and Relations.
The way I see it, you have two models Product(properties: title) and Category (properties: main).
Then, you can have relations between the two, potentially
Product belongsTo Category
Category hasMany Product
This means a product will belong to a single category, while a category may contain many products. There are other relations available
Then, using the generated REST API, you can filter GET requests to get items in function of their properties (like main in your case), or use custom GET requests (automatically generated when you add relations) to get for instance all products belonging to a specific category.
Does this helps ?
Based on what you have here I'd probably recommend using the scope option when defining the relationship. The LoopBack docs show a very similar example of the "product - category" scenario:
Product.hasMany(Category, {
as: 'categories',
scope: function(instance, filter) {
return { type: instance.type };
}
});
In the example above, instance is a category that is being matched, and each product would have a new categories property that would contain the matching Category entities for that Product. Note that this does not follow your exact data scheme, so you may need to play around with it. Also, I think your API query would have to specify that you want the categories related data loaded (those are not included by default):
/api/Products/13?filter{"include":["categories"]}
I suggest you define a custom / remote method in Product.js that does the work for you.
Product.getCategories(_productId){
// if you are taking product title as param instead of _productId,
// you will first need to find product ID
// then execute a find query on products_categories with
// 1. where filter to get only main categoris and productId = _productId
// 2. include filter to include product and category objects
// 3. orderBy filter to sort items based on orderBy column
// now you will get an array of products_categories.
// Each item / object in the array will have nested objects of Product and Category.
}

Cloudant 1 to many function

I’ve just started to use Cloudant and I just can’t get my head around the map functions. I’ve been fiddling with the data below but it isn’t working out as I expected.
The relationship is, a user can have many vehicles. A vehicle belongs to 1 user. The vehicle ‘userId’ is the key of the user. There is a bit of redundancy as in user the _id and userId is the same, guess later is not required.
Anyhow, how can I find for a/every user, the vehicles which belong to it? The closest I’ve come through trial and error is a result which displays the owner of every vehicle, but I would like it the other way round, the user and the vehicles belonging to it. All the examples I’ve found use another document which ‘joins’ two or more documents, but I don’t need to do that?
Any point in the right direction appreciated - I really have no idea.
function (doc) {
if (doc.$doctype == "vehicle")
{
emit(doc.userId, {_id: doc.userId});
}
}
EDIT: Getting closer. I'm not sure exactly what I was expecting, but the result seems a bit 'messy'. Row[0] is the user document, row[n > 0] are the vehicle documents. I guess it's fine when a startkey/endkey is used, but without the results are a bit jumbled up.
function (doc) {
if (doc.$doctype == 'user') {
emit([doc._id, 0], doc);
} else if (doc.$doctype == 'vehicle') {
emit([doc.userId, 1, doc._id], doc);
}
}
A user is described as,
{
"_id": "user:10",
"firstname": “firstnamehere",
"secondname": “secondnamehere",
"userId": "user:10",
"$doctype": "user"
}
a vehicle is described as,
{
"_id": "vehicle:4002”,
“name”: “avehicle”,
"userId": "user:10",
"$doctype": "vehicle",
}
You're getting in the right direction! You already got that right with the global IDs. Having the type of the document as part of the ID in some form is a very good idea, so that you don't get confused later (all documents are in the same "pot").
Here are some minor problems with your current solution (before getting to your actual question):
Don't emit the doc as value in emit(key, value). You can always ask for the document that belongs to a view row by querying with include_docs=true. Having the doc as view value increases the view indexes a lot. When you don't need a specific value, use emit(key, null).
You also don't need the ID in the emit value. You'll get the ID of the document that belongs to a view row as part of the row anyway.
View Collation
Now to your problem of aggregating the vehicles with their user. You got the basic pattern right. This pattern is called view collation, you can read more about it in the CouchDB docs (ignore that it is in the "Couchapp" section).
The trick with view collation is that you return two or more types of documents, but make sure that they are sorted in a way that allows for direct grouping. Thus it is important to understand how CouchDB sorts the view result. See the collation specification for more information on that one. An important key to understanding view collation is that rows with array keys are sorted by key elements. So when two rows have the same key[0], they sort by key[1]. If that's equal as well, key[2] is considered, and so on.
Your map function frist groups users and vehicles by user ID (key[0]). Your map function then uses the fact that 0 sorts before 1 in the second element of the key, so your view will contain the following:
user 1
vehicle of user 1
vehicle of user 1
vehicle of user 1
user 2
user 3
vehicle of user 3
user 4
etc.
As you can see, the vehicles of a user immediately follow their user. Thus you can group this result into aggregates without performing expensive sort or lookup operations.
Note that users are sorted according to their ID, and vehicles within users also according to their ID. This is because you use the IDs in the key array.
Creating Queries
Now that view isn't worth much if you can't query according to your needs. A view as you have it supports the following queries:
Get all users with their vehicles
Get a range of users with their vehicles
Get a single user with its vehicles
Get a single user without vehicles (you could also use the _all_docs view for that though)
Example query for "all users between user 1 and user 3 (inclusive) with their vehicles"
We want to query for a range, so we use startkey and endkey in the query:
startkey=["user:1", 0]
endkey=["user:3", 1, {}]
Note the use of {} as sentinel value, which is required so that the end key is larger than any row that has a key of ["user:3", 1, (anyConceivableVehicleId)]

Neo4J: Match all nodes with relation that has specific attribute value?

I need to find all nodes connected with relation that has attribute fld = email. Neo4j Cypher complains the following as a query with bad syntax:
MATCH (n)-[r:rel*..]-(m) WHERE has(r.fld) and r.fld='email' RETURN n,r,m
What would be the right one?
Best bet:
MATCH (n)-[r:rel {fld: 'email'}]-(m) RETURN n, r, m;
This should match nodes that are connected with a "rel" relationship that has a property "fld" with the value "email".
HTH

Resources