How to create grouped vertices graphs with ArangoDB - arangodb

I'm discovering ArangoDB and I'm wondering if it is a suitable environment to work on graphs that display grouped nodes relations.
Here is a simple example :
Given a set of nodes :
[ {
"_key" : "a1",
"group" : "a"
}, {
"_key" : "a2",
"group" : "a"
}, {
"_key" : "b1",
"group" : "b"
}, {
"_key" : "b2",
"group" : "b"
}, {
"_key" : "c1",
"group" : "c"
}, {
"_key" : "c2",
"group" : "c"
} ]
And their relations :
[ {
"_from" : "nodes/a1",
"_to" : "nodes/b1",
"qty" : 40
}, {
"_from" : "nodes/a2",
"_to" : "nodes/b2",
"qty" : 50
}, {
"_from" : "nodes/a1",
"_to" : "nodes/c1",
"qty" : 10
}, {
"_from" : "nodes/a2",
"_to" : "nodes/c1",
"qty" : 50
}, {
"_from" : "nodes/a2",
"_to" : "nodes/c2",
"qty" : 50
} ]
What is the best way for generating a graph that instead of displaying discrete nodes relations with the qty as an edge weight, will display "group" agregated nodes relations with the weight being the sum of "qty" of nodes inside each group.
So for instance, the graph would display a relation between the group "a" and "c" with a weight of 110.
Does it make any sense to use ArangoDB for such use cases ? And how to do it in a versatile way ?
Thanks for your help

Related

How to stop endAt when it tries to search again from the end?

I am stuck when using endAt() in Firebase RealtimeDatabase.Some incorrect values appear in my results. Here is my code:
resuftObj = await messagesRef
.child("test")
.orderByChild("country")
.endAt("ja", "9")
.limitToLast(4)
.once("value");
Here my example data or view:
|-test:
|---1:{
"country" : "us",
"id" : 1
},
|---2:{
"country" : "ja",
"id" : 2
},
|---3:{
"country" : "ca",
"id" : 3
},
|---4:{
"country" : "uk",
"id" : 4
},
|---5:{
"country" : "us",
"id" : 5
},
|---6:{
"country" : "ca",
"id" : 6
},
|---7:{
"country" : "uk",
"id" : 7
},
|---8:{
"country" : "ja",
"id" : 8
},
|---9:{
"country" : "us",
"id" : 9
}
I would expect returned objects with "country" = "ja." But I see two values 3 and 6 ("country" = "ca"). My thinking is that when it reaches id= 2, my Query finds no more results so re-search from the end (id=9). How to prevent this?
Result:
{
"2": {
"country": "ja",
"id": 2
},
"3": {
"country": "ca",
"id": 3
},
"6": {
"country": "ca",
"id": 6
},
"8": {
"country": "ja",
"id": 8
}
}
To further limit the results to only those with country equal to ja, you must also specify startAt("ja"):
ref.orderByChild("country")
.startAt("ja")
.endAt("ja", "9")
.limitToLast(4)
Also see my repro here: https://jsbin.com/fajegol/edit?js,console
Update: it is easier to see why you're getting the ca results of you print the results in the right order (which I didn't do initially):
ref.orderByChild("country")
.endAt("ja", "8")
.limitToLast(4)
.once("value", function(snapshot) {
snapshot.forEach(function(child) {
console.log(child.val());
})
});
Gives this output:
{ "country" : "ca", "id" : 3 }
{ "country" : "ca", "id" : 6 }
{ "country" : "ja", "id" : 2 }
{ "country" : "ja", "id" : 8 }
With that it makes a lot more sense: the child nodes are ordered on country and this is the last 4 nodes before the one you told it to endAt().
The output in your question comes from logging snapshot.val(), which gives you the results without the order you requested (since keys in JSON are unordered by definition).

Arangodb AQL Joining, merging, embedding nested three collections or more

I have the following collections, based on the example Arangodb doc here but have added a third collection called region
Users
{
"name" : {
"first" : "John",
"last" : "Doe"
},
"city" : "cities/2241300989",
"_id" : "users/2290649597",
"_rev" : "2290649597",
"_key" : "2290649597"
}
Cities
{
"population" : 1000,
"name" : "Metropolis",
"region" : "regions/2282300990",
"_id" : "cities/2241300989",
"_rev" : "2241300989",
"_key" : "2241300989"
}
Regions
{
"name" : "SomeRegion1",
"_id" : "regions/2282300990",
"_rev" : "2282300990",
"_key" : "2282300990"
}
I want to have a target result like this
[
{
"user" : {
"name" : {
"first" : "John",
"last" : "Doe"
},
"_id" : "users/2290649597",
"_rev" : "2290649597",
"_key" : "2290649597"
},
"city" : {
"population" : 1000,
"name" : "Metropolis",
"_id" : "cities/2241300989",
"_rev" : "2241300989",
"_key" : "2241300989",
"region" : {
"name" : "SomeRegion1",
"_id" : "regions/2282300990",
"_rev" : "2282300990",
"_key" : "2282300990"
}
}
}
]
The example in the Arangodb doc here only has queries for two collections
FOR u IN users
FOR c IN cities
FILTER u.city == c._id RETURN merge(u, {city: c})
# However I want to have more than two collections e.g.
FOR u IN users
FOR c IN cities
For r IN regions
FILTER u.city == c._id and c.region == r._id RETURN merge(????????)
How would you get the result with three collections joined as above? What happens if I want a forth nested one?
When you store a document _id that references another collection, then you can leverage the DOCUMENT AQL command.
So your AQL query becomes a bit simpler, like this:
FOR u IN users
LET city = DOCUMENT(u.city)
LET city_with_region = MERGE(city, { region: DOCUMENT(city.region})
RETURN MERGE(u, { city: city_with_region})
This query could be collapsed even more, but I left it like this so it's more self documenting.
What is cool about DOCUMENT is you can return only a single attribute of a document, such as LET region_name = DOCUMENT(city.region).name.
I've also found that in most cases it's more performant that doing a subquery to locate the document.
Probably something like this:
FOR u IN users
FOR c IN cities
For r IN regions
FILTER u.city == c._id AND c.region == r._id
RETURN { user: u, city: MERGE(c, {region: r } }
Is there a particular reason why you store ids instead of keys to refer to cities and regions? The _id is just a virtual field that consists of the _key prefixed by the collection name (plus a slash). So this would work just as well (I intentionally omit the internal _id and _rev fields):
Users
{
"name" : {
"first" : "John",
"last" : "Doe"
},
"city" : "2241300989",
"_key" : "2290649597"
}
Cities
{
"population" : 1000,
"name" : "Metropolis",
"region" : "2282300990",
"_key" : "2241300989"
}
Regions
{
"name" : "SomeRegion1",
"_key" : "2282300990"
}
FOR u IN users
FOR c IN cities
For r IN regions
FILTER u.city == c._key AND c.region == r._key
RETURN { user: u, city: MERGE(c, {region: r } }

Sort JSON document by values embedded in an array of objects

I have a document in the below format. The goal is to group the document by student name and sort it by rank in the ascending order. Once that is done, iterate through the rank(within a student) and if each subsequent rank is greater than the previous one, the version field needs to be incremented. As part of a pipeline, student_name will be passed to me so matching by student name should be good instead of grouping.
NOTE: Tried it with python and works to some extent. A python solution would also be great!
{
"_id" : ObjectId("5d389c7907bf860f5cd11220"),
"class" : "I",
"students" : [
{
"student_name" : "AAA",
"Version" : 2,
"scores" : [
{
"value" : "50",
"rank" : 2
},
{
"value" : "70",
"rank" : 1
}
]
},
{
"student_name" : "BBB",
"Version" : 5,
"scores" : [
{
"value" : 80,
"rank" : 2
},
{
"value" : 100,
"rank" : 1
},
{
"value" : 100,
"rank" : 1
}
]
}
]
}
I tried this piece of code to sort
def version(student_name):
db.column.aggregate(
[
{"$unwind": "$students"},
{"$unwind": "$students.scores"},
{"$sort" : {"students.scores.rank" : 1}},
{"$group" : {"students.student_name}
]
)
for i in range(0,(len(students.scores)-1)):
if students.scores[i].rank < students.scores[i+1].rank:
tag.update_many(
{"$inc" : {"students.Version":1}}
)
The expected output for student AAA should be
{
"_id" : ObjectId("5d389c7907bf860f5cd11220"),
"class" : "I",
"students" : [
{
"student_name" : "AAA",
"Version" : 3, #version incremented
"scores" : [
{
"value" : "70",
"rank" : 1
},
{
"value" : "50",
"rank" : 2
}
]
}
I was able to sort the document.
pipeline = [
{"$unwind": "$properties"},
{"$unwind": "$properties.values"},
{"$sort" : {"$properties.values.rank" : -1}},
{"$group": {"_id" : "$properties.property_name", "values" : {"$push" : "$properties.values"}}}
]
import pprint
pprint.pprint(list(db.column.aggregate(pipeline)))

MongoDB Manual and DBRef referencing using node js

I have two case ,first case: where in my key has reference to the same collection ,is this correct way to reference ?,if yes ! how to de-ref/link in node js ?,if not how to reference fom the same collection
{
"_id" : ObjectId("53402597d852426020000002"),
"address" : {
"$ref" : "home_adress"
},
"contact" : "987654321",
"dob" : "01-01-1991",
"name" : "Tom Benzamin",
"Post_address" : {
"$ref" : "home_adress"
},
"home_adress":"Street 1, NY"}
case two: reference to a different collection(DBRefs),How to detect the reference and send the second query to fetch the referenced value ?
{
"_id" : ObjectId("53402597d852426020000002"),
"address" : {
"$ref" : "address_home",
"$id" : ObjectId("534009e4d852427820000002"),
"$db" : "ref"
},
"contact" : "987654321",
"dob" : "01-01-1991",
"name" : "Tom Benzamin"}
{
"_id" : ObjectId("534009e4d852427820000002"),
"building" : "22 A, Indiana Apt",
"pincode" : 123456,
"city" : "Los Angeles",
"state" : "California"}

MongoDB remove the lowest score, node.js

I am trying to remove the lowest homework score.
I tried this,
var a = db.students.find({"scores.type":"homework"}, {"scores.$":1}).sort({"scores.score":1})
but how can I remove this set of data?
I have 200 pieces of similar data below.
{
"_id" : 148,
"name" : "Carli Belvins",
"scores" : [
{
"type" : "exam",
"score" : 84.4361816750119
},
{
"type" : "quiz",
"score" : 1.702113040528119
},
{
"type" : "homework",
"score" : 22.47397850465176
},
{
"type" : "homework",
"score" : 88.48032660881387
}
]
}
you are trying to remove an element but the statement you provided is just to find it.
Use db.students.remove(<query>) instead. Full documentation here

Resources