How to perform a bidirectional upsert in ArangoDB

How to perform a bidirectional upsert in ArangoDB - arangodb

I'm working with a dataset similar to ArangoDB official "friendship" example, except I'm adding a "weight" concept on the Edge Collection. Like so :
People
[
{ "_id": "people/100", "_key": "100", "name": "John" },
{ "_id": "people/101", "_key": "101", "name": "Fred" },
{ "_id": "people/102", "_key": "102", "name": "Jacob" },
{ "_id": "people/103", "_key": "103", "name": "Ethan" }
]
Friendship
[
{ "_from": "people/100", "_to": "people/101", "weight": 27 },
{ "_from": "people/103", "_to": "people/102", "weight": 31 },
{ "_from": "people/102", "_to": "people/100", "weight": 12 },
{ "_from": "people/101", "_to": "people/103", "weight": 56 }
]
I want to write a function that, when someone interacts with someone else, UPSERTs the Friendship between the two (incrementing the weight by 1 if it existed before, or initializing with a weight of 1 if it's new).
The trouble is, when executing that function, I have now clue on which direction the friendship was initialized, thus I cannot really use an upsert. So 2 questions here :
Is there any way to make an upsert on an edge with "bidirectional" filter ?
Like so, but bidirectional
UPSERT {
// HERE, I BASICALLY WAN'T TO IGNORE THE SIDE
_from: ${people1}, _to: ${people2}
}
INSERT {
_from: ${people1}, _to: ${people2}, weight: 1
}
UPDATE {
weight: OLD.weight + 1
}
IN ${friendshipCollection}
RETURN NEW
Instead of trying to "select the friendship, no matter the direction"; should I rather actually duplicate the friendship on both directions (and constantly maintain / update it) ?

Related

How to update some collection with different query parameters and with different set values?

I have an array of mongoDB collection -
[
{
"_id": "630499244683ed43d56edd06",
"userId": "630499234683ed43d56edd05",
"isPaid": "true"
},
{
"_id": "6304c19bda84477b41b4bbfa",
"userId": "630499234683ed43d56edd05",
"isPaid": "true"
},
{
"_id": "6304c1b5da84477b41b4bbfb",
"userId": "630499234683ed43d56edd05",
"isPaid": "true"
},
{
"_id": "6304c1cbda84477b41b4bbfc",
"userId": "630499234683ed43d56edd05",
"isPaid": "true"
},
]
I just want to add the order property to all the object but for all the specific _id I need to add the different order value
Like this -
[
{
"_id": "630499244683ed43d56edd06",
"userId": "630499234683ed43d56edd05",
"isPaid": "true",
"order": 7
},
{
"_id": "6304c19bda84477b41b4bbfa",
"userId": "630499234683ed43d56edd05",
"isPaid": "true",
"order": 5
},
{
"_id": "6304c1b5da84477b41b4bbfb",
"userId": "630499234683ed43d56edd05",
"isPaid": "true",
"order": 0,
},
{
"_id": "6304c1cbda84477b41b4bbfc",
"userId": "630499234683ed43d56edd05",
"isPaid": "true",
"order": 2
},
]
Please let me know How do I implement this approach in mongoDB?
I'm not using mongoose in my project.

You have to add the structure of the coming data, how it grouped with _id and the count of order. If it is an array of object, you can use loop. I don't think there is a solution to update different values in only one code.

If you want specific ids to have specific order values you need to update them separately. You could use bulkWrite with the updateOne method.
db.collection.bulkWrite( [
{ updateOne: {
filter: { _id: ... },
update: { $set: { order: 1 } }
} },
...
] )
Refer to the MongoDB documentation and to your language driver in particular: https://www.mongodb.com/docs/v6.0/reference/method/db.collection.bulkWrite/#updateone-and-updatemany

How to use CouchDB Mango query (/db/_find) with an index to select multiple _id keys

I am using CouchDB 3.1.1 to perform Mango queries against a database containing a large number of documents. A very common requirement in my application is to perform queries on a very specific and dynamic set of documents. From what I understand at this moment, these are the only choices I have on how to confront my problem:
Make multiple requests to /db/_find each with a distinct "_id"
Make a single call to /db/_find
Of the ways I can accomplish the second choice:
Use an "$or" array on all the "_id": value pairs
Use an "$or" array on all the values of the "_id" key
The second choice is what I would prefer to use since making multiple POST requests would incur overhead. Unfortunately using "$or" seems to get in the way of the query engine making use of the "_id" index.
Thus, choice #1 returns with a speedy 2 ms per transaction but the results are not sorted (requiring my application to do the sorting). Choice #2, given an array of 2 _ids, regardless of the $or syntax, takes over 3 seconds to render.
What is the most efficient way to use a CouchDB Mango query index against a specific set of documents?
Fast Example: Results using a single _id
{
"selector": {
"_id": "184094"
},
"fields": [
"_id"
]
}
documents examined: 26,312
results returned: 1
execution time: 2 ms
Slow Example: Results using $or of key / value pairs
{
"selector": {
"$or": [
{
"_id": "184094"
},
{
"_id": "157533"
}
]
},
"fields": [
"_id"
]
}
documents examined: 26,312
results returned: 2
execution time: 2,454 ms
Slow Example: Results using $or array of values
{
"selector": {
"_id": {
"$or": [
"184094",
"157533"
]
}
},
"fields": [
"_id"
]
}
documents examined: 26,312
results returned: 2
execution time: 2,522 ms
Slow Example: Results using $in (which is illegal but still returns results)
{
"selector": {
"_id": {
"$in": [
"184094",
"157533"
]
}
},
"fields": [
"_id"
]
}
documents examined: 26,312
results returned: 2
execution time: 2,618 ms
Index: The registered index for _id
{
"_id": "_design/508b5b51e6085c2f96444b82aced1e5dfec986b2",
"_rev": "1-f951eb482f9a521752adfdb6718a6a59",
"language": "query",
"views": {
"foo-index": {
"map": {
"fields": {
"_id": "asc"
},
"partial_filter_selector": {}
},
"reduce": "_count",
"options": {
"def": {
"fields": [
"_id"
]
}}}}}
Explain: An 'explain' summary done to one of the slow queries. Note that the registered index was used.
{
"dbname": "dnp_person_comment",
"index": {
"ddoc": "_design/508b5b51e6085c2f96444b82aced1e5dfec986b2",
"name": "foo-index",
"type": "json",
"partitioned": false,
"def": {
"fields": [
{
"_id": "asc"
}
]
}
},
"partitioned": false,
"selector": {
"$or": [
{
"_id": {
"$eq": "184094"
}
},
{
"_id": {
"$eq": "157533"
}
}
]
},
"opts": {
"use_index": [],
"bookmark": "nil",
"limit": 25,
"skip": 0,
"sort": {},
"fields": [
"_id"
],
"partition": "",
"r": [
49
],
"conflicts": false,
"stale": false,
"update": true,
"stable": false,
"execution_stats": false
},
"limit": 25,
"skip": 0,
"fields": [
"_id"
],
"mrargs": {
"include_docs": true,
"view_type": "map",
"reduce": false,
"partition": null,
"start_key": [],
"end_key": [
"<MAX>"
],
"direction": "fwd",
"stable": false,
"update": true,
"conflicts": "undefined"
}
}

ArangoDb get edges with properties

I am using ArangoDb newest version and I have problem.
I have two collections:
Country (and this is document collection) and Distance (this is edge collection with keys like: _from, _to, distance).
How can I get via AQL all information about countries where Country.Continent = 'Europe' with distances between them from edge collection?
SQL would be like this:
Select * from Country c, Distance d where c.Continent = 'Europe'
Thank You.

I have been working on a project recently and started using ArangoDB so hopefully I can be of assistance to you.
I took some inspiration for my answer from the below links of the Arango and AQL documentation:
AQL Graph Traversal
Shortest Path in AQL
Please see below my AQL query and do let me know if that helped at all. You can substitute the 'Europe' part on the FILTER for #Continent which will allow you to specify it dynamically, if need be.
FOR country IN Country
FILTER country.Continent == 'Europe'
FOR vertex, edge, path
IN OUTBOUND country Distance
RETURN path
This yields the following result for me. I just created some test collections with 2 edges linking countries together. I have included the vertex, edge as well as the path of the query in the 'FOR' part, so you are welcome to play around with the 'RETURN' part at the end by substituting the vertex or edge and seeing what results that yields for you.
[
{
"edges": [
{
"_key": "67168",
"_id": "Distance/67168",
"_from": "Country/67057",
"_to": "Country/67094",
"_rev": "_aecXk7---_",
"Distance": 5
}
],
"vertices": [
{
"_key": "67057",
"_id": "Country/67057",
"_rev": "_aecWJ0q--_",
"countryName": "UK",
"Continent": "Europe"
},
{
"_key": "67094",
"_id": "Country/67094",
"_rev": "_aecWZhi--_",
"countryName": "Italy",
"Continent": "Europe"
}
]
},
{
"edges": [
{
"_key": "67222",
"_id": "Distance/67222",
"_from": "Country/67057",
"_to": "Country/67113",
"_rev": "_aecYB9---_",
"Distance": 10
}
],
"vertices": [
{
"_key": "67057",
"_id": "Country/67057",
"_rev": "_aecWJ0q--_",
"countryName": "UK",
"Continent": "Europe"
},
{
"_key": "67113",
"_id": "Country/67113",
"_rev": "_aecWmEy--_",
"countryName": "Spain",
"Continent": "Europe"
}
]
}
]
For example if you substitute the 'RETURN path' part with 'RETURN edge', you will just retrieve the edges if that is all you need, as per below:
[
{
"_key": "67168",
"_id": "Distance/67168",
"_from": "Country/67057",
"_to": "Country/67094",
"_rev": "_aecXk7---_",
"Distance": 5
},
{
"_key": "67222",
"_id": "Distance/67222",
"_from": "Country/67057",
"_to": "Country/67113",
"_rev": "_aecYB9---_",
"Distance": 10
}
]

Mongodb aggregation or projection

{
"items": [
{
"id": "5bb619e49593e5d3cbaa0b52",
"name": "Flowers",
"weight": "1.5"
},
{
"id": "5bb619e4ebdccb9218aa9dcb",
"name": "Chair",
"weight": "8.4"
},
{
"id": "5bb619e4911037797edae511",
"name": "TV",
"weight": "20.8"
},
{
"id": "5bb619e4504f248e1be543d3",
"name": "Skateboard",
"weight": "5.9"
},
{
"id": "5bb619e40fee29e3aaf09759",
"name": "Donald Trump statue",
"weight": "18.4"
},
{
"id": "5bb619e44251009d72e458b9",
"name": "Molkkÿ game",
"weight": "17.9"
},
{
"id": "5bb619e439d3e99e2e25848d",
"name": "Helmet",
"weight": "22.7"
}
]
}
I have this structure of models. I want to calculate the weight of each order.
Should I use aggregation or does someone have any idea?
this is an example of order :
{
"id": "5bb61dfd4d64747dd8d7d6cf",
"date": "Sat Aug 11 2018 02:01:25 GMT+0000 (UTC)",
"items": [
{
"item_id": "5bb619e44251009d72e458b9",
"quantity": 4
},
{
"item_id": "5bb619e4504f248e1be543d3",
"quantity": 2
},
{
"item_id": "5bb619e40fee29e3aaf09759",
"quantity": 3
}
]
}

You can use below aggregation
db.order.aggregate([
{ "$unwind": "$items" },
{ "$lookup": {
"from": "items",
"localField": "items.item_id",
"foreignField": "id",
"as": "item"
}},
{ "$unwind": "$item" },
{ "$addFields": { "items.weight": "$item.weight" }},
{ "$group": {
"_id": "$_id",
"items": { "$push": "$items" },
"date": { "$first": "$date" }
}}
])

You have two options here without changing your model structure:
pull all items used in Parcel from database in your application
perform all computations on database side using aggregation (and $lookup)
It very depends on your actual data model and dataset size. First option is very straightforward and potentially can be more performant on big datasets especially when sharding/replica set involved. But it requires more roundtrips to database which will bring more latency. On the other hand aggregation in certain cases can be quite slow on lookups.
But the only good way is to test it on your real data. If your current dataset is tiny (say 100s of Mb) choose the way you comfortable with - both will work great.
Update
Since you need to distribute Orders to Parcels I'd prefer to go with option #1, though using aggregation is still possible.
This is what I would do:
pull an Order from database
pull all related Items from database by ids found in Order.items
perform calculation of Order weight
create one Parcel if weight < 30 and save it to database
or if weight > 30 distribute somehow Items to Parcels and save them to database
Note, that you can pull multiple Items by their ids in one call with query like this:
{
_id: { $in: [<id1>, <id2>] }
}
There is also one more thing to consider. Please pay attention to the fact that MongoDB do not have transactions or multidocument atomicity. So performing this type of operations (pulling something from DB, performing calculations, and storing back) with schema defined the way you show can lead to creating duplicates.

How to get friend's leaderboard in MongoDB

This is my Friends Collection
[
{
"_id": "59e4fbcac23f38cdfa6963a8",
"friend_id": "59e48f0af8c277d7a8886ed7",
"user_id": "59e1d36ad17ad5ad3d0453f7",
"__v": 0,
"created_at": "2017-10-16T18:34:50.875Z"
},
{
"_id": "59e5065f705a90cfa218c9e5",
"friend_id": "59e48f0af8c277d7a8886edd",
"user_id": "59e1d36ad17ad5ad3d0453f7",
"__v": 0,
"created_at": "2017-10-16T19:19:59.483Z"
}
]
This is my Scores collection:
[
{
"_id": "59e48f0af8c277d7a8886ed8",
"score": 19,
"user_id": "59e48f0af8c277d7a8886ed7",
"created_at": "2017-10-13T09:02:10.010Z"
},
{
"_id": "59e48f0af8c277d7a8886ed9",
"score": 24,
"user_id": "59e48f0af8c277d7a8886ed7",
"created_at": "2017-10-11T00:56:10.010Z"
},
{
"_id": "59e48f0af8c277d7a8886eda",
"score": 52,
"user_id": "59e48f0af8c277d7a8886ed7",
"created_at": "2017-10-24T09:16:10.010Z"
},
]
This is my Users collection.
[
{
"_id": "59e48f0af8c277d7a8886ed7",
"name": "testuser_0",
"thumbnail": "path_0"
},
{
"_id": "59e48f0af8c277d7a8886edd",
"name": "testuser_1",
"thumbnail": "path_1"
},
{
"_id": "59e48f0af8c277d7a8886ee3",
"name": "testuser_2",
"thumbnail": "path_2"
},
{
"_id": "59e48f0af8c277d7a8886ee9",
"name": "testuser_3",
"thumbnail": "path_3"
},
]
And finally i need list of friends sorted in highscore order for a particular time period (say last 24 hours) with something like this...
[
{
"friend_id": "59e48f0af8c277d7a8886ed7",
"friend_name":"test_user_2"
"thumbnail":"image_path",
"highscore":15
},
"friend_id": "59e48f0af8c277d7a8886edd",
"friend_name":"test_user_3"
"thumbnail":"image_path",
"highscore":10
}
]
What's the best way to achieve this? I have tried aggregation pipeline but getting quite confused with working with 3 collections.

Following your answers, an array size of 500 entries in a document may not be a bad idea to store the friends as you would only store "friends id" and "created" in each entry. It saves having a collection.
You would not have too much performances issues if you project the data in your query by selecting only the fields you want.
https://docs.mongodb.com/v3.2/tutorial/project-fields-from-query-results/#return-specified-fields-only
For the score that increase of 30 per day; it depends what type of query you do.
It would take a while to reach the 2MB limit per the document by adding 30 scores per day.
regarding joining the different collections there is a stack overflow question about it:
How do I perform the SQL Join equivalent in MongoDB?
or
https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/
You will need to use the aggregation framework from mongoDB to use if; not just a find command.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string