How can I check for duplicate documents in Mongoose? - node.js

Here is an example of a nested document that I have in my collection:
"person" : [
{
"title" : "front-end developer",
"skills" : [
{
"name" : "js",
"project" : "1",
},
{
"name" : "CSS",
"project" : "5",
}
]
},
{
"title" : "software engineer",
"skills" : [
{
"name" : "Java",
"project" : "1",
},
{
"name" : "c++",
"project" : "5",
}
]
}
]
Is there a simple way of determining whether other documents are identical to this object e.g. has the same keys, value and array indexes? Currently my method of checking for duplicates is very long and requires multiple nested loops. Any help would be greatly appreciated. Thanks!

If you want to get a list of identical (except for the _id field, obviously) documents in your collection, here is how you can do that:
collection.aggregate({
$project: {
"_id": 1, // keep the _id field where it is anyway
"doc": "$$ROOT" // store the entire document in the "doc" field
}
}, {
$project: {
"doc._id": 0 // remove the _id from the stored document because we do not want to compare it
}
}, {
$group: {
"_id": "$doc", // group by the entire document's contents as in "compare the whole document"
"ids": { $push: "$_id" }, // create an array of all IDs that form this group
"count": { $sum: 1 } // count the number of documents in this group
}
}, {
$match: {
"count": { $gt: 1 } // only show what's duplicated
}
})
As always with the aggregation framework, you can try to make sense of what exactly is going on in each step by commenting out all steps and then activating everything again stage by stage.

Related

Update all the key values of a dynamic object in MongoDB

I have a object which has dynamic keys all the values in that are numeric integers, i like to update all the key values in that object
{
"_id" : ObjectId("6395fc7b1c5a0c4a5fc9bd8e"),
"users" : [
ObjectId("638da89d0066308efe081709"),
ObjectId("63844feadf507942caaf90e3"),
ObjectId("638455e5fa983e9cf84c0f3f")
],
"type" : "GROUP",
"unReadCount" : {
"638da89d0066308efe081709" : 0,
"63844feadf507942caaf90e3" : 0,
"638455e5fa983e9cf84c0f3f" : 0
},
"createdAt" : ISODate("2022-12-11T21:21:23.815+05:30"),
"updatedAt" : ISODate("2022-12-11T22:48:33.953+05:30"),
"__v" : 0
},
I want to increment the unReadCount entire object values, please note the unReadCount object keys are not static it varies document to document. I tried with normal $inc operator it thrown error stating that has the field 'unReadCount' of non-numeric type object" $ wouldn't work as its not an array.
Please note that am trying to achieve this in MongoDB, i can do this via JS code by fetching the records and looping through it, but i like to do it in MongoDB/Mongoose. Any clue/help is appreciated
I think here is what you need
db.tests.updateMany({},[
{
$addFields: {
unReadCountArray: { $objectToArray: "$unReadCount" }
}
},
{
$addFields: {
unReadCountArray: {
$map: {
input: "$unReadCountArray",
as: "unReadCount",
in:
{
$mergeObjects: [ { k:"$$unReadCount.k", v: {$add: ["$$unReadCount.v", 1] }}, null ]
}
}
}
}
},
{
$addFields: {
unReadCount: {
$arrayToObject: '$unReadCountArray'
}
}
},{
$unset:'unReadCountArray'
},
{ $set: { modified: "$$NOW"} }])

Get documents that contains specific attributes in array

I have a document collection with recipes that looks like this:
{
"title" : "Pie",
"url" : "pie.png",
"people" : "4",
"ingredients" : [
{
"amount" : "150",
"measure" : "g",
"name" : "Butter"
},
{
"amount" : "200",
"measure" : "g",
"name" : "Flour"
}
],
"_id" : ObjectId("55acf33223ae282719bdc9b7")
}
Im trying to create a query that retrieves all the documents that contains multiple fields, like "butter" and "flour".
I have managed to retrieve documents that contains one field, like the query below:
db.recipe.find(
{"ingredients.name": "Butter"},
{"_id": 1, "ingredients": {"$elemMatch": {"name": "Butter"}}},
callback
);
I’ve tried using
{
$all: [
{ $elemMatch: { name: "Butter" }},
{ $elemMatch:{ name: "Flour"}}
]
}
but I cant get it to work. Any help appreciated!
How about this:
db.recipe.find({
"$and": [
{ "ingredients.name": "Butter" },
{ "ingredients.name": "Flour" }
]
})
EDIT(thanks to #BlakesSeven):
The shorter way to write the above is using the $all operator.
db.recipe.find({ "ingredients.name": { "$all": ["Butter", "Flour"] } }

MongoDb - $match filter not working in subdocument

This is Collection Structure
[{
"_id" : "....",
"name" : "aaaa",
"level_max_leaves" : [
{
level : "ObjectIdString 1",
max_leaves : 4,
}
]
},
{
"_id" : "....",
"name" : "bbbb",
"level_max_leaves" : [
{
level : "ObjectIdString 2",
max_leaves : 2,
}
]
}]
I need to find the subdocument value of level_max_leaves.level filter when its matching with given input value.
And this how I tried,
For example,
var empLevelId = 'ObjectIdString 1' ;
MyModel.aggregate(
{$unwind: "$level_max_leaves"},
{$match: {"$level_max_leaves.level": empLevelId } },
{$group: { "_id": "$level_max_leaves.level",
"total": { "$sum": "$level_max_leaves.max_leaves" }}},
function (err, res) {
console.log(res);
});
But here the $match filter is not working. I can't find out exact results of ObjectIdString 1
If I filter with name field, its working fine. like this,
{$match: {"$name": "aaaa" } },
But in subdocument level its returns 0.
{$match: {"$level_max_leaves.level": "ObjectIdString 1"} },
My expected result was,
{
"_id" : "ObjectIdString 1",
"total" : 4,
}
You have typed the $match incorrectly. Fields with $ prefixes are either for the implemented operators or for "variable" references to field content. So you just type the field name:
MyModel.aggregate(
[
{ "$match": { "level_max_leaves.level": "ObjectIdString 1" } },
{ "$unwind": "$level_max_leaves" },
{ "$match": { "level_max_leaves.level": "ObjectIdString 1" } },
{ "$group": {
"_id": "$level_max_leaves.level",
"total": { "$sum": "$level_max_leaves.max_leaves" }
}}
],
function (err, res) {
console.log(res);
}
);
Which on the sample you provide produces:
{ "_id" : "ObjectIdString 1", "total" : 4 }
It is also good practice to $match first in your pipeline. That is in fact the only time an index can be used. But not only for that, as without the initial $match statement, your aggregation pipeline would perform an $unwind operation on every document in the collection, whether it met the conditions or not.
So generally what you want to do here is
Match the documents that contain the required elements in the array
Unwind the array of the matching documents
Match the required array content excluding all others

Getting the number of unique values of a query

I have some documents with the following structure:
{
"_id": "53ad76d70ddd13e015c0aed1",
"action": "login",
"actor": {
"name": "John",
"id": 21337037
}
}
How can I make a query in Node.js that will return the number of the unique actors that have done a specific action. For example if I have a activity stream log, that shows all the actions done by the actors, and a actorscan make a specific action multiple times, how can I get the number of all the unique actors that have done the "login" action. The actors are identified by actor.id
db.collection.distinct()
db.collection.distinct("actor.id", { action: "login"})
will return all unique occiriences and then you can get count of a result set.
PS
do not forget about db.collection.ensureIndex({action: 1})
You can use aggregation framework for this:
db.coll.aggregate([
/* Filter only actions you're looking for */
{ $match : { action : "login" }},
/* finally group the documents by actors to calculate the num. of actions */
{ $group : { _id : "$actor", numActions: { $sum : 1 }}}
]);
This query will group the documents by the entire actor sub-document and calculate the number of actions by using $sum. The $match operator will filter only documents with specific action.
However, that query will work only if your actor sub-documents are the same. You said that you're identifying your actors by id field. So if, for some reason, actor sub-documents are not exactly the same, you will have problems with your results.
Consider these these three documents:
{
...
"actor": {
"name": "John",
"id": 21337037
}
},
{
...
"actor": {
"name": "john",
"id": 21337037
}
},
{
...
"actor": {
"surname" : "Nash",
"name": "John",
"id": 21337037
}
}
They will be grouped in three different groups, even though the id field is the same.
To overcome this problem, you will need to group by actor.id.
db.coll.aggregate([
/* Filter only actions you're looking for */
{ $match : { action : "login" }},
/* finally group the documents to calculate the num. of actions */
{ $group : { _id : "$actor.id", numActions: { $sum : 1 }}}
]);
This query will correctly group your documents by looking only at the actor.id field.
Edit
You didn't specify what driver you were using so I wrote the examples for MongoDB shell.
Aggregation with Node.js driver is very similar but with one difference: Node.js is async The results of the aggregation are returned in the callback. You can check the Node.js aggregation documentation for more examples:
So the aggregation command in Node.js will look like this:
var MongoClient = require('mongodb').MongoClient;
MongoClient.connect('mongodb://127.0.0.1:27017/test', function(err, db) {
if(err) throw err;
var collection = db.collection('auditlogs');
collection.aggregate([
{ $match : { action : "login" }},
{ $group : { _id : "$actor.id", numActions: { $sum : 1 }}} ],
function(err, docs) {
if (err) console.error(err);
console.log(docs);
// do something with results
}
);
});
For these test documents:
{
"_id" : ObjectId("53b162ea698171cc1677fab8"),
"action" : "login",
"actor" : {
"name" : "John",
"id" : 21337037
}
},
{
"_id" : ObjectId("53b162ee698171cc1677fab9"),
"action" : "login",
"actor" : {
"name" : "john",
"id" : 21337037
}
},
{
"_id" : ObjectId("53b162f7698171cc1677faba"),
"action" : "login",
"actor" : {
"name" : "john",
"surname" : "nash",
"id" : 21337037
}
},
{
"_id" : ObjectId("53b16319698171cc1677fabb"),
"action" : "login",
"actor" : {
"name" : "foo",
"id" : 10000
}
}
It will return the following result:
[ { _id: 10000, numActions: 1 },
{ _id: 21337037, numActions: 3 } ]
The aggregation framework is your answer:
db.actors.aggregate([
// If you really need to filter
{ "$match": { "action": "login" } },
// Then group
{ "$group": {
"_id": {
"action": "$action",
"actor": "$actor"
},
"count": { "$sum": 1 }
}}
])
Your "actor" combination is "unique", so all you need to do it have the common "grouping keys" under the _id value for the $group pipeline stage and count those "distinct" combinations with $sum.

Get Sum of Values in MongoDB

I have an array and each list item in that array is an object the has a value key. What I would like to do is add all of the values to get a total. I want to do this on the back end rather than the front end. I have tried the aggregate method, but haven't had any luck as it returns an empty array. Here is my array:
"budgets" : [
{
"name" : "check",
"value" : "1000",
"expense" : "false",
"uniqueId" : 0.9880268634296954
},
{
"name" : "car",
"value" : "500",
"expense" : "true",
"uniqueId" : 0.1904486275743693
},
{
"name" : "car2",
"value" : "500",
"expense" : "false",
"uniqueId" : 0.23043920518830419
},
{
"name" : "car23",
"value" : "500",
"expense" : "false",
"uniqueId" : 0.014449386158958077
},
{
"name" : "car235",
"value" : "500",
"expense" : "false",
"uniqueId" : 0.831609656335786
}
],
What I want to do is get the total of the values that have "expense" : "true" and the get a separate total of "expense" : "false" how would I do this? Like I said I have tried the aggregate method, but I must be doing it wrong.
If you are new to using the aggregation framework commands you might not yet be aware of the $cond operator. You can use this in order to get your totals separated:
db.collection.aggregate([
// Unwind the array to de-normalize the items
{ "$unwind": "$budgets" },
// Group the totals with conditions
{ "$group": {
"_id": "$_id",
"expense": { "$sum": {
"$cond": [
"$budgets.expense",
"$budgets.value",
0
]
}},
"nonExpense": { "$sum": {
"$cond": [
"$budgets.expense",
0,
"$budgets.value"
]
}}
}}
])
So what that will do is evaluate the true/false condition of expense as the first argument, and when the condition is actually true then the second argument of the $cond will be chosen and passed on to the $sum. If the condition evaluates to false then the second argument is chosen.
But looking at your data you appear to have a problem:
{
"name" : "check",
"value" : "1000",
"expense" : "false",
"uniqueId" : 0.9880268634296954
},
Noting here that all of the fields, and most notably the expense and value items, are all strings. So this is a problem as while we could get around evaluating the true/false values by doing string comparisons rather than direct boolean, you simply cannot co-erce a string as a number that can be passed to $sum.
So firstly you need to fix your data, unless it is not actually in the form as you have represented. But when it meets a correct form, then you can do your aggregation.
First of all covert you datatype of value from string to integer(summable) type and then use
db.collectioName.aggregate(
{$unwind:"$budgets"},
{
$group:{_id: "$budgets.expense",
total: { $sum: "$budgets.value"}}
})
and then you should get result like this
{
"result" : [
{
"_id" : "true",
"total" : 500
},
{
"_id" : "false",
"total" : 2000
}
],
"ok" : 1
}
Try something like this
db.collection.aggregate([ { "$unwind": "$budgets" },
{ $match: { expense: "true" } },
{ $group: { _id: "$_id",
name: { $addToSet: "$name" },
uniqueId: { $addToSet: "$uniqueId" },
total: { $sum: "$$budgets.value" } } }
])

Resources