Related
Imagine the is a document like this:
{
_id: ObjectID('someIdblahbla')
users: [
{
_id: 'id1',
name: 'name1',
},
{
_id: 'id2',
name: 'name2',
},
{
_id: 'id3',
name: 'name3'
}
]
}
I have an array like this:
const newData = [
{_id: 'id1', name: 'newName1'},
{_id: 'id2', 'name': 'newName2', family:'newFamily2'}
]
what I want is to update the array in the document using the corresponding _id and using it to add/update each element.
so my end result would be like:
{
_id: ObjectID('someIdblahbla')
users: [
{
_id: 'id1',
name: 'newName1',
},
{
_id: 'id2',
name: 'newName2',
family:'newFamily2'
},
{
_id: 'id3',
name: 'name3'
}
]
}
my guess was using The filtered positional operator but I am not sure if it's the correct way to go and how to do it.
thank you for your kind tips beforehand.
There is no straight way to add/update in array, you can use update with aggregation pipeline starting from MongoDB 4.2,
First of all, you need to convert _id from string to objectId type, if you are using mongoose npm you can use mongoose.Types.ObjectId method or if you are using mongodb npm you can use ObjectId method,
let newData = [
{ _id: 'id1', name: 'newName1' },
{ _id: 'id2', 'name': 'newName2', family:'newFamily2' }
];
let newIds = [];
newData = newData.map(n => {
n._id = ObjectId(n._id); // or mongoose.Types.ObjectId(n._id)
newIds.push(n._id); // for checking conditions
return n;
});
You can put query condition, and do below operations,
$map to iterate loop of users array, check condition if user._id is in input newIds then do update operation otherwise do insert operation
update operation:
$filter to iterate loop of input newData and filter already present object from input so we can update it
$arrayElemAt to get first object from above filtered array
$mergeObjects to merge current object with above input object
insert operation:
$filter to iterate loop of newData array and return not present object means new items in array of objects
$concatArrays to concat above new and updated result array
db.collection.updateOne(
{ _id: ObjectId("someIdblahbla") },
[{
$set: {
users: {
$concatArrays: [
{
$map: {
input: "$users",
as: "u",
in: {
$cond: [
{ $in: ["$$u._id", newIds] },
{
$mergeObjects: [
"$$u",
{
$arrayElemAt: [
{
$filter: {
input: newData,
cond: { $eq: ["$$this._id", "$$u._id"] }
}
},
0
]
}
]
},
"$$u"
]
}
}
},
{
$filter: {
input: newData,
cond: { $not: { $in: ["$$this._id", "$users._id"] } }
}
}
]
}
}
}]
)
Playground
Query1 (update(merge objects) existing members, doesn't add new members)
Test code here
Replace
[{"_id": "id1","name": "newName1"},{"_id": "id2","name": "newName2","family": "newFamily2"}] with you array or the driver variable that hold the array
db.collection.update({
"_id": {
"$eq": "1"
}
},
[
{
"$addFields": {
"users": {
"$map": {
"input": "$users",
"as": "user",
"in": {
"$reduce": {
"input": [
{
"_id": "id1",
"name": "newName1"
},
{
"_id": "id2",
"name": "newName2",
"family": "newFamily2"
}
],
"initialValue": "$$user",
"in": {
"$let": {
"vars": {
"old_user": "$$value",
"new_user": "$$this"
},
"in": {
"$cond": [
{
"$eq": [
"$$old_user._id",
"$$new_user._id"
]
},
{
"$mergeObjects": [
"$$old_user",
"$$new_user"
]
},
"$$old_user"
]
}
}
}
}
}
}
}
}
}
])
Query2 (update(merge) if found, else push in the end)
Its like the above but finds the not-existing members,and push them in the end.Its a bit more slower and complicated
Test code here
Replace
[{"_id": "id1","name": "newName1"},{"_id": "id2","name": "newName2","family": "newFamily2"},{"_id": "id4","name": "newName4"}]
with your array or the driver variable that hold the array
db.collection.update({
"_id": {
"$eq": "1"
}
},
[
{
"$addFields": {
"yourarray": [
{
"_id": "id1",
"name": "newName1"
},
{
"_id": "id2",
"name": "newName2",
"family": "newFamily2"
},
{
"_id": "id4",
"name": "newName4"
}
]
}
},
{
"$addFields": {
"new-ids": {
"$setDifference": [
{
"$map": {
"input": "$yourarray",
"as": "u",
"in": "$$u._id"
}
},
{
"$map": {
"input": "$users",
"as": "u",
"in": "$$u._id"
}
}
]
}
}
},
{
"$addFields": {
"users": {
"$concatArrays": [
{
"$map": {
"input": "$users",
"as": "user",
"in": {
"$reduce": {
"input": "$yourarray",
"initialValue": "$$user",
"in": {
"$let": {
"vars": {
"old_user": "$$value",
"new_user": "$$this"
},
"in": {
"$cond": [
{
"$eq": [
"$$old_user._id",
"$$new_user._id"
]
},
{
"$mergeObjects": [
"$$old_user",
"$$new_user"
]
},
"$$old_user"
]
}
}
}
}
}
}
},
{
"$filter": {
"input": "$yourarray",
"as": "u",
"cond": {
"$in": [
"$$u._id",
"$new-ids"
]
}
}
}
]
}
}
},
{
"$unset": [
"yourarray",
"new-ids"
]
}
])
So during some code review I decided to improve existing query performance by improving one aggregation that was like this:
.aggregate([
//difference starts here
{
"$lookup": {
"from": "sessions",
"localField": "_id",
"foreignField": "_client",
"as": "sessions"
}
},
{
$unwind: "$sessions"
},
{
$match: {
"sessions.deleted_at": null
}
},
//difference ends here
{
$project: {
name: client_name_concater,
email: '$email',
phone: '$phone',
address: addressConcater,
updated_at: '$updated_at',
}
}
]);
to this:
.aggregate([
//difference starts here
{
$lookup: {
from: 'sessions',
let: {
id: "$_id"
},
pipeline: [
{
$match: {
$expr: {
$and:
[
{
$eq: ["$_client", "$$id"]
}, {
$eq: ["$deleted_at", null]
},
]
}
}
}
],
as: 'sessions'
}
},
{
$match: {
"sessions": {$ne: []}
}
},
//difference ends here
{
$project: {
name: client_name_concater,
email: '$email',
phone: '$phone',
address: addressConcater,
updated_at: '$updated_at',
}
}
]);
I thought that the second option should be better, since we have one less stage, but the difference in performance is massive in the opposite way, the first query runs on average ~40ms, the other one ranges between 3.5 - 5 seconds, 100 times more. The other collection (sessions) has around 120 documents, while this one about 152, but still, even if it was acceptable due to data size, why the difference between these two, isn't it basically the same thing, we are just adding the join condition in the pipeline with the other main condition of the join. Am I missing something?
Some functions or variables included there are mostly static or concatenation that shouldn't affect the $lookup part.
Thanks
EDIT:
Added query plans, for version 1:
{
"stages": [
{
"$cursor": {
"query": {
"$and": [
{
"deleted_at": null
},
{}
]
},
"fields": {
"email": 1,
"phone": 1,
"updated_at": 1,
"_id": 1
},
"queryPlanner": {
"plannerVersion": 1,
"namespace": "test.clients",
"indexFilterSet": false,
"parsedQuery": {
"deleted_at": {
"$eq": null
}
},
"winningPlan": {
"stage": "COLLSCAN",
"filter": {
"deleted_at": {
"$eq": null
}
},
"direction": "forward"
},
"rejectedPlans": []
}
}
},
{
"$lookup": {
"from": "sessions",
"as": "sessions",
"localField": "_id",
"foreignField": "_client",
"unwinding": {
"preserveNullAndEmptyArrays": false
}
}
},
{
"$project": {
"_id": true,
"email": "$email",
"phone": "$phone",
"updated_at": "$updated_at"
}
}
],
"ok": 1
}
For version 2:
{
"stages": [
{
"$cursor": {
"query": {
"deleted_at": null
},
"fields": {
"email": 1,
"phone": 1,
"sessions": 1,
"updated_at": 1,
"_id": 1
},
"queryPlanner": {
"plannerVersion": 1,
"namespace": "test.clients",
"indexFilterSet": false,
"parsedQuery": {
"deleted_at": {
"$eq": null
}
},
"winningPlan": {
"stage": "COLLSCAN",
"filter": {
"deleted_at": {
"$eq": null
}
},
"direction": "forward"
},
"rejectedPlans": []
}
}
},
{
"$lookup": {
"from": "sessions",
"as": "sessions",
"let": {
"id": "$_id"
},
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{
"$eq": [
"$_client",
"$$id"
]
},
{
"$eq": [
"$deleted_at",
null
]
}
]
}
}
}
]
}
},
{
"$match": {
"sessions": {
"$not": {
"$eq": []
}
}
}
},
{
"$project": {
"_id": true,
"email": "$email",
"phone": "$phone",
"updated_at": "$updated_at"
}
}
],
"ok": 1
}
One thing of note, the joined sessions collection has certain properties with very big data (some imported data), so I am thinking that in some way it may be affecting the query size due to this data? But why the difference between the two $lookup versions though.
The second version adds an aggregation pipeline execution for each document in the joined collection.
The documentation says:
Specifies the pipeline to run on the joined collection. The pipeline determines the resulting documents from the joined collection. To return all documents, specify an empty pipeline [].
The pipeline is executed for each document in the collection, not for each matched document.
Depending on how large the collection is (both # of documents and document size) this could come out to a decent amount of time.
after removing the limit, the pipeline version jumped to over 10 seconds
Makes sense - all of the additional documents due to the removal of limit also must have the aggregation pipeline executed for them.
It is possible that per-document execution of aggregation pipeline isn't as optimized as it could be. For example, if the pipeline is set up and torn down for each document, there could easily be more overhead in that than in the $match conditions.
Is there any case when using one or the other?
Executing an aggregation pipeline per joined document provides additional flexibility. If you need this flexibility, it may make sense to execute the pipeline, though performance needs to be considered regardless. If you don't, it is sensible to use a more performant approach.
I want to lookup from an object to a collection where the foreignField key is embedded into an array of objects. I have:
collection "shirts"
{
"_id" : ObjectId("5a797ef0768d8418866eb0f6"),
"name" : "Supermanshirt",
"price" : 9.99,
"flavours" : [
{
"flavId" : ObjectId("5a797f8c768d8418866ebad3"),
"size" : "M",
"color": "white",
},
{
"flavId" : ObjectId("3a797f8c768d8418866eb0f7"),
"size" : "XL",
"color": "red",
},
]
}
collection "basket"
{
"_id" : ObjectId("5a797ef0333d8418866ebabc"),
"basketName" : "Default",
"items" : [
{
"dateAdded" : 1526996879787.0,
"itemFlavId" : ObjectId("5a797f8c768d8418866ebad3")
}
],
}
My Query:
basketSchema.aggregate([
{
$match: { $and: [{ _id }, { basketName }]},
},
{
$unwind: '$items',
},
{
$lookup:
{
from: 'shirts',
localField: 'items.itemFlavId',
foreignField: 'flavours.flavId',
as: 'ordered_shirts',
},
},
]).toArray();
my expected result:
[{
"_id" : ObjectId("5a797ef0333d8418866ebabc"),
"basketName" : "Default",
"items" : [
{
"dateAdded" : 1526996879787.0,
"itemFlavId" : ObjectId("5a797f8c768d8418866ebad3")
}
],
"ordered_shirts" : [
{
"_id" : ObjectId("5a797ef0768d8418866eb0f6"),
"name" : "Supermanshirt",
"price" : 9.99,
"flavours" : [
{
"flavId" : ObjectId("5a797f8c768d8418866ebad3"),
"size" : "M",
"color": "white",
}
]
}
],
}]
but instead my ordered_shirts array is empty.
How can I use a foreignField if this foreignField is embedded in an array at the other collection?
I am using MongoDB 3.6.4
As commented, it would appear that there is simply something up in your code where you are pointing at the wrong collection. The general case for this is to simply look at the example listing provided below and see what the differences are, since with the data you provide and the correct collection names then your expected result is in fact returned.
Of course where you need to take such a query "after" that initial $lookup stage is not a simple matter. From a structural standpoint, what you have is generally not a great idea since referring "joins" into items within an array means you are always returning data which is not necessarily "related".
There are some ways to combat that, and mostly there is the form of "non-correlated" $lookup introduced with MongoDB 3.6 which can aid in ensuring you are not returning "unnecessary" data in the "join".
I'm working here in the form of "merging" the "sku" detail with the "items" in the basket, so a first form would be:
Optimal MongoDB 3.6
// Store some vars like you have
let _id = ObjectId("5a797ef0333d8418866ebabc"),
basketName = "Default";
// Run non-correlated $lookup
let optimal = await Basket.aggregate([
{ "$match": { _id, basketName } },
{ "$lookup": {
"from": Shirt.collection.name,
"as": "items",
"let": { "items": "$items" },
"pipeline": [
{ "$match": {
"$expr": {
"$setIsSubset": ["$$items.itemflavId", "$flavours.flavId"]
}
}},
{ "$project": {
"_id": 0,
"items": {
"$map": {
"input": {
"$filter": {
"input": "$flavours",
"cond": { "$in": [ "$$this.flavId", "$$items.itemFlavId" ]}
}
},
"in": {
"$mergeObjects": [
{ "$arrayElemAt": [
"$$items",
{ "$indexOfArray": [
"$$items.itemFlavId", "$$this.flavId" ] }
]},
{ "name": "$name", "price": "$price" },
"$$this"
]
}
}
}
}},
{ "$unwind": "$items" },
{ "$replaceRoot": { "newRoot": "$items" } }
]
}}
])
Note that since you are using mongoose to hold details for the models we can use Shirt.collection.name here to read the property from that model with the actual collection name as needed for the $lookup. This helps avoid confusion within the code and also "hard-coding" something like the collection name when it's actually stored somewhere else. In this way should you change the code which registers the "model" in a way which altered the collection name, then this would always retrieve the correct name for use in the pipeline stage.
The main reason you use this form of $lookup with MongoDB 3.6 is because you want to use that "sub-pipeline" to manipulate the foreign collection results "before" they are returned and merged with the parent document. Since we are "merging" the results into the existing "items" array of the basket we use the same field name in argument to "as".
In this form of $lookup you typically still want "related" documents even though it gives you the control to do whatever you want. In this case we can compare the array content from "items" in the parent document which we set as a variable for the pipeline to use with the array under "flavours" in the foreign collection. A logical comparison for the two "sets" of values here where they "intersect" is using the $setIsSubset operator using the $expr so we can compare on a "logical operation".
The main work here is being done in the $project which is simply using $map on the array from the "flavours" array of the foreign document, processed with $filter in comparison to the "items" we passed into the pipeline and essentially re-written in order to "merge" the matched content.
The $filter reduces down the list for consideration to only those which match something present within the "items", and then we can use $indexOfArray and $arrayElemAt in order to extract the detail from the "items" and merge it with each remaining "flavours" entry which matches using the $mergeObjects operator. Noting here that we also take some "parent" detail from the "shirt" as the "name" and "price" fields which are common to the variations in size and color.
Since this is still an "array" within the matched document(s) to the join condition, in order to get a "flat list" of objects suitable for "merged" entries in the resulting "items" of the $lookup we simply apply $unwind, which within the context of matched items left only creates "little" overhead, and $replaceRoot in order to promote the content under that key to the top level.
The result is just the "merged" content listed in the "items" of the basket.
Sub-optimal MongoDB
The alternate approaches are really not that great since all involve returning other "flavours" which do not actually match the items in the basket. This basically involves "post-filtering" the results obtained from the $lookup as opposed to "pre-filtering" which the process above does.
So the next case here would be using methods to manipulate the returned array in order to remove the items which don't actually match:
// Using legacy $lookup
let alternate = await Basket.aggregate([
{ "$match": { _id, basketName } },
{ "$lookup": {
"from": Shirt.collection.name,
"localField": "items.itemFlavId",
"foreignField": "flavours.flavId",
"as": "ordered_items"
}},
{ "$addFields": {
"items": {
"$let": {
"vars": {
"ordered_items": {
"$reduce": {
"input": {
"$map": {
"input": "$ordered_items",
"as": "o",
"in": {
"$map": {
"input": {
"$filter": {
"input": "$$o.flavours",
"cond": {
"$in": ["$$this.flavId", "$items.itemFlavId"]
}
}
},
"as": "f",
"in": {
"$mergeObjects": [
{ "name": "$$o.name", "price": "$$o.price" },
"$$f"
]
}
}
}
}
},
"initialValue": [],
"in": { "$concatArrays": ["$$value", "$$this"] }
}
}
},
"in": {
"$map": {
"input": "$items",
"in": {
"$mergeObjects": [
"$$this",
{ "$arrayElemAt": [
"$$ordered_items",
{ "$indexOfArray": [
"$$ordered_items.flavId", "$$this.itemFlavId"
]}
]}
]
}
}
}
}
},
"ordered_items": "$$REMOVE"
}}
]);
Here I'm still using some MongoDB 3.6 features, but these are not a "requirement" of the logic involved. The main constraint in this approach is actually the $reduce which requires MongoDB 3.4 or greater.
Using the same "legacy" form of $lookup as you were attempting, we still get the desired results as you display but that of course contains information in the "flavours" that does not match the "items" in the basket. In much the same way as shown in the previous listing we can apply $filter here to remove the items which don't match. The same process here uses that $filter output as the input for $map, which again is doing the same "merge" process as before.
Where the $reduce comes in is because the resulting processing where there is an "array" target from $lookup with documents that themselves contain an "array" of "flavours" is that these arrays need to be "merged" into a single array for further processing. The $reduce simply uses the processed output and performs a $concatArrays on each of the "inner" arrays returned to make these results singular. We already "merged" the content, so this becomes the new "merged" "items".
Older Still $unwind
And of course the final way to present ( even though there are other combinations ) is using $unwind on the arrays and using $group to put it back together:
let old = await Basket.aggregate([
{ "$match": { _id, basketName } },
{ "$unwind": "$items" },
{ "$lookup": {
"from": Shirt.collection.name,
"localField": "items.itemFlavId",
"foreignField": "flavours.flavId",
"as": "ordered_items"
}},
{ "$unwind": "$ordered_items" },
{ "$unwind": "$ordered_items.flavours" },
{ "$redact": {
"$cond": {
"if": {
"$eq": [
"$items.itemFlavId",
"$ordered_items.flavours.flavId"
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
{ "$group": {
"_id": "$_id",
"basketName": { "$first": "$basketName" },
"items": {
"$push": {
"dateAdded": "$items.dateAdded",
"itemFlavId": "$items.itemFlavId",
"name": "$ordered_items.name",
"price": "$ordered_items.price",
"flavId": "$ordered_items.flavours.flavId",
"size": "$ordered_items.flavours.size",
"color": "$ordered_items.flavours.color"
}
}
}}
]);
Most of this should be pretty self explanatory as $unwind is simply a tool to "flatten" array content into singular document entries. In order to just get the results we want we can use $redact to compare the two fields. Using MongoDB 3.6 you "could" use $expr within a $match here:
{ "$match": {
"$expr": {
"$eq": [
"$items.itemFlavId",
"$ordered_items.flavours.flavId"
]
}
}}
But when it comes down to it, if you have MongoDB 3.6 with it's other features then $unwind is the wrong thing to do here due to all the overhead it will actually add.
So all that really happens is you $lookup then "flatten" the documents and finally $group all related detail together using $push to recreate the "items" in the basket. It "looks simple" and is probably the most easy form to understand, however "simplicity" does not equal "performance" and this would be pretty brutal to use in a real world use case.
Summary
That should cover the explanation of the things you need to do when working with "joins" that are going to compare items within arrays. This probably should lead you on the path of realizing this is not really a great idea and it would be far better to keep your "skus" listed "separately" rather than listing them all related under a single "item".
It also should in part be a lesson that "joins" in general are not a great idea with MongoDB. You really only should define such relations where they are "absolutely necessary". In such a case of "details for items in a basket", then contrary to traditional RDBMS patterns it would actually be far better in terms of performance to simply "embed" that detail from the start. In that way you don't need complicated join conditions just to get a result, which might have saved "a few bytes" in storage but is taking a lot more time than what should have been a simple request for the basket with all the detail already "embedded". That really should be the primary reason why you are using something like MongoDB in the first place.
So if you have to do it, then really you should be sticking with the first form since where you have the available features to use then use them best to their advantage. Whilst other approaches may seem easier, it won't help the application performance, and of course best performance would be embedding to begin with.
A full listing follows for demonstration of the above discussed methods and for basic comparison to prove that the provided data does in fact "join" as long as the other parts of the application set-up are working as they should be. So a model on "how it should be done" in addition to demonstrating the full concepts.
const { Schema, Types: { ObjectId } } = mongoose = require('mongoose');
const uri = 'mongodb://localhost/basket';
mongoose.Promise = global.Promise;
mongoose.set('debug', true);
const basketItemSchema = new Schema({
dateAdded: { type: Number, default: Date.now() },
itemFlavId: { type: Schema.Types.ObjectId }
},{ _id: false });
const basketSchema = new Schema({
basketName: String,
items: [basketItemSchema]
});
const flavourSchema = new Schema({
flavId: { type: Schema.Types.ObjectId },
size: String,
color: String
},{ _id: false });
const shirtSchema = new Schema({
name: String,
price: Number,
flavours: [flavourSchema]
});
const Basket = mongoose.model('Basket', basketSchema);
const Shirt = mongoose.model('Shirt', shirtSchema);
const log = data => console.log(JSON.stringify(data, undefined, 2));
(async function() {
try {
const conn = await mongoose.connect(uri);
// clean data
await Promise.all(Object.entries(conn.models).map(([k,m]) => m.remove()));
// set up data for test
await Basket.create({
_id: ObjectId("5a797ef0333d8418866ebabc"),
basketName: "Default",
items: [
{
dateAdded: 1526996879787.0,
itemFlavId: ObjectId("5a797f8c768d8418866ebad3")
}
]
});
await Shirt.create({
_id: ObjectId("5a797ef0768d8418866eb0f6"),
name: "Supermanshirt",
price: 9.99,
flavours: [
{
flavId: ObjectId("5a797f8c768d8418866ebad3"),
size: "M",
color: "white"
},
{
flavId: ObjectId("3a797f8c768d8418866eb0f7"),
size: "XL",
color: "red"
}
]
});
// Store some vars like you have
let _id = ObjectId("5a797ef0333d8418866ebabc"),
basketName = "Default";
// Run non-correlated $lookup
let optimal = await Basket.aggregate([
{ "$match": { _id, basketName } },
{ "$lookup": {
"from": Shirt.collection.name,
"as": "items",
"let": { "items": "$items" },
"pipeline": [
{ "$match": {
"$expr": {
"$setIsSubset": ["$$items.itemflavId", "$flavours.flavId"]
}
}},
{ "$project": {
"_id": 0,
"items": {
"$map": {
"input": {
"$filter": {
"input": "$flavours",
"cond": { "$in": [ "$$this.flavId", "$$items.itemFlavId" ]}
}
},
"in": {
"$mergeObjects": [
{ "$arrayElemAt": [
"$$items",
{ "$indexOfArray": [
"$$items.itemFlavId", "$$this.flavId" ] }
]},
{ "name": "$name", "price": "$price" },
"$$this"
]
}
}
}
}},
{ "$unwind": "$items" },
{ "$replaceRoot": { "newRoot": "$items" } }
]
}}
])
log(optimal);
// Using legacy $lookup
let alternate = await Basket.aggregate([
{ "$match": { _id, basketName } },
{ "$lookup": {
"from": Shirt.collection.name,
"localField": "items.itemFlavId",
"foreignField": "flavours.flavId",
"as": "ordered_items"
}},
{ "$addFields": {
"items": {
"$let": {
"vars": {
"ordered_items": {
"$reduce": {
"input": {
"$map": {
"input": "$ordered_items",
"as": "o",
"in": {
"$map": {
"input": {
"$filter": {
"input": "$$o.flavours",
"cond": {
"$in": ["$$this.flavId", "$items.itemFlavId"]
}
}
},
"as": "f",
"in": {
"$mergeObjects": [
{ "name": "$$o.name", "price": "$$o.price" },
"$$f"
]
}
}
}
}
},
"initialValue": [],
"in": { "$concatArrays": ["$$value", "$$this"] }
}
}
},
"in": {
"$map": {
"input": "$items",
"in": {
"$mergeObjects": [
"$$this",
{ "$arrayElemAt": [
"$$ordered_items",
{ "$indexOfArray": [
"$$ordered_items.flavId", "$$this.itemFlavId"
]}
]}
]
}
}
}
}
},
"ordered_items": "$$REMOVE"
}}
]);
log(alternate);
// Or really old style
let old = await Basket.aggregate([
{ "$match": { _id, basketName } },
{ "$unwind": "$items" },
{ "$lookup": {
"from": Shirt.collection.name,
"localField": "items.itemFlavId",
"foreignField": "flavours.flavId",
"as": "ordered_items"
}},
{ "$unwind": "$ordered_items" },
{ "$unwind": "$ordered_items.flavours" },
{ "$redact": {
"$cond": {
"if": {
"$eq": [
"$items.itemFlavId",
"$ordered_items.flavours.flavId"
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
{ "$group": {
"_id": "$_id",
"basketName": { "$first": "$basketName" },
"items": {
"$push": {
"dateAdded": "$items.dateAdded",
"itemFlavId": "$items.itemFlavId",
"name": "$ordered_items.name",
"price": "$ordered_items.price",
"flavId": "$ordered_items.flavours.flavId",
"size": "$ordered_items.flavours.size",
"color": "$ordered_items.flavours.color"
}
}
}}
]);
log(old);
} catch(e) {
console.error(e)
} finally {
process.exit()
}
})()
And sample output as:
Mongoose: baskets.remove({}, {})
Mongoose: shirts.remove({}, {})
Mongoose: baskets.insertOne({ _id: ObjectId("5a797ef0333d8418866ebabc"), basketName: 'Default', items: [ { dateAdded: 1526996879787, itemFlavId: ObjectId("5a797f8c768d8418866ebad3") } ], __v: 0 })
Mongoose: shirts.insertOne({ _id: ObjectId("5a797ef0768d8418866eb0f6"), name: 'Supermanshirt', price: 9.99, flavours: [ { flavId: ObjectId("5a797f8c768d8418866ebad3"), size: 'M', color: 'white' }, { flavId: ObjectId("3a797f8c768d8418866eb0f7"), size: 'XL', color: 'red' } ], __v: 0 })
Mongoose: baskets.aggregate([ { '$match': { _id: 5a797ef0333d8418866ebabc, basketName: 'Default' } }, { '$lookup': { from: 'shirts', as: 'items', let: { items: '$items' }, pipeline: [ { '$match': { '$expr': { '$setIsSubset': [ '$$items.itemflavId', '$flavours.flavId' ] } } }, { '$project': { _id: 0, items: { '$map': { input: { '$filter': { input: '$flavours', cond: { '$in': [Array] } } }, in: { '$mergeObjects': [ { '$arrayElemAt': [Array] }, { name: '$name', price: '$price' }, '$$this' ] } } } } }, { '$unwind': '$items' }, { '$replaceRoot': { newRoot: '$items' } } ] } } ], {})
[
{
"_id": "5a797ef0333d8418866ebabc",
"basketName": "Default",
"items": [
{
"dateAdded": 1526996879787,
"itemFlavId": "5a797f8c768d8418866ebad3",
"name": "Supermanshirt",
"price": 9.99,
"flavId": "5a797f8c768d8418866ebad3",
"size": "M",
"color": "white"
}
],
"__v": 0
}
]
Mongoose: baskets.aggregate([ { '$match': { _id: 5a797ef0333d8418866ebabc, basketName: 'Default' } }, { '$lookup': { from: 'shirts', localField: 'items.itemFlavId', foreignField: 'flavours.flavId', as: 'ordered_items' } }, { '$addFields': { items: { '$let': { vars: { ordered_items: { '$reduce': { input: { '$map': { input: '$ordered_items', as: 'o', in: { '$map': [Object] } } }, initialValue: [], in: { '$concatArrays': [ '$$value', '$$this' ] } } } }, in: { '$map': { input: '$items', in: { '$mergeObjects': [ '$$this', { '$arrayElemAt': [ '$$ordered_items', [Object] ] } ] } } } } }, ordered_items: '$$REMOVE' } } ], {})
[
{
"_id": "5a797ef0333d8418866ebabc",
"basketName": "Default",
"items": [
{
"dateAdded": 1526996879787,
"itemFlavId": "5a797f8c768d8418866ebad3",
"name": "Supermanshirt",
"price": 9.99,
"flavId": "5a797f8c768d8418866ebad3",
"size": "M",
"color": "white"
}
],
"__v": 0
}
]
Mongoose: baskets.aggregate([ { '$match': { _id: 5a797ef0333d8418866ebabc, basketName: 'Default' } }, { '$unwind': '$items' }, { '$lookup': { from: 'shirts', localField: 'items.itemFlavId', foreignField: 'flavours.flavId', as: 'ordered_items' } }, { '$unwind': '$ordered_items' }, { '$unwind': '$ordered_items.flavours' }, { '$redact': { '$cond': { if: { '$eq': [ '$items.itemFlavId', '$ordered_items.flavours.flavId' ] }, then: '$$KEEP', else: '$$PRUNE' } } }, { '$group': { _id: '$_id', basketName: { '$first': '$basketName' }, items: { '$push': { dateAdded: '$items.dateAdded', itemFlavId: '$items.itemFlavId', name: '$ordered_items.name', price: '$ordered_items.price', flavId: '$ordered_items.flavours.flavId', size: '$ordered_items.flavours.size', color: '$ordered_items.flavours.color' } } } } ], {})
[
{
"_id": "5a797ef0333d8418866ebabc",
"basketName": "Default",
"items": [
{
"dateAdded": 1526996879787,
"itemFlavId": "5a797f8c768d8418866ebad3",
"name": "Supermanshirt",
"price": 9.99,
"flavId": "5a797f8c768d8418866ebad3",
"size": "M",
"color": "white"
}
]
}
]
How can I use aggregate and find together in Mongoose?
i.e I have the following schema:
const schema = new Mongoose.Schema({
created: { type: Date, default: Date.now() },
name: { type: String, default: 'development' }
followers: [{ type: Mongoose.Schema.ObjectId, ref: 'Users'}]
...
})
export default Mongoose.model('Locations', schema)
How can I query the users with only the fields name and followers_count.
followers_count: the length of followers .
There, I know we can use select to get only the field name.
How can we get the count of followers?
For MongoDB 3.6 and greater, use the $expr operator which allows the use of aggregation expressions within the query language:
var followers_count = 30;
db.locations.find({
"$expr": {
"$and": [
{ "$eq": ["$name", "development"] },
{ "$gte": [{ "$size": "$followers" }, followers_count ]}
]
}
});
For non-compatible versions, you can use both the $match and $redact pipelines to query your collection. For example, if you want to query the locations collection where the name is 'development' and followers_count is greater than 30, run the following aggregate operation:
const followers_count = 30;
Locations.aggregate([
{ "$match": { "name": "development" } },
{
"$redact": {
"$cond": [
{ "$gte": [ { "$size": "$followers" }, followers_count ] },
"$$KEEP",
"$$PRUNE"
]
}
}
]).exec((err, locations) => {
if (err) throw err;
console.log(locations);
})
or within a single pipeline as
Locations.aggregate([
{
"$redact": {
"$cond": [
{
"$and": [
{ "$eq": ["$name", "development"] },
{ "$gte": [ { "$size": "$followers" }, followers_count ] }
]
},
"$$KEEP",
"$$PRUNE"
]
}
}
]).exec((err, locations) => {
if (err) throw err;
console.log(locations);
})
The above will return the locations with just the _id references from the users. To return the users documents as means to "populate" the followers array, you can then append the $lookup pipeline.
If the underlying Mongo server version is 3.4 and newer, you can run the pipeline as
let followers_count = 30;
Locations.aggregate([
{ "$match": { "name": "development" } },
{
"$redact": {
"$cond": [
{ "$gte": [ { "$size": "$followers" }, followers_count ] },
"$$KEEP",
"$$PRUNE"
]
}
},
{
"$lookup": {
"from": "users",
"localField": "followers",
"foreignField": "_id",
"as": "followers"
}
}
]).exec((err, locations) => {
if (err) throw err;
console.log(locations);
})
else you would need to $unwind the followers array before applying $lookup and then regroup with $group pipeline after that:
let followers_count = 30;
Locations.aggregate([
{ "$match": { "name": "development" } },
{
"$redact": {
"$cond": [
{ "$gte": [ { "$size": "$followers" }, followers_count ] },
"$$KEEP",
"$$PRUNE"
]
}
},
{ "$unwind": "$followers" },
{
"$lookup": {
"from": "users",
"localField": "followers",
"foreignField": "_id",
"as": "follower"
}
},
{ "$unwind": "$follower" },
{
"$group": {
"_id": "$_id",
"created": { "$first": "$created" },
"name": { "$first": "$name" },
"followers": { "$push": "$follower" }
}
}
]).exec((err, locations) => {
if (err) throw err;
console.log(locations);
})
You can use as the following:
db.locations.aggregate([
{$match:{"your find query"}},
{$project:{"your desired fields"}}
])
In the match you can do stuff like:
{{$match:{name:"whatever"}}
In the project, you can select the fields you want using numbers 0 or 1 like:
{$project:{_id:1,created:0,name:1}}
Which 0 means, do not put and 1 means put.
Here is my MongoDB collection schema:
company: String
model: String
cons: [String] // array of tags that were marked as "cons"
pros: [String] // array of tags that were marked as "pros"
I need to aggregate it so I get the following output:
[{
"_id": {
"company": "Lenovo",
"model": "T400"
},
"tags": {
tag: "SomeTag"
pros: 124 // number of times, "SomeTag" tag was found in "pros" array in `Lenovo T400`
cons: 345 // number of times, "SomeTag" tag was found in "cons" array in `Lenovo T400`
}
}...]
I tried to do the following:
var aggParams = {};
aggParams.push({ $unwind: '$cons' });
aggParams.push({ $unwind: '$pros' });
aggParams.push({$group: {
_id: {
company: '$company',
model: '$model',
consTag: '$cons'
},
consTagCount: { $sum: 1 }
}});
aggParams.push({$group: {
_id: {
company: '$_id.company',
model: '$_id.model',
prosTag: '$pros'
},
prosTagCount: { $sum: 1 }
}});
aggParams.push({$group: {
_id: {
company:'$_id.company',
model: '$_id.model'
},
tags: { $push: { tag: { $or: ['$_id.consTag', '$_id.prosTag'] }, cons: '$consTagCount', pros: '$prosTagCount'} }
}});
But I got the following result:
{
"_id": {
"company": "Lenovo",
"model": "T400"
},
"tags": [
{
"tag": false,
"pros": 7
}
]
}
What is the right way to do this with aggregation?
Yes this is a bit harder considering that there are multiple arrays, and if you try both at the same time you end up with a "cartesian condition" where one arrray multiplies the contents of the other.
Therefore, just combine the array content at the beginning, which probably indicates how you should be storing the data in the first place:
Model.aggregate(
[
{ "$project": {
"company": 1,
"model": 1,
"data": {
"$setUnion": [
{ "$map": {
"input": "$pros",
"as": "pro",
"in": {
"type": { "$literal": "pro" },
"value": "$$pro"
}
}},
{ "$map": {
"input": "$cons",
"as": "con",
"in": {
"type": { "$literal": "con" },
"value": "$$con"
}
}}
]
}
}},
{ "$unwind": "$data" }
{ "$group": {
"_id": {
"company": "$company",
"model": "$model",
"tag": "$data.value"
},
"pros": {
"$sum": {
"$cond": [
{ "$eq": [ "$data.type", "pro" ] },
1,
0
]
}
},
"cons": {
"$sum": {
"$cond": [
{ "$eq": [ "$data.type", "con" ] },
1,
0
]
}
}
}
],
function(err,result) {
}
)
So via the first $project stage the $map operators are adding the "type" value to each item of each array. Not that it really matters here as all items should process "unique" anyway, the $setUnion operator "contatenates" each array into a singular array.
As mentioned earlier, you probably should be storing in this way in the first place.
Then process $unwind followed by $group, wherein each "pros" and "cons" is then evaluated via $cond to for it's matching "type", either returning 1 or 0 where the match is respectively true/false to the $sum aggregation accumulator.
This gives you a "logical match" to count each respective "type" within the aggregation operation as per the grouping keys specified.