I have an array like this:
{
"_id" : ObjectId("581b7d650949a5204e0a6e9b"),
"types" : [
{
"type" : ObjectId("581b7c645057c4602f48627f"),
"quantity" : 4,
"_id" : ObjectId("581b7d650949a5204e0a6e9e")
},
{
"type" : ObjectId("581ca0e75b1e3058521a6d8c"),
"quantity" : 4,
"_id" : ObjectId("581b7d650949a5204e0a6e9e")
}
],
"__v" : 0
},
{
"_id" : ObjectId("581b7d650949a5204e0a6e9c"),
"types" : [
{
"type" : ObjectId("581b7c645057c4602f48627f"),
"quantity" : 4,
"_id" : ObjectId("581b7d650949a5204e0a6e9e")
}
],
"__v" : 0
}
And I want to create a query that will return me the elementswhere the array of types ALL match a $in array.
For example:
query([ObjectId("581b7c645057c4602f48627f"), ObjectId("581ca0e75b1e3058521a6d8c")])
should return elements 1 and 2
query([ObjectId("581b7c645057c4602f48627f")])
should return element 2
query([ObjectId("581ca0e75b1e3058521a6d8c")])
should return nothing
I tried
db.getCollection('elements').find({'types.type': { $in: [ObjectId("581ca0e75b1e3058521a6d8c")]}})
But it returns the elements if only one types matches
You may have to use aggregation as $in and $elematch will return only matching elements. Project stage does set equals to create a all match flag and matches in the last stage with true value.
aggregate([ {
$project: {
_id: 0,
isAllMatch: {$setIsSubset: ["$types.type", [ObjectId("581b7c645057c4602f48627f")]]},
data: "$$ROOT"
}
}, {
$match: {
isAllMatch: true
}
}])
Sample Output
{
"isAllMatch": true,
"data": {
"_id": ObjectId("581b7d650949a5204e0a6e9c"),
"types": [{
"type": ObjectId("581b7c645057c4602f48627f"),
"quantity": 4,
"_id": ObjectId("581b7d650949a5204e0a6e9e")
}],
"__v": 0
}
}
Alternative version:
This version combines both project and match stages into one $redact stage with $cond operator to decide whether to keep or prune the elements.
aggregate([{
"$redact": {
"$cond": [{
$setIsSubset: ["$types.type", [ObjectId("581b7c645057c4602f48627f")]]
},
"$$KEEP",
"$$PRUNE"
]
}
}])
Related
I'm practicing how to use MongoDB aggregation, but they seem to take a really long time (running time).
The problem seems to happen whenever I use $group. All other queries run just fine.
I have some 1.3 million dummy documents that need to perform two basic operations: get a count of the IP addresses and unique IP addresses.
My schema looks something like this:
{
"_id":"5da51af103eb566faee6b8b4",
"ip_address":"...",
"country":"CL",
"browser":{
"user_agent":...",
}
}
Running a basic $group query takes about 12s on average, which is much too slow.
I did a little research, and someone suggested creating an index on ip_addresses. That seems to have slowed it down because queries now take 13-15s.
I use MongoDB and the query I'm running looks like this:
visitorsModel.aggregate([
{
'$group': {
'_id': '$ip_address',
'count': {
'$sum': 1
}
}
}
]).allowDiskUse(true)
.exec(function (err, docs) {
if (err) throw err;
return res.send({
uniqueCount: docs.length
})
})
Any help is appreciated.
Edit: I forgot to mention, someone suggested it might be a hardware issue? I'm running the query on a core i5, 8GB RAM laptop if it helps.
Edit 2: The query plan:
{
"stages" : [
{
"$cursor" : {
"query" : {
},
"fields" : {
"ip_address" : 1,
"_id" : 0
},
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "metrics.visitors",
"indexFilterSet" : false,
"parsedQuery" : {
},
"winningPlan" : {
"stage" : "COLLSCAN",
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1387324,
"executionTimeMillis" : 7671,
"totalKeysExamined" : 0,
"totalDocsExamined" : 1387324,
"executionStages" : {
"stage" : "COLLSCAN",
"nReturned" : 1387324,
"executionTimeMillisEstimate" : 9,
"works" : 1387326,
"advanced" : 1387324,
"needTime" : 1,
"needYield" : 0,
"saveState" : 10930,
"restoreState" : 10930,
"isEOF" : 1,
"invalidates" : 0,
"direction" : "forward",
"docsExamined" : 1387324
}
}
}
},
{
"$group" : {
"_id" : "$ip_address",
"count" : {
"$sum" : {
"$const" : 1
}
}
}
}
],
"ok" : 1
}
This is some info about using $group aggregation stage, if it uses indexes, and its limitations and what can be tried to overcome these.
1. The $group Stage Doesn't Use Index:
Mongodb Aggregation: Does $group use index?
2. $group Operator and Memory:
The $group stage has a limit of 100 megabytes of RAM. By default, if
the stage exceeds this limit, $group returns an error. To allow for
the handling of large datasets, set the allowDiskUse option to true.
This flag enables $group operations to write to temporary files.
See MongoDb docs on $group Operator and Memory
3. An Example Using $group and Count:
A collection called as cities:
{ "_id" : 1, "city" : "Bangalore", "country" : "India" }
{ "_id" : 2, "city" : "New York", "country" : "United States" }
{ "_id" : 3, "city" : "Canberra", "country" : "Australia" }
{ "_id" : 4, "city" : "Hyderabad", "country" : "India" }
{ "_id" : 5, "city" : "Chicago", "country" : "United States" }
{ "_id" : 6, "city" : "Amritsar", "country" : "India" }
{ "_id" : 7, "city" : "Ankara", "country" : "Turkey" }
{ "_id" : 8, "city" : "Sydney", "country" : "Australia" }
{ "_id" : 9, "city" : "Srinagar", "country" : "India" }
{ "_id" : 10, "city" : "San Francisco", "country" : "United States" }
Query the collection to count the cities by each country:
db.cities.aggregate( [
{ $group: { _id: "$country", cityCount: { $sum: 1 } } },
{ $project: { country: "$_id", _id: 0, cityCount: 1 } }
] )
The Result:
{ "cityCount" : 3, "country" : "United States" }
{ "cityCount" : 1, "country" : "Turkey" }
{ "cityCount" : 2, "country" : "Australia" }
{ "cityCount" : 4, "country" : "India" }
4. Using allowDiskUse Option:
db.cities.aggregate( [
{ $group: { _id: "$country", cityCount: { $sum: 1 } } },
{ $project: { country: "$_id", _id: 0, cityCount: 1 } }
], { allowDiskUse : true } )
Note, in this case it makes no difference in query performance or output. This is to show the usage only.
5. Some Options to Try (suggestions):
You can try a few things to get some result (for trial purposes only):
Use $limit stage and restrict the number of documents processed and
see what is the result. For example, you can try { $limit: 1000 }.
Note this stage needs to come before the $group stage.
You can also use the $match, $project stages before the $group
stage to control the shape and size of the input. This may
return a result (instead of an error).
[EDIT ADD]
Notes on Distinct and Count:
Using the same cities collection - to get unique countries and a count of them you can try using the aggregate stage $count along with $group as in the following two queries.
Distinct:
db.cities.aggregate( [
{ $match: { country: { $exists: true } } },
{ $group: { _id: "$country" } },
{ $project: { country: "$_id", _id: 0 } }
] )
The Result:
{ "country" : "United States" }
{ "country" : "Turkey" }
{ "country" : "India" }
{ "country" : "Australia" }
To get the above result as a single document with an array of unique values, use the $addToSetoperator:
db.cities.aggregate( [
{ $match: { country: { $exists: true } } },
{ $group: { _id: null, uniqueCountries: { $addToSet: "$country" } } },
{ $project: { _id: 0 } },
] )
The Result: { "uniqueCountries" : [ "United States", "Turkey", "India", "Australia" ] }
Count:
db.cities.aggregate( [
{ $match: { country: { $exists: true } } },
{ $group: { _id: "$country" } },
{ $project: { country: "$_id", _id: 0 } },
{ $count: "uniqueCountryCount" }
] )
The Result: { "uniqueCountryCount" : 4 }
In the above queries the $match stage is used to filter any documents with non-existing or null countryfield. The $project stage reshapes the result document(s).
MongoDB Query Language:
Note the two queries get similar results when using the MongoDB query language commands: db.collection.distinct("country") and db.cities.distinct("country").length (note the distinct returns an array).
You can create index
db.collectionname.createIndex( { ip_address: "text" } )
Try this, it is more faster.
I think it will help you.
how to get data in mongoose where last element in array?
I have data looks like this:
[
{
"_id" : ObjectId("5b56eb3deb869312d85a8e69"),
"transactionStatus" : [
{
"status" : "pending",
"createdAt" : ISODate("2018-07-24T09:02:53.347Z")
},
{
"status" : "process",
"createdAt" : ISODate("2018-07-24T09:02:53.347Z")
}
]
},
{
"_id" : ObjectId("5b56eb3deb869312d8589765"),
"transactionStatus" : [
{
"status" : "pending",
"createdAt" : ISODate("2018-07-24T09:02:53.347Z")
},
{
"status" : "process",
"createdAt" : ISODate("2018-07-24T09:03:30.347Z")
},
{
"status" : "done",
"createdAt" : ISODate("2018-07-24T09:04:22.347Z")
}
]
}
]
And, I want to get data above where last object transactionStatus.status = process, so the result should be:
{
"_id" : ObjectId("5b56eb3deb869312d85a8e69"),
"transactionStatus" : [
{
"status" : "pending",
"createdAt" : ISODate("2018-07-24T09:02:53.347Z")
},
{
"status" : "process",
"createdAt" : ISODate("2018-07-24T09:02:53.347Z")
}
]
}
how to do that with mongoose?
You can use $expr (MongoDB 3.6+) inside of match. Using $let and $arrayElemAt passing -1 as second argument you can get the last element as a temporary variable and then you can compare the values:
db.col.aggregate([
{
$match: {
$expr: {
$let: {
vars: { last: { $arrayElemAt: [ "$transactionStatus", -1 ] } },
in: { $eq: [ "$$last.status", "process" ] }
}
}
}
}
])
The same result can be achieved for lower versions of MongoDB using $addFields and $match. You can add $project then to remove that temporary field:
db.col.aggregate([
{
$addFields: {
last: { $arrayElemAt: [ "$transactionStatus", -1 ] }
}
},
{
$match: { "last.status": "process" }
},
{
$project: { last: 0 }
}
])
//Always update new status at Position 0 using $position operator
db.update({
"_id": ObjectId("5b56eb3deb869312d85a8e69")
},
{
"$push": {
"transactionStatus": {
"$each": [
{
"status": "process",
"createdAt": ISODate("2018-07-24T09:02:53.347Z")
}
],
"$position": 0
}
}
}
)
//Your Query for checking first element status is process
db.find(
{
"transactionStatus.0.status": "process"
}
)
refer $position, $each
I am newbie. But I try to learn the most logical ways to write the queries.
Assume I have collection which is as;
{
"id" : NumberInt(1),
"school" : [
{
"name" : "george",
"code" : "01"
},
{
"name" : "michelangelo",
"code" : "01"
}
],
"enrolledStudents" : [
{
"userName" : "elisabeth",
"code" : NumberInt(21)
}
]
}
{
"id" : NumberInt(2),
"school" : [
{
"name" : "leonarda da vinci",
"code" : "01"
}
],
"enrolledStudents" : [
{
"userName" : "michelangelo",
"code" : NumberInt(25)
}
]
}
I want to list occurence of a key with their corresponding code values.
As an example key : michelangelo
To find the occurence of the key, I wrote two differen aggregation queries as;
db.test.aggregate([
{$unwind: "$school"},
{$match : {"school.name" : "michelangelo"}},
{$project: {_id: "$id", "key" : "$school.name", "code" : "$school.code"}}
])
and
db.test.aggregate([
{$unwind: "$enrolledStudents"},
{$match : {"enrolledStudents.userName" : "michelangelo"}},
{$project: {_id: "$id", "key" : "$enrolledStudents.userName", "code" : "$enrolledStudents.code"}}
])
the result of these 2 queries return what I want as;
{ "_id" : 1, "key" : "michelangelo", "code" : "01" }
{ "_id" : 2, "key" : "michelangelo", "code" : 25 }
One of them to search in enrolledStudents, the other one is searching in school field.
Can these 2 queries reduced into more logical query? Or is this the only way to do it?
ps: I am aware that database structure is not logical, but I tried to simulate.
edit
I try to write a query with find.
db.test.find({$or: [{"enrolledStudents.userName" : "michelangelo"} , {"school.name" : "michelangelo"}]}).pretty()
but this returns the whole documents as;
{
"id" : 1,
"school" : [
{
"name" : "george",
"code" : "01"
},
{
"name" : "michelangelo",
"code" : "01"
}
],
"enrolledStudents" : [
{
"userName" : "elisabeth",
"code" : 21
}
]
}
{
"id" : 2,
"school" : [
{
"name" : "leonarda da vinci",
"code" : "01"
}
],
"enrolledStudents" : [
{
"userName" : "michelangelo",
"code" : 25
}
]
}
Mongo 3.4
$match - This stage will keep all the school array and enrolledStudents where there is atleast one embedded document matching both the query condition
$group - This stage will combine all the school and enrolledStudents array to 2d array for each _id in a group.
$project - This stage will $filter the merge array for matching query condition and $map the array to with new labels values array.
$unwind - This stage will flatten the array.
$addFields & $replaceRoot - This stages will add the id field and promote the values array to the top.
db.collection.aggregate([
{$match : {$or: [{"enrolledStudents.userName" : "michelangelo"} , {"school.name" : "michelangelo"}]}},
{$group: {_id: "$id", merge : {$push:{$setUnion:["$school", "$enrolledStudents"]}}}},
{$project: {
values: {
$map:
{
input: {
$filter: {
input: {"$arrayElemAt":["$merge",0]},
as: "onef",
cond: {
$or: [{
$eq: ["$$onef.userName", "michelangelo"]
}, {
$eq: ["$$onef.name", "michelangelo"]
}]
}
}
},
as: "onem",
in: {
key : { $ifNull: [ "$$onem.userName", "$$onem.name" ] },
code : "$$onem.code"}
}
}
}
},
{$unwind: "$values"},
{$addFields:{"values.id":"$_id"}},
{$replaceRoot: { newRoot:"$values"}}
])
Sample Response
{ "_id" : 2, "key" : "michelangelo", "code" : 25 }
{ "_id" : 1, "key" : "michelangelo", "code" : "01" }
Mongo <= 3.2
Replace last two stages of above aggregation with $project to format the response.
{$project: {"_id": 0 , id:"$_id", key:"$values.key", code:"$values.code"}}
Sample Response
{ "_id" : 2, "key" : "michelangelo", "code" : 25 }
{ "_id" : 1, "key" : "michelangelo", "code" : "01" }
You can use $redact instead of $group & match and add $project with $map to format the response.
$redact to go through a document level at a time and perform $$DESCEND and $$PRUNE on the matching criteria.
The only thing to note is usage of $ifNull in the first document level for id so that you can $$DESCEND to embedded document level for further processing.
db.collection.aggregate([
{
$redact: {
$cond: [{
$or: [{
$eq: ["$userName", "michelangelo"]
}, {
$eq: ["$name", "michelangelo"]
}, {
$ifNull: ["$id", false]
}]
}, "$$DESCEND", "$$PRUNE"]
}
},
{
$project: {
id:1,
values: {
$map:
{
input: {$setUnion:["$school", "$enrolledStudents"]},
as: "onem",
in: {
key : { $ifNull: [ "$$onem.userName", "$$onem.name" ] },
code : "$$onem.code"}
}
}
}
},
{$unwind: "$values"},
{$project: {_id:0,id:"$id", key:"$values.key", code:"$values.code"}}
])
I have multiple data something like this
{
"_id" : ObjectId("57189fcd72b6e0480ed7a0a9"),
"venueId" : ObjectId("56ce9ead08daba400d14edc9"),
"companyId" : ObjectId("56e7d62ecc0b8fc812b2aac5"),
"cardTypeId" : ObjectId("56cea8acd82cd11004ee67a9"),
"matchData" : [
{
"matchId" : ObjectId("57175c25561d87001e666d12"),
"matchDate" : ISODate("2016-04-08T18:30:00.000Z"),
"matchTime" : "20:00:00",
"_id" : ObjectId("57189fcd72b6e0480ed7a0ab"),
"active" : 3,
"cancelled" : 0,
"produced" : 3
},
{
"matchId" : ObjectId("57175c25561d87001e666d13"),
"matchDate" : ISODate("2016-04-09T18:30:00.000Z"),
"matchTime" : "20:00:00",
"_id" : ObjectId("57189fcd72b6e0480ed7a0aa"),
"active" : null,
"cancelled" : null,
"produced" : null
}
],
"__v" : 0
}
i m doing group by companyId and its work fine But i want to search in matchData based on matchtime and matchId For that purpose i am $unwind matchData after unwind i using my search query like this
db.getCollection('matchWiseData').aggregate([
{"$match":{
"matchData.matchId":{"$in":[ObjectId("57175c25561d87001e666d12")]}
}},
{"$unwind":"$matchData"},
{"$match":{
"matchData.matchId":{"$in":[ObjectId("57175c25561d87001e666d12")]}}
}])
its give me proper result but after applying unwind is there any way to undo it I m using unwind to just search inside subdocument or there is any other way to search inside subdocument.
Well you can of course just use $push and $first in a $group to get the document back to what it was:
db.getCollection('matchWiseData').aggregate([
{ "$match":{
"matchData.matchId":{"$in":[ObjectId("57175c25561d87001e666d12")]}
}},
{ "$unwind":"$matchData"},
{ "$match":{
"matchData.matchId":{"$in":[ObjectId("57175c25561d87001e666d12")]}
}},
{ "$group": {
"_id": "$_id",
"venueId": { "$first": "$venueId" },
"companyId": { "$first": "$companyId" },
"cardTypeId": { "$first": "$cardTypeId" },
"matchData": { "$push": "$matchData" }
}}
])
But you probably should have just used $filter with MongoDB 3.2 in the first place:
db.getCollection('matchWiseData').aggregate([
{ "$match":{
"matchData.matchId":{"$in":[ObjectId("57175c25561d87001e666d12")]}
}},
{ "$project": {
"venueId": 1,
"companyId": 1,
"cardTypeId": 1,
"matchData": {
"$filter": {
"input": "$matchData",
"as": "match",
"cond": {
"$or": [
{ "$eq": [ "$$match.matchId", ObjectId("57175c25561d87001e666d12") ] }
]
}
}
}
}}
])
And if you had at least MongoDB 2.6, you still could have used $map and $setDifference instead:
db.getCollection('matchWiseData').aggregate([
{ "$match":{
"matchData.matchId":{"$in":[ObjectId("57175c25561d87001e666d12")]}
}},
{ "$project": {
"venueId": 1,
"companyId": 1,
"cardTypeId": 1,
"matchData": {
"$setDifference": [
{ "$map": {
"input": "$matchData",
"as": "match",
"in": {
"$cond": [
{ "$or": [
{ "$eq": [ "$$match.matchId", ObjectId("57175c25561d87001e666d12") ] }
]},
"$$match",
false
]
}
}},
[false]
]
}
}}
])
That's perfectly fine when every array element already has a "unique" identifier, so the "set" operation just removes the false values from $map.
Both of those a ways to "filter" content from an array without actually using $unwind
N.B: Not sure if you really grasp that $in is used to match a "list of conditions" rather than being required to match on arrays. So generally the condition can just be:
"matchData.matchId": ObjectId("57175c25561d87001e666d12")
Where you only actually have a single value to match on. You use $in and $or when you have a "list" of conditions. Arrays themselves make no difference to the operator required.
This is Collection Structure
[{
"_id" : "....",
"name" : "aaaa",
"level_max_leaves" : [
{
level : "ObjectIdString 1",
max_leaves : 4,
}
]
},
{
"_id" : "....",
"name" : "bbbb",
"level_max_leaves" : [
{
level : "ObjectIdString 2",
max_leaves : 2,
}
]
}]
I need to find the subdocument value of level_max_leaves.level filter when its matching with given input value.
And this how I tried,
For example,
var empLevelId = 'ObjectIdString 1' ;
MyModel.aggregate(
{$unwind: "$level_max_leaves"},
{$match: {"$level_max_leaves.level": empLevelId } },
{$group: { "_id": "$level_max_leaves.level",
"total": { "$sum": "$level_max_leaves.max_leaves" }}},
function (err, res) {
console.log(res);
});
But here the $match filter is not working. I can't find out exact results of ObjectIdString 1
If I filter with name field, its working fine. like this,
{$match: {"$name": "aaaa" } },
But in subdocument level its returns 0.
{$match: {"$level_max_leaves.level": "ObjectIdString 1"} },
My expected result was,
{
"_id" : "ObjectIdString 1",
"total" : 4,
}
You have typed the $match incorrectly. Fields with $ prefixes are either for the implemented operators or for "variable" references to field content. So you just type the field name:
MyModel.aggregate(
[
{ "$match": { "level_max_leaves.level": "ObjectIdString 1" } },
{ "$unwind": "$level_max_leaves" },
{ "$match": { "level_max_leaves.level": "ObjectIdString 1" } },
{ "$group": {
"_id": "$level_max_leaves.level",
"total": { "$sum": "$level_max_leaves.max_leaves" }
}}
],
function (err, res) {
console.log(res);
}
);
Which on the sample you provide produces:
{ "_id" : "ObjectIdString 1", "total" : 4 }
It is also good practice to $match first in your pipeline. That is in fact the only time an index can be used. But not only for that, as without the initial $match statement, your aggregation pipeline would perform an $unwind operation on every document in the collection, whether it met the conditions or not.
So generally what you want to do here is
Match the documents that contain the required elements in the array
Unwind the array of the matching documents
Match the required array content excluding all others