I've got a MongoDB / Nodes aggregation that looks a little like this (there are other values in there, but this is the basic idea).
[
{
'$unwind': {
'path': '$Vehicles'
}
},
{
'$match': {
'Vehicles.Manufacturer': 'FORD'
}
},
{
'$facet': {
'makes': [
{
'$group': {
'_id': '$Vehicles.Manufacturer',
'count': {
'$sum': 1
}
}
}
]
}
},
{
'$project': {
'makes': {
'$sortArray': {
'input': '$makes',
'sortBy': 1
}
}
}
}
]
This works fine. But I would also like to pass an unmatched list through. IE an an array of vehicles whose Manufacturer = FORD and an other list of all Manufacturer.
Can't get it to work. Any ideas please?
Thanks in advance.
Edit:-
The current output looks like this:
[{
"makes": [
{
"_id": "FORD",
"count": 285
}
]
}]
and ideally it would look something like this:
[{
"makes": [
{
"_id": "FORD",
"count": 285
}
],
"unfiltered_makes": [
{
"_id": "ABARTH",
"count": 1
},
{
"_id": "AUDI",
"count": 7
},
{
"_id": "BMW",
"count": 2
},
{
"_id": "CITROEN",
"count": 4
},
{
"_id": "DS",
"count": 1
},
{
"_id": "FIAT",
"count": 1
}.... etc
]
}]
The data looks a bit like this:
"Vehicles": [
{
"Id": 1404908,
"Manufacturer": "MG",
"Model": "3",
"Price": 11995 .... etc
},{
"Id": 1404909,
"Manufacturer": "FORD",
"ManufacturerId": 34,
"Model": "Focus",
"Price": 12000 .... etc
} ... etc
]
In this case you can do something like:
db.collection.aggregate([
{$unwind: "$Vehicles"},
{$group: {
_id: "$Vehicles.Manufacturer",
count: {$sum: 1}}
},
{$facet: {
makes: [{$match: {_id: "FORD"}}],
unfiltered_makes: [{$group: {_id: 0, data: {$push: "$$ROOT"}}}]
}
},
{$project: {makes: 1, unfiltered_makes: "$unfiltered_makes.data"}}
])
See how it works on the playground example
Another option is:
db.collection.aggregate([
{$unwind: "$Vehicles"},
{$group: {
_id: "$Vehicles.Manufacturer",
count: {$sum: 1}}
},
{$group: {
_id: 0,
unfiltered_makes: {$push: "$$ROOT"},
makes: {$push: {$cond: [{$eq: ["$_id", "FORD"]}, "$$ROOT", "$$REMOVE"]}}
}
}
])
See how it works on the playground example
Here's another way to do it using "$function" to generate a histogram of "Manufacturer" and format the returned array. The javascript function only traverses the "Vehicles" array once, so this may be fairly efficient, although I did not do algorithm timing comparisons on a large collection.
N.B.: I'm a javascript noob and there may be a better way to do this.
db.collection.aggregate([
{
"$set": {
"unfiltered_makes": {
"$function": {
// generate histogram of manufacturers and format output
"body": "function(makes) {const m = new Object();makes.forEach((elem) => {m[elem.Manufacturer] = m[elem.Manufacturer] + 1 || 1});return Object.entries(m).map(([make, count]) => {return {'_id':make, 'count':count}})}",
"args": ["$Vehicles"],
"lang": "js"
}
}
}
},
{
"$project": {
"_id": 0,
"unfiltered_makes": 1,
"makes": {
"$filter": {
"input": "$unfiltered_makes",
"as": "make",
"cond": {
"$eq": [
"$$make._id",
// your search "Manufacturer" goes here
"FORD"
]
}
}
}
}
}
])
Try it on mongoplayground.net.
Related
I have a collection in MongoDB that looks something like this:
[
{
"machine": 1,
"status": true,
"comments": [
{
"machine": 1,
"status": false,
"emp": "158",
"comment": "testing the latest update"
},
{
"machine": 1,
"status": false,
"emp": "007",
"comment": "2nd comment"
},
]
},
{
"machine": 2,
"status": true,
"comments": [
{
"machine": 2,
"status": true,
"emp": "158",
"comment": "checking dcm 2"
}
]
}
]
I would like to return ALL of the top level documents (machines 1 & 2), but only comments by emp "007". I also only want to return the latest comment, not all. I got that part working with this line:
await db.collection('status').find().project( { "comments": { "$slice": -1 } }).toArray()
But I cannot for the life of me get it to then filter by 'emp' in the nested array.
You can use a simple aggregation with $filter to get a clean output:
db.collection.aggregate([
{
$project: {
machine: 1,
status: 1,
comments: {
$slice: [
{
$filter: {
input: "$comments",
as: "item",
cond: {$eq: ["$$item.emp", "007"]}}
}, -1
]
},
_id: 0
}
},
{$set: {comments: {$arrayElemAt: ["$comments", 0]}}},
])
Playground example
db.collection.aggregate([
{
$match: {
"comments.machine": {
$in: [
1,
2
]
},
"comments.emp": "007"
}
},
{
"$addFields": {//addFields with same name overwrites the array
"comments": {
"$filter": { //Filter the array elements based on condition
"input": "$comments",
"as": "comment",
"cond": {
$eq: [
"$$comment.emp",
"007"
]
}
}
}
}
}
])
Playground
If you have a date, sort by that and get the top one.
I'm using MongoDb with Node.js. And I'm having trouble with aggregation.This is the Example of data in Collection in Database:
{
"name": "abcdef",
"address": ghijk,
"reli":"A",
"prov:"a" ,
}
{
"name": "xyz",
"address": "vwz",
"reli":"B",
"prov:"b" ,
}
{
"name": "qwe",
"address": "rty",
"reli":'C',
"prov:"c" ,
},
{
"name": "abcdef",
"address": ghijk,
"reli":"A",
"prov:"a" ,
}
{
"name": "hat",
"address": "ate",
"reli":'C',
"prov:"c" ,
},
This is my query to count:
const count = await db.aggregate([
{
$facet: {
"reli": [
{ $group: { _id: '$reli', count: { $sum: 1 } } }
],
"prov": [
{ $group: { _id: '$prov', count: { $sum: 1 } } }
],
This is my result of the query:
[
{
"reli": [
{
"_id": "A",
"count": 2
},
{
"_id": "B",
"count": 1
},
{
"_id": "C",
"count": 2
}
"prov": [
{
"_id": "a",
"count": 2
},
{
"_id": "b",
"count": 1
},
{
"_id": "c",
"count": 2
}
}]
I want to aggregate this data. And only want values for every reli and prov and it's count in my results.
Expecting Output:
[
"reli":{
A:2 // As my Collection has 2 "A" And 2 is it's count
C:2 // As there is 2 "C" in Collection.
}
"prov":{
c:2 //As there are 2 "c" in Collection.
b:1 //As there are 1 "b" in Collection
}
]
You can use $arrayToObject to get what you want here, creating keys from your values. For example:
db.collection.aggregate([
{
$facet: {
reli: [
{$group: { _id: "$reli", v: {$sum: 1}}},
{$project: {v: 1, k: "$_id", _id: 0}}
],
prov: [
{$group: {_id: "$prov", v: {$sum: 1}}},
{$project: {v: 1, k: "$_id", _id: 0}}
]
}
},
{
$project: {
prov: {$arrayToObject: "$prov"},
reli: {$arrayToObject: "$reli"}
}
}
])
Playground example
If there is a correlation between the reli and prov group, as in your example, you can avoid the $facet, group once, count once and only project twice.
You may try to use $group like this, the output isn't exactly like expected, but it may help you :
const reli = await db.collection.aggregate([
{
'$group': {
'_id': '$reli',
'count': {
'$sum': 1
}
}
}, {
'$project': {
'_id': 0,
'reli': '$_id',
'count': 1
}
}
])
const prov = await db.collection.aggregate([
{
'$group': {
'_id': '$prov',
'count': {
'$sum': 1
}
}
}, {
'$project': {
'_id': 0,
'prov': '$_id',
'count': 1
}
}
])
const res = {
reli,
prov
};
output :
{
"prov": [
{ "count": 2, "prov": "a" },
{ "count": 1, "prov": "b" },
{ "count": 2, "prov": "c" }
],
"reli": [
{ "count": 2, "reli": "A" },
{ "count": 2, "reli": "C" },
{ "count": 1, "reli": "B" }
]
}
From this data, if you need it, you could re write key:value, for exact data model as you asked.
i have a problem with aggregation framework in MongoDB (mongoose) this is the problem. i have the following database scheme.so what i want to do is count number of people who has access through Mobile only , Card only, or both. with out any order,
{
'_id': ObjectId,
'user_access_type': ['Mobile' , 'Card']
}
{
'_id': ObjectId,
'user_access_type': ['Card' , 'Mobile']
}
{
'_id': ObjectId,
'user_access_type': ['Mobile']
}
{
'_id': ObjectId,
'user_access_type': ['Card']
}
Now i am using this but it only groups by the order of the user_access_type array,
[ { "$group" : { "_id": {"User" : "$user_access_type"} , "count": {"$sum" : 1} }]
this is the output:
{
"_id": {
"User": [
"Card",
"Mobile"
]
},
"count": 1
},
{
"_id": {
"_id": "5f7dce2359aaf004985f98eb",
"User": [
"Mobile",
"Card"
]
},
"count": 1
},
{
"_id": {
"User": [
"Mobile"
]
},
"count": 1
},
{
"_id": {
"User": [
"Card"
]
},
"count": 1
},
vs what i want:
{
"_id": {
"User": [
"Card",
"Mobile" // we can say both
]
},
"count": 2 // does not depend on order
},
{
"_id": {
"User": [
"Mobile"
]
},
"count": 1
},
{
"_id": {
"User": [
"Card"
]
},
"count": 1
},
You can use other option as well using $function,
$function can allow to add javascript code, you can use sort() to sort the array
db.collection.aggregate([
{
$addFields: {
user_access_type: {
$function: {
body: function(user_access_type){
return user_access_type.sort();
},
args: ["$user_access_type"],
lang: "js"
}
}
}
},
{
$group: {
_id: "$user_access_type",
count: { $sum: 1 }
}
}
])
Second option,
If user_access_type array having always unique elements then you can use $setUnion operator on user_access_type array as self union, some how this will re-order array in same order,
db.collection.aggregate([
{
$addFields: {
user_access_type: {
$setUnion: "$user_access_type"
}
}
},
{
$group: {
_id: "$user_access_type",
count: { $sum: 1 }
}
}
])
Playground
I have appointment collection in that i have status codes like upcoming, cancelled, completed. i want to write an api to get count of each status using mongoose or mongodb methods.
output should be like below
[{
group : "grp1",
appointments_completed :4
appointments_upcoming :5
appointments_cancelled : 7
}]
thanks in advance.
I hope it help you
db.getCollection('codelist').aggregate([
{
$group:{
_id:{status:"$status"},
count:{$sum:1}
}
}
])
The result will be
[{
"_id" : {
"status" : "cancelled"
},
"count" : 13.0
},
{
"_id" : {
"status" : "completed"
},
"count" : 20.0
}
]
I think you can process it with nodejs
Using Aggregation Pipeline $group we can get this count
db.collection_name.aggregate([
{ $group: {
_id:null,
appointments_completed: {$sum : "$appointments_completed" },
appointments_upcoming:{$sum :"$appointments_upcoming"},
appointments_cancelled:{$sum: "$appointments_cancelled"}
}
}
]);
With MongoDb 3.6 and newer, you can leverage the use of $arrayToObject operator and a $replaceRoot pipeline to get the desired result. You would need to run the following aggregate pipeline:
db.appointments.aggregate([
{ "$group": {
"_id": {
"group": <group_by_field>,
"status": { "$concat": ["appointments_", { "$toLower": "$status" }] }
},
"count": { "$sum": 1 }
} },
{ "$group": {
"_id": "$_id.group",
"counts": {
"$push": {
"k": "$_id.status",
"v": "$count"
}
}
} },
{ "$addFields": {
"counts": {
"$setUnion": [
"$counts", [
{
"k": "group",
"v": "$_id"
}
]
]
}
} },
{ "$replaceRoot": {
"newRoot": { "$arrayToObject": "$counts" }
} }
])
For older versions, a more generic approach though with a different output format would be to group twice and get the counts as an array of key value objects as in the following:
db.appointments.aggregate([
{ "$group": {
"_id": {
"group": <group_by_field>,
"status": { "$toLower": "$status" }
},
"count": { "$sum": 1 }
} },
{ "$group": {
"_id": "$_id.group",
"counts": {
"$push": {
"status": "$_id.status",
"count": "$count"
}
}
} }
])
which spits out:
{
"_id": "grp1"
"counts":[
{ "status": "completed", "count": 4 },
{ "status": "upcoming", "count": 5 }
{ "status": "cancelled", "count": 7 }
]
}
If the status codes are fixed then the $cond operator in the $group pipeline step can be used effectively to evaluate the counts based on the status field value. Your overall aggregation pipeline can be constructed as follows to produce the result in the desired format:
db.appointments.aggregate([
{ "$group": {
"_id": <group_by_field>,
"appointments_completed": {
"$sum": {
"$cond": [ { "$eq": [ "$status", "completed" ] }, 1, 0 ]
}
},
"appointments_upcoming": {
"$sum": {
"$cond": [ { "$eq": [ "$status", "upcoming" ] }, 1, 0 ]
}
},
"appointments_cancelled": {
"$sum": {
"$cond": [ { "$eq": [ "$status", "cancelled" ] }, 1, 0 ]
}
}
} }
])
Here is my MongoDB collection schema:
company: String
model: String
cons: [String] // array of tags that were marked as "cons"
pros: [String] // array of tags that were marked as "pros"
I need to aggregate it so I get the following output:
[{
"_id": {
"company": "Lenovo",
"model": "T400"
},
"tags": {
tag: "SomeTag"
pros: 124 // number of times, "SomeTag" tag was found in "pros" array in `Lenovo T400`
cons: 345 // number of times, "SomeTag" tag was found in "cons" array in `Lenovo T400`
}
}...]
I tried to do the following:
var aggParams = {};
aggParams.push({ $unwind: '$cons' });
aggParams.push({ $unwind: '$pros' });
aggParams.push({$group: {
_id: {
company: '$company',
model: '$model',
consTag: '$cons'
},
consTagCount: { $sum: 1 }
}});
aggParams.push({$group: {
_id: {
company: '$_id.company',
model: '$_id.model',
prosTag: '$pros'
},
prosTagCount: { $sum: 1 }
}});
aggParams.push({$group: {
_id: {
company:'$_id.company',
model: '$_id.model'
},
tags: { $push: { tag: { $or: ['$_id.consTag', '$_id.prosTag'] }, cons: '$consTagCount', pros: '$prosTagCount'} }
}});
But I got the following result:
{
"_id": {
"company": "Lenovo",
"model": "T400"
},
"tags": [
{
"tag": false,
"pros": 7
}
]
}
What is the right way to do this with aggregation?
Yes this is a bit harder considering that there are multiple arrays, and if you try both at the same time you end up with a "cartesian condition" where one arrray multiplies the contents of the other.
Therefore, just combine the array content at the beginning, which probably indicates how you should be storing the data in the first place:
Model.aggregate(
[
{ "$project": {
"company": 1,
"model": 1,
"data": {
"$setUnion": [
{ "$map": {
"input": "$pros",
"as": "pro",
"in": {
"type": { "$literal": "pro" },
"value": "$$pro"
}
}},
{ "$map": {
"input": "$cons",
"as": "con",
"in": {
"type": { "$literal": "con" },
"value": "$$con"
}
}}
]
}
}},
{ "$unwind": "$data" }
{ "$group": {
"_id": {
"company": "$company",
"model": "$model",
"tag": "$data.value"
},
"pros": {
"$sum": {
"$cond": [
{ "$eq": [ "$data.type", "pro" ] },
1,
0
]
}
},
"cons": {
"$sum": {
"$cond": [
{ "$eq": [ "$data.type", "con" ] },
1,
0
]
}
}
}
],
function(err,result) {
}
)
So via the first $project stage the $map operators are adding the "type" value to each item of each array. Not that it really matters here as all items should process "unique" anyway, the $setUnion operator "contatenates" each array into a singular array.
As mentioned earlier, you probably should be storing in this way in the first place.
Then process $unwind followed by $group, wherein each "pros" and "cons" is then evaluated via $cond to for it's matching "type", either returning 1 or 0 where the match is respectively true/false to the $sum aggregation accumulator.
This gives you a "logical match" to count each respective "type" within the aggregation operation as per the grouping keys specified.