I am using aggregate in node.js as follows
collection.aggregate(
{
$group : {
_id : "$id_page",
"count" : {$sum : 1}
}
},
{$sort : {"count" : -1}},
{$limit : 1}
).limit(1).toArray(function (err, r) { ................. })
this runs correctly but I am getting this result
{ id: '346593403645', _id: 57a868497e07fcf75f27009c, __v: 0 }
because of the _id key, the object cannot be exploited.
Is it possible to use aggregate such a way it does not return the _id key?
use $project and choose which field is display
collection.aggregate(
{
$group : {
_id : "$id_page",
"count" : {$sum : 1}
}
},
{$sort : {"count" : -1}},
{$limit : 1} ,
{$project:{count:1,_id:0}}
)
Related
I'm practicing how to use MongoDB aggregation, but they seem to take a really long time (running time).
The problem seems to happen whenever I use $group. All other queries run just fine.
I have some 1.3 million dummy documents that need to perform two basic operations: get a count of the IP addresses and unique IP addresses.
My schema looks something like this:
{
"_id":"5da51af103eb566faee6b8b4",
"ip_address":"...",
"country":"CL",
"browser":{
"user_agent":...",
}
}
Running a basic $group query takes about 12s on average, which is much too slow.
I did a little research, and someone suggested creating an index on ip_addresses. That seems to have slowed it down because queries now take 13-15s.
I use MongoDB and the query I'm running looks like this:
visitorsModel.aggregate([
{
'$group': {
'_id': '$ip_address',
'count': {
'$sum': 1
}
}
}
]).allowDiskUse(true)
.exec(function (err, docs) {
if (err) throw err;
return res.send({
uniqueCount: docs.length
})
})
Any help is appreciated.
Edit: I forgot to mention, someone suggested it might be a hardware issue? I'm running the query on a core i5, 8GB RAM laptop if it helps.
Edit 2: The query plan:
{
"stages" : [
{
"$cursor" : {
"query" : {
},
"fields" : {
"ip_address" : 1,
"_id" : 0
},
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "metrics.visitors",
"indexFilterSet" : false,
"parsedQuery" : {
},
"winningPlan" : {
"stage" : "COLLSCAN",
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1387324,
"executionTimeMillis" : 7671,
"totalKeysExamined" : 0,
"totalDocsExamined" : 1387324,
"executionStages" : {
"stage" : "COLLSCAN",
"nReturned" : 1387324,
"executionTimeMillisEstimate" : 9,
"works" : 1387326,
"advanced" : 1387324,
"needTime" : 1,
"needYield" : 0,
"saveState" : 10930,
"restoreState" : 10930,
"isEOF" : 1,
"invalidates" : 0,
"direction" : "forward",
"docsExamined" : 1387324
}
}
}
},
{
"$group" : {
"_id" : "$ip_address",
"count" : {
"$sum" : {
"$const" : 1
}
}
}
}
],
"ok" : 1
}
This is some info about using $group aggregation stage, if it uses indexes, and its limitations and what can be tried to overcome these.
1. The $group Stage Doesn't Use Index:
Mongodb Aggregation: Does $group use index?
2. $group Operator and Memory:
The $group stage has a limit of 100 megabytes of RAM. By default, if
the stage exceeds this limit, $group returns an error. To allow for
the handling of large datasets, set the allowDiskUse option to true.
This flag enables $group operations to write to temporary files.
See MongoDb docs on $group Operator and Memory
3. An Example Using $group and Count:
A collection called as cities:
{ "_id" : 1, "city" : "Bangalore", "country" : "India" }
{ "_id" : 2, "city" : "New York", "country" : "United States" }
{ "_id" : 3, "city" : "Canberra", "country" : "Australia" }
{ "_id" : 4, "city" : "Hyderabad", "country" : "India" }
{ "_id" : 5, "city" : "Chicago", "country" : "United States" }
{ "_id" : 6, "city" : "Amritsar", "country" : "India" }
{ "_id" : 7, "city" : "Ankara", "country" : "Turkey" }
{ "_id" : 8, "city" : "Sydney", "country" : "Australia" }
{ "_id" : 9, "city" : "Srinagar", "country" : "India" }
{ "_id" : 10, "city" : "San Francisco", "country" : "United States" }
Query the collection to count the cities by each country:
db.cities.aggregate( [
{ $group: { _id: "$country", cityCount: { $sum: 1 } } },
{ $project: { country: "$_id", _id: 0, cityCount: 1 } }
] )
The Result:
{ "cityCount" : 3, "country" : "United States" }
{ "cityCount" : 1, "country" : "Turkey" }
{ "cityCount" : 2, "country" : "Australia" }
{ "cityCount" : 4, "country" : "India" }
4. Using allowDiskUse Option:
db.cities.aggregate( [
{ $group: { _id: "$country", cityCount: { $sum: 1 } } },
{ $project: { country: "$_id", _id: 0, cityCount: 1 } }
], { allowDiskUse : true } )
Note, in this case it makes no difference in query performance or output. This is to show the usage only.
5. Some Options to Try (suggestions):
You can try a few things to get some result (for trial purposes only):
Use $limit stage and restrict the number of documents processed and
see what is the result. For example, you can try { $limit: 1000 }.
Note this stage needs to come before the $group stage.
You can also use the $match, $project stages before the $group
stage to control the shape and size of the input. This may
return a result (instead of an error).
[EDIT ADD]
Notes on Distinct and Count:
Using the same cities collection - to get unique countries and a count of them you can try using the aggregate stage $count along with $group as in the following two queries.
Distinct:
db.cities.aggregate( [
{ $match: { country: { $exists: true } } },
{ $group: { _id: "$country" } },
{ $project: { country: "$_id", _id: 0 } }
] )
The Result:
{ "country" : "United States" }
{ "country" : "Turkey" }
{ "country" : "India" }
{ "country" : "Australia" }
To get the above result as a single document with an array of unique values, use the $addToSetoperator:
db.cities.aggregate( [
{ $match: { country: { $exists: true } } },
{ $group: { _id: null, uniqueCountries: { $addToSet: "$country" } } },
{ $project: { _id: 0 } },
] )
The Result: { "uniqueCountries" : [ "United States", "Turkey", "India", "Australia" ] }
Count:
db.cities.aggregate( [
{ $match: { country: { $exists: true } } },
{ $group: { _id: "$country" } },
{ $project: { country: "$_id", _id: 0 } },
{ $count: "uniqueCountryCount" }
] )
The Result: { "uniqueCountryCount" : 4 }
In the above queries the $match stage is used to filter any documents with non-existing or null countryfield. The $project stage reshapes the result document(s).
MongoDB Query Language:
Note the two queries get similar results when using the MongoDB query language commands: db.collection.distinct("country") and db.cities.distinct("country").length (note the distinct returns an array).
You can create index
db.collectionname.createIndex( { ip_address: "text" } )
Try this, it is more faster.
I think it will help you.
i have simple schema like this
{
"productName": "pppppp"
"sku" : {
"carted" : [
{
"_id" : ObjectId("56c6d606c0987668109a21f7"),
"timestamp" : ISODate("2016-02-19T08:44:54.043+0000"),
"cartId" : "56c6c1fd60c4491c157e433d",
"qty" : NumberInt(2)
},
{
"_id" : ObjectId("56c6d653172fb54817ec2356"),
"timestamp" : ISODate("2016-02-19T08:46:11.902+0000"),
"cartId" : "56c6c1fd60c4491c157e433d",
"qty" : NumberInt(2)
},
{
"_id" : ObjectId("56c6d6a7172fb54817ec2358"),
"timestamp" : ISODate("2016-02-19T08:47:35.652+0000"),
"cartId" : "56c6c1fd60c4491c157e433d",
"qty" : NumberInt(2)
}
],
"qty" : NumberInt(14)
}
}
how the way to view the product "pppppp" and show the quantity to 20? the sku.quantity added with all available sku.carted.qty.
i want it looks like this
{
"productName": "pppppp"
"qty" : 20
}
Please try this one with $group, $sum and $add
> db.collection.aggregate([
{$unwind: '$sku.carted'},
// sum the `qty` in the carted array, put this result to `qt`
{$group: {
_id: {productName: '$productName', q: '$sku.qty'},
qt: {$sum: '$sku.carted.qty'}
}},
// add the `qt` and `sku.qty`
// and reshape the output result.
{$project: {
_id: 0,
productName: '$_id.productName',
qty: {$add: ['$_id.q', '$qt']}
}}
]);
I am new to mongoose, I am facing a problem while trying to fetch some data using aggregate query.
One part of my auction schema is:
"_id" : ObjectId("56c58be1faaa402c0d4ae66f"),
"auction_name" : "Auction2",
"auction_start_datetime" : ISODate("2016-02-18T09:30:00.000Z"),
"auction_end_datetime" : ISODate("2016-02-22T09:00:00.000Z"),
"auction_status" : "active",
"auction_series" : "GA-06-C",
"auction_reserve_price" : 1000,
"auction_increment_amount" : 200,
"fancy_numbers" : [
{
"number_end_datetime" : ISODate("2016-02-22T09:00:00.000Z"),
"number_start_datetime" : ISODate("2016-02-18T09:30:00.000Z"),
"increment_amount" : 200,
"reserve_price" : 1000,
"number" : 5000,
"_id" : ObjectId("56c58da3faaa402c0d4ae739"),
"bid_users" : [
{
"user_id" : "56c416a599ad7c9c1611b90b",
"bid_amount" : 7200,
"bid_time" : ISODate("2016-02-18T11:58:53.025Z"),
"user_name" : "amit#mailinator.com",
"_id" : ObjectId("56c5aec4acebf3b4061a645e")
},
{
"user_id" : "56c172dc302a2c90179c7fd1",
"bid_amount" : 15400,
"bid_time" : ISODate("2016-02-19T10:38:43.506Z"),
"user_name" : "rbidder#mailinator.com",
"_id" : ObjectId("56c5afe0d2baef7020ede1b6")
},
{
"user_id" : "56c477afb27a7ed824c54427",
"bid_amount" : 2800,
"bid_time" : ISODate("2016-02-18T11:56:58.830Z"),
"user_name" : "bidder2#mailinator.com",
"_id" : ObjectId("56c5b18a78c3fb340a8c6d75")
},
{
"user_id" : "56c5b17378c3fb340a8c6d73",
"bid_amount" : 5600,
"bid_time" : ISODate("2016-02-18T11:58:34.616Z"),
"user_name" : "bidder3#mailinator.com",
"_id" : ObjectId("56c5b1d778c3fb340a8c6d78")
}
]
}
]
Here, fancy_number is an array under auction collection and bid_users is an array under each fancy_number.
I have the user_id, I want to query and get only the bid_user records in which he is the highest bidder.
For example:
There are 3 users bidded 200,300,400 respectively, I want to get the
record (i.e number and amount) only if this particular user bid is 400
(highest). where ill be passing the user_id
The aggregate query which I wrote is:
var ObjectID = require('mongodb').ObjectID;
tempId = new ObjectID(req.body.aId);
auctionModel.aggregate({$match: {'_id': tempId}},
{$unwind: '$fancy_numbers'},
{$unwind:"$fancy_numbers.bid_users"},
{$group: {'_id':"$fancy_numbers.number" , maxBid: { $max: '$fancy_numbers.bid_users.bid_amount'}}},
function(err, bidData){
if(err){
console.log('Error :' + err);
}
else if(bidData) {
console.log(bidData);
}
});
Somehow this query is not working, its only giving records of max bid and number. I want records only if he is the highest bidder.
If I catch you correctly, please try to do it through $sort, and $limit to retrieve the highest bidder as below
auctionModel.aggregate(.aggregate([
{$match: {'_id': '123'}},
{$unwind: '$fancy_numbers'},
{$unwind: '$fancy_numbers.bid_users'},
{$sort: {bid_amount: 1}},
{$limit: 1}]);
I have a collection named votes:
{
"_id" : ObjectId("54a3cb59b2b8ded51693074d"),
"Pseudo" : "Cacaboy",
"Type" : "down",
"postvote" : ObjectId("54a2f05bedbe1109145b06b6"),
"CreatedDate" : ISODate("2014-12-31T10:02:34.209Z"),
"__v" : 0
}
{
"_id" : ObjectId("54a3d776ecbf63c61a91d396"),
"Pseudo" : "CosmicJB",
"Type" : "up",
"postvote" : ObjectId("54a2f05bedbe1109145b06b6"),
"CreatedDate" : ISODate("2014-12-31T11:01:10.715Z"),
"__v" : 0
}
{
"_id" : ObjectId("54a3dca5b2b8ded51693074e"),
"Pseudo" : "hateman",
"Type" : "down",
"postvote" : ObjectId("54a2f05bedbe1109145b06b6"),
"CreatedDate" : ISODate("2014-12-31T10:02:34.209Z"),
"__v" : 0
}
Implemented Aggregation pipeline:
Vote.aggregate({$match: {postvote: pvote}},
{$group: {_id: '$Type',
n: { $sum: 1 }
}},
function(err, cb){
console.log(cb);
});
Obtained o/p:
[ { _id: 'up', n: 1 }, { _id: 'down', n: 2 } ]
Desired Result, for a postvote:
If up and down votes, both are present then result:up-down.
If just down votes are present then, result: -down.
If just up votes are present then result:up.
Is it possible using aggregation?
You need to modify your aggregation pipeline to perform the below operations:
Match the records with the desired postvote id(s).
For each record, project an extra field named weight, for the records of Type - up, the
weight would be 1, for the other -1.
Group based on the postvote field, to project the sum of the weight field for the
postvote as result.
Code:
Vote.aggregate(
{$match:{"postvote":pvote}},
{$project:{"postvote":1,"weight":{$cond:[{$eq:["$Type","up"]},1,-1]}}},
{$group:{"_id":"$postvote","result":{$sum:"$weight"}}},
function(err,data){
// handle response.
}
)
Sample o/p:
{ "_id" : ObjectId("54a2f05bedbe1109145b06b6"), "result" : -1 }
I want retrieve the account array if the query find role "elite"
i try with
db.users.aggregate(
{ $match : { "account.role" : "Elite" } }
);
But i have all object...
{
"_id" : ObjectId("7623902143981943"),
"account" : [
{
"role" : "Elite",
"action" : [
"create",
"read",
"update",
"delete"
],
"extra" : {
account:[1,2,3,4]
}
},
{
"role" : "User",
"action" : [
"create",
"read",
"update",
"delete"
],
"extra" : {
account:[10]
}
}
],
}
Can i retrieve only extra Array ( account:[1,2,3,4] ),if it is a positive result from the query? or I have to parse the received object?
(the schema is very simplified, but I have many roles)
You must use $project and $unwind:
//Order of $unwind and $match matters
db.users.aggregate(
{$unwind: "$account"},
{$match : { "account.role" : "Elite" }},
{$project : { "extra.account" : 1}}
);
explanation
$unwind splits the array into different elements. See the effect of
db.users.aggregate({$unwind: "$account"})
then you match the elements with {"account.role": "Elite"}. See the effect of:
db.users.aggregate(
{$unwind: "$account"},
{$match : { "account.role" : "Elite" }}
);
And then you finally project just the desired fields
db.users.aggregate(
{$unwind: "$account"},
{$match : { "account.role" : "Elite" }},
{$project : { "extra.account" : 1}}
);
//You can also remove the _id filed (included by default with:
db.users.aggregate(
{$unwind: "$account"},
{$match : { "account.role" : "Elite" }},
{$project : { _id: 0, "extra.account" : 1}}
);
OLD ANSWER
You must use projection:
db.users.aggregate(
{$match : { "account.role" : "Elite" }},
{$project : { "extra.account" : 1}}
);
Besides, if you are just matching documents, there's no need to use the aggregation framewrok and you can just use:
// No projection here
db.users.find({"account.role" : "Elite"})
or
// Only returns the _id field + "extra.account" field if exists. By default the _id field is included
db.users.find({"account.role" : "Elite"}, { "extra.account" : 1})
// Only returns the "extra.account" field if exists
db.users.find({"account.role" : "Elite"}, { _id: 0, "extra.account" : 1})
Mongodb documentation can be found here and here