I don't really know how to frame the question but what I have is the following schema in mongoose
new Schema({
gatewayId: { type: String, index: true },
timestamp: { type: Date, index: true },
curr_property:Number,
curr_property_cost:Number,
day_property:Number,
day_property_cost: Number,
curr_solar_generating: Number,
curr_solar_export:Number,
day_solar_generated:Number,
day_solar_export:Number,
curr_chan1:Number,
curr_chan2:Number,
curr_chan3:Number,
day_chan1:Number,
day_chan2:Number,
day_chan3:Number
},{
collection: 'owlelecmonitor'
});
and I want to be able to query all the documents in the collection but the data should be arranged inside the array in the following format
[ [{
gatewayId: 1,
timestamp: time
....
},
{
gatewayId: 1,
timestamp: time2
....
}],
[{
gatewayId: 2,
timestamp: time
....
},
{
gatewayId: 2,
timestamp: time2
....
}],
[{
gatewayId: 3,
timestamp: time
....
},
{
gatewayId: 3,
timestamp: time2
....
}]
];
Is there a way that I can do this in mongoose instead of retrieving the documents and processing them again ?
Yes, it's possible. Consider the following aggregation pipeline in mongo shell. This uses a single pipeline stream comprising of just the $group operator, grouping all the documents by gatewayId and creating another array field that holds all the grouped documents. This extra field uses the accumulator operator $push on the system variable $$ROOT which returns the root document, i.e. the top-level document, currently being processed in the aggregation pipeline stage.
With the cursor returned from the aggregate() method, you can then use its map() method to create the desired final array. The following mongo shell demonstration describes the above concept:
var result = db.owlelecmonitor.aggregate([
{
"$group": {
"_id": "$gatewayId",
"doc": {
"$push": "$$ROOT"
}
}
}
]).map(function (res){ return res.doc; });
printjson(result);
This will output to shell the desired result.
To implement this in Mongoose, use the following aggregation pipeline builder:
OwlelecMonitorModel
.aggregate()
.group({
"_id": "$gatewayId",
"doc": {
"$push": "$$ROOT"
}
})
.exec(function (err, result) {
var res = result.map(function (r){return r.doc;});
console.log(res);
});
Related
I have an endpoint that does an operation such as this:
const pipeline = [
{
$match: {
$and: [
{
$or: [...],
},
],
},
},
{
$group: {
_id : '$someProp',
anotherProp: { $push: '$$ROOT' },
},
},
{ $sort: { date: -1 } },
{ $limit: 10 },
]
const groupedDocs = await MyModel.aggregate(pipeline);
The idea here is that the returned documents look like this:
[
{
_id: 'some value',
anotherProp: [ /* ... array of documents where "someProp" === "some value" */ ],
},
{
_id: 'another value',
anotherProp: [ /* ... array of documents where "someProp" === "another value" */ ],
},
...
]
After getting these results, the endpoint responds with an array containing all the members of anotherProp, like this:
const response = groupedDocs.reduce((docs, group) => docs.concat(group.anotherProp), []);
res.status(200).json(response);
My problem is that the final documents in the response contain the _id field, but I want to rename that field to id. This question addresses this issue, and specifically this answer is what should work, but for some reason the transform function doesn't get invoked. To put it differently, I've tried doing this:
schema.set('toJSON', {
virtuals: true,
transform: function (doc, ret) {
console.log(`transforming toJSON for document ${doc._id}`);
delete ret._id;
},
});
schema.set('toObject', {
virtuals: true,
transform: function (doc, ret) {
console.log(`transforming toObject for document ${doc._id}`);
delete ret._id;
},
});
But the console.log statements are not executed, meaning that the transform function is not getting invoked. So I still get the _id in the response instead of id.
So my question is how can I get id instead of _id in this scenario?
Worth mentioning that toJSON and toObject are invoked (the console.logs show) in other places where I read properties from the documents. Like if I do:
const doc = await MyModel.findById('someId');
const name = doc.name;
res.status(200).json(doc);
The response contains id instead of _id. It's almost like the transform function is invoked once I do anything with the documents, but if I pass the documents directly as they arrive from the database, neither toJSON nor toObject is invoked.
Thanks in advance for your insights. :)
The toJSON and toObject methods won't work here because they don't apply to documents from an aggregation pipeline. Mongoose doesn't convert aggregation docs to mongoose docs, it returns the raw objects returned by the pipeline operation. I ultimately achieved this by adding pipeline stages to first add an id field with the same value as the _id field, then a second stage to remove the _id field. So essentially my pipeline became:
const pipeline = [
{
$match: {
$and: [
{
$or: [...],
},
],
},
},
// change the "_id" to "id"
{ $addFields: { id: '$_id' } },
{ $unset: ['_id'] },
{
$group: {
_id : '$someProp',
anotherProp: { $push: '$$ROOT' },
},
},
{ $sort: { date: -1 } },
{ $limit: 10 },
]
const groupedDocs = await MyModel.aggregate(pipeline);
It is possible to recast the raw objects into mongoose documents after getting them from the aggregate. You just need to transform them back one by one. They will then trigger the toJSON on return.
const document = Model.hydrate(rawObject);
Answer found here:
Cast plain object to mongoose document
I am using aggregates to query my schema for counts over date ranges, my problem is i am not getting any response from the server (Times out everytime), other mongoose queries are working fine (find, save, etc.) and when i call aggregates it depends on the pipeline (when i only use match i get a response when i add unwind i don't get any).
Connection Code:
var promise = mongoose.connect('mongodb://<username>:<password>#<db>.mlab.com:<port>/<db-name>', {
useMongoClient: true,
replset: {
ha: true, // Make sure the high availability checks are on
haInterval: 5000 // Run every 5 seconds
}
});
promise.then(function(db){
console.log('DB Connected');
}).catch(function(e){
console.log('DB Not Connected');
console.errors(e.message);
process.exit(1);
});
Schema:
var ProspectSchema = new Schema({
contact_name: {
type: String,
required: true
},
company_name: {
type: String,
required: true
},
contact_info: {
type: Array,
required: true
},
description:{
type: String,
required: true
},
product:{
type: Schema.Types.ObjectId, ref: 'Product'
},
progression:{
type: String
},
creator:{
type: String
},
sales: {
type: Schema.Types.ObjectId,
ref: 'User'
},
technical_sales: {
type: Schema.Types.ObjectId,
ref: 'User'
},
actions: [{
type: {type: String},
description: {type: String},
date: {type: Date}
}],
sales_connect_id: {
type: String
},
date_created: {
type: Date,
default: Date.now
}
});
Aggregation code:
exports.getActionsIn = function(start_date, end_date) {
var start = new Date(start_date);
var end = new Date(end_date);
return Prospect.aggregate([
{
$match: {
// "actions": {
// $elemMatch: {
// "type": {
// "$exists": true
// }
// }
// }
"actions.date": {
$gte: start,
$lte: end
}
}
}
,{
$project: {
_id: 0,
actions: 1
}
}
,{
$unwind: "actions"
}
,{
$group: {
_id: "actions.date",
count: {
$sum: 1
}
}
}
// ,{
// $project: {
// _id: 0,
// date: {
// $dateToString: {
// format: "%d/%m/%Y",
// date: "actions.date"
// }
// }
// // ,
// // count : "$count"
// }
// }
]).exec();
}
Calling the Aggregation:
router.get('/test',function(req, res, next){
var start_date = req.query.start_date;
var end_date = req.query.end_date;
ProspectCont.getActionsIn(start_date,end_date).then(function(value, err){
if(err)console.log(err);
res.json(value);
});
})
My Main Problem is that i get no response at all, i can work with an error message the issue is i am not getting any so i don't know what is wrong.
Mongoose Version: 4.11.8
P.s. I tried multiple variations of the aggregation pipeline, so this isn't my first try, i have an aggregation working on the main prospects schema but not the actions sub-document
You have several problems here, mostly by missing concepts. Lazy readers can skip to the bottom for the full pipeline example, but the main body here is in the explanation of why things are done as they are.
You are trying to select on a date range. The very first thing to check on any long running operation is that you have a valid index. You might have one, or you might not. But you should issue: ( from the shell )
db.prospects.createIndex({ "actions.date": 1 })
Just to be sure. You probably really should add this to the schema definition so you know this should be deployed. So add to your defined schema:
ProspectSchema.index({ "actions.date": 1 })
When querying with a "range" on elements of an array, you need to understand that those are "multiple conditions" which you are expecting to match elements "between". Whilst you generally can get away with querying a "single property" of an array using "Dot Notation", you are missing that the application of [$gte][1] and $lte is like specifying the property several times with $and explicitly.
Whenever you have such "multiple conditions" you always mean to use $elemMatch. Without it, you are simply testing every value in the array to see if it is greater than or less than ( being some may be greater and some may be lesser ). The $elemMatch operator makes sure that "both" are applied to the same "element", and not just all array values as "Dot notation" exposes them:
{ "$match": {
"actions": {
"$elemMatch": { "date": { "$gte": start, "$lte: end } }
}
}}
That will now only match documents where the "array elements" fall between the specified date. Without it, you are selecting and processing a lot more data which is irrelevant to the selection.
Array Filtering: Marked in Bold because it's prominence cannot be ignored. Any initial $match works just like any "query" in that it's "job" is to "select documents" valid to the expression. This however does not have any effect on the contents of the array in the documents returned.
Whenever you have such a condition for document selection, you nearly always intend to "filter" such content from the array itself. This is a separate process, and really should be performed before any other operations that work with the content. Especially [$unwind][4].
So you really should add a $filter in either an $addFields or $project as is appropriate to your intend "immediately" following any document selection:
{ "$project": {
"_id": 0,
"actions": {
"$filter": {
"input": "$actions",
"as": "a",
"in": {
"$and": [
{ "$gte": [ "$$a.date", start ] },
{ "$lte": [ "$$a.date", end ] }
]
}
}
}
}}
Now the array content, which you already know "must" have contained at least one valid item due to the initial query conditions, is "reduced" down to only those entries that actually match the date range that you want. This removes a lot of overhead from later processing.
Note the different "logical variants" of $gte and $lte in use within the $filter condition. These evaluate to return a boolean for expressions that require them.
Grouping It's probably just as an attempt at getting a result, but the code you have does not really do anything with the dates in question. Since typical date values should be provided with millisecond precision, you general want to reduce them.
Commented code suggests usage of $dateToString within a $project. It is strongly recommended that you do not do that. If you intend such a reduction, then supply that expression directly to the grouping key within $group instead:
{ "$group": {
"_id": {
"$dateToString": {
"format": "%Y-%m-%d",
"date": "$actions.date"
}
},
"count": { "$sum": 1 }
}}
I personally don't like returning a "string" when a natural Date object serializes properly for me already. So I like to use the "math" approach to "round" dates instead:
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$actions.date", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$actions.date", new Date(0) ] },
1000 * 60 * 60 * 24
]}
],
new Date(0)
]
},
"count": { "$sum": 1 }
}}
That returns a valid Date object "rounded" to the current day. Mileage may vary on preferred approaches, but it's the one I like. And it takes the least bytes to transfer.
The usage of Date(0) represents the "epoch date". So when you $subtract one BSON Date from another you end up with the milliseconds difference between the two as an integer. When $add an integer value to a BSON Date, you get a new BSON Date representing the sum of the milliseconds value between the two. This is the basis of converting to numeric, rounding to the nearest start of day, and then converting numeric back to a Date value.
By making that statement directly within the $group rather than $project, you are basically saving what actually gets interpreted as "go through all the data and return this calculated value, then go and do...". Much the same as working through a pile of objects, marking them with a pen first and then actually counting them as a separate step.
As a single pipeline stage it saves considerable resources as you do the accumulation at the same time as calculating the value to accumulate on. When you think it though much like the provided analogy, it just makes a lot of sense.
As a full pipeline example you would put the above together as:
Prospect.aggregate([
{ "$match": {
"actions": {
"$elemMatch": { "date": { "$gte": start, "$lte: end } }
}
}},
{ "$project": {
"_id": 0,
"actions": {
"$filter": {
"input": "$actions",
"as": "a",
"in": {
"$and": [
{ "$gte": [ "$$a.date", start ] },
{ "$lte": [ "$$a.date", end ] }
]
}
}
}
}},
{ "$unwind": "$actions" },
{ "$group": {
"_id": {
"$dateToString": {
"format": "%Y-%m-%d",
"date": "$actions.date"
}
},
"count": { "$sum": 1 }
}}
])
And honestly if after making sure an index is in place, and following that pipeline you still have timeout problems, then reduce the date selection down until you get a reasonable response time.
If it's still taking too long ( or the date reduction is not reasonable ) then your hardware simply is not up to the task. If you really have a lot of data then you have to be reasonable with expectations. So scale up or scale out, but those things are outside the scope of any question here.
As it stands those improvements should make a significant difference over any attempt shown so far. Mostly due to a few fundamental concepts that are being missed.
I'm looking for a way to use Mongoose's aggregation pipeline to sum data. I have a schema that looks like this:
var Object = new Schema ({
Object2 : { type : ObjectId, ref : 'Object2' },
Object3 : { type : ObjectId, ref : 'Object3' },
value : {},
unit : String
})
The 'value' field will typically be a number, but I am allowing the user to include text in the field that transforms the number. Thus, value is stored as a string, and then I convert it into a number. I can't just store it as a number because there will be no way to convert it back for the user's display.
I've been trying to perform a Mongoose $sum pipeline to sum these values, while still allowing me to convert the number. Here is what I've tried:
data.aggregate([
{ $match:
{
$and: [{Object2: mongoose.Types.ObjectId(id)}, {Object3: {$in: sessions.map(function(this){ return new mongoose.Types.ObjectId(this._id); })}}]
}
},
{ $project:
{
'value': calculateNumber('value')
}
},
{ $group:
{
_id: null,
value: { $sum: "$value"},
unit: { $first: "$unit"}
}
}
], function(err, result) {
sendCallback(err, result, callback);
return;
})
However, I get a value of 0. Not using the project stage gives a value of 0, and I can't figure out if there is any way to use the project stage to apply my function and convert the fields to Numbers, so that I can take advantage of $sum.
Any help, without just declaring value as a Number in my schema, would be greatly appreciated!
I'm using mongoose to deal with my database.
I have the following model:
var DeviceSchema = mongoose.Schema({
type: Number,
pushId: String
});
The type attribute can be either 0 or 1.
I want to execute a query that grab all documents and retrieve the result in the following format:
{
fstType: [
{
_id: "545d533e2c21b900000ad234",
type: 0,
pushId: "123"
},
{
_id: "545d533e2c21b900000ad235",
type: 0,
pushId: "124"
},
],
sndType: [
{
_id: "545d533e2c21b900000ad236",
type: 1,
pushId: "125"
},
{
_id: "545d533e2c21b900000ad237",
type: 1,
pushId: "126"
},
]
}
Is that possible? I want to do that in one single query.
Thanks.
Is that possible? I want to do that in one single query.
Yes. It is possible. You can achieve the desired result, through the following aggregation pipeline operations.
Sort by the type parameter in ascending order.
Group records together having the same type, construct an array of
documents for each group. After this stage, only two records will be
present, each with an attribute called items, which is an array of
records for each group.
Since our records are sorted by type, the first group will contain
records with type 0, and the second with type 1.
At last we merge the groups and give them each a name, based on their type.
var model = mongoose.model('collection',DeviceSchema);
model.aggregate([
{$sort:{"type":-1}},
{$group:{"_id":"$type",
"items":{$push:"$$ROOT"},
"type":{$first:"$type"}}},
{$project:{"items":{$cond:[{$eq:["$type",0]},
{"firstType":"$items"},
{"secondType":"$items"}]}}},
{$group:{"_id":null,
"firstType":{$first:"$items.firstType"},
"secondType":{$last:"$items.secondType"}}},
{$project:{"_id":0,"firstType":1,"secondType":1}}
], function (err, result) {
if (err) {
console.log(err);
return;
}
console.log(result);
});
o/p:
{ firstType:
[ { _id: '545d533e2c21b900000ad234', type: 0, pushId: '123' },
{ _id: '545d533e2c21b900000ad235', type: 0, pushId: '124' } ],
secondType:
[ { _id: '545d533e2c21b900000ad236', type: 1, pushId: '125' },
{ _id: '545d533e2c21b900000ad237', type: 1, pushId: '126' } ] }
I'm having a lot of difficulty in solving this mongodb (mongoose) problem.
There is this schema 'Recommend' (username, roomId, ll and date) and its collection contains recommendation of user.
I need to get a list of most recommended rooms (by roomId). Below is the schema and my tried solution with mongoose query.
var recommendSchema = mongoose.Schema({
username: String,
roomId: String,
ll: { type: { type: String }, coordinates: [ ] },
date: Date
})
recommendSchema.index({ ll: '2dsphere' });
var Recommend = mongoose.model('Recommend', recommendSchema);
Recommend.aggregate(
{
$group:
{
_id: '$roomId',
recommendCount: { $sum: 1 }
}
},
function (err, res) {
if (err) return handleError(err);
var resultSet = res.sort({'recommendCount': 'desc'});
}
);
The results returned from the aggregation pipeline are just plain objects. So you do the sorting as a pipeline stage, not as a separate operation:
Recommend.aggregate(
[
// Grouping pipeline
{ "$group": {
"_id": '$roomId',
"recommendCount": { "$sum": 1 }
}},
// Sorting pipeline
{ "$sort": { "recommendCount": -1 } },
// Optionally limit results
{ "$limit": 5 }
],
function(err,result) {
// Result is an array of documents
}
);
So there are various pipeline operators that can be used to $group or $sort or $limit and other things as well. These can be presented in any order, and as many times as required. Just understanding that one "pipeline" stage flows results into the next to act on.