How to aggregate and group in mongoose - node.js

I have lot of accounts with each of them having an employee assigned. I want to find the number of accounts of each employee. How do I do this task using aggregate of mongoose(mongodb). I am familiar with other functions of mongoose and able to achieve with following code
exports.accountsOfEachEmployee = function(req, res) {
Account.find({active:true}).exec(function(err, accounts){
if (err || !accounts) res.status(400).send({
message: 'could not retrieve accounts from database'
});
var accountsOfEachEmployee = {};
for (var i = 0; i < accounts.length; i++) {
if(accountsOfEachEmployee[order[i].employee]) {
accountsOfEachEmployee[order[i].employee] = 1;
} else {
accountsOfEachEmployee[order[i].employee]++;
}
}
res.json(accountsOfEachEmployee);
});
};
Is using aggregate faster? How does grouping and aggregation work in mongoose(mongodb). Following is my schema of accounts
var AccountSchema = new Schema({
active: {
type : Boolean,
default: false
},
employee: {
type: Schema.ObjectId,
ref: 'Employee'
},
});

Aggregation is an faster than map reduce to get results in mongodb for simple queries. I am able to complete the above query with result and then group, count of mongodb. Following is the query I used later
Order.aggregate({$match: {active: true }},
{$group: {_id:'$employee', numberOfOrders: {$sum:1}}}, function(err, orders) {
res.json(orders);
});
Query is executed in 2 parts. First part is getting all the results which are active and then group them based on the value of employee along with getting a new field numberofOrders which is number of number of documents in each group formed when we grouped based on employee.

Related

Nodejs: Create search query with reference collection field

My User collection model schema:
var userModel = new Schema({
userAddress: { type: Object, ref: 'useraddress' },
name: String,
});
My User addresses collection model schema:
var addressModel = new Schema({
macAddress: String,
repeat: Number,
});
Get data method is:
module.exports.get = function (req, res) {
var _repeatTime = 2;
var _searchQRY = [];
_searchQRY.push(
{
"useraddress.repeat": { $gte: _repeatTime}
});
userModel.find({ $and: _searchQRY }).populate('useraddress').exec(function (err, results) {
res.json({ record: results})
});
This is my code. I want to filter with address repeat number. But i am not getting correct result with this query.
First Mongoose performs the the search on users collection by {"useraddress.repeat": {$gte: val}} query. And only after the call starts population.
So you should get 0 results as address is not yet populated.
Here are 2 ways of solving this. First, check out this answer please.
You'll need to:
//Any conditions that apply to not populated user collection documents
var userQuery = {};
userModel.find(userQuery)
//Populate only if the condition is fulfilled
.populate('useraddress', null, {"useraddress.repeat": { $gte: _repeatTime}})
.exec(function (err, results) {
results = results.filter(function(doc){
//If not populated it will be null, so we filter them out
return !!doc.useraddress;
});
//Do needed stuff here.
});
The second way is to use aggregation and $lookup (you'll need mongodb v 3.2+). Basically it means to move this population and filtering to DB level.
userModel
.aggregate()
//Anything applying to users collection before population
.match(userQuery)
.lookup({
from: 'address', //Please check collection name here
localField: 'useraddress',
foreignField: '_id',
as: 'useraddress'
})
//Lookup pushes the mathes to an array, in our case it's 1:1, so we can unwind
.unwind('useraddress')
//Filter them as you want
.match({'useraddress.repeat': { $gte: _repeatTime}})
.exec(function (result) {
//Get the result here.
});

express/mongoose update query

I having problem wrapping my head around updating multiple values in my mongoDB using mongooseJS and ExpressJS.
Let say I submit an array of 2 or more objects from my frontend to "express routing" and there I get the req.body parameters to fetch it. My req.body looks like this:
[articles:
{ article: {
_id: '564209c66c23d5d20c37bd84',
quantity: 25,
},
{ article: {
_id: '564209c66c23d5d20c37bd83',
quantity: 51,
},
}]
I then need to loop? to find the specific article in the db to update and when that article is found I want to update the "quantity" value from the frontend to the correct article in db.
var id = [];
var body = {};
for (var i = req.body.length - 1; i >= 0; i--) {
id.push(req.body[i].article._id);
body[i] = req.body[i].article.quantity;
};
Articles.update(
{ _id: {$in: id} },
{ $set: {quantity: body[0].article.quantity} },
{multi: true},
function(err, response){
if(err)
console.log(err);
console.log(response);
});
The problem with this code is that I put in the first quantity value for all articles and I want it to be the correct one from the frontend. It feels like I'm on the right path but i pretty new to mongoDB and express so if there is a better solution or even a solution let me know.
Grahlie,
If you are having issues with queries, it's sometimes useful to test queries from the mongodb shell itself to workout the logic.
If your article documents are structured as such:
{
_id: ObjectId("564209c66c23d5d20c37bd84"),
quantity: 25
}
{
_id: ObjectId("564209c66c23d5d20c37bd83"),
quantity: 51
}
If you want to update the quantity of a unique document based on it's _id then you could so with this query.
db.articles.update(
{"_id": "564209c66c23d5d20c37bd84"},
{$set : { "quantity" : 25}}
)
If you wanted to update multiple documents with the same quantity you could use $in, but that's not what you want to do. You want to loop through your req.body array and update the quantity of each article.
So your code would be as such:
var articles = req.body;
var updateArticle = function(article) {
Articles.update(
{_id:article._id},
{$set:{ quantity: article.quantity}},
function(err, article){
...
);
}
for(var i = 0, n = articles.length; i < n; i++){
updateArticle(articles.[i].article);
}

Mongoose many to many population with uni-directional references

I have the following schemas which are used to represent a many-to-many relationship :
var CategorySchema = new Schema({
title: {type: String},
});
mongoose.model('Category', CategorySchema);
var ProductSchema = new Schema({
title: {type: String},
categories: [
{
type: Schema.ObjectId,
ref: 'Category'
}
]
});
mongoose.model('Product', ProductSchema );
When I query the Categories or the Products I want to be able to get in the result all the linked documents.
Populating the categories when querying the Product is straightforward:
Product.find().populate('categories').exec(...)
But how to do this from the Category side? I know I can add an array of ObjectId ref to the Product documents in the CategorySchema. But I'd like to avoid bi-directional referencing (I don't want to maintain it, and have a risk of inconsistency).
EDIT: here is the solution I implemented
/**
* List all Categories
*/
exports.all = function (req, res) {
//Function needed in order to send the http response only once all
//the categories' product has been retrieved and added to the returned JSON document.
function sendResponse(categories) {
res.json(categories);
}
AppCategory.list(function (err, categories) {
if (err) {
errors.serverError();
} else {
_.forEach(categories, function (category, index) {
category.products = [];
Product.byCategory(category._id, function (err, products) {
category.products= category.products.concat(products);
if (index === categories.length - 1) {
sendResponse(categories);
}
});
});
}
});
};
ProductSchema.statics = {
byCategory: function (categoryId, callback) {
this.find({'categories': categoryId})
.sort('-title')
.exec(callback);
}
};
You probably don't want to do that. :-) I would guess a product can be in some reasonably-small number of categories, but a category might have many thousands of products. In that case, trying to do Category.populate('products') is not going to work from an efficiency standpoint. You'll use lots of memory, not be able to do pagination in a straightforward way, load duplicate product data into memory when a product belongs to several categories, etc. Better to load the products in a category by querying directly against the products collection. You can filter by category easily enough a la Product.find({'categories._id': $in: arrayOfCategoryIds}}).

How can I speed up a mongoDB (mongoose) batch insert with nodejs?

I have a bunch of documents in a collection I need to copy and insert into the collection, changing only the parent_id on all of them. This is taking a very very long time and maxing out my CPU. This is the current implementation I have. I only need to change the parent_id on all the documents.
// find all the documents that need to be copied
models.States.find({parent_id: id, id: { $in: progress} }).exec(function (err, states) {
if (err) {
console.log(err);
throw err;
}
var insert_arr = [];
// copy every document into an array
for (var i = 0; i < states.length; i++) {
// copy with the new id
insert_arr.push({
parent_id: new_parent_id,
id: states[i].id,
// data is a pretty big object
data: states[i].data,
})
}
// batch insert
models.States.create(insert_arr, function (err) {
if (err) {
console.log(err);
throw err;
}
});
});
Here is the schema I am using
var states_schema = new Schema({
id : { type: Number, required: true },
parent_id : { type: Number, required: true },
data : { type: Schema.Types.Mixed, required: true }
});
There must be a better way to do this that I just cannot seem to come up with. Any suggestions are more than welcome! Thanks.
In such a case there is no point to do this on application layer. Just do this in database.
db.States.find({parent_id: id, id: { $in: progress} }).forEach(function(doc){
delete doc._id;
doc.parentId = 'newParentID';
db.States.insert(doc);
})
If you really need to do this in mongoose, I see the following problem:
your return all the documents that matches your criteria, then you iterate though them and copy them into another array (modifying them), then you iterate through modified elements and copy them back. So this is at least 3 times longer then what I am doing.
P.S. If you need to save to different collection, you should change db.States.insert(doc) to db.anotherColl.insert(doc)
P.S.2 If you can not do this from the shell, I hope you can find a way to insert my query into mongoose.

Hide embedded document in mongoose/node REST server

I'm trying to hide certain fields on my GET output for my REST server. I have 2 schema's, both have a field to embed related data from eachother into the GET, so getting /people would return a list of locations they work at and getting a list of locations returns who works there. Doing that, however, will add a person.locations.employees field and will then list out the employees again, which obviously I don't want. So how do I remove that field from the output before displaying it? Thanks all, let me know if you need any more information.
/********************
/ GET :endpoint
********************/
app.get('/:endpoint', function (req, res) {
var endpoint = req.params.endpoint;
// Select model based on endpoint, otherwise throw err
if( endpoint == 'people' ){
model = PeopleModel.find().populate('locations');
} else if( endpoint == 'locations' ){
model = LocationsModel.find().populate('employees');
} else {
return res.send(404, { erorr: "That resource doesn't exist" });
}
// Display the results
return model.exec(function (err, obj) {
if (!err) {
return res.send(obj);
} else {
return res.send(err);
}
});
});
Here is my GET logic. So I've been trying to use the query functions in mongoose after the populate function to try and filter out those references. Here are my two schema's.
peopleSchema.js
return new Schema({
first_name: String,
last_name: String,
address: {},
image: String,
job_title: String,
created_at: { type: Date, default: Date.now },
active_until: { type: Date, default: null },
hourly_wage: Number,
locations: [{ type: Schema.ObjectId, ref: 'Locations' }],
employee_number: Number
}, { collection: 'people' });
locationsSchema.js
return new Schema({
title: String,
address: {},
current_manager: String, // Inherit person details
alternate_contact: String, // Inherit person details
hours: {},
employees: [{ type: Schema.ObjectId, ref: 'People' }], // mixin employees that work at this location
created_at: { type: Date, default: Date.now },
active_until: { type: Date, default: null }
}, { collection: 'locations' });
You should specify the fields you want to fetch by using the select() method. You can do so by doing something like:
if( endpoint == 'people' ){
model = PeopleModel.find().select('locations').populate('locations');
} else if( endpoint == 'locations' ){
model = LocationsModel.find().select('employees').populate('employees');
} // ...
You can select more fields by separating them with spaces, for example:
PeopleModel.find().select('first_name last_name locations') ...
Select is the right answer but it also may help to specify it in your schema so that you maintain consistency in your API and I've found it helps me to not remember to do it everywhere I perform a query on the object.
You can set certain fields in your schema to never return by using the select: true|false attribute on the schema field.
More details can be found here: http://mongoosejs.com/docs/api.html#schematype_SchemaType-select
SOLUTION!
Because this was so hard for me to find i'm going to leave this here for anybody else. In order to "deselect" a populated item, just prefix the field with "-" in your select. Example:
PeopleModel.find().populate({path: 'locations', select: '-employees'});
And now locations.employee's will be hidden.
If you remember from you SQL days, SELECT does a restriction on the table(s) being queried. Restrict is one of the primitive operations from the relational model and continues to be a useful feature as the relational model has evolved. blah blah blah.
In mongoose, the Query.select() method allows you to perform this operation with some extra features. Particularly, not only can you specify what attributes (columns) to return, but you can also specify what attributes you want to exclude.
So here's the example:
function getPeople(req,res, next) {
var query = PeopleModel.find().populate({path: 'locations', select: '-employees'});
query.exec(function(err, people) {
// error handling stuff
// process and return response stuff
});
}
function getLocations(req,res, next) {
var query = LocationModel.find().populate({path: 'employees', select: '-locations'});
query.exec(function(err, people) {
// error handling stuff
// processing and returning response stuff
});
}
app.get('people', getPeople);
app.get('locations', getLocations);
Directly from the Mongoose Docs:
Go to http://mongoosejs.com/docs/populate.html and search for "Query conditions and other options"
Query conditions and other options
What if we wanted to populate our fans array based on their age,
select just their names, and return at most, any 5 of them?
Story
.find(...)
.populate({
path: 'fans',
match: { age: { $gte: 21 }},
select: 'name -_id',
options: { limit: 5 }
})
.exec()
I just wanted to remark, for the simplicity of the endpoint you may be able to get away with this way to define the endpoints. However, in general this kind of dispacher pattern is not necessary and may pose problems later in development when developing with Express.

Resources