Mongoose: Insert multiple documents and then disconnect database? - node.js

When using Mongoose to insert multiple documents into a collection, since each .save() method has it's own callback, how do you know when they are all complete so that you can mongoose.disconnect()?
Lets say I have 3 documents I need to insert into the database:
var database = mongoose.connect('mongodb://localhost/somedb');
document1.create({...}, function(err){
if (err) { ... }
// It's saved!
});
document2.create({...}, function(err){
if (err) { ... }
// It's saved!
});
document3.create({...}, function(err){
if (err) { ... }
// It's saved!
});
database.disconnect();
The disconnect is most likely going to happen before the documents get saved to the database, especially if the Mongodb server is remote or slow or something.
What is the best way to handle this? Are Promises the only way? What was the solution to this a few years ago before Promises were so prevalent?

One option instead of using promises is to use the async library that has been around for some years now. There are nice helpers such as async.series and async.parallel that provide a final function call when all provided methods are complete, with the difference there being whether you need the operations to complete in series ( one after the other ) on simply just execute all together in parallel:
async.parallel(
[
function(callback) {
document1.create({...},callback);
},
function(callback) {
document2.create({...},callback);
},
function(callback) {
document3.create({...},callback);
},
],
// Called when all completed or on an error
function(err,results) {
// err is any error
// results contains an array ( in this case ) of any results
mongoose.disconnect();
}
)
Since the first argument in either case is either an array or an object containing the function calls, then you can just build up said array/object and pass it in, as long as the second argument is the handler that recieves any error or otherwise executes when all are complete.
So you can for example "build" like this:
var list = [
{ "model": "Model1", "data": {...} },
{ "model": "Model2", "data": {...} },
{ "model": "Model3", "data": {...} }
];
list = list.map(function(item) {
return function(callback) {
mongoose.model(item.model).create(item.data,callback);
}
});
async.parallel(list,function(err,results) {
// err is any error
// results contains an array ( in this case ) of any results
mongoose.disconnect();
});
And that all works the same as if hardcoded, but in a simple and programatic way.
Of course the other natural approach is to simply nest each call within their callbacks, but libraries such as shown are meant to make things prettier and easier to manipulate for creation of such a list.

Related

Mongoose subdocument update: dot notation assignment OR .findByIdAndUpdate() [duplicate]

I am using .pull to remove a record from an array in mongo db and it works fine, but a comment I read somewhere on stack overflow (can't find it again to post the link) is bothering me in that it commented that it was bad to use .save instead of using .findByIdAndUpdate or .updateOne
I just wanted to find out if this is accurate or subjective.
This is how I am doing it currently. I check if the product with that id actually exists, and if so I pull that record from the array.
exports.deleteImg = (req, res, next) => {
const productId = req.params.productId;
const imgId = req.params.imgId;
Product.findById(productId)
.then(product => {
if (!product) {
return res.status(500).json({ message: "Product not found" });
} else {
product.images.pull(imgId);
product.save()
.then(response => {
return res.status(200).json( { message: 'Image deleted'} );
})
}
})
.catch(err => {
console.log(err);
});
};
I think what they were saying though was it should rather be done something like this (an example I found after a google)
users.findByIdAndUpdate(userID,
{$pull: {friends: friend}},
{safe: true, upsert: true},
function(err, doc) {
if(err){
console.log(err);
}else{
//do stuff
}
}
);
The main difference is that when you use findById and save, you first get the object from MongoDB and then update whatever you want to and then save. This is ok when you don't need to worry about parallelism or multiple queries to the same object.
findByIdAndUpdate is atomic. When you execute this multiple times, MongoDB will take care of the parallelism for you. Folllowing your example, if two requests are made at the same time on the same object, passing { $pull: { friends: friendId } }, the result will be the expected: only one friend will be pulled from the array.
But let's say you've a counter on the object, like friendsTotal with starting value at 0. And you hit the endpoint that must increase the counter by one twice, for the same object.
If you use findById, then increase and then save, you'd have some problems because you are setting the whole value. So, you first get the object, increase to 1, and update. But the other request did the same. You'll end up with friendsTotal = 1.
With findByIdAndUpdate you could use { $inc: { friendsTotal: 1 } }. So, even if you execute this query twice, on the same time, on the same object, you would end up with friendsTotal = 2, because MongoDB use these update operators to better handle parallelism, data locking and more.
See more about $inc here: https://docs.mongodb.com/manual/reference/operator/update/inc/

Query with Mongoose multiple times without nesting

I'm trying to generate a document with node.js that needs to run multiple unrelated database queries from a mongo database.
Here is my current code:
Data.find({}, function(err, results) {
if (err) return next(err);
//finished getting data
res.render('page');
}
}
The problem is if I try to run another query, I seem to have to nest it within the first one so that it waits for the first one to finish before starting, and then I have to put res.render() within the innermost nested query (if I don't, res.render() will be called before the database is finished grabbing data, and it wont be rendered with the page).
What I have to do:
Data.find({}, function(err, results) {
if (err) return next(err);
//finished getting data
Data2.find({}, function(err, results2) {
if (err) return next(err);
//finished getting data 2
res.render('page');
}
}
}
}
I am going to have more than 2 queries, so if I keep nesting them it's going to get really messy really fast. Is there a cleaner way to do this, such as a way to make the code wait until all the data is returned and the function is run before continuing with the script?
For mongoose you can probably just do a Promise.all() and use .concat() on the resulting arrays of each query.
As a full demo:
var async = require('async'),
mongoose = require('mongoose'),
Schema = mongoose.Schema;
var d1Schema = new Schema({ "name": String });
var Data1 = mongoose.model("Data1", d1Schema);
var d2Schema = new Schema({ "title": String });
var Data2 = mongoose.model("Data2", d2Schema);
mongoose.set('debug',true);
mongoose.connect('mongodb://localhost/test');
async.series(
[
// Clean
function(callback) {
async.each([Data1,Data2],function(model,callback) {
model.remove({},callback)
},callback);
},
// Setup some data
function(callback) {
async.each([
{ "name": "Bill", "model": "Data1" },
{ "title": "Something", "model": "Data2" }
],function(data,callback) {
var model = data.model;
delete data.model;
mongoose.model(model).create(data,callback);
},callback);
},
// Actual Promise.all demo
function(callback) {
Promise.all([
Data1.find().exec(),
Data2.find().exec()
]).then(function(result) {
console.log([].concat.apply([],result));
callback()
}).catch(callback);
}
],
function(err) {
if (err) throw err;
mongoose.disconnect();
}
)
I'm just mixing in async there for brevity of example, but the meat of it is in:
Promise.all([
Data1.find().exec(),
Data2.find().exec()
]).then(function(result) {
console.log([].concat.apply([],result));
})
Where the Promise.all() basically waits for and combines the two results, which would be an "array of arrays" here but the .concat() takes care of that. The result will be:
[
{ _id: 59420fd33d48fa0a490247c8, name: 'Bill', __v: 0 },
{ _id: 59420fd43d48fa0a490247c9, title: 'Something', __v: 0 }
]
Showing the objects from each collection, joined together in one array.
You could also use the async.concat method as an alternate, but unless you are using the library already then it's probably just best to stick to promises.

How to avoid multiple finds of same document in mongoose?

I'm new to mongoose so please forgive me if this sounds stupid.
For every edit, I want to store the historic values and also modify the existing value of my collection.
So here's the code
collection.findById(type_id).select({ "history": 0 })
.then(function(data){
var changes = { data : data, by : user_id }
return collection.findOneAndUpdateAsync({_id:type_id},
{$push: {"history": changes }})
})
.then(function(data){
return collection.findOneAndUpdateAsync({_id:type_id}, info)
})
.then(function(res){
resolve(res) })
This piece of code works just fine, but I don't want to do multiple find for the same collection.
Would be great if you can suggest something better and efficient.
Thanks in advance. :)
I don't think all that's possible within an atomic update. The best you can do is reduce the number of calls to the server down to two: your first call is the update using findByIdAndUpdate() with the new option set to false (default), this will allow you to access
the data before the update. You can then use the second call to update history with this data. For example:
collection.findByIdAndUpdateAsync(type_id, info /*, { "new": false } */)
.then(function(data){
var changes = {
"data": data,
"by": user_id
};
return collection.findByIdAndUpdateAsync(
type_id,
{ "$push": { "history": changes } },
{ "new": true }
);
})
.then(function(res){ resolve(res) });

Mongoose/ express how to retrieve only the values of the array

I am new to node.js and want to do the following thing.
Write a query to fetch the annotation(array values) key from mongoDb and pass this array values [only ] as an argument to the second query.
Here is my code
// create the carousel based on the associated stills using keyword annotations
function findAssociatedArchivalStills(carousels, taskCb){
async.waterfall([
// 1st query
function findAssociatedAnnotations(archiveId, taskCb) {
Archive.findAnnotations(archiveId, function onResult(err,annotations){
console.log(annotations);
taskCb(err,annotations);
});
},
// 2nd query
function findAssociatedStills(annotations,taskCb) {
Still.findAssociatedStills(annotations,function onResult(err,stills){
taskCb(err,stills);
});
},
function buildCarousel(stills,taskCb) {
return taskCb(null, new Carousel({
title: 'Related Stills',
items: stills,
}));
},
], function onFinish(err) {
taskCb(null, carousels);
});
},
// done building the associated Episodes carousel
], function onFinish(err, carousels) {
handleResponse(err, res, carousels);
});
});
The methods are defined as follows
1st query definition in model
schema.statics.findAnnotations = function findAnnotations(archiveId, cb) {
this.findOne()
.where('_id', types.ObjectId(archiveId))
.select({'keywords':1, '_id':0})
.exec(cb);
};
2nd query definition in model
schema.statics.findAssociatedStills = function
findAssociatedStills(Annotations, cb) {
this.find()
.where({$text: { $search:Annotations}},{score:{$meta:"textScore"}})
.sort({score:{$meta:"textScore"}})
.limit(2)
.exec(cb);
};
THE PROBLEM IS
When I ran the 1st query , it is returning following
{ keywords:
[ 'IRELAND',
'ELECTIONS',
'PARTY_POLITICAL_BROADCASTS',
'FINE_GAEL' ] }
But the input to the next query should be only the values such as
'IRELAND', 'ELECTIONS', 'PARTY_POLITICAL_BROADCASTS', 'FINE_GAEL'
How to filter from the result only the values of the array without key
I know what will be the query in MongoDb
that is as follows
db.archives.episodes.find({_id:ObjectId("577cd9558786332020aff74c")}, {keywords:1, _id:0}).forEach( function(x) { print(x.keywords); } );
Is it good to filter it in the query or is it right way to filter in the returned script.
Please advice.Thanks for your time.
You're using series, not waterfall. And your archiveId cannot be set in the first function. You need to setup it before async.waterfall.
Here's the right syntax (with waterfall) :
function findAssociatedArchivalStills(carousels, masterCallback){
var archiveId = 'yourArchiveId';
async.waterfall([
// 1st query
function findAssociatedAnnotations(taskCallback) {
Archive.findAnnotations(archiveId, taskCallback);
},
// 2nd query
function findAssociatedStills(annotations, taskCallback) {
Still.findAssociatedStills(annotations,taskCallback);
},
function buildCarousel(stills, taskCallback) {
return taskCallback(null, new Carousel({
title: 'Related Stills',
items: stills
}));
}
], function onFinish(err) {
if (err){
return masterCallback(err);
}
masterCallback(null, carousels);
});
}
Documentation : http://caolan.github.io/async/docs.html#.waterfall
PS : Always use different names for your function's callback and your async functions callbacks.

How can I save multiple documents concurrently in Mongoose/Node.js?

At the moment I use save to add a single document. Suppose I have an array of documents that I wish to store as single objects. Is there a way of adding them all with a single function call and then getting a single callback when it is done? I could add all the documents individually but managing the callbacks to work out when everything is done would be problematic.
Mongoose does now support passing multiple document structures to Model.create. To quote their API example, it supports being passed either an array or a varargs list of objects with a callback at the end:
Candy.create({ type: 'jelly bean' }, { type: 'snickers' }, function (err, jellybean, snickers) {
if (err) // ...
});
Or
var array = [{ type: 'jelly bean' }, { type: 'snickers' }];
Candy.create(array, function (err, jellybean, snickers) {
if (err) // ...
});
Edit: As many have noted, this does not perform a true bulk insert - it simply hides the complexity of calling save multiple times yourself. There are answers and comments below explaining how to use the actual Mongo driver to achieve a bulk insert in the interest of performance.
Mongoose 4.4 added a method called insertMany
Shortcut for validating an array of documents and inserting them into
MongoDB if they're all valid. This function is faster than .create()
because it only sends one operation to the server, rather than one for each
document.
Quoting vkarpov15 from issue #723:
The tradeoffs are that insertMany() doesn't trigger pre-save hooks, but it should have better performance because it only makes 1 round-trip to the database rather than 1 for each document.
The method's signature is identical to create:
Model.insertMany([ ... ], (err, docs) => {
...
})
Or, with promises:
Model.insertMany([ ... ]).then((docs) => {
...
}).catch((err) => {
...
})
Mongoose doesn't have bulk inserts implemented yet (see issue #723).
Since you know the number of documents you're saving, you could write something like this:
var total = docArray.length
, result = []
;
function saveAll(){
var doc = docArray.pop();
doc.save(function(err, saved){
if (err) throw err;//handle error
result.push(saved[0]);
if (--total) saveAll();
else // all saved here
})
}
saveAll();
This, of course, is a stop-gap solution and I would recommend using some kind of flow-control library (I use q and it's awesome).
Bulk inserts in Mongoose can be done with .insert() unless you need to access middleware.
Model.collection.insert(docs, options, callback)
https://github.com/christkv/node-mongodb-native/blob/master/lib/mongodb/collection.js#L71-91
Use async parallel and your code will look like this:
async.parallel([obj1.save, obj2.save, obj3.save], callback);
Since the convention is the same in Mongoose as in async (err, callback) you don't need to wrap them in your own callbacks, just add your save calls in an array and you will get a callback when all is finished.
If you use mapLimit you can control how many documents you want to save in parallel. In this example we save 10 documents in parallell until all items are successfully saved.
async.mapLimit(myArray, 10, function(document, next){
document.save(next);
}, done);
I know this is an old question, but it worries me that there are no properly correct answers here. Most answers just talk about iterating through all the documents and saving each of them individually, which is a BAD idea if you have more than a few documents, and the process gets repeated for even one in many requests.
MongoDB specifically has a batchInsert() call for inserting multiple documents, and this should be used from the native mongodb driver. Mongoose is built on this driver, and it doesn't have support for batch inserts. It probably makes sense as it is supposed to be a Object document modelling tool for MongoDB.
Solution: Mongoose comes with the native MongoDB driver. You can use that driver by requiring it require('mongoose/node_modules/mongodb') (not too sure about this, but you can always install the mongodb npm again if it doesn't work, but I think it should) and then do a proper batchInsert
Newer versions of MongoDB support bulk operations:
var col = db.collection('people');
var batch = col.initializeUnorderedBulkOp();
batch.insert({name: "John"});
batch.insert({name: "Jane"});
batch.insert({name: "Jason"});
batch.insert({name: "Joanne"});
batch.execute(function(err, result) {
if (err) console.error(err);
console.log('Inserted ' + result.nInserted + ' row(s).');
}
Use insertMany function to insert many documents. This sends only one operation to the server and Mongoose validates all the documents before hitting the mongo server. By default Mongoose inserts item in the order they exist in the array. If you are ok with not maintaining any order then set ordered:false.
Important - Error handling:
When ordered:true validation and error handling happens in a group means if one fails everything will fail.
When ordered:false validation and error handling happens individually and operation will be continued. Error will be reported back in an array of errors.
Here is another way without using additional libraries (no error checking included)
function saveAll( callback ){
var count = 0;
docs.forEach(function(doc){
doc.save(function(err){
count++;
if( count == docs.length ){
callback();
}
});
});
}
You can use the promise returned by mongoose save, Promise in mongoose does not have all, but you can add the feature with this module.
Create a module that enhance mongoose promise with all.
var Promise = require("mongoose").Promise;
Promise.all = function(promises) {
var mainPromise = new Promise();
if (promises.length == 0) {
mainPromise.resolve(null, promises);
}
var pending = 0;
promises.forEach(function(p, i) {
pending++;
p.then(function(val) {
promises[i] = val;
if (--pending === 0) {
mainPromise.resolve(null, promises);
}
}, function(err) {
mainPromise.reject(err);
});
});
return mainPromise;
}
module.exports = Promise;
Then use it with mongoose:
var Promise = require('./promise')
...
var tasks = [];
for (var i=0; i < docs.length; i++) {
tasks.push(docs[i].save());
}
Promise.all(tasks)
.then(function(results) {
console.log(results);
}, function (err) {
console.log(err);
})
Add a file called mongoHelper.js
var MongoClient = require('mongodb').MongoClient;
MongoClient.saveAny = function(data, collection, callback)
{
if(data instanceof Array)
{
saveRecords(data,collection, callback);
}
else
{
saveRecord(data,collection, callback);
}
}
function saveRecord(data, collection, callback)
{
collection.save
(
data,
{w:1},
function(err, result)
{
if(err)
throw new Error(err);
callback(result);
}
);
}
function saveRecords(data, collection, callback)
{
save
(
data,
collection,
callback
);
}
function save(data, collection, callback)
{
collection.save
(
data.pop(),
{w:1},
function(err, result)
{
if(err)
{
throw new Error(err);
}
if(data.length > 0)
save(data, collection, callback);
else
callback(result);
}
);
}
module.exports = MongoClient;
Then in your code change you requires to
var MongoClient = require("./mongoHelper.js");
Then when it is time to save call (after you have connected and retrieved the collection)
MongoClient.saveAny(data, collection, function(){db.close();});
You can change the error handling to suit your needs, pass back the error in the callback etc.
This is an old question, but it came up first for me in google results when searching "mongoose insert array of documents".
There are two options model.create() [mongoose] and model.collection.insert() [mongodb] which you can use. View a more thorough discussion here of the pros/cons of each option:
Mongoose (mongodb) batch insert?
Here is an example of using MongoDB's Model.collection.insert() directly in Mongoose. Please note that if you don't have so many documents, say less than 100 documents, you don't need to use MongoDB's bulk operation (see this).
MongoDB also supports bulk insert through passing an array of
documents to the db.collection.insert() method.
var mongoose = require('mongoose');
var userSchema = mongoose.Schema({
email : { type: String, index: { unique: true } },
name : String
});
var User = mongoose.model('User', userSchema);
function saveUsers(users) {
User.collection.insert(users, function callback(error, insertedDocs) {
// Here I use KrisKowal's Q (https://github.com/kriskowal/q) to return a promise,
// so that the caller of this function can act upon its success or failure
if (!error)
return Q.resolve(insertedDocs);
else
return Q.reject({ error: error });
});
}
var users = [{email: 'foo#bar.com', name: 'foo'}, {email: 'baz#bar.com', name: 'baz'}];
saveUsers(users).then(function() {
// handle success case here
})
.fail(function(error) {
// handle error case here
});

Resources