I'm trying to update every document in an expanding Mongo database.
My plan is to start with the youngest, most recently created document and work back from there, one-by-one querying the next oldest document.
The problem is that my Mongoose query is skipping documents that were created in the same second. I thought greater than/less than operators would work on _ids generated in the same second. But though there are 150 documents in the database right now, this function gets from the youngest to the oldest document in only 8 loops.
Here's my Mongoose query within the recursive node loop:
function loopThroughDatabase(i, doc, sizeOfDatabase){
if (i < sizeOfDatabase) {
(function(){
myMongooseCollection.model(false)
.find()
.where("_id")
.lt(doc._id)
.sort("id")
.limit(1)
.exec(function(err, docs) {
if (err) {
console.log(err);
}
else {
updateDocAndSaveToDatabase(docs[0]);
loopThroughDatabase(i + 1, docs[0], sizeOfDatabase); //recursion here
}
});
})();
}
}
loopThroughDatabase(1, youngestDoc, sizeOfDatabase);
Error found.
In the Mongoose query, I was sorting by "id" rather than "_id"
If you read the MongoDB documentation, you will see that it depends on the process in which the item was created http://docs.mongodb.org/manual/reference/glossary/#term-objectid, therefore, to guarantee what you need, you need to add a Date stamp to the records and use that instead of the _id
Related
This question already has answers here:
How to update if exists otherwise insert new document?
(4 answers)
Closed 3 years ago.
I'm getting usersfrom this API.
https://redmine-mock-api.herokuapp.com/api/v1/users
There are 100 in total (100 different ids). I added them to my mongodb database using mongoose but the problem is that it's always adding (i already have 5000 users (50 times repeated 100 users)
I want to add if the id does not exist or update if it exists.
What am I doing wrong? users is the array of users from the API
db.collection("users").insertMany(users, function (error, response) {
if (error) throw error;
console.log("Number of users inserted: " + response.insertedCount);
db.close();
});
try following format
db.collection.update({'_id': Id}, //Find with the unique identifier
{//put whatever you want to insert},
{upsert: true}
)
Looking at the mongo db docs it looks like insertMany is not what you want to use here. You probably want to use updateMany as it supports upserts where insertMany doesn't.
You should be able to use something like this:
db.collection("users").updateMany(
{},
users,
{
upsert: true
}
)
This is using an empty filter so will insert if the item doesn't exist or update if it does.
db.collection.updateMany(
<filter>,
<update>,
{
upsert: <boolean>
}
)
The updateMany() method takes the following parameters:
Filter
The selection criteria for the update. The same query selectors as in the find() method are available.
Specify an empty document { } to update all documents in the collection.
Update
The modifications to apply.
Use Update Operators such as $set, $unset, or $rename.
Upsert
Optional. When true, updateMany() either:
Creates a new document if no documents match the filter. For more details see upsert behavior.
Updates documents that match the filter.
To avoid multiple upserts, ensure that the filter fields are uniquely indexed.
Defaults to false.
db.users.updateMany(
{},
{$set: users},
{
upsert: true
}
)
i am having an issue with this piece of code.
(Using nodes, express, mongoose to save documents to mongodb)
// We have an array that contains objects (365) of type 'Day' to be stored in mongodb.
// In my code this array contains data :-)
let arrDays = []
// Empty array to hold reference to the _id of the created days
let arrDaysID = [];
// Store all days in the 'arrDays' array to mongodb using mongoose
Day.create(arrDays, function(err, days){
if(err){
console.log("Something went wrong " + err)
} else {
// Iterate over every 'day' object in mongodb and add a reference to
// the _id of the 'day' object inside the 'year' object
days.forEach(function(day){
year.days.push(day._id);
year.save();
});
}
});
The problem is that instead of adding every _id reference in the 'year' object once, the _id reference is added multiple times. When the code finished running, about 140.000 references are inside the year.days array while there are only 365 days defined...
Any hints or tips are welcome.
Dave
As per the documentation of model.create API, it is a Shortcut for saving one or more documents to the database. MyModel.create(docs) does new MyModel(doc).save() for every doc in docs.
This means, you are executing the forEach loop for every day added in create method making it an n*n complex function.
365 * 365 = 133225 which is approximately equal to 140000 records.
I think this explains the problem.
EDIT - Better Alternative
As per mongoose insertMany API, Model.insertMany() is faster than Model.create because it only sends one operation to the server, rather than one for each document
I need to get data from MongoDB that is first narrowed by one initial category, say '{clothing : pants}' and then a subsequent search for pants of a specific size, using an array like sizes = ['s','lg','6', '12'].
I need to return all of the results where 'pants' matches those 'sizes'.
I've started a search with:
Product.find({$and:[{categories:req.body.category, size:{$in:req.body.sizes}}]},
function(err, products) {
if (err) { console.log(err); }
return res.send(products)
});
I really don't know where to go from there. I've been all over the Mongoose docs.
Some direction would be very helpful.
The mongoose queries can receive object like Mongodb would. So you can pass the search parameters separated by ,
Product.find({categories:req.body.category, size:{$in:['s','lg','6', '12']}})
For more information on $in, check here
For more information on $and operator, check here (note we can ommit the $and operator in some cases and that is what I did)
I am working through a MEAN stack tutorial. It contains the following code as a route in index.js. The name of my Mongo collection is brandcollection.
/* GET Brand Complaints page. */
router.get('/brands', function(req, res) {
var db = req.db;
var collection = db.get('brandcollection');
collection.find({},{},function(e,docs){
res.render('brands', {
"brands" : docs
});
});
});
I would like to modify this code but I don't fully understand how the .find method is being invoked. Specifically, I have the following questions:
What objects are being passed to function(e, docs) as its arguments?
Is function(e, docs) part of the MongoDB syntax? I have looked at the docs on Mongo CRUD operations and couldn't find a reference to it. And it seems like the standard syntax for a Mongo .find operation is collection.find({},{}).someCursorLimit(). I have not seen a reference to a third parameter in the .find operation, so why is one allowed here?
If function(e, docs) is not a MongoDB operation, is it part of the Monk API?
It is clear from the tutorial that this block of code returns all of the documents in the collection and places them in an object as an attribute called "brands." However, what role specifically does function(e, docs) play in that process?
Any clarification would be much appreciated!
The first parameter is the query.
The second parameter(which is optional) is the projection i.e if you want to restrict the contents of the matched documents
collection.find( { qty: { $gt: 25 } }, { item: 1, qty: 1 },function(e,docs){})
would mean to get only the item and qty fields in the matched documents
The third parameter is the callback function which is called after the query is complete. function(e, docs) is the mongodb driver for node.js syntax. The 1st parameter e is the error. docs is the array of matched documents. If an error occurs it is given in e. If the query is successful the matched documents are given in the 2nd parameter docs(the name can be anything you want).
The cursor has various methods which can be used to manipulate the matched documents before mongoDB returns them.
collection.find( { qty: { $gt: 25 } }, { item: 1, qty: 1 })
is a cursor you can do various operations on it.
collection.find( { qty: { $gt: 25 } }, { item: 1, qty: 1 }).skip(10).limit(5).toArray(function(e,docs){
...
})
meaning you will skip the first 10 matched documents and then return a maximum of 5 documents.
All this stuff is given in the docs. I think it's better to use mongoose instead of the native driver because of the features and the popularity.
I have looked a long time and not found an answer. The Node.JS MongoDB driver docs say you can do bulk inserts using insert(docs) which is good and works well.
I now have a collection with over 4,000,000 items, and I need to add a new field to all of them. Usually mongodb can only write 1 transaction per 100ms, which means I would be waiting for days to update all those items. How can I do a "bulk save/update" to update them all at once? update() and save() seem to only work on a single object.
psuedo-code:
var stuffToSave = [];
db.collection('blah').find({}, function(err, stuff) {
stuff.toArray().forEach(function(item)) {
item.newField = someComplexCalculationInvolvingALookup();
stuffToSave.push(item);
}
}
db.saveButNotSuperSlow(stuffToSave);
Sure, I'll need to put some limit on doing something like 10,000 at once to not try do all 4 million at once, but i think you get the point.
MongoDB allows you to update many documents that match a specific query using a single db.collection.update(query, update, options) call, see the documentation. For example,
db.blah.update(
{ },
{
$set: { newField: someComplexValue }
},
{
multi: true
}
)
The multi option allows the command to update all documents that match the query criteria. Note that the exact same thing applies when using the Node.JS driver, see that documentation.
If you're performing many different updates on a collection, you can wrap them all in a Bulk() builder to avoid some of the overhead of sending multiple updates to the database.