Mongoose bulk insert or update documents - node.js

I am working on a node.js app, and I've been searching for a way around using the Model.save() function because I will want to save many documents at the same time, so it would be a waste of network and processing doing it one by one.
I found a way to bulk insert. However, my model has two properties that makes them unique, an ID and a HASH (I am getting this info from an API, so I believe I need these two informations to make a document unique), so, I wanted that if I get an already existing object it would be updated instead of inserted into the schema.
Is there any way to do that? I was reading something about making concurrent calls to save the objects, using Q, however I still think this would generate an unwanted load on the Mongo server, wouldn't it? Does Mongo or Mongoose have a method to bulk insert or update like it does with insert?
Thanks in advance

I think you are looking for the Bulk.find(<query>).upsert().update(<update>) function.
You can use it this way:
bulk = db.yourCollection.initializeUnorderedBulkOp();
for (<your for statement>) {
bulk.find({ID: <your id>, HASH: <your hash>}).upsert().update({<your update fields>});
}
bulk.execute(<your callback>)
For each document, it will look for a document matching the {ID: <your id>, HASH: {your hash}} criteria. Then:
If it finds one, it will update that document using {<your update fields>}
Otherwise, it will create a new document
As you need, it will not make a connection to the mongo server on each iteration of the for loop. Instead a single call will be made on the bulk.execute() line.

Related

Node.JS/Express - how to avoid multiple database queries

I have a basic express app and im getting started with db queries and i want to know how to avoid multiple db queries because i dont think its efficient the way i do it :
app.get('/:word', function(req,res){
db.create({'name': word});
console.log('the word is ' + word);
});
What i want to do is :
get the word from the url
check if it exists in the datbaase (or previously requested because if it was then it was probably added already through this basic code)
if it doesn't exist then add it and then proceed to console.log
I want to add each word to my database once only and not run the db query again and again.
Here's what im thinking :
Not so efficient way
query to check if it exists before inserting one
Good way but i dont know how to start here
Cache the word being queried and maintain cache to prevent db queries
More info edit
I'm using mongodb via mongoose
the 'word' key is already unique so i know its not creating duplicate values
i dont want to run ANY db queries if that value or that url has already been hit once
The only way to check if the word already exists is to query the database before inserting. There are libraries (and also database) that implements the findOrCreate method, but this is always just an abstraction. Behind the scenes, the database will search for an existing value before writing.
If your database is huge and queryng is not suitable, you could use a cashing system (like Redis). But this definitely depends on your logic and your data size.
Probably you can just optimize the process just adding and index to the column you want be unique (I guess it's name?).
You could also define the column name as unique. When inserting, the database will throw you an error if the document already exists. But keep in mind again that, behind the scenes, the database is queryng for an existing same value before inserting. The advantage to have an "unique" column is that the index for this column is automatically created and also from your app logic (node js) you can just call the insert method and add a little bit error handling logic.
MongoDB will create any collections you use in your app if they do not already exist.
Insert Unique Value :
Create Unique Index to your key, So that the value will be added only once. If you try to add again it will throws an error to you.
To create Unique Index,
db.collection.createIndex( { "name": 1 }, { unique: true } )
Caching :
For caching, Store your data on cache system(Like: memory-cache, redis) on first time data will be query from MongoDB and then for subsequent need of data you can use cache system.
In mongo db you can use findOneAndUpdate with optional flag upsert: true documentation
To ensure that every word appears only once you should also set unique index on that field. However rememer that unique index is case sensitive so Cat and cat are different words.

Couch db bulk operations

So I've been trying to move data from one database to another. I've already move them but I need to clear the documents which I've already moved from the old database. I've been using ektorp's execute bulk to perform bulk operations. But for some reason I keep getting document update conflict when I try to delete bulk by inserting _deleted.
I might be doing it wrong, here is what I did.
Fetch by bulk with include docs. (For some reason, this doesn't work with just id and rev.)
Then include the _deleted field to each document.
Post using executebulk.
It works for some documents but keeps getting document update conflict for some documents.
Any solution/suggestions please..
This is the preferred way of deleting docs in bulk:
List<Object> bulkDocs = ...
MyClass toBeDeleted = ...
bulkDocs.add(BulkDeleteDocument.of(toBeDeleted));
db.executeBulk(bulkDocs);
If you only need a way to delete/update docs in bulk and you don't need to necessarily implement it in your own software, you can use the great couchapp at:
https://github.com/harthur/costco
You need to upload it to your own server with a couchapp deployment tool, and use a function like
function(doc) {
if(doc.istodelete) // replace this or remove to delete all docs
return null;
}
Read instructions and examples

Batch update with Mongoose

I'm pulling data from a RETS(XML) feed and saving it in a local MongoDB using node and mongoose.
Periodically I need to update the documents and delete the inactive ones as well as add new ones. Rather than making multiple queries to Mongo or the RETS server, I was pulling both and looping through the data.
This works fine but is there a way to save the Mongoose results back to the database with updates and inserts? Or do I need to find each document and update it individually?
On MongoDB, to update multiple documents (not just one) using Mongoose you can use the multi option:
Model.updateMany({
size: 'lage'
}, {
$set: { size: 'large' }
});
See more on in the Mongoose documentation for updating documents and here
For completeness, If any one has multiple query conditions and want to add new fields for every matching documents of query condition then we can go with
var bulk = Person.collection.initializeUnorderedBulkOp();
bulk.find(query1).update(update1);
bulk.find(query2).update(update2);
bulk.execute(callback);
In following documentation, it is said that db.collection.initializeUnorderedBulkOp()
Initializes and returns a new Bulk() operations builder for a
collection. The builder constructs an unordered list of write
operations that MongoDB executes in bulk. MongoDB executes in
parallel the write operations in the list.
https://docs.mongodb.org/v3.0/reference/method/db.collection.initializeUnorderedBulkOp/

How to efficiently bulk insert and update mongodb document values from an array?

I have a Tags collection which contains documents of the following structure:
{
word:"movie", //tag word
count:1 //count of times tag word has been used
}
I am given an array of new tags that need to be added/updated in the Tags collection:
["music","movie","book"]
I can update the counts all Tags currently existing in the tags collection by using the following query:
db.Tags.update({word:{$in:["music","movies","books"]}}, {$inc:{count:1}}), true, true);
While this is an effective strategy to update, I am unable to see which tag values were not found in the collection, and setting the upsert flag to true did not create new documents for the unfound tags.
This is where I am stuck, how should I handle the bulk insert of "new" values into the Tags collection?
Is there any other way I could better utilize the update so that it does upsert the new tag values?
(Note: I am using Node.js with mongoose, solutions using mongoose/node-mongo-native would be nice but not necessary)
Thanks ahead
The concept of using upsert and the $in operator simultaneously is incongruous. This simply will not work as there is no way to different between upsert if *any* in and upsert if *none* in.
In this case, MongoDB is doing the version you don't want it to do. But you can't make it change behaviour.
I would suggest simply issuing three consecutive writes by looping through the array of tags. I know that's it's annoying and it has a bad code smell, but that's just how MongoDB works.

How to get Post with Comments Count in single query with CouchDB?

How to get Post with Comments Count in single query with CouchDB?
I can use map-reduce to build standalone view [{key: post_id, value: comments_count}] but then I had to hit DB twice - one query to get the post, another to get comments_count.
There's also another way (Rails does this) - count comments manually, on the application server and save it in comment_count attribute of the post. But then we need to update the whole post document every time a new comment added or deleted.
It seems to me that CouchDB is not tuned for such a way, unlike RDBMS when we can update only the comment_count attribute in CouchDB we are forced to update the whole post document.
Maybe there's another way to do it?
Thanks.
The view's return json includes the document count as 'total_rows', so you don't need to compute anything yourself, just emit all the documents you want counted.
{"total_rows":3,"offset":0,"rows":[
{"id":...,"key":...,value:doc1},
{"id":...,"key":...,value:doc2},
{"id":...,"key":...,value:doc3}]
}

Resources