Chaining nested asynchronous finds with Node.js monk and MongoDB - node.js

Using Node.js monk and MongoDB, I want to mimic a table join:
Do a find on collection A
For each result X, do a find in collection B, and update X
Return updated list of results
The asynchronous nature of database commands in monk is giving me trouble.
This is my initial code. It doesn't work because the second call to find returns a promise immediately,
and the results in xs are sent in the response before they can be updated.
var db = require('monk')('localhost/mydb');
db.get('collection').find({}, function(e,xs) {
xs.forEach(function(x){
coll_b.find({a_id:x._id}, function(e,bs) {
a['bs'] = bs;
});
});
res.json({'results':as});
});
I feel like I should use promise chaining here, but I cannot figure out how to do it.
Any help would be greatly appreciated.

I think I solved it in this way, inspired by this answer:
var db = require('monk')('localhost/mydb');
// Initial find
db.get('collection').find({}, function(e,xs) {
// Get inner finds as a list of functions which return promises
var tasks = xs.map(function(x){
return function() {
return coll_b.find({a_id:x._id}, function(e,bs) {
a['bs'] = bs;
});
}
});
// Chain tasks together
var p = tasks[0](); // start the first one
for(var i = 1; i < tasks.length; i++) p = p.then(tasks[i]);
// After all tasks are done, output results
p.then(function(_x){
res.json({'results':xs});
});
});
I still feel like this code could be minimised by using chain(), but at least this works as expected.
Note: I realise that performing a second find for each result is not necessarily efficient, but that's not my concern here.

Related

Is this is happening because of the asynchronous nature of Node.js or is there any alternate way to achiece this?

This is what I am trying to do.
I have an empty array
var send_data = [] ;
and I am using "sync-each" npm library of node.js before that I was doing the iterations using the map callback function but got stuck in the same situation.
Here's my code.
var each = require('sync-each');
client.execute(someQuery,[value],(err,data) => {
var items = data.rows;
each(items,(items,next) => {
// here I am performing some if-else queries and some Cassandra database queries and then pushing the value to my array send_data.
if(items.type == true) {
send_data.push({ value: items.message,flag:true });
}else{
send_data.push({value:items.message,flag:false});
}
},(err,transformedItems)=>{
if(err){
console.log(err);
}
});
});
My programs runs fine without getting any error but when I consoles the final output I get unsoreted list of array values like
[{value:1},{value:3},{value:2},{value:4}]
Is there's a way to correct this?
You can use the map function which makes more sense for your case:
var items = [1,2,3,4];
var send_data = items.map((item)=>({value:item}));
console.log(send_data);

How to control serial and parallel control flow with mapped functions?

I've drawn a simple flow chart, which basically crawls some data from internet and loads them into the database. So far, I had thought I was peaceful with promises, however now I have an issue that I'm working for at least three days without a simple step.
Here is the flow chart:
Consider there is a static string array like so: const courseCodes = ["ATA, "AKM", "BLG",... ].
I have a fetch function, it basically does a HTTP request followed by parsing. Afterwards it returns some object array.
fetch works perfectly with invoking its callback with that expected object array, it even worked with Promises, which was way greater and tidy.
fetch function should be invoked with every element in the courseCodes array as its parameter. This task should be performed in parallel execution, since those seperate fetch functions do not affect each other.
As a result, there should be a results array in callback (or Promises resolve parameter), which includes array of array of objects. With those results, I should invoke my loadCourse with those objects in the results array as its parameter. Those tasks should be performed in serial execution, because it basically queries database if similar object exists, adds it if it's not.
How can perform this kind of tasks in node.js? I could not maintain the asynchronous flow in such a scenario like this. I've failed with caolan/async library and bluebird & q promise libraries.
Try something like this, if you are able to understand this:
const courseCodes = ["ATA, "AKM", "BLG",... ]
//stores the tasks to be performed.
var parallelTasks = [];
var serialTasks = [];
//keeps track of courses fetched & results.
var courseFetchCount = 0;
var results = {};
//your fetch function.
fetch(course_code){
//your code to fetch & parse.
//store result for each course in results object
results[course_code] = 'whatever result comes from your fetch & parse code...';
}
//your load function.
function loadCourse(results) {
for(var index in results) {
var result = results[index]; //result for single course;
var task = (
function(result) {
return function() {
saveToDB(result);
}
}
)(result);
serialTasks.push(task);
}
//execute serial tasks for saving results to database or whatever.
var firstSerialTask = serialTasks.shift();
nextInSerial(null, firstSerialTask);
}
//pseudo function to save a result to database.
function saveToDB(result) {
//your code to store in db here.
}
//checks if fetch() is complete for all course codes in your array
//and then starts the serial tasks for saving results to database.
function CheckIfAllCoursesFetched() {
courseFetchCount++;
if(courseFetchCount == courseCodes.length) {
//now process courses serially
loadCourse(results);
}
}
//helper function that executes tasks in serial fashion.
function nextInSerial(err, result) {
if(err) throw Error(err.message);
var nextSerialTask = serialTasks.shift();
nextSerialTask(result);
}
//start executing parallel tasks for fetching.
for(var index in courseCode) {
var course_code = courseCode[index];
var task = (
function(course_code) {
return function() {
fetch(course_code);
CheckIfAllCoursesFetched();
}
}
)(course_code);
parallelTasks.push(task);
for(var task_index in parallelTasks) {
parallelTasks[task_index]();
}
}
Or you may refer to nimble npm module.

Sequelize migration - run an array of queries in order

I have a series of queries that need to run in a specific order. I've been trying this:
var queries = []
queries.push('update blah set foo="bar"')
queries.push('update baz set bar="foo"')
for(var i=0; i< queries.length; i++){
Promise.all([
migration.sequelize.query(queries[i]).then(function(result){
console.log(result)
})
])
}
done();
This does not work as anticipated. Any suggestions?
UPDATE: using recursion in the callback seems to work
var queries = []
queries.push('update blah set foo="bar"')
queries.push('update baz set bar="foo"')
var index = 0
var execute = function(queries){
if(typeof queries[index] == 'undefined'){
return done()
}
console.log(queries[index])
migration.sequelize.query(queries[index]).then(function(result){
console.log(result)
index += 1
return execute(queries)
})
}
execute(queries)
If you're on io.js and can write generators, or transpiling using Babel or TypeScript and can write async functions, this becomes very easy.
async function runSerialQueries(queries) {
var results = [];
for (var i=0; i<queries.length; i++) {
var query = queries[i];
var result = await migration.sequelize.query(query);
results.push(result);
}
return results;
}
runSerialQueries([
'update blah set foo="bar"',
'update baz set bar="foo"'
]).then(function(results) {
// ...
})
NOTE: The async keyword used above is a native feature of JavaScript EcmaScript 2016, not to be confused with the "async" npm module which is a completely different thing.
Anyhow, this solution ensures that one query doesn't start until the previous one ends, and will go in the order of the original array. It basically behaves just like it reads. If you can write generators but not async functions, something nearly identical can be done with generators, but requires a library like co() or Bluebird.coroutine().
Check out this excellent article for background on these techniques. They really are the future of JavaScript:
https://blog.risingstack.com/asynchronous-javascript/ <-- highly recommended!
I don't exactly know what the problem is with your code (if there any).
You should try with the each function of the async lib.
For example:
async.each(queries, function(query, done) {
// do the query here
done(); // this query is done, so runs the next one
});

How to synchronize MongoDB async query in NodeJS

I have a for-loop statement and an async MongoDB inside loop body. What I want to do is to make a find query from my MongoDB database, and push the result into an Array.
Here is the code:
function() arrResult() {
var arr = [];
for(...) {
collection.find({ foo: i }, function (err, cursor) {
arr.push(cursor);
}
}
return arr;
}
But it's obvious that the return value of the function would be an empty Array.
I want to tackle this problem using Q module. Is there any solutions?
I want to tackle this problem using Q module. Is there any solutions?
Yes, promises are a very easy abstraction to deal with this. You can execute the queries in parallel, and collect their results with all.
In particular, with Q it would look like this:
function arrResult(…) {
var promises = [];
for (…)
promises.push( Q.ninvoke(collection, "find", {foo: i}) );
return Q.all(promises);
}
arrResult(…).then(function(arr) {
…
}, function(err) {
// first error, if any occured
});
You need a sync mechanism that acts like a process gate.
Each returning query has to arrive at the gate, e.g. decrements some counter and deposit its result.
When all arrived at the gate, a final callback does return the collected results.

Synchronize node.js object

I am using a variable and that is used by many functions at a time. I need to synchronize it. How do I do it?
var x = 0;
var a = function(){
x=x+1;
}
var b = function(){
x=x+2;
}
var c = function(){
var t = x;
return t;
}
This is the simplified logic of my code. To give more insight, X is as good as my mongoDB object which needs to be used by only one function at a time. Also 3 functions are like REST api calls so there is probability they will be called at same time.
I need to write getX function which should manage locking and unlocking.
Any suggestions?
Node is single threaded so there is no chance of the the 3 functions to be executed at the same time. Syncronization and race conditions only apply in multithreaded environments. There is a case though, if the first function blocks for i/o.
You are asking about keeping a single object synchronized as several
asynchronous operations modify that object. This is a bit vague (do you need to execute them in order? do they change the same properties?) Its hard to make a catch all solution, so I suggest that you determine what order, if any, the operations must take place in, and use the async library to handle
the control flow.
The async.waterfall method (example below) is useful if you want to pass
results down a chain of functions that execute in order. There are many other
useful functions included in the library, like async.eachSeries (execute a function once per array item in order) and
async.parallel (execute an array of functions simultaneously.) All docs available at https://github.com/caolan/async
var async = require('async');
function calculateX(callback){
async.waterfall(
[
function(done){
var x = 0;
asyncCall1(x, function(x1){ // add x1=x+1;
done(null, x1);
});
},
function(x1, done){
asyncCall2(x1, function(x2){ // add x2=x1+2;
done(null, x2);
});
},
],
function(err, x2){
var t = x2;
callback(t);
});
};
calculateX(function(x2){
mongo.save(x2, function(err){ // or something idk mongo
if(err){ console.log(err) };
});
});

Resources