Execute when both(!) events fire .on('end') - node.js

I have a node app that reads two files as streams. I use event.on('end') to then work with the results. The problem is I don't really know how I can wait for BOTH events to trigger 'end'.
What I have now is:
reader1.on('end', function(){
reader2.on('end',function(){
doSomething();
});
});
With small files this works, but if one of the files is very large the app aborts.

Your execution logic is somewhat flawed. You ought to do something like this instead
var checklist = [];
// checklist will contain sort of a counter
function reader_end(){
if(checklist.length == 2 )
// doSomething only if both have been added to the checklist
doSomething();
}
reader1.on('end', function() {
checklist.push('reader1');
// increment the counter
reader_end();
});
reader2.on('end', function() {
checklist.push('reader2');
reader_end();
});
Although there are libraries to better handle this sort of stuff, like Async and Promises.
With Async you'll need to use compose
var r12_done = async.compose(reader1.on, reader2.on);
r12_done('end', function(){
doSomething();
});
Edit: I just noticed that since probably reader1.on is a Stream 'end' event which doesn't have the standard callback argument signature of (err, results), this probably won't work. In that case you should just go with Promise.
With Promise you'll need to first Promisify and then join
var reader1Promise = Promise.promisify(reader1.on)('end');
var reader2Promise = Promise.promisify(reader2.on)('end');
var reader12Promise = Promise.join(reader1Promise, reader1Promise);
reader12Promise.then(function(){
doSomething();
});

Related

How to force sequential statement execution in Node?

I am new to Node, so please forgive me if my question is too simple. I fully appreciate the Async paradigm and why it is useful in single threads. But some logical operations are synchronous by nature.
I have found many posts about the async/sync issue, and have been reading for whole two days about callbacks, promises, async/await etc...
But still I cannot figure what should be straight forward and simple thing to do. Am I missing something!
Basically for the code below:
const fs = require('fs');
var readline = require('readline');
function getfile (aprodfile) {
var prodlines = [];
var prodfile = readline.createInterface({input: fs.createReadStream(aprodfile)});
prodfile.on('line', (line) => {prodlines.push(line)});
prodfile.on('close', () => console.log('Loaded '+prodlines.length));
// the above block is repeated several times to read several other
// files, but are omitted here for simplicity
console.log(prodlines.length);
// then 200+ lines that assume prodlines already filled
};
the output I get is:
0
Loaded 11167
whereas the output I expect is:
Loaded 11167
11167
This is because the console.log statement executes before the prodfile.on events are completed.
Is there a nice clean way to tell Node to execute commands sequentially, even if blocking? or better still to tell console.log (and the 200+ lines of code following it) to wait until prodlines is fully populated?
Here's the execution order of what you wrote :
prodfile.on('line', line => { // (1) subscribing to the 'line' event
prodlines.push(line) // (4), whenever the 'line' event is triggered
});
prodfile.on('close', () => { // (2) subscribing to the 'close' event
console.log('Loaded '+prodlines.length) // (5), whenever the 'close' event is triggered
});
console.log(prodlines.length); // (3) --> so it logs 0, nothing has happened yet
What you can do is this :
function getfile (aprodfile) {
var prodlines = [];
var prodfile = readline.createInterface({input: fs.createReadStream(aprodfile)});
prodfile.on('line', line => { prodlines.push(line) });
prodfile.on('close', () => {
console.log('Loaded '+prodlines.length)
finishedFetching( prodlines );
});
};
function finishedFetching( prodlines ) {
console.log(prodlines.length) // 200!
}

Better way to write a simple Node redis loop (using ioredis)?

So, I'm stilling learning the JS/Node way from a long time in other languages.
I have a tiny micro-service that reads from a redis channel, temp stores it in a working channel, does the work, removes it, and moves on. If there is more in the channel it re-runs immediately. If not, it sets a timeout and checks again in 1 second.
It works fine...but timeout polling doesn't seem to be the "correct" way to approach this. And I haven't found much about using BRPOPLPUSH to try to block (vs. RPOPLPUSH) and wait in Node....or other options like that. (Pub/Sub isn't an option here...this is the only listener, and it may not always be listening.)
Here's the short essence of what I'm doing:
var Redis = require('ioredis');
var redis = new Redis();
var redisLoop = function () {
redis.rpoplpush('channel', 'channel-working').then(function (result) {
if (result) {
processJob(result); //do stuff
//delete the item from the working channel, and check for another item
redis.lrem('channel-working', 1, result).then(function (result) { });
redisLoop();
} else {
//no items, wait 1 second and try again
setTimeout(redisLoop, 1000);
}
});
};
redisLoop();
I feel like I'm missing something really obvious. Thanks!
BRPOPLPUSH doesn't block in Node, it blocks in the client. In this instance I think it's exactly what you need to get rid of the polling.
var Redis = require('ioredis');
var redis = new Redis();
var redisLoop = function () {
redis.brpoplpush('channel', 'channel-working', 0).then(function (result) {
// because we are using BRPOPLPUSH, the client promise will not resolve
// until a 'result' becomes available
processJob(result);
// delete the item from the working channel, and check for another item
redis.lrem('channel-working', 1, result).then(redisLoop);
});
};
redisLoop();
Note that redis.lrem is asynchronous, so you should use lrem(...).then(redisLoop) to ensure that your next tick executes only after the item is successfully removed from channel-working.

Synchronize node.js object

I am using a variable and that is used by many functions at a time. I need to synchronize it. How do I do it?
var x = 0;
var a = function(){
x=x+1;
}
var b = function(){
x=x+2;
}
var c = function(){
var t = x;
return t;
}
This is the simplified logic of my code. To give more insight, X is as good as my mongoDB object which needs to be used by only one function at a time. Also 3 functions are like REST api calls so there is probability they will be called at same time.
I need to write getX function which should manage locking and unlocking.
Any suggestions?
Node is single threaded so there is no chance of the the 3 functions to be executed at the same time. Syncronization and race conditions only apply in multithreaded environments. There is a case though, if the first function blocks for i/o.
You are asking about keeping a single object synchronized as several
asynchronous operations modify that object. This is a bit vague (do you need to execute them in order? do they change the same properties?) Its hard to make a catch all solution, so I suggest that you determine what order, if any, the operations must take place in, and use the async library to handle
the control flow.
The async.waterfall method (example below) is useful if you want to pass
results down a chain of functions that execute in order. There are many other
useful functions included in the library, like async.eachSeries (execute a function once per array item in order) and
async.parallel (execute an array of functions simultaneously.) All docs available at https://github.com/caolan/async
var async = require('async');
function calculateX(callback){
async.waterfall(
[
function(done){
var x = 0;
asyncCall1(x, function(x1){ // add x1=x+1;
done(null, x1);
});
},
function(x1, done){
asyncCall2(x1, function(x2){ // add x2=x1+2;
done(null, x2);
});
},
],
function(err, x2){
var t = x2;
callback(t);
});
};
calculateX(function(x2){
mongo.save(x2, function(err){ // or something idk mongo
if(err){ console.log(err) };
});
});

Is there a better way to execute a node.js callback at regular intervals than setInterval?

I have a node.js script that writes a stream to an array like this:
var tempCrossSection = [];
stream.on('data', function(data) {
tempCrossSection.push(data);
});
and another callback that empties the array and does some processing on the data like this:
var crossSection = [];
setInterval(function() {
crossSection = tempCrossSection;
tempCrossSection = [];
someOtherFunction(crossSection, function(data) {
console.log(data);
}
}, 30000);
For the most part this works, but sometimes the setInterval callback will execute more than once in a 30000ms interval (and it is not a queued call sitting on the event loop). I have also done this as a cronJob with same results. I am wondering if there is a way to ensure that setInterval executes only once per 30000ms. Perhaps there is a better solution altogether. Thanks.
When you have something async, you should use setTimeout instead, otherwise if the asynchonous function takes to long you'll end up with issues.
var crossSection = [];
setTimeout(function someFunction () {
crossSection = tempCrossSection;
tempCrossSection = [];
someOtherFunction(crossSection, function(data) {
console.log(data);
setTimeout(someFunction, 30000);
}
}, 30000);
Timers in javascript are not as reliable as many think. It sounds like you found that out already! The key is to measure the time elapsed since the last invocation of the timer's callback to decide if you should run in this cycle or not.
See http://www.sitepoint.com/creating-accurate-timers-in-javascript/
The ultimate goal there was to build a timer that fires, say every second (higher precision than your timeout value), that then decides if it is going to fire your function.

How to wait for all async calls to finish

I'm using Mongoose with Node.js and have the following code that will call the callback after all the save() calls has finished. However, I feel that this is a very dirty way of doing it and would like to see the proper way to get this done.
function setup(callback) {
// Clear the DB and load fixtures
Account.remove({}, addFixtureData);
function addFixtureData() {
// Load the fixtures
fs.readFile('./fixtures/account.json', 'utf8', function(err, data) {
if (err) { throw err; }
var jsonData = JSON.parse(data);
var count = 0;
jsonData.forEach(function(json) {
count++;
var account = new Account(json);
account.save(function(err) {
if (err) { throw err; }
if (--count == 0 && callback) callback();
});
});
});
}
}
You can clean up the code a bit by using a library like async or Step.
Also, I've written a small module that handles loading fixtures for you, so you just do:
var fixtures = require('./mongoose-fixtures');
fixtures.load('./fixtures/account.json', function(err) {
//Fixtures loaded, you're ready to go
};
Github:
https://github.com/powmedia/mongoose-fixtures
It will also load a directory of fixture files, or objects.
I did a talk about common asyncronous patterns (serial and parallel) and ways to solve them:
https://github.com/masylum/i-love-async
I hope its useful.
I've recently created simpler abstraction called wait.for to call async functions in sync mode (based on Fibers). It's at an early stage but works. It is at:
https://github.com/luciotato/waitfor
Using wait.for, you can call any standard nodejs async function, as if it were a sync function, without blocking node's event loop. You can code sequentially when you need it.
using wait.for your code will be:
//in a fiber
function setup(callback) {
// Clear the DB and load fixtures
wait.for(Account.remove,{});
// Load the fixtures
var data = wait.for(fs.readFile,'./fixtures/account.json', 'utf8');
var jsonData = JSON.parse(data);
jsonData.forEach(function(json) {
var account = new Account(json);
wait.forMethod(account,'save');
}
callback();
}
That's actually the proper way of doing it, more or less. What you're doing there is a parallel loop. You can abstract it into it's own "async parallel foreach" function if you want (and many do), but that's really the only way of doing a parallel loop.
Depending on what you intended, one thing that could be done differently is the error handling. Because you're throwing, if there's a single error, that callback will never get executed (count won't be decremented). So it might be better to do:
account.save(function(err) {
if (err) return callback(err);
if (!--count) callback();
});
And handle the error in the callback. It's better node-convention-wise.
I would also change another thing to save you the trouble of incrementing count on every iteration:
var jsonData = JSON.parse(data)
, count = jsonData.length;
jsonData.forEach(function(json) {
var account = new Account(json);
account.save(function(err) {
if (err) return callback(err);
if (!--count) callback();
});
});
If you are already using underscore.js anywhere in your project, you can leverage the after method. You need to know how many async calls will be out there in advance, but aside from that it's a pretty elegant solution.

Resources