In this code, why using a closure? - node.js

I don't get why a closure is being used in the code below:
function writeData(socket, data){
var success = !socket.write(data);
if(!success){
(function(socket, data){
socket.once('drain', function(){
writeData(socket, data);
});
})(socket, data)
}
}
and why using var success=!socket.write(data); instead directly input.
May be socket.write is not a boolean?

The IIFE is unnecessary, you can rewrite the code to this:
function writeData(socket, data){
var success = ! socket.write(data);
if (! success) {
socket.once('drain', function() {
writeData(socket, data);
});
}
}
Or even this:
function writeData(socket, data){
var success = ! socket.write(data);
if (! success) {
socket.once('drain', writeData.bind(this, socket, data));
}
}

According to the documentation for socket.write(), the method
Sends data on the socket. The second parameter specifies the encoding
in the case of a string--it defaults to UTF8 encoding.
Returns true if the entire data was flushed successfully to the kernel
buffer. Returns false if all or part of the data was queued in user
memory. 'drain' will be emitted when the buffer is again free.
The optional callback parameter will be executed when the data is
finally written out - this may not be immediately.
In the code, if the first socket.write() is not able to flush all the data in one go, the closure waits for the socket drain event, in which case it will call writeData method again. This is a very ingenious way of creating an asynchronous recursive function, which will get called until success returns true.

Related

Can an event based read function ever run out of order?

Given a situation where I use the nodejs readline library to iterate over each line in the STDIN stream, do some processing on it and write it back out to STDOUT as in the following example:
var rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
terminal: false
});
function my_function(line) {
var output = ...(line);
process.stdout.write(output);
}
rl.on('line', my_function);
I'm concerned that the processing I'm doing will take very different amounts of time depending on the line content so some lines will return very quickly while others takes some time to sort out. Is it possible that my_function() will ever run out of order and hence cause the output stream to be scrambled? Should I be looking into using a synchronous loop of some kind instead of this asynchronous event handler?
The JavaScript execution itself is single-threaded, so as long as you're only performing synchronous operations inside the event handler, there is no problem.
If you are performing asynchronous operations inside the event handler, then it is possible that another 'line' event could be emitted before your asynchronous operation(s) are complete. In that case, you would need to rl.pause() first and then rl.resume() once you are finished with your asynchronous operations. However, this isn't foolproof since 'line' events could still be emitted after a rl.pause() if the current chunk of data read from the input stream had multiple line breaks.
So if you are performing asynchronous operations inside the event handler, you are probably better off just reading from the stream yourself so that you have more control over the parsing behavior. This is actually pretty easy to do, for example:
function parseStream(stream, callback) {
// Assuming all stream data is text and not binary ...
var buffer = '';
var RE_EOL = /\r?\n/g;
stream.on('data', function(data) {
buffer += data;
processBuffer();
});
stream.on('end', callback);
stream.on('error', callback);
function processBuffer() {
var idx = RE_EOL.exec(buffer);
if (~idx) {
// Found a line ending
var line = buffer.slice(0, RE_EOL.index);
buffer = buffer.slice(RE_EOL.index + RE_EOL[0].length);
stream.pause();
callback(null, line, processBuffer);
} else {
stream.resume();
}
}
}
// ...
processStream(process.stdin, function(err, line, done) {
if (err) throw err;
if (line === undefined) {
// No more data will be available (stream ended)
console.log('(Stream ended!)');
return;
}
// Do something with `line`
console.log(line);
// Call `done()` whenever your async operation(s) are all finished
done();
});

is my code blocking node thread

I have some confusions over nodejs and would like some help. I have a table called camps, contacts and camp_contact. I have to show the contact list based on the camp the user belongs to. I used async to loop through the camps, which I save in the user session, and then grab the data from mysql.
var array_myData = [];
async.each(req.session.user.camps, function(camps, callback) {
database.getConnection(function(err, connection) {
// Use the connection
connection.query('SELECT contacts.*, contact_camp.* '+
' FROM contacts JOIN contact_camp '+
' ON contact_camp.contact_id = contacts.id '+
' WHERE contact_camp.camp_id = ?',
[camps.camp_id], function(err,data){
if(err) {
//this will call the err function
callback('error');
}
else {
array_myData = array_myData.concat(data);
callback();
}
connection.release();
});
});
//final function call.
}, function(err){
// if any error happened, this function fires.
if( err ) {
// All processing will now stop.
} else {
res.render('contacts',
{
page
});
}
});
The code works fine. Now the thing I'm wondering about is, does using array.concat block the thread? if so, how can I change that? I read around and according to what I understood, I/O operation that are not asynchronous blocks the thread like reading from file or database. Does having array like this var array = ['a','b', 'c'] and looping through it would block the thread?
Lastly, is there a way to know if a code that I have written has blocked the thread or not? Because I get worried every time I write a function of my own.
I also get confused when a create a function with a callback like:
function test(param, fn) { do something; fn(); }
I'm not sure if this kind of function without a timer would block the thread or not.
Ok, if it helps you somehow:
I read around and according to what I understood, I/O operation that are not asynchronous blocks the thread like reading from file or database.
In NodeJS the operations on file descriptors are asynchronous => non-blocking
The file descriptors would be:
database operations,
open/close files
network operations
ex.:
fs.readFile('<pathToTheFile>', (err, data) => {
if (err) throw err;
console.log(data);
});
// if your file is very big, until it is read, maybe other requests will be finished
Does having array like this var array = ['a','b', 'c'] and looping through it would block the thread?
Yes, it will block the thread, but operations like this are not resource intensive. NodeJs is an event-driven language (single-threaded), but also, in a multi threaded or multi process language, the kernel thread would still be blocked by such operations => the same thing ... this is maybe off topic.
// if you do something like this
while(true){
function test(param, fn) { do something; fn(); }
}
// you will see that you just blocked the thread
I'm not sure if this kind of function without a timer would block the thread or not
normally you would put that function in a error handler, and it shouldn't block the thread if you're not doing an infinite while.
try{}catch(err){
// do something with the error
}

nodejs and mongoskin, callback after all items have been saved

I have the following piece of code where I iterate through a collection and do another db query and construct an object within its callback. Finally I save that object to another collection.
I wish to call another function after all items have been saved, but can't figure out how. I tried using the async library, specifically async whilst when item is not null, but that just throws me in an infinite loop.
Is there a way to identify when all items have been saved?
Thanks!
var cursor = db.collection('user_apps').find({}, {timeout:false});
cursor.each(function (err, item) {
if (err) {
throw err;
}
if (item) {
var appList = item.appList;
var uuid= item.uuid;
db.collection('app_categories').find({schema_name:{$in: appList}}).toArray(function (err, result) {
if (err) throw err;
var catCount = _.countBy(result, function (obj) {
return obj.category;
})
catObj['_id'] = uuid;
catObj['total_app_num'] = result.length;
catObj['app_breakdown'] = catCount;
db.collection('audiences').insert(catObj, function (err) {
if (err) console.log(err);
});
});
}
else {
// do something here after all items have been saved
}
});
The key here is to use something that is going to respect the callback signal when performing the "loop" operation. The .each() as implemented here will not do that, so you need an "async" loop control that will signify that each loop has iterated and completed, with it's own callback within the callback.
Provided your underlying MongoDB driver is at least version 2, then there is a .forEach() which has a callback which is called when the loop is complete. This is better than .each(), but it does not solve the problem of knowing when the inner "async" .insert() operations have been completed.
So a better approach is to use the stream interface returned by .find(), where this is more flow control allowed. There is a .stream() method for backwards compatibility, but modern drivers will just return the interface by default:
var stream = db.collection('user_apps').find({});
stream.on("err",function(err){
throw(err);
});
stream.on("data",function(item) {
stream.pause(); // pause processing of stream
var appList = item.appList;
var uuid= item.uuid;
db.collection('app_categories').find({schema_name:{ "$in": appList}}).toArray(function (err, result) {
if (err) throw err;
var catCount = _.countBy(result, function (obj) {
return obj.category;
})
var catObj = {}; // always re-init
catObj['_id'] = uuid;
catObj['total_app_num'] = result.length;
catObj['app_breakdown'] = catCount;
db.collection('audiences').insert(catObj, function (err) {
if (err) console.log(err);
stream.resume(); // resume stream processing
});
});
});
stream.on("end",function(){
// stream complete and processing done
});
The .pause() method on the stream stops further events being emitted so that each object result is processed one at a time. When the callback from the .insert() is called, then the .resume() method is called, signifying that processing is complete for that item and a new call can be made to process the next item.
When the stream is complete, then everything is done so the "end" event hook is called to continue your code.
That way, both each loop is signified with an end to move to the next iteration as well as there being a defined "end" event for the complete end of processing. As the control is "inside" the .insert() callback, then those operations are respected for completion as well.
As a side note, you might consider including your "category" information in the source collection, as it seems likely your results can be more efficiently returned using .aggregate() if all required data were in a single collection.

EventEmitter in the middle of a chain of Promises

I am doing something that involves running a sequence of child_process.spawn() in order (to do some setup, then run the actual meaty command that the caller is interested in, then do some cleanup).
Something like:
doAllTheThings()
.then(function(exitStatus){
// all the things were done
// and we've returned the exitStatus of
// a command in the middle of a chain
});
Where doAllTheThings() is something like:
function doAllTheThings() {
runSetupCommand()
.then(function(){
return runInterestingCommand();
})
.then(function(exitStatus){
return runTearDownCommand(exitStatus); // pass exitStatus along to return to caller
});
}
Internally I'm using child_process.spawn(), which returns an EventEmitter and I'm effectively returning the result of the close event from runInterestingCommand() back to the caller.
Now I need to also send data events from stdout and stderr to the caller, which are also from EventEmitters. Is there a way to make this work with (Bluebird) Promises, or are they just getting in the way of EventEmitters that emit more than one event?
Ideally I'd like to be able to write:
doAllTheThings()
.on('stdout', function(data){
// process a chunk of received stdout data
})
.on('stderr', function(data){
// process a chunk of received stderr data
})
.then(function(exitStatus){
// all the things were done
// and we've returned the exitStatus of
// a command in the middle of a chain
});
The only way I can think to make my program work is to rewrite it to remove the promise chain and just use a raw EventEmitter inside something that wraps the setup/teardown, something like:
withTemporaryState(function(done){
var cmd = runInterestingCommand();
cmd.on('stdout', function(data){
// process a chunk of received stdout data
});
cmd.on('stderr', function(data){
// process a chunk of received stderr data
});
cmd.on('close', function(exitStatus){
// process the exitStatus
done();
});
});
But then since EventEmitters are so common throughout Node.js, I can't help but think I should be able to make them work in Promise chains. Any clues?
Actually, one of the reasons I want to keep using Bluebird, is because I want to use the Cancellation features to allow the running command to be cancelled from the outside.
There are two approaches, one provides the syntax you originally asked for, the other takes delegates.
function doAllTheThings(){
var com = runInterestingCommand();
var p = new Promise(function(resolve, reject){
com.on("close", resolve);
com.on("error", reject);
});
p.on = function(){ com.on.apply(com, arguments); return p; };
return p;
}
Which would let you use your desired syntax:
doAllTheThings()
.on('stdout', function(data){
// process a chunk of received stdout data
})
.on('stderr', function(data){
// process a chunk of received stderr data
})
.then(function(exitStatus){
// all the things were done
// and we've returned the exitStatus of
// a command in the middle of a chain
});
However, IMO this is somewhat misleading and it might be desirable to pass the delegates in:
function doAllTheThings(onData, onErr){
var com = runInterestingCommand();
var p = new Promise(function(resolve, reject){
com.on("close", resolve);
com.on("error", reject);
});
com.on("stdout", onData).on("strerr", onErr);
return p;
}
Which would let you do:
doAllTheThings(function(data){
// process a chunk of received stdout data
}, function(data){
// process a chunk of received stderr data
})
.then(function(exitStatus){
// all the things were done
// and we've returned the exitStatus of
// a command in the middle of a chain
});

How to get back from inside function syncronized in node.js?

function squre(val) {
main.add(val,function(result){
console.log("squre = " + result); //returns 100 (2nd line of output)
return result;
});
}
console.log(squre(10)); // returns null (1st line of output)
I need 100 as output in both of the lines.
It rather depends on the nature of main.add(). But, by its use of a callback, it's most likely asynchronous. If that's the case, then return simply won't work well as "asynchronous" means that the code doesn't wait for result to be available.
You should give a read through "How to return the response from an AJAX call?." Though it uses Ajax as an example, it includes a very thorough explanation of asynchronous programming and available options of control flow.
You'll need to define squre() to accept a callback of its own and adjust the calling code to supply one:
function squre(val, callback) {
main.add(val, function (result) {
console.log("squre = " + result);
callback(result);
});
});
squre(10, function (result) {
console.log(result);
});
Though, on the chance that main.add() is actually synchronous, you'll want to move the return statement. They can only apply to the function they're directly within, which would be the anonymous function rather than spure().
function squre(val) {
var answer;
main.add(val, function (result) {
console.log("squre = " + result);
answer = result;
});
return answer;
}
You can't, as you have the asynchronous function main.add() that will be executed outside of the current tick of the event loop (see this article for more information). The value of the squre(10) function call is undefined, since this function doesn't return anything synchronously. See this snippet of code:
function squre(val) {
main.add(val,function(result){
console.log("squre = " + result);
return result;
});
return true; // Here is the value really returned by 'squre'
}
console.log(source(10)) // will output 'true'
and The Art of Node for more information on callbacks.
To get back data from an asynchronous function, you'll need to give it a callback:
function squre(val, callback) {
main.add(val, function(res) {
// do something
callback(null, res) // send back data to the original callback, reporting no error
}
source(10, function (res) { // define the callback here
console.log(res);
});

Resources