Does node.js preserve asynchronous execution order? - node.js

I am wondering if node.js makes any guarantee on the order async calls start/complete.
I do not think it does, but I have read a number of code samples on the Internet that I thought would be buggy because the async calls may not complete in the order expected, but the examples are often stated in contexts of how great node is because of its single-threaded asynchronous model. However I cannot find an direct answer to this general question.
Is it a situation that different node modules make different guarantees? For example at https://stackoverflow.com/a/8018371/1072626 the answer clearly states the asynchronous calls involving Redis preserves order.
The crux of this problem can be boiled down to is the following execution (or similar) is strictly safe in node?
var fs = require("fs");
fs.unlink("/tmp/test.png");
fs.rename("/tmp/image1.png", "/tmp/test.png");
According to the author the call to unlink is needed because rename will fail on Windows if there is a pre-existing file. However, both calls are asynchronous, so my initial thoughts were that the call to rename should be in the callback of unlink to ensure the asynchronous I/O completes before the asynchronous rename operation starts otherwise rename may execute first, causing an error.

Async operation do not have any determined time to execute.
When you call unlink, it asks OS to remove the file, but it is not defined when OS will actually remove the file; it might be a millisecond or an year later.
The whole point of async operation is that they don't depend on each other unless explicitly stated so.
In order to rename to occur after unlink, you have to modify your code like this:
fs.unlink("/tmp/test.png", function (err) {
if (err) {
console.log("An error occured");
} else {
fs.rename("/tmp/image1.png", "/tmp/test.png", function (err) {
if (err) {
console.log("An error occured");
} else {
console.log("Done renaming");
}
});
}
});
or, alternatively, to use synchronized versions of fs functions (note that these will block the executing thread):
fs.unlinkSync("/tmp/test.png");
fs.renameSync("/tmp/image1.png", "/tmp/test.png");
There also libraries such as async that make async code to look better:
async.waterfall([
fs.unlink.bind(null, "/tmp/test.png");
fs.rename.bind(null, "/tmp/image1.png", "/tmp/test.png");
], function (err) {
if (err) {
console.log("An error occured");
} else {
console.log("done renaming");
}
});
Note that in all examples error handling is extremely simplified to represent the idea.

If you look at the documentation of Node.js you'll find that the function fs.unlink takes a callback as an argument as:
fs.unlink(path, [callback]);
An action that you intend to take when the current function returns should be passed to the function as the callback argument. So typically in your case the code will take the following form:
var fs = require("fs");
fs.unlink("/tmp/test.png", function(){
fs.rename("/tmp/image1.png", "/tmp/test.png");
});
In the specific case of unlink and rename there are also synchronous function in Node.js and can be used as fs.unlinkSync(path) and fs.renameSync(oldPath, newPath). This will ensure that the code is run synchronously.
Moreover if you wish to use asynchronous implementation but retain better readability you could consider a library like async. It also has options for different modes of implementation like parallel, series, waterfall etc.
Hope this helps.

Related

How to check file is writable (resource is not busy nor locked)

excel4node's write to file function catches error and does not propagate to a caller. Therefore, my app cannot determine whether write to file is successful or not.
My current workaround is like below:
let fs = require('fs')
try {
let filePath = 'blahblah'
fs.writeFileSync(filePath, '') // Try-catch is for this statement
excel4nodeWorkbook.write(filePath)
} catch (e) {
console.log('File save is not successful')
}
It works, but I think it's a sort of hack and that it's not a semantically correct way. I also testedfs.access and fs.accessSync, but they only check permission, not the state (busy/lock) of resource.
Is there any suggestion for this to look and behave nicer without modifying excel4node source code?
I think you are asking the wrong question. If you check at time T, then write at time T + 1ms, what would guarantee that the file is still writeable?
If the file is not writeable for whatever reason, the write will fail, period. Nothing to do. Your code is fine, but you can probably also do without the fs.writeFileSync(), which will just erase whatever else was in the file before.
You can also write to a randomly-generated file path to make reasonably sure that two processes are not writing to the same file at the same time, but again, that will not prevent all possible write errors, so what you really, really want is rather some good error handling.
In order to handle errors properly you have to provide a callback!
Something along the lines of:
excel4nodeWorkbook.write(filePath, (err) => {
if (err) console.error(err);
});
Beware, this is asynchronous code, so you need to handle that as well!
You already marked a line in the library's source code. If you look a few lines above, you can see it uses the handler argument to pass any errors to. In fact, peeking at the documentation comment above the function, it says:
If callback is given, callback called with (err, fs.Stats) passed
Hence you can simply pass a function as your second argument and check for err like you've probably already seen elsewhere in the node environment:
excel4nodeWorkbook.write(filepath, (err) => {
if (err) {
console.error(err);
}
});

NodeJS synchronously change directory

I have the following code in NodeJS:
var targetDir = tmpDir + date;
try {
fs.statSync(targetDir);
}
catch (e) {
mkdirp.sync(targetDir, {mode: 755});
}
process.chdir(targetDir);
doStuffThatDependsOnBeingInTargetDir();
My understanding is that in NodeJS, functions such process.chdir are asynchronously executed. So if I need execute some code afterwards, how do I guarantee that I'm in the directory before I execute my subsequent function?
If process.chdir took a callback then I would do it in the callback. But it doesn't. This asynchronous paradigm is definitely confusing for a newcomer so I figured I would ask. This isn't the most practical consideration since the code seems to work anyways. But I feel like I'm constantly running into this and don't know how to handle these situations.
process.chdir() function is a synchronous function. As you said yourself, it does not have a callback function to tell if it succeeded or not. It does however throw an exception if something goes wrong so you would want to invoke it inside a try catch block.
You can check if the process successfully changed directory by process.cwd() function.

Node's del command - callback not firing

I'm working through a pluralsight course on gulp. John Papa is demonstrating how to inject a function that deletes existing css files, into the routine that compiles the new ones.
The callback on the del function is not firing. The del function is running, file are deleted, I see no error messages. If I call the callback manually it executes, so looks like the function is in tact. So I am wondering what would cause del not to want to execute the callback.
delete routine:
function clean(path, done) {
log('cleaning ' + path);
del(path, done); // problem call
}
The 'done' function is not firing, but it does if I change the code to this:
function clean(path, done) {
log('cleaning ' + path);
del(path);
done();
}
Which, of course, defeats the intended purpose of waiting until del is done before continuing on.
Any ideas at to what's going on would be appreciated.
for reference (in case relevant):
compile css function:
gulp.task('styles', ['clean-styles'], function(){
log('compiling less');
return gulp
.src(config.less)
.pipe($.less())
.pipe($.autoprefixer({browsers:['last 2 versions', '> 5%']}))
.pipe(gulp.dest(config.temp));
});
injected clean function:
gulp.task('clean-styles', function(done){
var files = config.temp + '/**/*.css';
clean(files, done);
});
UPDATE
If anyone else runs into this, re-watched the training video and it was using v1.1 of del. I checked and I was using 2.x. After installing v 1.1 all works.
del isn't a Node's command, it's probably this npm package. If that's the case it doesn't receive a callback as second parameter, instead it returns a promise and you should call .then(done) to get it called after the del finishes.
Update
A better solution is to embrace the Gulp's promise nature:
Change your clean function to:
function clean(path) {
return del(path); // returns a promise
}
And your clean-styles task to:
gulp.task('clean-styles', function(){
var files = config.temp + '/**/*.css';
return clean(files);
});
As of version 2.0, del's API changed to use promises.
Thus to specify callback you should use .then():
del('unicorn.png').then(callback);
In case you need to call it from a gulp task - just return a promise from the task:
gulp.task('clean', function () {
return del('unicorn.png');
});
Checking the docs for the del package it looks like you're getting mixed up between node's standard callback mechanism and del's, which is using a promise.
You'll want to use the promise API, with .then(done) in order to execute the callback parameter.
Node and javascript in general is currently in a bit of a state of flux for design patterns to handle async code, with most of the browser community and standards folks leaning towards promises, whereas the Node community tends towards the callback style and a library such as async.
With ES6 standardizing promises, I suspect we're going to see more of these kinds of incompatibilities in node as the folks who are passionate about that API start incorporating into node code more and more.

To async, or not to async in node.js?

I'm still learning the node.js ropes and am just trying to get my head around what I should be deferring, and what I should just be executing.
I know there are other questions relating to this subject generally, but I'm afraid without a more relatable example I'm struggling to 'get it'.
My general understanding is that if the code being executed is non-trivial, then it's probably a good idea to async it, as to avoid it holding up someone else's session. There's clearly more to it than that, and callbacks get mentioned a lot, and I'm not 100% on why you wouldn't just synch everything. I've got some ways to go.
So here's some basic code I've put together in an express.js app:
app.get('/directory', function(req, res) {
process.nextTick(function() {
Item.
find().
sort( 'date-modified' ).
exec( function ( err, items ){
if ( err ) {
return next( err );
}
res.render('directory.ejs', {
items : items
});
});
});
});
Am I right to be using process.nextTick() here? My reasoning is that as it's a database call then some actual work is having to be done, and it's the kind of thing that could slow down active sessions. Or is that wrong?
Secondly, I have a feeling that if I'm deferring the database query then it should be in a callback, and I should have the actual page rendering happening synchronously, on condition of receiving the callback response. I'm only assuming this because it seems like a more common format from some of the examples I've seen - if it's a correct assumption can anyone explain why that's the case?
Thanks!
You are using it wrong in this case, because .exec() is already asynchronous (You can tell by the fact that is accepts a callback as a parameter).
To be fair, most of what needs to be asynchronous in nodejs already is.
As for page rendering, if you require the results from the database to render the page, and those arrive asynchronously, you can't really render the page synchronously.
Generally speaking it's best practice to make everything you can asynchronous rather than relying on synchronous functions ... in most cases that would be something like readFile vs. readFileSync. In your example, you're not doing anything synchronously with i/o. The only synchronous code you have is the logic of your program (which requires CPU and thus has to be synchronous in node) but these are tiny little things by comparison.
I'm not sure what Item is, but if I had to guess what .find().sort() does is build a query string internally to the system. It does not actually run the query (talk to the DB) until .exec is called. .exec takes a callback, so it will communicate with the DB asynchronously. When that communication is done, the callback is called.
Using process.nextTick does nothing in this case. That would just delay the calling of its code until the next event loop which there is no need to do. It has no effect on synchronicity or not.
I don't really understand your second question, but if the rendering of the page depends on the result of the query, you have to defer rendering of the page until the query completes -- you are doing this by rendering in the callback. The rendering itself res.render may not be entirely synchronous either. It depends on the internal mechanism of the library that defines the render function.
In your example, next is not defined. Instead your code should probably look like:
app.get('/directory', function(req, res) {
Item.
find().
sort( 'date-modified' ).
exec(function (err, items) {
if (err) {
console.error(err);
res.status(500).end("Database error");
}
else {
res.render('directory.ejs', {
items : items
});
}
});
});
});

Should I avoid calling require when responding to a request?

Is requireing a module going to block every single request? According to the docs, the module is cached after the first require but I wanted to see if it's an anti-pattern to do a dynamic require when responding to a request.
Nope, it won't block on every request (as long as you're requiring the same module each time), and it's not an anti-pattern.
If you're loading the same module on each request, any call to require will return instantly (because the module will have already been loaded, compiled, and cached). If, however, many different modules may be required so that you don't get the benefit of caching, it may be better to do an asynchronous require.
But something like this?
function handler(req, res) { require('fs').readFile(…); }
No big deal. It's just a matter of style.
I am often told that blocking of any kind is a no-no in node.js, and that asynchronicity is one of its main imperatives. You could try the following.
Quoting the answer from non-blocking require in node.js
This is how require is implemented:
> console.log(require.extensions['.js'].toString())
function (module, filename) {
var content = NativeModule.require('fs').readFileSync(filename, 'utf8');
module._compile(stripBOM(content), filename);
}
You can do the same thing in your app. I guess something like this would work:
var fs = require('fs')
require.async = function(filename, callback) {
fs.readFile(filename, 'utf8', function(err, content) {
if (err) return callback(err)
module._compile(content, filename)
// this require call won't block anything because of caching
callback(null, require(filename))
})
}
require.async('./test.js', function(err, module) {
console.log(module)
})
It is not about slow or fast. require is synchronous operation. This means that it will block whole server while executing. If you have 100000 connections all will wait for 100000 requires.
Never use require inside loop, it is bad practice.
So answer to your original question is YES.

Resources