Is requireing a module going to block every single request? According to the docs, the module is cached after the first require but I wanted to see if it's an anti-pattern to do a dynamic require when responding to a request.
Nope, it won't block on every request (as long as you're requiring the same module each time), and it's not an anti-pattern.
If you're loading the same module on each request, any call to require will return instantly (because the module will have already been loaded, compiled, and cached). If, however, many different modules may be required so that you don't get the benefit of caching, it may be better to do an asynchronous require.
But something like this?
function handler(req, res) { require('fs').readFile(…); }
No big deal. It's just a matter of style.
I am often told that blocking of any kind is a no-no in node.js, and that asynchronicity is one of its main imperatives. You could try the following.
Quoting the answer from non-blocking require in node.js
This is how require is implemented:
> console.log(require.extensions['.js'].toString())
function (module, filename) {
var content = NativeModule.require('fs').readFileSync(filename, 'utf8');
module._compile(stripBOM(content), filename);
}
You can do the same thing in your app. I guess something like this would work:
var fs = require('fs')
require.async = function(filename, callback) {
fs.readFile(filename, 'utf8', function(err, content) {
if (err) return callback(err)
module._compile(content, filename)
// this require call won't block anything because of caching
callback(null, require(filename))
})
}
require.async('./test.js', function(err, module) {
console.log(module)
})
It is not about slow or fast. require is synchronous operation. This means that it will block whole server while executing. If you have 100000 connections all will wait for 100000 requires.
Never use require inside loop, it is bad practice.
So answer to your original question is YES.
Related
I'm still learning the node.js ropes and am just trying to get my head around what I should be deferring, and what I should just be executing.
I know there are other questions relating to this subject generally, but I'm afraid without a more relatable example I'm struggling to 'get it'.
My general understanding is that if the code being executed is non-trivial, then it's probably a good idea to async it, as to avoid it holding up someone else's session. There's clearly more to it than that, and callbacks get mentioned a lot, and I'm not 100% on why you wouldn't just synch everything. I've got some ways to go.
So here's some basic code I've put together in an express.js app:
app.get('/directory', function(req, res) {
process.nextTick(function() {
Item.
find().
sort( 'date-modified' ).
exec( function ( err, items ){
if ( err ) {
return next( err );
}
res.render('directory.ejs', {
items : items
});
});
});
});
Am I right to be using process.nextTick() here? My reasoning is that as it's a database call then some actual work is having to be done, and it's the kind of thing that could slow down active sessions. Or is that wrong?
Secondly, I have a feeling that if I'm deferring the database query then it should be in a callback, and I should have the actual page rendering happening synchronously, on condition of receiving the callback response. I'm only assuming this because it seems like a more common format from some of the examples I've seen - if it's a correct assumption can anyone explain why that's the case?
Thanks!
You are using it wrong in this case, because .exec() is already asynchronous (You can tell by the fact that is accepts a callback as a parameter).
To be fair, most of what needs to be asynchronous in nodejs already is.
As for page rendering, if you require the results from the database to render the page, and those arrive asynchronously, you can't really render the page synchronously.
Generally speaking it's best practice to make everything you can asynchronous rather than relying on synchronous functions ... in most cases that would be something like readFile vs. readFileSync. In your example, you're not doing anything synchronously with i/o. The only synchronous code you have is the logic of your program (which requires CPU and thus has to be synchronous in node) but these are tiny little things by comparison.
I'm not sure what Item is, but if I had to guess what .find().sort() does is build a query string internally to the system. It does not actually run the query (talk to the DB) until .exec is called. .exec takes a callback, so it will communicate with the DB asynchronously. When that communication is done, the callback is called.
Using process.nextTick does nothing in this case. That would just delay the calling of its code until the next event loop which there is no need to do. It has no effect on synchronicity or not.
I don't really understand your second question, but if the rendering of the page depends on the result of the query, you have to defer rendering of the page until the query completes -- you are doing this by rendering in the callback. The rendering itself res.render may not be entirely synchronous either. It depends on the internal mechanism of the library that defines the render function.
In your example, next is not defined. Instead your code should probably look like:
app.get('/directory', function(req, res) {
Item.
find().
sort( 'date-modified' ).
exec(function (err, items) {
if (err) {
console.error(err);
res.status(500).end("Database error");
}
else {
res.render('directory.ejs', {
items : items
});
}
});
});
});
Using node.js, what is the best way to process a million items in an HTTP post request without blocking the server? My only guess is some sort of message queue, but I really have no idea.
You would want to use a lib like async.js to create non-blocking loops.
https://github.com/caolan/async
var async = require("async");
async.each(yourArrayOfThings, function(oneItem, callback) {
// do something
// ...
return callback(null);
}, function(err) {
// if any of the callbacks returned an error, err would equal that error
});
Give some more information on what your processing needs are, if this is not an applicable solution for you.
I managed to create a module to handle all the database call. It uses this lib: https://github.com/developmentseed/node-sqlite3
My issues are the following.
Everytime I make a call, I need to make sure the database exist, and if not to create it.
Plus, as all the calls are asynchronous, I end up having loads of functions in functions in callbacks ... etc.
It pretty much looks like this:
getUsers : function (callback){
var _aUsers = [];
var that = this;
this._setupDb(function(){
var db = that.db;
db.all("SELECT * FROM t_client", function(err, rows) {
rows.forEach(function (row) {
_aUsers.push({"cli_id":row.id,"cli_name":row.cli_name,"cli_path":row.cli_path});
});
callback(_aUsers);
});
});
},
So, is there any way I can export my module only when the database is ready and fully created if it does not exist yet?
Does anyone see a way around the "asynchronous" issue?
You could also try using promises or fibers ...
I don't think so. If you make it synchronous, you are taking away the advantage. Javascript functions are meant to be that way. Such a situation is referred to as callback hell. If you are facing problems managing callbacks then you can use these libraries :
underscore
async
See these guides to understand basics of asynchronous programming
Node.js: Style and structure
Using underscore.js managing-callback-spaghetti-in-nodejs
Using async.js node-js-async-programming
I've run into an issue with NodeJS where, due to some middleware, I need to directly return a value which requires knowing the last modified time of a file. Obviously the correct way would be to do
getFilename: function(filename, next) {
fs.stat(filename, function(err, stats) {
// Do error checking, etc...
next('', filename + '?' + new Date(stats.mtime).getTime());
});
}
however, due to the middleware I am using, getFilename must return a value, so I am doing:
getFilename: function(filename) {
stats = fs.statSync(filename);
return filename + '?' + new Date(stats.mtime).getTime());
}
I don't completely understand the nature of the NodeJS event loop, so what I was wondering is if statSync had any special sauce in it that somehow pumped the event loop (or whatever it is called in node, the stack of instructions waiting to be performed) while the filenode information was loading or is it really blocking and that this code is going to cause performance nightmares down the road and I should rewrite the middleware I am using to use a callback? If it does have special sauce to allow for the event loop to continue while it is waiting on the disk, is that available anywhere else (though some promise library or something)?
Nope, there is no magic here. If you block in the middle of the function, everything is blocked.
If performance becomes an issue, I think your only option is to rewrite that part of the middleware, or get creative with how it is used.
I am wondering if node.js makes any guarantee on the order async calls start/complete.
I do not think it does, but I have read a number of code samples on the Internet that I thought would be buggy because the async calls may not complete in the order expected, but the examples are often stated in contexts of how great node is because of its single-threaded asynchronous model. However I cannot find an direct answer to this general question.
Is it a situation that different node modules make different guarantees? For example at https://stackoverflow.com/a/8018371/1072626 the answer clearly states the asynchronous calls involving Redis preserves order.
The crux of this problem can be boiled down to is the following execution (or similar) is strictly safe in node?
var fs = require("fs");
fs.unlink("/tmp/test.png");
fs.rename("/tmp/image1.png", "/tmp/test.png");
According to the author the call to unlink is needed because rename will fail on Windows if there is a pre-existing file. However, both calls are asynchronous, so my initial thoughts were that the call to rename should be in the callback of unlink to ensure the asynchronous I/O completes before the asynchronous rename operation starts otherwise rename may execute first, causing an error.
Async operation do not have any determined time to execute.
When you call unlink, it asks OS to remove the file, but it is not defined when OS will actually remove the file; it might be a millisecond or an year later.
The whole point of async operation is that they don't depend on each other unless explicitly stated so.
In order to rename to occur after unlink, you have to modify your code like this:
fs.unlink("/tmp/test.png", function (err) {
if (err) {
console.log("An error occured");
} else {
fs.rename("/tmp/image1.png", "/tmp/test.png", function (err) {
if (err) {
console.log("An error occured");
} else {
console.log("Done renaming");
}
});
}
});
or, alternatively, to use synchronized versions of fs functions (note that these will block the executing thread):
fs.unlinkSync("/tmp/test.png");
fs.renameSync("/tmp/image1.png", "/tmp/test.png");
There also libraries such as async that make async code to look better:
async.waterfall([
fs.unlink.bind(null, "/tmp/test.png");
fs.rename.bind(null, "/tmp/image1.png", "/tmp/test.png");
], function (err) {
if (err) {
console.log("An error occured");
} else {
console.log("done renaming");
}
});
Note that in all examples error handling is extremely simplified to represent the idea.
If you look at the documentation of Node.js you'll find that the function fs.unlink takes a callback as an argument as:
fs.unlink(path, [callback]);
An action that you intend to take when the current function returns should be passed to the function as the callback argument. So typically in your case the code will take the following form:
var fs = require("fs");
fs.unlink("/tmp/test.png", function(){
fs.rename("/tmp/image1.png", "/tmp/test.png");
});
In the specific case of unlink and rename there are also synchronous function in Node.js and can be used as fs.unlinkSync(path) and fs.renameSync(oldPath, newPath). This will ensure that the code is run synchronously.
Moreover if you wish to use asynchronous implementation but retain better readability you could consider a library like async. It also has options for different modes of implementation like parallel, series, waterfall etc.
Hope this helps.