Organizing routes without pre loading the modules in nodejs express - node.js

Currently the routing in Express framework requires the module to be loaded before. But that would not be efficient in real life scenarios where there are hundreds of module. I would like to load only the module that is needed. Is there a way where I can define a route to a target module without pre loading the module.
Something like this
app.get('user/showall', 'user.list');
So I would like user module to be loaded only when that particular request needs it to be loaded.

I would rather have a slow startup with fast request handling than fast startup with slow request handling because modules have to be loaded at runtime.
But if you really want, you could create a middleware to implement such behaviour (totally untested):
var lazyload = function(route) {
var s = route.split('.');
var mod = s[0];
var exp = s[1];
return function(req, res, next) {
require(mod)[exp](req, res);
};
};
...
app.get('user/showall', lazyload('user.list'));
(this assumes that routes are always named MODULENAME.EXPORTEDNAME).

I second what #robertklep said "I would rather have a slow startup with fast request handling than fast startup with slow request handling".
But I strongly recommend against doing a require to handle a request because the first call is synchronous and will block the server, which not only affects the current request, but prevents any other request from being processed. With enough such requests, and your server will stop answering requests.
Basically: preload all the code you will need, but lazy load data (in an asynchronous fashion).
(This is not what you are asking for, but this would be considered bad practice).

Related

Access current req object everywhere in Node.js Express

I wonder how to access req object if there's no 'req' parameter in callback.
This is the scenario:
In ExpressJs, I have a common function, it uses to handle something with 'req' object, but not pass req into it.
module.exports = {
get: function(){
var req = global.currentRequest;
//do something...
}
}
My current solution is that I write a middleware for all request, I put the 'req' in global variable, then I can access the 'req' everywhere with 'global.currentRequest'.
// in app.js
app.use(function (req, res, next) {
global.currentRequest= req;
next();
});
But I don't know if it's good? Can anyone have suggestions?
Thanks a lot!
The only proper way is to pass the req object through as an argument to all functions that need it.
Stashing it in a global simply will not work because multiple requests can be in process at the same time if any requests use async calls as part of their processing and those multiple requests will stomp on each other making a hard to track down bug. There are no shortcuts here. Pass the current request as an argument to any code that needs it.
You cannot put request-specific data into a global in node.js, ever. Doing so will create an opportunity for two requests that are in-flight at the same time to stomp on each other and for data to get confused between requests. Remember, this is a server that is potentially handling requests for many clients. You cannot use synchronous, one-at-a-time thinking for a server. A node.js server may potentially have many requests all in flight at the same time and thus plain globals cannot be used for request-specific data.
There is no shortcut here. You will just have to pass the req object through to the function that needs it. If that means you have to change the function signature of several intervening functions, then so-be-it. That's what you have to do. That is the only correct way to solve this type of problem.
There are some circumstances where you may be able to use a closure to "capture" the desired req object and then use it in inner functions without passing it to those inner functions, but it does not sound like that is your function structure. We'd have to see a lot more of your real/actual code to be able to know whether that's a possibility or not.
Actually, this is possible with something like global-request-context
This is using zone.js which let you persist variables across async tasks.

Is there any risk to read/write the same file content from different 'sessions' in Node JS?

I'm new in Node JS and i wonder if under mentioned snippets of code has multisession problem.
Consider I have Node JS server (express) and I listen on some POST request:
app.post('/sync/:method', onPostRequest);
var onPostRequest = function(req,res){
// parse request and fetch email list
var emails = [....]; // pseudocode
doJob(emails);
res.status(200).end('OK');
}
function doJob(_emails){
try {
emailsFromFile = fs.readFileSync(FILE_PATH, "utf8") || {};
if(_.isString(oldEmails)){
emailsFromFile = JSON.parse(emailsFromFile);
}
_emails.forEach(function(_email){
if( !emailsFromFile[_email] ){
emailsFromFile[_email] = 0;
}
else{
emailsFromFile[_email] += 1;
}
});
// write object back
fs.writeFileSync(FILE_PATH, JSON.stringify(emailsFromFile));
} catch (e) {
console.error(e);
};
}
So doJob method receives _emails list and I update (counter +1) these emails from object emailsFromFile loaded from file.
Consider I got 2 requests at the same time and it triggers doJob twice. I afraid that when one request loaded emailsFromFile from file, the second request might change file content.
Can anybody spread the light on this issue?
Because the code in the doJob() function is all synchronous, there is no risk of multiple requests causing a concurrency problem.
If you were using async IO in that function, then there would be possible concurrency issues.
To explain, Javascript in node.js is single threaded. So, there is only one thread of Javascript execution running at a time and that thread of execution runs until it returns back to the event loop. So, any sequence of entirely synchronous code like you have in doJob() will run to completion without interruption.
If, on the other hand, you use any asynchronous operations such as fs.readFile() instead of fs.readFileSync(), then that thread of execution will return back to the event loop at the point you call fs.readFileSync() and another request can be run while it is reading the file. If that were the case, then you could end up with two requests conflicting over the same file. In that case, you would have to implement some form of concurrency protection (some sort of flag or queue). This is the type of thing that databases offer lots of features for.
I have a node.js app running on a Raspberry Pi that uses lots of async file I/O and I can have conflicts with that code from multiple requests. I solved it by setting a flag anytime I'm writing to a specific file and any other requests that want to write to that file first check that flag and if it is set, those requests going into my own queue are then served when the prior request finishes its write operation. There are many other ways to solve that too. If this happens in a lot of places, then it's probably worth just getting a database that offers features for this type of write contention.

Best way to reuse a large translation file within Node / Express

I'm new to Node but I figured I'd jump right in and start converting a PHP app into Node/Express. It's a bilingual app that uses gettext with PO/MO files. I found a Node module called node-gettext. I'd rather not convert the PO files into another format right now, so it seems this library is my only option.
So my concern is that right now, before every page render, I'm doing something like this:
exports.home_index = function(req, res)
{
var gettext = require('node-gettext'),
gt = new gettext();
var fs = require('fs');
gt.textdomain('de');
var fileContents = fs.readFileSync('./locale/de.mo');
gt.addTextdomain('de', fileContents);
res.render(
'home/index.ejs',
{ gt: gt }
);
};
I'll also be using the translations in classes, so with how it's set up now I'd have to load the entire translation file again every time I want to translate something in another place.
The translation file is about 50k and I really don't like having to do file operations like this on every page load. In Node/Express, what would be the most efficient way to handle this (aside from a database)? Usually a user won't even be changing their language after the first time (if they're changing it from English).
EDIT:
Ok, I have no idea if this is a good approach, but it at least lets me reuse the translation file in other parts of the app without reloading it everywhere I need to get translated text.
In app.js:
var express = require('express'),
app = express(),
...
gettext = require('node-gettext'),
gt = new gettext();
Then, also in app.js, I create the variable app.locals.gt to contain the gettext/translation object, and I include my middleware function:
app.locals.gt = gt;
app.use(locale());
In my middleware file I have this:
mod
module.exports = function locale()
{
return function(req, res, next)
{
// do stuff here to populate lang variable
var fs = require('fs');
req.app.locals.gt.textdomain(lang);
var fileContents = fs.readFileSync('./locales/' + lang + '.mo');
req.app.locals.gt.addTextdomain(lang, fileContents);
next();
};
};
It doesn't seem like a good idea to assign the loaded translation file to app, since depending on the current request that file will be one of two languages. If I assigned the loaded translation file to app instead of a request variable, can that mix up users' languages?
Anyway, I know there's got to be a better way of doing this.
The simplest option would be to do the following:
Add this in app.js:
var languageDomains = {};
Then modify your Middleware:
module.exports = function locale()
{
return function(req, res, next)
{
// do stuff here to populate lang variable
if ( !req.app.locals.languageDomains[lang] ) {
var fs = require('fs');
var fileContents = fs.readFileSync('./locales/' + lang + '.mo');
req.app.locals.languageDomains[lang] = true;
req.app.locals.gt.addTextdomain(lang, fileContents);
}
req.textdomain = req.app.locals.gt.textdomain(lang);
next();
};
};
By checking if the file has already been loaded you are preventing the action from happening multiple times, and the domain data will stay resident in the server's memory. The downside to the simplicity of this solution is that if you ever change the contents of your .mo files whilst the server is running, the changes wont be taken into account. However, this code could be extended to keep an eye on the mtime of the files, and reload accordingly, or make use of fs.watchFile — if required:
if ( !req.app.locals.languageDomains[lang] ) {
var fs = require('fs'), filename = './locales/' + lang + '.mo';
var fileContents = fs.readFileSync(filename);
fs.watchFile(filename, function (curr, prev) {
req.app.locals.gt.addTextdomain(lang, fs.readFileSync(filename));
});
req.app.locals.languageDomains[lang] = true;
req.app.locals.gt.addTextdomain(lang, fileContents);
}
Warning: It should also be noted that using sync versions of functions outside of server initialisation is not a good idea because it can freeze the thread. You'd be better off changing your sync loading to the async equivalent.
After the above changes, rather than passing gt to your template, you should be able to use req.textdomain instead. It seems that the gettext library supports a number of requests directly on each domain object, which means you hopefully don't need to refer to the global gt object on a per request basis (which will be changing it's default domain on each request):
Each domain supports:
getTranslation
getComment
setComment
setTranslation
deleteTranslation
compilePO
compileMO
Taken from here:
https://github.com/andris9/node-gettext/blob/e193c67fdee439ab9710441ffd9dd96d027317b9/lib/domain.js
update
A little bit of further clarity.
Once the server has loaded the file into memory the first time, it should remain there for all subsequent connections it receives (for any visitor/request) because it is stored globally and wont be garbage collected — unless you remove all references to the data, which would mean gettext would need to have some kind of unload/forget domain method.
Node is different to PHP in that its environment is shared and wraps its own HTTP server (if you are using something like Express), which means it is very easy to remember data globally as it has a constant environment that all the code is executed within. PHP is always executed after the HTTP server has received and dealt with the request (e.g. Apache). Each PHP response is then executed in its own separate run-time, which means you have to rely on databases, sessions and cache stores to share even simple information and most resources.
further optimisations
Obviously with the above you are constantly running translations on each page load. Which means the gettext library will still be using the translation data resident in memory, which will take up processing time. To get around this, it would be best to make sure your URLs have something that makes them unique for each different language i.e. my-page/en/ or my.page.fr or even jp.domain.co.uk/my-page and then enable some kind of full page caching using something like memcached or express-view-cache. However, once you start caching pages you need to make certain there aren't any regions that are user specific, if so, you need to start implement more complicated systems that are sensitive to these areas.
Remember: The golden rule of optimisation, don't do so before you need to... basically meaning I wouldn't worry about page caching until you know it's going to be an issue, but it is always worth bearing in mind what your options are, as it should shape your code design.
update 2
Just to illustrate a bit further on the behaviour of a server running in JavaScript, and how the global behaviour is not just a property of req.app, but in fact any object that is further up the scope chain.
So, as an example, instead of adding var languageDomains = {}; to your app.js, you could instantiate it further up the scope of wherever your middleware is placed. It's best to keep your global entities in one place however, so app.js is the better place, but this is just for illustration.
var languageDomains = {};
module.exports = function locale()
{
/// you can still access languageDomains here, and it will behave
/// globally for the entire server.
languageDomains[lang]
}
So basically, where-as with PHP, the entire code-base is re-executed on each request — so the languageDomains would be instantiated a-new each time — in Node the only part of the code to be re-executed is the code within locale() (because it is triggered as part of a new request). This function will still have a reference to the already existing and defined languageDomains via the scope chain. Because languageDomains is never reset (on a per request basis) it will behave globally.
Concurrent users
Node.js is single threaded. This means that in order for it to be concurrent i.e. handle multiple requests at the "same" time, you have to code your app in such a way that each little part can be executed very quickly and then slip into a waiting state, whilst another part of another request is dealt with.
This is the reason for the asynchronous and callback nature of Node, and the reason to avoid Sync calls whilst your app is running. Any one Sync request could halt or freeze execution of the thread and delay handling for all other requests. The reason why I state this is to give you a better idea of how multiple users might interact with your code (and global objects).
Basically once a request is being dealt with by your server, it is it's only focus, until that particular execution cycle ends i.e. your request handler stops calling other code that needs to run synchronously. Once that happens the next queued item is dealt with (a callback or something), this could be part of another request, or it could be the next part in the current request.

sails.js Use session param in model

This is an extension of this question.
In my models, every one requires a companyId to be set on creation and every one requires models to be filtered by the same session held companyid.
With sails.js, I have read and understand that session is not available in the model unless I inject it using the controller, however this would require me to code all my controller/actions with something very, very repetitive. Unfortunate.
I like sails.js and want to make the switch, but can anyone describe to me a better way? I'm hoping I have just missed something.
So, if I understand you correctly, you want to avoid lots of code like this in your controllers:
SomeModel.create({companyId: req.session.companyId, ...})
SomeModel.find({companyId: req.session.companyId, ...})
Fair enough. Maybe you're concerned that companyId will be renamed in the future, or need to be further processed. The simplest solution if you're using custom controller actions would be to make class methods for your models that accept the request as an argument:
SomeModel.doCreate(req, ...);
SomeModel.doFind(req, ...);
On the other hand, if you're on v0.10.x and you can use blueprints for some CRUD actions, you will benefit from the ability to override the blueprints with your own code, so that all of your creates and finds automatically use the companyId from the session.
If you're coming from a non-Node background, this might all induce some head-scratching. "Why can't you just make the session available everywhere?" you might ask. "LIKE THEY DO IN PHP!"
The reason is that PHP is stateless--every request that comes in gets essentially a fresh copy of the app, with nothing in memory being shared between requests. This means that any global variables will be valid for the life of a single request only. That wonderful $_SESSION hash is yours and yours alone, and once the request is processed, it disappears.
Contrast this with Node apps, which essentially run in a single process. Any global variables you set would be shared between every request that comes in, and since requests are handled asynchronously, there's no guarantee that one request will finish before another starts. So a scenario like this could easily occur:
Request A comes in.
Sails acquires the session for Request A and stores it in the global $_SESSION object.
Request A calls SomeModel.find(), which calls out to a database asynchronously
While the database does its magic, Request A surrenders its control of the Node thread
Request B comes in.
Sails acquires the session for Request B and stores it in the global $_SESSION object.
Request B surrenders its control of the thread to do some other asynchronous call.
Request A comes back with the result of its database call, and reads something from the $_SESSION object.
You can see the issue here--Request A now has the wrong session data. This is the reason why the session object lives inside the request object, and why it needs to be passed around to any code that wants to use it. Trying too hard to circumvent this will inevitably lead to trouble.
Best option I can think of is to take advantage of JS, and make some globally accessible functions.
But its gonna have a code smell :(
I prefer to make a policy that add the companyId inside the body.param like this:
// Needs to be Logged
module.exports = function(req, res, next) {
sails.log.verbose('[Policy.insertCompanyId() called] ' + __filename);
if (req.session) {
req.body.user = req.session.companyId;
//or something like AuthService.getCompanyId(req.session);
return next();
}
var err = 'Missing companyId';
//log ...
return res.redirect(307, '/');
};

Expressjs middleware example

Hellow I have code in my app.js, looking like that:
app.use('/someurl', require('./middleware/somemodule'));
-app.use instead app.all
and module looks like:
if(process.env.BLALAL === undefined){
throw "Error: process.env.BLALAL === undefined";
}
module.exports = function(req, res, next){
...
}
is it a bad practice ?
As said on the express api reference:
app.VERB(path, [callback...], callback)
The app.VERB() methods provide the routing functionality in Express,
where VERB is one of the HTTP verbs, such as app.post().
app.use([path], function)
Use the given middleware function, with optional mount path,
defaulting to "/".
The "mount" path is stripped and is not visible to the middleware
function. The main effect of this feature is that mounted middleware
may operate without code changes regardless of its "prefix" pathname.
IMO
The functionality may be nearly the same, but there is an underlying semantic meaning. The routes itself should be set through the app.VERB api, while any middleware should be set through the app.use api.
Normally middlewares modify the request or response objects, or inject functionality from other module that may answer the request, or not.
connect.static is a good example. It could be really an app or an HttpServer by itself, but is injected as middleware on other app object.
I personally don't like require inside other commands then var bla = require('bla');, it makes code much worse readable in my opinion and you did not get anything in return.
I am not sure what was your intention, but if your code depends on environment variable, it is better to throw immediately than later when your route is called. So app.use is better then app.all. But I don't understand why aren't you simply test your condition inside app.js and why you hide it in somemodule.

Resources