I'm in a rather unique situation (one I've never found myself in before) and I can't find anything on how it should be handled, so I thought I'd ask here. (and perhaps start a good discussion on how it should be handled)
I'm writing a Node.js/Express application that does a series of database calls in various route handlers. I'm using the node-sqlite3 module to make the database calls. In this case the user is uploading a file, so I take that from the form POST data and save it to the filesystem (will be moved to blob storage later) and generate some xml files for other reasons. The way that I name these files is by the id in the database to facilitate routes like '/file/:id' to GET the file later on, as well as leveraging the database to ensure I don't have name collision issues.
I'm doing this in CoffeeScript so I'm wrapping this model in a class, so it uses this (or #) to access these helper methods. Before I was doing a last_insert_rowid() call to get the id of the thing I just inserted but this opens me up to potential race conditions. So as it turns out when you do an INSERT in the node-sqlite3 module the callback it calls when it's done saves the last inserted row id in the this object (so (err) -> fileid = #.lastID). Now for some code (names changed and simplified).
uploadFile: (fileData, cb) ->
#.openConnectionIfClosed()
#db.serialize =>
# preserve the # object to allow calls to class methods like #.saveFile
#db.run 'insert into thing (val=$val)', $val: 'some_val', (err) ->
if err
cb err
return
# # refers to the this object provided by #db.run
fileid = #.lastID
async.waterfall [
(async_cb) =>
# uh oh, the # object is the one from the db library, not my class
# earning me a 'method undefined' error
#.saveFile fileid, async_cb
# etc... you get the idea
...
], (err) -> cb err
#.closeConnection()
So I'm needing to retain the class this object to access instance methods, but I also need to get the lastID value out of the this object returned by default from the callback in the db call. Obviously if I change the callback on the db call to fat arrows I can access my class methods, but then the #.lastID returns undefined.
What's the "proper" way to achieve this? (in CoffeeScript or JavaScript, doesn't particularly matter) I think that the way I can solve this is by assigning a context variable like ctxt = # at the top of the method but obviously this isn't favorable. Any ideas?
EDIT: Oh, and I forgot to mention, I can't just name the file something else because I also need that id to update the database. The call to async.waterfall does: Save file -> generate xml metadata file & save it -> add the path of the xml file to the database entry retroactively. Though I could potentially use a statement to do that without completely isolated database calls, haven't investigated that yet.
In JavaScript, if you need to preserve a particular context, its standard practice to use the variable self, as such:
var self = this;
In CoffeeScript, the fat arrow is standard practice for cases like this, though using 'self' is not wrong either.
Related
In Fastify.js you have at least to ways to register hooks: globally (via fastify.addHook()) or as a property inside the route declaration. In the example below I'm trying to use fastfy-multer to handle file uploading but the maximum amount of files must be limited by a setting associated with a "room". As the app has many rooms, most of the requests contain a reference to a room, and every time the request is being augmented with room settings by the preHandler hook.
import fastify from 'fastify'
import multer from 'fastify-multer'
const server = fastify()
server.register(multer.contentParser)
// For all requests containing the room ID, fetch the room options from the database
fastify.addHook('preHandler', async (request, reply) => {
if (request.body.roomID) {
const roomOptions = await getRoomOptions(request.body.roomID)
if (roomOptions) {
reuqest.body.room = roomOptions
}
else {
// handle an error if the room doesn't exist
}
}
})
server.post('/post', {
// Limit the maximum amount of files to be uploaded based on room options
preHandler: upload.array(files, request.body.room.maxFiles)
})
In order for this setup to work, the global hook must always be executed before the file upload hook. How can I guarantee that?
Summary: As #Manuel Spigolon said:
How can I guarantee that? The framework does it
Now we can take Manuel's word for it (SPOILER ALERT: they are absolutely correct), or we can prove how this works by looking in the source code on GitHub.
The first thing to keep in mind is that arrays in JavaScript are remain ordered by the way objects are pushed into them, but don’t take my word for it. That is all explained here if you want to dive a little deeper into the evidence. If that was not true, everything below doesn't matter and you could just stop reading now.
How addHook works
Now that we have established that arrays maintain their order, let look at how the addHook code is executed. We can start by looking at the default export of fastify in the fastify.js file located in the root directory. In this object if scoll down you'll see the addHook property defined. When we look into the addHook function implementation we can see that in that add hook call we are calling this[kHooks].add.
When we go back to see what the kHooks property is we see that it is a new Hooks(). When we go to take a look at the add method on the Hooks object, we can see that it just validates the hook that is being add and then [pushes] it to the array property on the Hooks object with the matching hook name. This shows that hooks will always be in the order which add was called for them.
How fastify.route adds hooks
I hope you're following to this point because that only proves the order of the addHook calls in the respective array on the Hooks object. The next question is how these interact with the calls of fastify.(get | post | route | ...) functions. We can walk through the fastify.get function, but they are all pretty much the same (you can do the same exercise with any of them). Looking at the get function, we see that the implementation is just calling the router.prepareRoute function. When you look into the prepareRoute implementation, you see that this function returns a call to the route function. In the route function there is a section where the hooks are set up. It looks like this:
for (const hook of lifecycleHooks) {
const toSet = this[kHooks][hook]
.concat(opts[hook] || [])
.map(h => h.bind(this))
context[hook] = toSet.length ? toSet : null
}
What this does is go through every lifecycle hook and turn it into a set of all the hooks from the Fastify instance (this) and the hooks in the options (opts[hook]) for that given hook and binds them to the fastify instance (this). This shows that the hooks in the options for the routes are always added after the addHook handlers.
How Fastify executes hooks
This is not everything we need though. Now we know the order in which the hooks are stored. But how exactly are they executed? For that we can look at the hookRunner function in the hooks.js file. We see this function acts as a sort of recursive loop that continues running as long as the handlers do not error. It first creates a variable i to keep track of the handler function it is currently on and then tries to execute it and increments the function tracker (i).
If the handler fails (handleReject), it runs a callback function and does not call the next function to continue. If the handler succeeds (handleResolve), it just runs the next function to try the same process on the following handler (functions[i++]) in the functions set.
Why does this matter
This proves that the hook handlers are called in the order that they were pushed into the ordered collection. In other words:
How can I guarantee that? The framework does it
I am a bit new to JavaScript web dev, and so am still getting my head around the flow of asynchronous functions, which can be a bit unexpected to the uninitiated. In my particular use case, I want execute a routine on the list of available databases before moving into the main code. Specifically, in order to ensure that a test environment is always properly initialized, I am dropping a database if it already exists, and then building it from configuration files.
The basic flow I have looks like this:
let dbAdmin = client.db("admin").admin();
dbAdmin.listDatabases(function(err, dbs){/*Loop through DBs and drop relevant one if present.*/});
return await buildRelevantDB();
By peppering some console.log() items throughout, I have determined that the listDatabases() call basically puts the callback into a queue of sorts. I actually enter buildRelevantDB() before entering the callback passed to listDatabases. In this particular example, it seems to work anyway, I think because the call that reads the configuration file is also asynchronous and so puts items into the same queue but later, but I find this to be brittle and sloppy. There must be some way to ensure that the listDatabases portion resolves before moving forward.
The closest solution I found is here, but I still don't know how to get the callback I pass to listDatabases to be like a then as in that solution.
Mixing callbacks and promises is a bit more advanced technique, so if you are new to javascript try to avoid it. In fact, try to avoid it even if you already learned everything and became a js ninja.
Dcumentation for listDatabases says it is async, so you can just await it without messing up with callbacks:
const dbs = await dbAdmin.listDatabases();
/*Loop through DBs and drop relevant one if present.*/
The next thing, there is no need to await before return. If you can await within a function, it is async and returns a promise anyway, so just return the promise from buildRelevantDB:
return buildRelevantDB();
Finally, you can drop database directly. No need to iterate over all databases to pick one you want to drop:
await client.db(<db name to drop>).dropDatabase();
I was wondering what's the best practice and if I should create:
a directory in which declare statically all the errors my application uses, like api/errors/custom1Error
declare them directly inside the files
or put the files directly inside the dir that needs that error, like api/controller/error/formInvalidError
other options!?
A neat way of going about this would be to simply add the errors as custom responses under api/responses. This way even the invocation becomes pretty neat. Although the doc says you should add them directly in the responses directory, I'm sure there must be a way to nest them under, say, responses/errors. I'll try that out and post an update in a bit.
Alright, off a quick search, I couldn't find any way to nest the responses, but you can use a small workaround that's not quite as neat:
Create the responses/errors directory with all the custom error response handlers. Create a custom response and name it something like custom.js. Then specify the response name while calling res.custom().
I'm adding a short snippet just for illustration:
api/responses/custom.js:
var customErrors = {
customError1: require('./errors/customError1'),
customError2: require('./errors/customError2')
};
module.exports = function custom (errorName, data) {
var req = this.req;
var res = this.res;
if (customErrors[errorName]) return customErrors[errorName](req, res, data);
else return res.negotiate();
}
From the controller:
res.custom('authError', data);
If you don't need logical processing for different errors, you can do away with the whole errors/ directory and directly invoke the respective views from custom.js:
module.exports = function custom (viewName, data) {
var req = this.req;
var res = this.res;
return res.view('errors/' + viewName, data);//assuming you have error views in views/errors
}
(You should first check if the view exists. Find out how on the linked page.)
Although I'm using something like this for certain purposes (dividing routes and so on), there definitely should be a way to include response handlers defined in different directories. (Perhaps by reconfiguring some grunt task?) I'll try to find that out and update if I find any success.
Good luck!
Update
Okay, so I found that the responses hook adds all files to res without checking if they are directories. So adding a directory under responses results in a TypeError from lodash. I may be reading this wrong but I guess it's reasonable to conclude that currently it's not possible to add a directory there, so I guess you'll have to stick to one of the above solutions.
I just started developing nodejs. I'm confused to use async model. I believe there is a way to turn most of SYNC use cases into ASYNC way. Example, by SYNC, we load some data and wait until it returns then show them to user; by ASYNC, we load data and return, just tell the user data will be presented later. I can understand why ASYNC is used in this scenario.
But here I have a use case. I'm building an web app, allowing user to place a order (buying something). Before saving the order data into db, I want to put some user data together with order data (I'm using document NoSql db by the way). So I think by SYNC, after I get order data, I make a SYNC call to database and wait for its returned user data. After I get returned data, integrate them together and ingest into db.
I think there might be an issue if I make ASYNC call to db to query user data because user data may be returned after I save data to db. And that's not what I want.
So in this case, how can I do this thing ASYNCHRONOUSLY?
Couple of things here. First, if your application already has the user data (the user is already logged in), then this information should be stored in session so you don't have to access the DB. If you are allowing the user to register at the time of purchase, you would simply want to pass a callback function that handles saving the order into your call that saves the user data. Without knowing specifically what your code looks like, something like this is what you would be looking for.
function saveOrder(userData, orderData, callback) {
// save the user data to the DB
db.save(userData, function(rec) {
// if you need to add the user ID or something to the order...
orderData.userId = rec.id; // this would be dependent on your DB of choice
// save the order data to the DB
db.save(orderData, callback);
});
}
Sync code goes something like this. step by step - one after other. There can be ifs and loops (for) etc. all of us get it.
fetchUserDataFromDB();
integrateOrderDataAndUserData();
updateOrderData();
Think of async programming with nodejs as event driven. Like UI programming - code (function) is executed when an event occurs. E.g. On click event - framework calls back registered clickHandler.
nodejs async programming can also be thought on these lines. When db query (async) execution completes, your callback is called. When order data is updated, your callback is called. The above code goes something like this:
function nodejsOrderHandler(req,res)
{
var orderData;
db.queryAsync(..., onqueryasync);
function onqueryasync(userdata)
{
// integrate user data with order data
db.update(updateParams, onorderudpate);
}
function onorderupdate(e, r)
{
// handler error
write response.
}
}
javascript closure provides the way to keep state in variables across functions.
There is certainly much more to async programming and there are helper modules that help with basic constructs like chain, parallel, join etc as you write more involved async code. but this probably gives you a quick idea.
I'm new to Node but I figured I'd jump right in and start converting a PHP app into Node/Express. It's a bilingual app that uses gettext with PO/MO files. I found a Node module called node-gettext. I'd rather not convert the PO files into another format right now, so it seems this library is my only option.
So my concern is that right now, before every page render, I'm doing something like this:
exports.home_index = function(req, res)
{
var gettext = require('node-gettext'),
gt = new gettext();
var fs = require('fs');
gt.textdomain('de');
var fileContents = fs.readFileSync('./locale/de.mo');
gt.addTextdomain('de', fileContents);
res.render(
'home/index.ejs',
{ gt: gt }
);
};
I'll also be using the translations in classes, so with how it's set up now I'd have to load the entire translation file again every time I want to translate something in another place.
The translation file is about 50k and I really don't like having to do file operations like this on every page load. In Node/Express, what would be the most efficient way to handle this (aside from a database)? Usually a user won't even be changing their language after the first time (if they're changing it from English).
EDIT:
Ok, I have no idea if this is a good approach, but it at least lets me reuse the translation file in other parts of the app without reloading it everywhere I need to get translated text.
In app.js:
var express = require('express'),
app = express(),
...
gettext = require('node-gettext'),
gt = new gettext();
Then, also in app.js, I create the variable app.locals.gt to contain the gettext/translation object, and I include my middleware function:
app.locals.gt = gt;
app.use(locale());
In my middleware file I have this:
mod
module.exports = function locale()
{
return function(req, res, next)
{
// do stuff here to populate lang variable
var fs = require('fs');
req.app.locals.gt.textdomain(lang);
var fileContents = fs.readFileSync('./locales/' + lang + '.mo');
req.app.locals.gt.addTextdomain(lang, fileContents);
next();
};
};
It doesn't seem like a good idea to assign the loaded translation file to app, since depending on the current request that file will be one of two languages. If I assigned the loaded translation file to app instead of a request variable, can that mix up users' languages?
Anyway, I know there's got to be a better way of doing this.
The simplest option would be to do the following:
Add this in app.js:
var languageDomains = {};
Then modify your Middleware:
module.exports = function locale()
{
return function(req, res, next)
{
// do stuff here to populate lang variable
if ( !req.app.locals.languageDomains[lang] ) {
var fs = require('fs');
var fileContents = fs.readFileSync('./locales/' + lang + '.mo');
req.app.locals.languageDomains[lang] = true;
req.app.locals.gt.addTextdomain(lang, fileContents);
}
req.textdomain = req.app.locals.gt.textdomain(lang);
next();
};
};
By checking if the file has already been loaded you are preventing the action from happening multiple times, and the domain data will stay resident in the server's memory. The downside to the simplicity of this solution is that if you ever change the contents of your .mo files whilst the server is running, the changes wont be taken into account. However, this code could be extended to keep an eye on the mtime of the files, and reload accordingly, or make use of fs.watchFile — if required:
if ( !req.app.locals.languageDomains[lang] ) {
var fs = require('fs'), filename = './locales/' + lang + '.mo';
var fileContents = fs.readFileSync(filename);
fs.watchFile(filename, function (curr, prev) {
req.app.locals.gt.addTextdomain(lang, fs.readFileSync(filename));
});
req.app.locals.languageDomains[lang] = true;
req.app.locals.gt.addTextdomain(lang, fileContents);
}
Warning: It should also be noted that using sync versions of functions outside of server initialisation is not a good idea because it can freeze the thread. You'd be better off changing your sync loading to the async equivalent.
After the above changes, rather than passing gt to your template, you should be able to use req.textdomain instead. It seems that the gettext library supports a number of requests directly on each domain object, which means you hopefully don't need to refer to the global gt object on a per request basis (which will be changing it's default domain on each request):
Each domain supports:
getTranslation
getComment
setComment
setTranslation
deleteTranslation
compilePO
compileMO
Taken from here:
https://github.com/andris9/node-gettext/blob/e193c67fdee439ab9710441ffd9dd96d027317b9/lib/domain.js
update
A little bit of further clarity.
Once the server has loaded the file into memory the first time, it should remain there for all subsequent connections it receives (for any visitor/request) because it is stored globally and wont be garbage collected — unless you remove all references to the data, which would mean gettext would need to have some kind of unload/forget domain method.
Node is different to PHP in that its environment is shared and wraps its own HTTP server (if you are using something like Express), which means it is very easy to remember data globally as it has a constant environment that all the code is executed within. PHP is always executed after the HTTP server has received and dealt with the request (e.g. Apache). Each PHP response is then executed in its own separate run-time, which means you have to rely on databases, sessions and cache stores to share even simple information and most resources.
further optimisations
Obviously with the above you are constantly running translations on each page load. Which means the gettext library will still be using the translation data resident in memory, which will take up processing time. To get around this, it would be best to make sure your URLs have something that makes them unique for each different language i.e. my-page/en/ or my.page.fr or even jp.domain.co.uk/my-page and then enable some kind of full page caching using something like memcached or express-view-cache. However, once you start caching pages you need to make certain there aren't any regions that are user specific, if so, you need to start implement more complicated systems that are sensitive to these areas.
Remember: The golden rule of optimisation, don't do so before you need to... basically meaning I wouldn't worry about page caching until you know it's going to be an issue, but it is always worth bearing in mind what your options are, as it should shape your code design.
update 2
Just to illustrate a bit further on the behaviour of a server running in JavaScript, and how the global behaviour is not just a property of req.app, but in fact any object that is further up the scope chain.
So, as an example, instead of adding var languageDomains = {}; to your app.js, you could instantiate it further up the scope of wherever your middleware is placed. It's best to keep your global entities in one place however, so app.js is the better place, but this is just for illustration.
var languageDomains = {};
module.exports = function locale()
{
/// you can still access languageDomains here, and it will behave
/// globally for the entire server.
languageDomains[lang]
}
So basically, where-as with PHP, the entire code-base is re-executed on each request — so the languageDomains would be instantiated a-new each time — in Node the only part of the code to be re-executed is the code within locale() (because it is triggered as part of a new request). This function will still have a reference to the already existing and defined languageDomains via the scope chain. Because languageDomains is never reset (on a per request basis) it will behave globally.
Concurrent users
Node.js is single threaded. This means that in order for it to be concurrent i.e. handle multiple requests at the "same" time, you have to code your app in such a way that each little part can be executed very quickly and then slip into a waiting state, whilst another part of another request is dealt with.
This is the reason for the asynchronous and callback nature of Node, and the reason to avoid Sync calls whilst your app is running. Any one Sync request could halt or freeze execution of the thread and delay handling for all other requests. The reason why I state this is to give you a better idea of how multiple users might interact with your code (and global objects).
Basically once a request is being dealt with by your server, it is it's only focus, until that particular execution cycle ends i.e. your request handler stops calling other code that needs to run synchronously. Once that happens the next queued item is dealt with (a callback or something), this could be part of another request, or it could be the next part in the current request.