node.js with redis: synchronous or asynchronous? - node.js

In my app (node / express / redis), I use some code to update several items in DB at the same time:
app.put('myaction', function(req, res){
// delete stuff
db.del("key1");
db.srem("set1", "test");
// Add stuff
db.sadd("set2", "test2");
db.sadd("set3", "test3");
db.hmset("hash1", "k11", "v11", "k21", "v21");
db.hmset("hash2", "k12", "v12", "k22", "v22");
// ...
// Send response back
res.writeHead(200, {'content-type': 'application/json'});
res.write(JSON.stringify({ "status" : "ok" }));
res.end();
});
Can I be sure ALL those actions will be performed before the method returns ? My concern is the asynchronous processing. As I do not use callback function in the db actions, will this be alright ?

While all of the commands are sent and responses parsed asynchronously, it's useful to note that the callbacks are all invoked in order. So you can use the callback of the last Redis command to send the response to the client, and then you'll know that all of the Redis commands have been executed before responding.

Use the MULTI/EXEC command to create a queue of your commands and execute them in a row. Then use a callback to send a coherent response back (success/failure).
Note that you must use Redis' AOF to avoid that - in case of crash - the db state is not coherent with your logic because only a part of the commands in the queue were executed: i.e. MULTI/EXEC is not transactional upon execution. This is a useful reference.

I haven't worked with redis, but if this works(if you it doesn't call undefined function) and it should be asynchronous, then you can use it. But if there is an error in updating, then you can't handle it, this way.

No, you can't be sure if all those actions complete successfully, because your redis server might crash.. To speed things up, you can group all your update commands into one with pipelining (does your redis driver support that?), then get the success or failure of the whole operation via a callback and proceed..

Related

Is there any risk to read/write the same file content from different 'sessions' in Node JS?

I'm new in Node JS and i wonder if under mentioned snippets of code has multisession problem.
Consider I have Node JS server (express) and I listen on some POST request:
app.post('/sync/:method', onPostRequest);
var onPostRequest = function(req,res){
// parse request and fetch email list
var emails = [....]; // pseudocode
doJob(emails);
res.status(200).end('OK');
}
function doJob(_emails){
try {
emailsFromFile = fs.readFileSync(FILE_PATH, "utf8") || {};
if(_.isString(oldEmails)){
emailsFromFile = JSON.parse(emailsFromFile);
}
_emails.forEach(function(_email){
if( !emailsFromFile[_email] ){
emailsFromFile[_email] = 0;
}
else{
emailsFromFile[_email] += 1;
}
});
// write object back
fs.writeFileSync(FILE_PATH, JSON.stringify(emailsFromFile));
} catch (e) {
console.error(e);
};
}
So doJob method receives _emails list and I update (counter +1) these emails from object emailsFromFile loaded from file.
Consider I got 2 requests at the same time and it triggers doJob twice. I afraid that when one request loaded emailsFromFile from file, the second request might change file content.
Can anybody spread the light on this issue?
Because the code in the doJob() function is all synchronous, there is no risk of multiple requests causing a concurrency problem.
If you were using async IO in that function, then there would be possible concurrency issues.
To explain, Javascript in node.js is single threaded. So, there is only one thread of Javascript execution running at a time and that thread of execution runs until it returns back to the event loop. So, any sequence of entirely synchronous code like you have in doJob() will run to completion without interruption.
If, on the other hand, you use any asynchronous operations such as fs.readFile() instead of fs.readFileSync(), then that thread of execution will return back to the event loop at the point you call fs.readFileSync() and another request can be run while it is reading the file. If that were the case, then you could end up with two requests conflicting over the same file. In that case, you would have to implement some form of concurrency protection (some sort of flag or queue). This is the type of thing that databases offer lots of features for.
I have a node.js app running on a Raspberry Pi that uses lots of async file I/O and I can have conflicts with that code from multiple requests. I solved it by setting a flag anytime I'm writing to a specific file and any other requests that want to write to that file first check that flag and if it is set, those requests going into my own queue are then served when the prior request finishes its write operation. There are many other ways to solve that too. If this happens in a lot of places, then it's probably worth just getting a database that offers features for this type of write contention.

node-vertica throws exception for multi-queries with a callback

I have created a simple web interface for vertica.
I expose simple operation above a vertica cluster.
one of the functionality I expose is querying vertica.
when my user enters a multi-query the node modul throws an exception and my process exits with exit 1.
Is there any way to catch this exception?
Is there any way overcome the problem in a different way?
Right now there's no way to overcome this when using a callback for the query result.
Preventing this from happening would involve making sure there's only one query in the user's input. This is hard because it involves parsing SQL.
The callback API isn't built to deal with multi-queries. I simply haven't bothered implementing proper handling of this case, because this has never been an issue for me.
Instead of a callback, you could use the event listener API, which will send you lower level messages, and handle this yourself.
q = conn.query("SELECT...; SELECT...");
q.on("fields", function(fields) { ... }); // 1 time per query
q.on("row", function(row) { ... }); // 0...* time per query
q.on("end", function(status) { ... }); // 1 time per query

Node.js: Will node always wait for setTimeout() to complete before exiting?

Consider:
node -e "setTimeout(function() {console.log('abc'); }, 2000);"
This will actually wait for the timeout to fire before the program exits.
I am basically wondering if this means that node is intended to wait for all timeouts to complete before quitting.
Here is my situation. My client has a node.js server he's gonna run from Windows with a Shortcut icon. If the node app encounters an exceptional condition, it will typically instantly exit, not leaving enough time to see in the console what the error was, and this is bad.
My approach is to wrap the entire program with a try catch, so now it looks like this: try { (function () { ... })(); } catch (e) { console.log("EXCEPTION CAUGHT:", e); }, but of course this will also cause the program to immediately exit.
So at this point I want to leave about 10 seconds for the user to take a peek or screenshot of the exception before it quits.
I figure I should just use blocking sleep() through the npm module, but I discovered in testing that setting a timeout also seems to work. (i.e. why bother with a module if something builtin works?) I guess the significance of this isn't big, but I'm just curious about whether it is specified somewhere that node will actually wait for all timeouts to complete before quitting, so that I can feel safe doing this.
In general, node will wait for all timeouts to fire before quitting normally. Calling process.exit() will exit before the timeouts.
The details are part of libuv, but the documentation makes a vague comment about it:
http://nodejs.org/api/all.html#all_ref
you can call ref() to explicitly request the timer hold the program open
Putting all of the facts together, setTimeout by default is designed to hold the event loop open (so if that's the only thing pending, the program will wait). You can programmatically disable or re-enable the behavior.
Late answer, but a definite yes - Nodejs will wait around for setTimeout to finish - see this documentation. Coincidentally, there is also a way to not wait around for setTimeout, and that is by calling unref on the object returned from setTimeout or setInterval.
To summarize: if you want Nodejs to wait until the timeout has been called, there's nothing you need to do. If you want Nodejs to not wait for a particular timeout, call unref on it.
If node didn't wait for all setTimeout or setInterval calls to complete, you wouldn't be able to use them in simple scripts.
Once you tell node to listen for an event, as with the setTimeout or some async I/O call, the event loop will loop until it is told to exit.
Rather than wrap everything in a try/catch you can bind an event listener to process just as the example in the docs:
process.on('uncaughtException', function(err) {
console.log('Caught exception: ' + err);
});
setTimeout(function() {
console.log('This will still run.');
}, 500);
// Intentionally cause an exception, but don't catch it.
nonexistentFunc();
console.log('This will not run.');
In the uncaughtException event, you can then add a setTimeout to exit after 10 seconds:
process.on('uncaughtException', function(err) {
console.log('Caught exception: ' + err);
setTimeout(function(){ process.exit(1); }, 10000);
});
If this exception is something you can recover from, you may want to look at domains: http://nodejs.org/api/domain.html
edit:
There may actually be another issue at hand: your client application doesn't do enough (or any?) logging. You can use log4js-node to write to a temp file or some application-specific location.
Easy way Solution:
Make a batch (.bat) file that starts nodejs
make a shortcut out of it
Why this is best. This way you client would run nodejs in command line. And even if nodejs program returns nothing would happen to command line.
Making bat file:
Make a text file
put START cmd.exe /k "node abc.js"
Save it
Rename It to abc.bat
make a shortcut or whatever.
Opening it will Open CommandLine and run nodejs file.
using settimeout for this is a bad idea.
The odd ones out are when you call process.exit() or there's an uncaught exception, as pointed out by Jim Schubert. Other than that, node will wait for the timeout to complete.
Node does remember timers, but only if it can keep track of them. At least that is my experience.
If you use setTimeout in an arrow / anonymous function I would recommend to keep track of your timers in an array, like:
=> {
timers.push(setTimeout(doThisLater, 2000));
}
and make sure let timers = []; isn't set in a method that will vanish, so i.e. globally.

How to avoid the need to delay event emission to the next tick of the event loop?

I'm writing a Node.js application using a global event emitter. In other words, my application is built entirely around events. I find this kind of architecture working extremely well for me, with the exception of one side case which I will describe here.
Note that I do not think knowledge of Node.js is required to answer this question. Therefore I will try to keep it abstract.
Imagine the following situation:
A global event emitter (called mediator) allows individual modules to listen for application-wide events.
A HTTP Server is created, accepting incoming requests.
For each incoming request, an event emitter is created to deal with events specific to this request
An example (purely to illustrate this question) of an incoming request:
mediator.on('http.request', request, response, emitter) {
//deal with the new request here, e.g.:
response.send("Hello World.");
});
So far, so good. One can now extend this application by identifying the requested URL and emitting appropriate events:
mediator.on('http.request', request, response, emitter) {
//identify the requested URL
if (request.url === '/') {
emitter.emit('root');
}
else {
emitter.emit('404');
}
});
Following this one can write a module that will deal with a root request.
mediator.on('http.request', function(request, response, emitter) {
//when root is requested
emitter.once('root', function() {
response.send('Welcome to the frontpage.');
});
});
Seems fine, right? Actually, it is potentially broken code. The reason is that the line emitter.emit('root') may be executed before the line emitter.once('root', ...). The result is that the listener never gets executed.
One could deal with this specific situation by delaying the emission of the root event to the end of the event loop:
mediator.on('http.request', request, response, emitter) {
//identify the requested URL
if (request.url === '/') {
process.nextTick(function() {
emitter.emit('root');
});
}
else {
process.nextTick(function() {
emitter.emit('404');
});
}
});
The reason this works is because the emission is now delayed until the current event loop has finished, and therefore all listeners have been registered.
However, there are many issues with this approach:
one of the advantages of such event based architecture is that emitting modules do not need to know who is listening to their events. Therefore it should not be necessary to decide whether the event emission needs to be delayed, because one cannot know what is going to listen for the event and if it needs it to be delayed or not.
it significantly clutters and complexifies code (compare the two examples)
it probably worsens performance
As a consequence, my question is: how does one avoid the need to delay event emission to the next tick of the event loop, such as in the described situation?
Update 19-01-2013
An example illustrating why this behavior is useful: to allow a http request to be handled in parallel.
mediator.on('http.request', function(req, res) {
req.onceall('json.parsed', 'validated', 'methodoverridden', 'authenticated', function() {
//the request has now been validated, parsed as JSON, the kind of HTTP method has been overridden when requested to and it has been authenticated
});
});
If each event like json.parsed would emit the original request, then the above is not possible because each event is related to another request and you cannot listen for a combination of actions executed in parallel for a specific request.
Having both a mediator which listens for events and an emitter which also listens and triggers events seems overly complicated. I'm sure there is a legit reason but my suggestion is to simplify. We use a global eventBus in our nodejs service that does something similar. For this situation, I would emit a new event.
bus.on('http:request', function(req, res) {
if (req.url === '/')
bus.emit('ns:root', req, res);
else
bus.emit('404');
});
// note the use of namespace here to target specific subsystem
bus.once('ns:root', function(req, res) {
res.send('Welcome to the frontpage.');
});
It sounds like you're starting to run into some of the disadvantages of the observer pattern (as mentioned in many books/articles that describe this pattern). My solution is not ideal – assuming an ideal one exists – but:
If you can make a simplifying assumption that the event is emitted only 1 time per emitter (i.e. emitter.emit('root'); is called only once for any emitter instance), then perhaps you can write something that works like jQuery's $.ready() event.
In that case, subscribing to emitter.once('root', function() { ... }) will check whether 'root' was emitted already, and if so, will invoke the handler anyway. And if 'root' was not emitted yet, it'll defer to the normal, existing functionality.
That's all I got.
I think this architecture is in trouble, as you're doing sequential work (I/O) that requires definite order of actions but still plan to build app on components that naturally allow non-deterministic order of execution.
What you can do
Include context selector in mediator.on function e.g. in this way
mediator.on('http.request > root', function( .. ) { } )
Or define it as submediator
var submediator = mediator.yield('http.request > root');
submediator.on(function( ... ) {
emitter.once('root', ... )
});
This would trigger the callback only if root was emitted from http.request handler.
Another trickier way is to make background ordering, but it's not feasible with your current one mediator rules them all interface. Implement code so, that each .emit call does not actually send the event, but puts the produced event in list. Each .once puts consume event record in the same list. When all mediator.on callbacks have been executed, walk through the list, sort it by dependency order (e.g. if list has first consume 'root' and then produce 'root' swap them). Then execute consume handlers in order. If you run out of events, stop executing.
Oi, this seems like a very broken architecture for a few reasons:
How do you pass around request and response? It looks like you've got global references to them.
If I answer your question, you will turn your server into a pure synchronous function and you'd lose the power of async node.js. (Requests would be queued effectively, and could only start executing once the last request is 100% finished.)
To fix this:
Pass request & response to the emit() call as parameters. Now you don't need to force everything to run synchronously anymore, because when the next component handles the event, it will have a reference to the right request & response objects.
Learn about other common solutions that don't need a global mediator. Look at the pattern that Connect was based on many Internet-years ago: http://howtonode.org/connect-it <- describes middleware/onion routing

Meteor blocking clarification

According to the meteor docs, inserts block:
On the server, if you don't provide a callback, then insert blocks
until the database acknowledges the write, or throws an exception if
something went wrong. If you do provide a callback, insert still
returns the ID immediately.
So this would be wrong:
Meteor.methods({
post: function (options) {
return Stories.insert(options)
}
});
I need to do this:
Meteor.methods({
post: function (options) {
return Stories.insert(options, function(){})
}
});
Can somebody confirm that this is the case? The former will block the ENTIRE SERVER until the db returns?
Yeah, it will block, but not the entire server.
In Meteor, your server code runs in a single thread per request, not in the asynchronous callback style typical of Node. We find the linear execution model a better fit for the typical server code in a Meteor application.
So, if you are worried about that it will block the entire server as it will do in typical Node, don't be.

Resources