Synchronous NodeJs (or other serverside JS) call - node.js

We are using Node for developing and 95% of code is Async, working fine.
For some 5% (one small module), which is sync in nature [and depends on other third party software],
and we are looking for
1. "Code to block until call back is finished"
2. At a time only one instance of function1 + its callback should be executed.
PS 1: I do completely agree, Node is for async work, We should avoid that, but this is separate non-realtime process.
PS 2: If not with Node any other Serverside JS framework? Last option is to use other lang like python, but if anything in JS possible, we are ready to give it a shot!

SEQ should solve your problem.
For an overview about sync modules please look at http://nodejsrocks.blogspot.de/2012/05/how-to-avoid-nodejs-spaghetti-code-with.html
Seq()
.seq(function () {
mysql.query("select * from foo",[], function(err,rows,fields) {
this(null, rows);
});
})
.seq(function(mysqlResult) {
console.log("mysql callback returnes:"+mysqlResult);
})

There are lots and lots of options, look at node-async, kaffeine async support, IcedCoffeescript, etc.

I want to make a plug for IcedCoffeeScript since I'm its maintainer. You can get by with solutions like Seq, but in general you'll wind up encoding control flow with function calls. I find that approach difficult to write and maintain. IcedCoffeeScript makes simple sequential operations a breeze:
console.log "hello, just wait a sec"
await setTimeout defer(), 100
console.log "ok, what did you want"
But more important, it handles any combination of async code and standard control flow:
console.log "Let me check..."
if isRunningLate()
console.log "Can't stop now, sorry!"
else
await setTimeout defer(), 1000
console.log "happy to wait, now what did you want?"
resumeWhatIWasDoingBefore()
Also loops work well, here is serial dispatch:
for i in [0...10]
await launchRpc defer res[i]
done()
And here is parallel dispatch:
await
for i in [0...10]
launchRpc defer res[i]
done()
Not only does ICS make sequential chains of async code smoother, it also encourages you to do as much as possible in parallel. If you need to change your code or your concurrency requirements, the changes are minimal, not a complete rewrite (as it would be in standard JS/CS or with some concurrency libraries).

Related

When should I split some task into asynchronous tinier tasks?

I'm writing a personal project in Node and I'm trying to figure out when a task should be asynchronously splitted. Let's say I have this "4-Step-Task", they are not very expensive (the most expensive its the one who iterates over an array of objects and trying to match a URL with a RegExp, and the array probably won't have more than 20 or 30 objects).
part1().then(y => {
doTheSecondPart
}).then(z => {
doTheThirdPart
}).then(c => {
doTheFourthPart
});
The other way will be just executing one after another, but nothing else will progress until this task is done. With the above approach, others tasks can progress at least a little bit between each part.
Is there any criteria about when this approach should be prefered over a classic synchronous one?
Sorry my bad english, not my native language.
All you've described is synchronous code that isn't very long to run. First off, there's no reason to even use promises for that type of code. Secondly, there's no reason to break it up into chunks. All you would be doing with either of those choices is making the code more complicated to write, more complicated to test and more complicated to understand and it would also run slower. All of those are undesirable.
If you force even synchronous code into a promise, then a .then() handler will give some other code a chance to run between .then() handlers, but only certain types of events can be run there because processing a resolved promise is one of the highest priority things to do in the event queue system. It won't, for example, allow another incoming http request arriving on your server to start to run.
If you truly wanted to allow other requests to run and so on, you would be better off just putting the code (without promises) into a WorkerThread and letting it run there and then communicate back the result via messaging. If you wanted to keep it in the main thread, but let any other code run, you'd probably have to use a short setTimeout() delay to truly let all possible other types of tasks run in between.
So, if this code doesn't take much time to run, there's just really no reason to mess with complicating it. Just let it run in the fastest, quickest and simplest way.
If you want more concrete advice, then please show some actual code and provide some timing information about how long it takes to run. Iterating through an array of 20-30 objects is nothing in the general scheme of things and is not a reason to rewrite it into timesliced pieces.
As for code that iterates over an array/list of items doing matching against some string, this is exactly what the Express web server framework does on every incoming URL to find the matching routes. That is not a slow thing to do in Javascript.
Asynchronous programming is a better fit for code that must respond to events – for example, any kind of graphical UI. An example of a situation where programmers use async but shouldn't is any code that can focus entirely on data processing and can accept a “stop-the-world” block while waiting for data to download.
I use it extensivly with a rest API server as we have no idea of how long a request can take to for a server to respond . So in order for us not to "block the app" while waiting for the server response async requests are most useful
part1().then(y => {
doTheSecondPart
}).then(z => {
doTheThirdPart
}).then(c => {
doTheFourthPart
});
As you have described in your sample is much more of a synchronous procedural process that would not necessarily allow your interface to still work while your algorithm is busy with a process
In the case of a server call, if you still waiting for server to respond the algorithm using then is still using up resources and wont free your app up to run any other user interface events, while its waiting for the process to reach the next then statement .
You should use Async Await in this instance where you waiting for a user event or a server to respond but do not want your app to hang while waiting for server data...
async function wait() {
await new Promise(resolve => setTimeout(resolve,2000));
console.log("awaiting for server once !!")
return 10;
}
async function wait2() {
await new Promise(resolve => setTimeout(resolve,3000));
console.log("awaiting for server twice !!")
return 10;
}
async function f() {
let promise = new Promise((resolve, reject) => {
setTimeout(() => resolve("done!"), 1000)
});
let result = await promise;//.then(async function(){
console.log(result)
let promise6 = await wait();
let promise7 = await wait2();
//}); // wait until the promise resolves (*)
//console.log(result); // "done!"
}
f();
This sample should help you gain a basic understanding of how async/ Await works and here are a few resources to research it
Promises and Async
Mozilla Refrences

Sleep main thread but do not block callbacks

This code works because system-sleep blocks execution of the main thread but does not block callbacks. However, I am concerned that system-sleep is not 100% portable because it relies on the deasync npm module which relies on C++.
Are there any alternatives to system-sleep?
var sleep = require('system-sleep')
var done = false
setTimeout(function() {
done = true
}, 1000)
while (!done) {
sleep(100) // without this line the while loop causes problems because it is a spin wait
console.log('sleeping')
}
console.log('If this is displayed then it works!')
PS Ideally, I want a solution that works on Node 4+ but anything is better than nothing.
PPS I know that sleeping is not best practice but I don't care. I'm tired of arguments against sleeping.
Collecting my comments into an answer per your request:
Well, deasync (which sleep() depends on) uses quite a hack. It is a native code node.js add-on that manually runs the event loop from C++ code in order to do what it is doing. Only someone who really knows the internals of node.js (now and in the future) could imagine what the issues are with doing that. What you are asking for is not possible in regular Javascript code without hacking the node.js native code because it's simply counter to the way Javascript was designed to run in node.js.
Understood and thanks. I am trying to write a more reliable deasync (which fails on some platforms) module that doesn't use a hack. Obviously this approach I've given is not the answer. I want it to support Node 4. I'm thinking of using yield / async combined with babel now but I'm not sure that's what I'm after either. I need something that will wait until the callback is callback is resolved and then return the value from the async callback.
All Babel does with async/await is write regular promise.then() code for you. async/await are syntax conveniences. They don't really do anything that you can't write yourself using promises, .then(), .catch() and in some cases Promise.all(). So, yes, if you want to write async/await style code for node 4, then you can use Babel to transpile your code to something that will run on node 4. You can look at the transpiled Babel code when using async/await and you will just find regular promise.then() code.
There is no deasync solution that isn't a hack of the engine because the engine was not designed to do what deasync does.
Javascript in node.js was designed to run one Javascript event at a time and that code runs until it returns control back to the system where the system will then pull the next event from the event queue and run its callback. Your Javascript is single threaded with no pre-emptive interruptions by design.
Without some sort of hack of the JS engine, you can't suspend or sleep one piece of Javascript and then run other events. It simply wasn't designed to do that.
var one = 0;
function delay(){
return new Promise((resolve, reject) => {
setTimeout(function(){
resolve('resolved')
}, 2000);
})
}
while (one == 0) {
one = 1;
async function f1(){
var x = await delay();
if(x == 'resolved'){
x = '';
one = 0;
console.log('resolved');
//all other handlers go here...
//all of the program that you want to be affected by sleep()
f1();
}
}
f1();
}

Node.js Control Flow and Callbacks

I've been confused on this for a month and searched everything but could not find an answer.
I want to get control of what runs first in the node.js. I know the way node deals with the code is non-blocking. I have the following example:
setTimeOut(function(){console.log("one second passed");}, 1000);
console.log("HelloWorld");
Here I want to run first one, output "one second passed", and then run the second one. How can I do that? I know setTimeOut is a way to solve this problem but that's not the answer I am looking for. I've tried using callback but not working. I am not sure about if I got the correct understanding of callbacks. Callbacks just mean function parameters to me and I don't think that will help me to solve this problem.
One possible way to solve this problem is to define a function that contains the "error first callback" like the following example:
function print_helloworld_atend(function helloworld(){
console.log("HelloWorld");
}){
setTimeOut(function(){console.log("one second passed");}, 1000);
helloworld();
}
Can I define a function with a callback who will know when the previous tasks are done. In the above function, how to make the callback function: helloworld to run after the "setTimeOut" expression?
If there is a structure that can solve my problem, that's my first choice. I am tired of using setTimeOuts.
I would really appreciate if anyone can help. Thanks for reading
You should be using promises. Bluebird is a great promise library. Faster than native and comes with great features. With promises you can chain together functions, and know that one will not be called until the previous function resolves. No need to set timeouts or delays. Although you can if you'd like. Below is example of a delay. Function B wont run until 6 seconds after A finishes. If you remove .delay(ms) B will run immediately after A finishes.
var Promise = require("bluebird");
console.time('tracked');
console.time('first');
function a (){
console.log('hello');
console.timeEnd('first');
return Promise.resolve();
}
function b (){
console.log('world');
console.timeEnd('tracked');
}
a().delay(6000)
.then(b)
.catch(Promise.TimeoutError, function(e) {
console.log('Something messed up yo', e);
});
This outputs:
→ node test.js
hello
first: 1.278ms
world
tracked: 6009.422ms
Edit: Promises are, in my opinion, the most fun way of control flow in node/javascript. To my knowledge there is not a .delay() or .timeout() in native javascript promises. However, there are Promises in general. You can find their documentation on mozilla's site. I would recommend that you use Bluebird instead though.
Use bluebird instead of native because:
It's faster. Petka Antonov, the creator of bluebird, has a great understanding of the V8 engines two compile steps and has optimized the library around it's many quirks. Native has little to no optimization and it shows when you compare their performance. More information here and here.
It has more features: Nice things like .reflect(), .spread(), .delay(), .timeout(), the list goes on.
You lose nothing by switching: all features in bluebird which also exist in native function in exactly the same way in implementation. If you find yourself in a situation where only native is available to you, you wont have to relearn what you already know.
Just execute everything that you want to execute after you log "one second passed", after you log "one second passed":
setTimeOut(function(){
console.log("one second passed");
console.log("HelloWorld");
}, 1000);
You can use async module to handle the callbacks.
To understand callbacks I'll give you a high level glance:
function: i want to do some i/o work.
nodejs: ok, but you shouldn't be blocking my process as I'm single threaded.
nodejs: pass a callback function, and I will let you know from it when the i/o work is done.
function: passes the callback function
nodejs: i/o work is done, calls the callback function.
function: thanks for the notification, continue processing other work.

Why does Node.js have both async & sync version of fs methods?

In Node.js, I can do almost any async operation one of two ways:
var file = fs.readFileSync('file.html')
or...
var file
fs.readFile('file.html', function (err, data) {
if (err) throw err
console.log(data)
})
Is the only benefit of the async one custom error handling? Or is there really a reason to have the file read operation non-blocking?
These exist mostly because node itself needs them to load your program's modules from disk when your program starts. More broadly, it is typical to do a bunch a synchronous setup IO when a service is initially started but prior to accepting network connections. Once the program is ready to go (has it's TLS cert loaded, config file has been read, etc), then a network socket is bound and at that point everything is async from then on.
Asynchronous calls allow for the branching of execution chains and the passing of results through that execution chain. This has many advantages.
For one, the program can execute two or more calls at the same time, and do work on the results as they complete, not necessarily in the order they were first called.
For example if you have a program waiting on two events:
var file1;
var file2;
//Let's say this takes 2 seconds
fs.readFile('bigfile1.jpg', function (err, data) {
if (err) throw err;
file1 = data;
console.log("FILE1 Done");
});
//let's say this takes 1 second.
fs.readFile('bigfile2.jpg', function (err, data) {
if (err) throw err;
file2 = data;
console.log("FILE2 Done");
});
console.log("DO SOMETHING ELSE");
In the case above, bigfile2.jpg will return first and something will be logged after only 1 second. So your output timeline might be something like:
#0:00: DO SOMETHING ELSE
#1:00: FILE2 Done
#2:00: FILE1 Done
Notice above that the log to "DO SOMETHING ELSE" was logged right away. And File2 executed first after only 1 second... and at 2 seconds File1 is done. Everything was done within a total of 2 seconds though the callBack order was unpredictable.
Whereas doing it synchronously it would look like:
file1 = fs.readFileSync('bigfile1.jpg');
console.log("FILE1 Done");
file2 = fs.readFileSync('bigfile2.jpg');
console.log("FILE2 Done");
console.log("DO SOMETHING ELSE");
And the output might look like:
#2:00: FILE1 Done
#3:00: FILE2 Done
#3:00 DO SOMETHING ELSE
Notice it takes a total of 3 seconds to execute, but the order is how you called it.
Doing it synchronously typically takes longer for everything to finish (especially for external processes like filesystem reads, writes or database requests) because you are waiting for one thing to complete before moving onto the next. Sometimes you want this, but usually you don't. It can be easier to program synchronously sometimes though, since you can do things reliably in a particular order (usually).
Executing filesystem methods asynchronously however, your application can continue executing other non-filesystem related tasks without waiting for the filesystem processes to complete. So in general you can continue executing other work while the system waits for asynchronous operations to complete. This is why you find database queries, filesystem and communication requests to generally be handled using asynchronous methods (usually). They basically allow other work to be done while waiting for other (I/O and off-system) operations to complete.
When you get into more advanced asynchronous method chaining you can do some really powerful things like creating scopes (using closures and the like) with a little amount of code and also create responders to certain event loops.
Sorry for the long answer. There are many reasons why you have the option to do things synchronously or not, but hopefully this will help you decide whether either method is best for you.
The benefit of the asynchronous version is that you can do other stuff while you wait for the IO to complete.
fs.readFile('file.html', function (err, data) {
if (err) throw err;
console.log(data);
});
// Do a bunch more stuff here.
// All this code will execute before the callback to readFile,
// but the IO will be happening concurrently. :)
You want to use the async version when you are writing event-driven code where responding to requests quickly is paramount. The canonical example for Node is writing a web server. Let's say you have a user making a request which is such that the server has to perform a bunch of IO. If this IO is performed synchronously, the server will block. It will not answer any other requests until it has finished serving this request. From the perspective of the users, performance will seem terrible. So in a case like this, you want to use the asynchronous versions of the calls so that Node can continue processing requests.
The sync version of the IO calls is there because Node is not used only for writing event-driven code. For instance, if you are writing a utility which reads a file, modifies it, and writes it back to disk as part of a batch operation, like a command line tool, using the synchronous version of the IO operations can make your code easier to follow.

Nodejs convert expensive synchronous tasks to async series

In nodejs I have an expensive function such as:
function expensiveCode(){
a.doExensiveOperation(1);
b.doAnotherExensiveOperation(2);
c.doADiffererentExensiveOperation(3);
d.doExensiveOperation(4);
}
Such that each sub-function call has different parameters and therefore can't be done in a loop. I'd like to throttle this expensive function call so that each sub-call is done on nextTick such as:
function expensiveCode(){
process.nextTick(function(){
a.doExensiveOperation(1);
process.nextTick(function(){
b.doAnotherExensiveOperation(2);
process.nextTick(function(){
c.doADiffererentExensiveOperation(3);
process.nextTick(function(){
d.doExensiveOperation(4);
});
});
});
});
}
That's obviously ugly, and if there are 20 lines of different operations will too hideous to even consider.
I've reviewed a number of libraries like "async.js" but they all appear to be expecting the called functions to be async - to have a callback function on completion. I need a simple way to do it without converting all my code to the 'callback when done' method which seems like overkill.
Sorry to burst your bubble, but async.waterfall or perhaps async.series combined with async.apply is what you want, and yes you'll need to make those operations async. I find it hard to believe you have found 20 different computationally-intense operations, none of which do any IO whatsoever. Check out the bcrypt library for an example of how to offer both synchronous and asynchronous versions of a CPU-intensive call. Converting your code to callback on completion isn't overkill, it's node. That's the rule in node. Either your function does no IO and completes quickly or you make it async with a callback. End of options.

Resources