How to wait for time interval? - node.js

I am busy with a node.js project communicating with an API which involves heavy use of a node library specific to that API. I have read (I think) all the existing questions about the kind of issues involved with pausing and their various solutions but still not sure how to apply a correct solution to my problem.
Simply put, I have a function I call multiple times from the API library and need to ensure they have all completed before continuing. Up to now I have managed to use the excellent caolan/async library to handle my sync/async needs but hit a block with this specific function from the API library.
The function is hideously complicated as it involves https and SOAP calling/parsing so I am trying to avoid re-writing it to behave with caolan/async, in fact I am not even sure at this stage why it is not well behaved.
It is an async function that I need to call multiple times and then wait until all the calls have completed. I have tried numerous ways of of using callbacks and even promises (q library) but just cannot get it to work as expected and as I have successfully done with the other async API functions.
Out of desperation I am hoping for a kludgy solution where I can just wait for say 5 seconds at a point in my program while all existing async functions complete but no further progress is made until 5 seconds have passed. So I want a non-blocking pause of 5 seconds if that is even possible.
I could probably do this using fibres but really hoping for another solution before I go down that route.

one simple solution to your problem would be to increment a counter every time you call your function. Then at the end of the callback have it emit an event. listen to that event and each time it's triggered increment a separate counter. When the two counters are equal you can move on.
This would look something like this
var function_call_counter = 0;
var function_complete_counter = 0;
var self = this;
for(var i = 0; i < time_to_call; i++){
function_call_counter++;
api_call(function(){
self.emit('api_called');
});
}
this.on('api_called', function(){
function_complete_counter++;
});
var id = setInterval(function(){
if(function_call_counter == function_complete_counter){
move_on();
clearInterval(id); // This stops the checking
}
}, 5000 ); // every 5 sec check to see if you can move on
Promises should work also they just might take some finessing. You mentioned q but you may want to check out promises A+

Related

Sleep main thread but do not block callbacks

This code works because system-sleep blocks execution of the main thread but does not block callbacks. However, I am concerned that system-sleep is not 100% portable because it relies on the deasync npm module which relies on C++.
Are there any alternatives to system-sleep?
var sleep = require('system-sleep')
var done = false
setTimeout(function() {
done = true
}, 1000)
while (!done) {
sleep(100) // without this line the while loop causes problems because it is a spin wait
console.log('sleeping')
}
console.log('If this is displayed then it works!')
PS Ideally, I want a solution that works on Node 4+ but anything is better than nothing.
PPS I know that sleeping is not best practice but I don't care. I'm tired of arguments against sleeping.
Collecting my comments into an answer per your request:
Well, deasync (which sleep() depends on) uses quite a hack. It is a native code node.js add-on that manually runs the event loop from C++ code in order to do what it is doing. Only someone who really knows the internals of node.js (now and in the future) could imagine what the issues are with doing that. What you are asking for is not possible in regular Javascript code without hacking the node.js native code because it's simply counter to the way Javascript was designed to run in node.js.
Understood and thanks. I am trying to write a more reliable deasync (which fails on some platforms) module that doesn't use a hack. Obviously this approach I've given is not the answer. I want it to support Node 4. I'm thinking of using yield / async combined with babel now but I'm not sure that's what I'm after either. I need something that will wait until the callback is callback is resolved and then return the value from the async callback.
All Babel does with async/await is write regular promise.then() code for you. async/await are syntax conveniences. They don't really do anything that you can't write yourself using promises, .then(), .catch() and in some cases Promise.all(). So, yes, if you want to write async/await style code for node 4, then you can use Babel to transpile your code to something that will run on node 4. You can look at the transpiled Babel code when using async/await and you will just find regular promise.then() code.
There is no deasync solution that isn't a hack of the engine because the engine was not designed to do what deasync does.
Javascript in node.js was designed to run one Javascript event at a time and that code runs until it returns control back to the system where the system will then pull the next event from the event queue and run its callback. Your Javascript is single threaded with no pre-emptive interruptions by design.
Without some sort of hack of the JS engine, you can't suspend or sleep one piece of Javascript and then run other events. It simply wasn't designed to do that.
var one = 0;
function delay(){
return new Promise((resolve, reject) => {
setTimeout(function(){
resolve('resolved')
}, 2000);
})
}
while (one == 0) {
one = 1;
async function f1(){
var x = await delay();
if(x == 'resolved'){
x = '';
one = 0;
console.log('resolved');
//all other handlers go here...
//all of the program that you want to be affected by sleep()
f1();
}
}
f1();
}

Rendering page asynchronously issue

I'm trying to render a page via something similar to this:
var content = '';
db.query(imageQuery,function(images){
content += images;
});
db.query(userQuery,function(users){
content += users;
});
response.end('<div id="page">'+content+'</div>');
Unfortunately, content is empty. I already know that these Asynchronous Queries cause the problem, but I can't find a way to fix it.
Somebody please helps me out of this.
The problem with your code is that you're saying "go do these two things for a while and then send my response." -- in other words you've told node to go into the other room to get the next pages of a book, and told it to do when it was done doing that, but then when it was out of the room, you just continued trying to read the book without the new pages.
What you need to do is instead send your response only when the two database queries are done.
There are several ways you can do that, how you do it is up to you.
You can chain the queries. This is inefficient since you're doing one query, waiting for it to return, doing the second, waiting for it to return and then sending your response, but it's the most basic way to do it.
var content = '';
db.query(imageQuery,function(images){
content += images;
db.query(userQuery,function(users){
content += users;
response.end('<div id="page">'+content+'</div>');
});
});
See how the response.end is now inside the last db.query's callback, which inside the first db.query's callback? This guarantees order of operations however. Your first query will ALWAYS complete first.
You could also write some sort of primitive latching system to run the queries in parallel. This is a little more efficient (they don't necessarily happen simultaneously, but it'll be faster than chaining them.) However, with this method you can't guarantee order of operations.
var _latch = 0;
var resp = '';
var complete = function(content){
resp += content;
++_latch;
if(_latch === 2){
response.end('<div id="page">'+resp+'</div>');
}
};
db.query(imageQuery, complete);
db.query(userQuery, complete);
So what you're doing there is saying run these queries and then call the same function. That function aggregates the responses and then counts the number of time it's been called. When it's been called the number of times you're making queries, it then returns the results to the user.
These are the two basic ways of handling multiple asynchronous methods. However, there are a lot of utilities to help you do this so you don't have to handle it manually.
async is a great library that will help you run async functions in series, parallel, waterfall, etc. Takes a TON of pain out of async management.
runnel is a similar library, but with a much smaller focus than async
q or bluebird are promises librarys implementing promises/a+. This provides a different concept behind flow control (if you're familiar with jQuery's deferred object, this is the idea that they were trying to implement.
You can read more about promises here, but a quick google will also help explain the concept.

How to have heavy processing operations done in node.js

I have a heavy data processing operation that I need to get done per 10-12 simulatenous request. I have read that for higher level of concurrency Node.js is a good platform and it achieves it by having an non blocking event loop.
What I know is that for having things like querying a database, I can spawn off an event to a separate process (like mongod, mysqld) and then have a callback which will handle the result from that process. Fair enough.
But what if I want to have a heavy piece of computation to be done within a callback. Won't it block other request until the code in that callback is executed completely. For example I want to process an high resolution image and code I have is in Javascript itself (no separate process to do image processing).
The way I think of implementing is like
get_image_from_db(image_id, callback(imageBitMap) {
heavy_operation(imageBitMap); // Can take 5 seconds.
});
Will that heavy_operation stop node from taking in any request for those 5 seconds. Or am I thinking the wrong way to do such task. Please guide, I am JS newbie.
UPDATE
Or can it be like I could process partial image and make the event loop go back to take in other callbacks and return to processing that partial image. (something like prioritising events).
Yes it will block it, as the callback functions are executed in the main loop. It is only the asynchronously called functions which do not block the loop. It is my understanding that if you want the image processing to execute asynchronously, you will have to use a separate processes to do it.
Note that you can write your own asynchronous process to handle it. To start you could read the answers to How to write asynchronous functions for Node.js.
UPDATE
how do i create a non-blocking asynchronous function in node.js? may also be worth reading. This question is actually referenced in the other one I linked, but I thought I'd include it here to for simplicity.
Unfortunately, I don't yet have enough reputation points to comment on Nick's answer, but have you looked into Node's cluster API? It's currently still experimental, but it would allow you to spawn multiple threads.
When a heavy piece of computation is done in the callback, the event loop would be blocked until the computation is done. That means the callback will block the event loop for the 5 seconds.
My solution
It's possible to use a generator function to yield back control to the event loop. I will use a while loop that will run for 3 seconds to act as a long running callback.
Without a Generator function
let start = Date.now();
setInterval(() => console.log('resumed'), 500);
function loop() {
while ((Date.now() - start) < 3000) { //while the difference between Date.now() and start is less than 3 seconds
console.log('blocked')
}
}
loop();
The output would be:
// blocked
// blocked
//
// ... would not return to the event loop while the loop is running
//
// blocked
//...when the loop is over then the setInterval kicks in
// resumed
// resumed
With a Generator function
let gen;
let start = Date.now();
setInterval(() => console.log('resumed'), 500);
function *loop() {
while ((Date.now() - start) < 3000) { //while the difference between Date.now() and start is less than 3 seconds
console.log(yield output())
}
}
function output() {
setTimeout(() => gen.next('blocked'), 500)
}
gen = loop();
gen.next();
The output is:
// resumed
// blocked
//...returns control back to the event loop while though the loop is still running
// resumed
// blocked
//...end of the loop
// resumed
// resumed
// resumed
Using javascript generators can help run heavy computational functions that would yield back control to the event loop while it's still computing.
To know more about the event loop visit
https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Statements/function*
https://davidwalsh.name/es6-generators

How is setTimeout implemented in node.js

I was wondering if anybody knows how setTimeout is implemented in node.js. I believe I have read somewhere that this is not part of V8. I quickly tried to find the implementation, but could not find it in the source(BIG).I for example found this timers.js file, which then for example links to timer_wrap.cc. But these file do not completely answer all of my questions.
Does V8 have setTimeout implementation? I guess also from the source the answer is no.
How is setTimeout implemented? javascript or native or combination of both? From timers.js I assume something along the line of both:
var Timer = process.binding('timer_wrap').Timer;`
When adding multiple timers(setTimeout) how does node.js know which to execute first? Does it add all the timers to a collection(sorted)? If it is sorted then finding the timeout which needs to be executed is O(1) and O(log n) for insertion? But then again in timers.js I see them use a linkedlist?
But then again adding a lot of timers is not a problem at all?
When executing this script:
var x = new Array(1000),
len = x.length;
/**
* Returns a random integer between min and max
* Using Math.round() will give you a non-uniform distribution!
*/
function getRandomInt (min, max) {
return Math.floor(Math.random() * (max - min + 1)) + min;
}
var y = 0;
for (var i = 0; i < len; i++) {
var randomTimeout = getRandomInt(1000, 10000);
console.log(i + ', ' + randomTimeout + ', ' + ++y);
setTimeout(function () {
console.log(arguments);
}, randomTimeout, randomTimeout, y);
}
you get a little bit of CPU usage but not that much?
I am wondering if I implement all these callbacks one by one in a sorted list if I will get better performance?
You've done most of the work already. V8 doesn't provides an implementation for setTimeout because it's not part of ECMAScript. The function you use is implemented in timers.js, which creates an instance of a Timeout object which is a wrapper around a C class.
There is a comment in the source describing how they are managing the timers.
// Because often many sockets will have the same idle timeout we will not
// use one timeout watcher per item. It is too much overhead. Instead
// we'll use a single watcher for all sockets with the same timeout value
// and a linked list. This technique is described in the libev manual:
// http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#Be_smart_about_timeouts
Which indicates it's using a double linked list which is #4 in the linked article.
If there is not one request, but many thousands (millions...), all
employing some kind of timeout with the same timeout value, then one
can do even better:
When starting the timeout, calculate the timeout value and put the
timeout at the end of the list.
Then use an ev_timer to fire when the timeout at the beginning of the
list is expected to fire (for example, using the technique #3).
When there is some activity, remove the timer from the list,
recalculate the timeout, append it to the end of the list again, and
make sure to update the ev_timer if it was taken from the beginning of
the list.
This way, one can manage an unlimited number of timeouts in O(1) time
for starting, stopping and updating the timers, at the expense of a
major complication, and having to use a constant timeout. The constant
timeout ensures that the list stays sorted.
Node.js is designed around async operations and setTimeout is an important part of that. I wouldn't try to get tricky, just use what they provide. Trust that it's fast enough until you've proven that in your specific case it's a bottleneck. Don't get stuck on premature optimization.
UPDATE
What happens is you've got essentially a dictionary of timeouts at the top level, so all 100ms timeouts are grouped together. Whenever a new timeout is added, or the oldest timeout triggers, it is appended to the list. This means that the oldest timeout, the one which will trigger the soonest, is at the beginning of the list. There is a single timer for this list, and it's set based on the time until the first item in the list is set to expire.
If you call setTimeout 1000 times each with the same timeout value, they will be appended to the list in the order you called setTimeout and no sorting is necessary. It's a very efficient setup.
No problem with many timers!
When uv loop call poll, it pass timeout argument to it with closest timer of all timers.
[closest timer of all timers]
https://github.com/joyent/node/blob/master/deps/uv/src/unix/timer.c #120
RB_MIN(uv__timers, &loop->timer_handles)
[pass timeout argument to poll api]
https://github.com/joyent/node/blob/master/deps/uv/src/unix/core.c #276
timeout = 0;
if ((mode & UV_RUN_NOWAIT) == 0)
timeout = uv_backend_timeout(loop);
uv__io_poll(loop, timeout);
Note: on Windows OS, it's almost same logic

How can I handle a callback synchrnously in Node.js?

I'm using Redis to generate IDs for my in memory stored models. The Redis client requires a callback to the INCR command, which means the code looks like
client.incr('foo', function(err, id) {
... continue on here
});
The problem is, that I already have written the other part of the app, that expects the incr call to be synchronous and just return the ID, so that I can use it like
var id = client.incr('foo');
The reason why I got to this problem is that up until now, I was generating the IDs just in memory with a simple closure counter function, like
var counter = (function() {
var count = 0;
return function() {
return ++count;
}
})();
to simplify the testing and just general setup.
Does this mean that my app is flawed by design and I need to rewrite it to expect callback on generating IDs? Or is there any simple way to just synchronize the call?
Node.js in its essence is an async I/O library (with plugins). So, by definition, there's no synchronous I/O there and you should rewrite your app.
It is a bit of a pain, but what you have to do is wrap the logic that you had after the counter was generated into a function, and call that from the Redis callback. If you had something like this:
var id = get_synchronous_id();
processIdSomehow(id);
you'll need to do something like this.
var runIdLogic = function(id){
processIdSomehow(id);
}
client.incr('foo', function(err, id) {
runIdLogic(id);
});
You'll need the appropriate error checking, but something like that should work for you.
There are a couple of sequential programming layers for Node (such as TameJS) that might help with what you want, but those generally do recompilation or things like that: you'll have to decide how comfortable you are with that if you want to use them.
#Sergio said this briefly in his answer, but I wanted to write a little more of an expanded answer. node.js is an asynchronous design. It runs in a single thread, which means that in order to remain fast and handle many concurrent operations, all blocking calls must have a callback for their return value to run them asynchronously.
That does not mean that synchronous calls are not possible. They are, and its a concern for how you trust 3rd party plugins. If someone decides to write a call in their plugin that does block, you are at the mercy of that call, where it might even be something that is internal and not exposed in their API. Thus, it can block your entire app. Consider what might happen if Redis took a significant amount of time to return, and then multiple that by the amount of clients that could potentially be accessing that same routine. The entire logic has been serialized and they all wait.
In answer to your last question, you should not work towards accommodating a blocking approach. It may seems like a simple solution now, but its counter-intuitive to the benefits of node.js in the first place. If you are only more comfortable in a synchronous design workflow, you may want to consider another framework that is designed that way (with threads). If you want to stick with node.js, rewrite your existing logic to conform to a callback style. From the code examples I have seen, it tends to look like a nested set of functions, as callback uses callback, etc, until it can return from that recursive stack.
The application state in node.js is normally passed around as an object. What I would do is closer to:
var state = {}
client.incr('foo', function(err, id) {
state.id = id;
doSomethingWithId(state.id);
});
function doSomethingWithId(id) {
// reuse state if necessary
}
It's just a different way of doing things.

Resources