Why does Node seemingly wait for all Promises to resolve? - node.js

I'm new to JavaScript and NodeJS so forgive me if this question is rudimentary.
Let's say I just have a simple file hello.js and I run it with $ node hello.js and all this file contains is
setTimeout(() => {console.log('hello');}, 5000);
Why doesn't this program finish immediately? Why instead does it wait for the underlying Promise to resolve?
After all, isn't the Promise associated with setTimeout created and run asynchronously? So wouldn't the main 'thread' of execution "fall off" when it encounters no more code to run?

The Node event loop keeps running until all outstanding tasks are completed or cancelled.
setTimeout creates a pending event, so the loop will keep running until that executes.
Outstanding Promises, setInterval and other mechanisms can all prevent the event loop from halting.
It's worth noting that setTimeout does not use a promise at all. That's just a regular callback function. The setTimeout() API long predates Promises.

I think there's more to the story/explanation here so I'll add an answer that contains additional info.
Nodejs keeps a reference counter for all unfinished asynchronous operations and nodejs itself will not exit automatically until all the asynchronous operations are complete (until the reference count gets to zero). If you want nodejs to exit before that, you can call process.exit() whenever you want.
Since setTimeout() is an asynchronous operation, it contributes to the reference count of unfinished asynchronous operations and thus it keeps nodejs from automatically exiting until the timer fires.
Note that setTimeout() does not use a promise - it's just a plain callback. It is not promises that nodejs waits for. It is the underlying asynchronous operations that promises often are attached to that it actually waits for.
So, if you did just this:
const myPromise = new Promise((resolve, reject) => {
console.log("did nothing here");
});
Then, nodejs would not wait for that promise all by itself, even though that promise never resolves or rejects. This is because nodejs does not actually wait for promises. It waits for the underlying asynchronous operations that are usually behind promises. Since there is no such underlying asynchronous operation behind this promise, nodejs does not wait for it.
It's also worth mentioning that you can actually tell nodejs to NOT wait for some asynchronous operations by using the .unref() method. In the example you show, if you do this:
const timer = setTimeout(() => {console.log('hello');}, 5000);
timer.unref();
Then, nodejs will NOT wait for that timer before exiting and if this is your entire program, nodejs will actually exit immediately without waiting for the timer to fire.
As an example, I have a nodejs program that carries out some nightly maintenance using a recurring timer. I don't want that maintenance timer to keep the nodejs program running if the other things that it's doing are done. So, after I set the timer, I call the .unref() method on it.

Related

JS: Why Promise then() method executes synchronously?

I need to make part of my method's code asynchronous, so it will execute in a non-blocking manner. For this purpose I've tried to create a "dummy" Promise and put the specified code in then block. I have something like this:
public static foo(arg1, arg2) {
const prom = new Promise((resolve, reject) => {
if (notValid(arg1, arg2)) {
reject();
}
resolve(true);
});
prom.then(() => {
...my code using arg1 and arg2...
});
}
However, then block always executes synchronously and blocks whole app, even though each and every JS documentation tells that then always runs asynchronously. I've also tried to replace the Promise with this:
Promise.resolve().then(() => {
...my code using arg1 and arg2...
});
but got same result.
The only way I've managed to make then block work asynchronously is by using setTimeout:
const pro = new Promise(resolve => {
setTimeout(resolve, 1);
});
pro.then(() => {
...my code using arg1 and arg2...
})
What can be the reason behind then block working synchronously? I don't want to proceed with using setTimeout, because it is kind of a "dirty" solution, and I think that there should be a way to make then run asynchronously.
Application is NodeJS with Express, written using Typescript.
I need to make part of my method's code asynchronous, so it will execute in a non-blocking manner. For this purpose I've tried to create a "dummy" Promise and put the specified code in then block.
Promises don't really make things asynchronous in and of themselves. What they do is wrap around something that's already asynchronous, and give a convenient way to tell when that thing is done. If you wrap a promise around something synchronous, your code is mostly still synchronous, just with a few details about when the .then callback executes.
even though each and every JS documentation tells that then always runs asynchronously.
By that, they mean that it waits for the current call stack to finish, and then runs the code in the .then callback. The technical term for what it's doing is a "microtask". This delay is done so that the order of operations of your code is the same whether the promise is already in a resolved state, or if some time needs to pass before it resolves.
But if your promise is already resolved (eg, because it's wrapped around synchronous code), the only thing your .then callback will be waiting for is the currently executing call stack. Once the current call stack synchronously finishes, your microtask runs synchronously to completion. The event loop will not be able to progress until you're done.
The only way I've managed to make then block work asynchronously is by using setTimeout
setTimeout will make things asynchronous*, yes. The code will be delayed until the timer goes off. If you want to wrap this in a promise you can, but the promise is not the part that makes it asynchronous, the setTimeout is.
I don't want to proceed with using setTimeout, because it is kind of a "dirty" solution
Ok, but it's the right tool for the job.
* when i say setTimeout makes it asynchronous, i just mean it delays the execution. This is good enough for many cases, but when your code eventually executes, it will tie up the thread until it's done. So if it takes a really long time you may need to write the code so it just does a portion of the work, and then sets another timeout to resume later

Does using promises in node.js make node synchronous?

How does promises still retain async while giving sync effect?
I.e Promises can make sequential code execution, doesn’t that remove async?
Then, whats the point of node.js?
No, promises do not make Node.js code run synchronous - promises allow you to write flows that appear synchronous.
When you do this:
console.log(1);
setTimeout(() => console.log(2));
console.log(3);
Node.js runs the code sequantially, first it logs 1, then it registers a function to happen after a timeout, then it logs 3. After a big when the timer fires - it executes its function and logs 2. No threads are involved.
With promises and async await, when you await you're explicitly telling Node.js "pause executing this function and continue when the promise I'm waiting for resolved". pause executing isn't the same as a thread blocking - no context switch happens at an operating system level - it just keeps track of the function and when the promise resolves it calls it.
Node is run with "our code - platform code" cycles, your code runs (until it ends, for example by hitting an await), then the platform code runs (for example, checking timers and I/O) and then that code can call your code (for example your suspended async function) again and run it.

Why node js doesn't exit after promise call

In the following code:
function sleep(ms) {
return new Promise((resolve) => {
setTimeout(resolve, ms);
});
}
sleep(10000);
Why NodeJS doesn't exit immediately? What causes the Promise to be waited for?
I thought I needed await sleep(10000) - but this gives an error.
Nodejs waits for timers that are still active before exiting. You can tell it to NOT wait for a timer by calling .unref() on the timer object itself.
So, it's not actually the promise it is waiting for, but rather the timer it is waiting for.
Internally, node.js keeps a reference count of the number of open timers that have not been .unref() and will not exit until that count (among other things) gets to zero.
Here's a couple excerpts from the node.js doc for timers:
Class: Timeout
By default, when a timer is scheduled using either setTimeout() or setInterval(), the Node.js event loop will continue running as long as the timer is active. Each of the Timeout objects returned by these functions export both timeout.ref() and timeout.unref() functions that can be used to control this default behavior.
timeout.unref()
When called, the active Timeout object will not require the Node.js event loop to remain active. If there is no other activity keeping the event loop running, the process may exit before the Timeout object's callback is invoked. Calling timeout.unref() multiple times will have no effect.
Take a look at the unref() function for timers in node - https://nodejs.org/api/timers.html#timers_timeout_unref
When called, the active Timeout object will not require the Node.js event loop to remain active. If there is no other activity keeping the event loop running, the process may exit before the Timeout object's callback is invoked.
You can create a timeout and call the unref() function on it - this will prevent node from staying alive if the only thing it is waiting for is the timeout.
function sleep(ms) {
return new Promise((resolve) => {
setTimeout(resolve, ms).unref();
});
}
As a side note, the same unref function can also be used for setTimeout calls.
As correctly noted by jfriend00, it is not the promise that is keeping node alive, it is the timeout.

Single thread synchronous and asynchronous confusion

Assume makeBurger() will take 10 seconds
In synchronous program,
function serveBurger() {
makeBurger();
makeBurger();
console.log("READY") // Assume takes 5 seconds to log.
}
This will take a total of 25 seconds to execute.
So for NodeJs lets say we make an async version of makeBurgerAsync() which also takes 10 seconds.
function serveBurger() {
makeBurgerAsync(function(count) {
});
makeBurgerAsync(function(count) {
});
console.log("READY") // Assume takes 5 seconds to log.
}
Since it is a single thread. I have troubling imagine what is really going on behind the scene.
So for sure when the function run, both async functions will enter event loops and console.log("READY") will get executed straight away.
But while console.log("READY") is executing, no work is really done for both async function right? Since single thread is hogging console.log for 5 seconds.
After console.log is done. CPU will have time to switch between both async so that it can run a bit of each function each time.
So according to this, the function doesn't necessarily result in faster execution, async is probably slower due to switching between event loop? I imagine that, at the end of the day, everything will be spread on a single thread which will be the same thing as synchronous version?
I am probably missing some very big concept so please let me know. Thanks.
EDIT
It makes sense if the asynchronous operations are like query DB etc. Basically nodejs will just say "Hey DB handle this for me while I'll do something else". However, the case I am not understanding is the self-defined callback function within nodejs itself.
EDIT2
function makeBurger() {
var count = 0;
count++; // 1 time
...
count++; // 999999 times
return count;
}
function makeBurgerAsync(callback) {
var count = 0;
count++; // 1 time
...
count++; // 999999 times
callback(count);
}
In node.js, all asynchronous operations accomplish their tasks outside of the node.js Javascript single thread. They either use a native code thread (such as disk I/O in node.js) or they don't use a thread at all (such as event driven networking or timers).
You can't take a synchronous operation written entirely in node.js Javascript and magically make it asynchronous. An asynchronous operation is asynchronous because it calls some function that is implemented in native code and written in a way to actually be asynchronous. So, to make something asynchronous, it has to be specifically written to use lower level operations that are themselves asynchronous with an asynchronous native code implementation.
These out-of-band operations, then communicate with the main node.js Javascript thread via the event queue. When one of these asynchronous operations completes, it adds an event to the Javascript event queue and then when the single node.js thread finishes what it is currently doing, it grabs the next event from the event queue and calls the callback associated with that event.
Thus, you can have multiple asynchronous operations running in parallel. And running 3 operations in parallel will usually have a shorter end-to-end running time than running those same 3 operations in sequence.
Let's examine a real-world async situation rather than your pseudo-code:
function doSomething() {
fs.readFile(fname, function(err, data) {
console.log("file read");
});
setTimeout(function() {
console.log("timer fired");
}, 100);
http.get(someUrl, function(err, response, body) {
console.log("http get finished");
});
console.log("READY");
}
doSomething();
console.log("AFTER");
Here's what happens step-by-step:
fs.readFile() is initiated. Since node.js implements file I/O using a thread pool, this operation is passed off to a thread in node.js and it will run there in a separate thread.
Without waiting for fs.readFile() to finish, setTimeout() is called. This uses a timer sub-system in libuv (the cross platform library that node.js is built on). This is also non-blocking so the timer is registered and then execution continues.
http.get() is called. This will send the desired http request and then immediately return to further execution.
console.log("READY") will run.
The three asynchronous operations will complete in an indeterminate order (whichever one completes it's operation first will be done first). For purposes of this discussion, let's say the setTimeout() finishes first. When it finishes, some internals in node.js will insert an event in the event queue with the timer event and the registered callback. When the node.js main JS thread is done executing any other JS, it will grab the next event from the event queue and call the callback associated with it.
For purposes of this description, let's say that while that timer callback is executing, the fs.readFile() operation finishes. Using it's own thread, it will insert an event in the node.js event queue.
Now the setTimeout() callback finishes. At that point, the JS interpreter checks to see if there are any other events in the event queue. The fs.readfile() event is in the queue so it grabs that and calls the callback associated with that. That callback executes and finishes.
Some time later, the http.get() operation finishes. Internal to node.js, an event is added to the event queue. Since there is nothing else in the event queue and the JS interpreter is not currently executing, that event can immediately be serviced and the callback for the http.get() can get called.
Per the above sequence of events, you would see this in the console:
READY
AFTER
timer fired
file read
http get finished
Keep in mind that the order of the last three lines here is indeterminate (it's just based on unpredictable execution speed) so that precise order here is just an example. If you needed those to be executed in a specific order or needed to know when all three were done, then you would have to add additional code in order to track that.
Since it appears you are trying to make code run faster by making something asynchronous that isn't currently asynchronous, let me repeat. You can't take a synchronous operation written entirely in Javascript and "make it asynchronous". You'd have to rewrite it from scratch to use fundamentally different asynchronous lower level operations or you'd have to pass it off to some other process to execute and then get notified when it was done (using worker processes or external processes or native code plugins or something like that).

How to know when the Promise is actually resolved in Node.js?

When we are using Promise in nodejs, given a Promise p, we can't know when the Promise p is actually resolved by logging the currentTime in the "then" callback.
To prove that, I wrote the test code below (using CoffeeScript):
# the procedure we want to profile
getData = (arg) ->
new Promise (resolve) ->
setTimeout ->
resolve(arg + 1)
, 100
# the main procedure
main = () ->
beginTime = new Date()
console.log beginTime.toISOString()
getData(1).then (r) ->
resolveTime = new Date()
console.log resolveTime.toISOString()
console.log resolveTime - beginTime
cnt = 10**9
--cnt while cnt > 0
return cnt
main()
When you run the above code, you will notice that the resolveTime (the time your code run into the callback function) is much later than 100ms from the beginTime.
So If we want to know when the Promise is actually resolved, HOW?
I want to know the exact time because I'm doing some profiling via logging. And I'm not able to modify the Promise p 's implementation when I'm doing some profiling outside of the black box.
So, Is there some function like promise.onUnderlyingConditionFulfilled(callback) , or any other way to make this possible?
This is because you have a busy loop that apparently takes longer than your timer:
cnt = 10**9
--cnt while cnt > 0
Javascript in node.js is single threaded and event driven. It can only do one thing at a time and it will finish the current thing it's doing before it can service the event posted by setTimeout(). So, if your busy loop (or any other long running piece of Javascript code) takes longer than you've set your timer for, the timer will not be able to run until this other Javascript code is done. "single threaded" means Javascript in node.js only does one thing at a time and it waits until one thing returns control back to the system before it can service the next event waiting to run.
So, here's the sequence of events in your code:
It calls the setTimeout() to schedule the timer callback for 100ms from now.
Then you go into your busy loop.
While it's in the busy loop, the setTimeout() timer fires inside of the JS implementation and it inserts an event into the Javascript event queue. That event can't run at the moment because the JS interpreter is still running the busy loop.
Then eventually it finishes the busy loop and returns control back to the system.
When that is done, the JS interpreter then checks the event queue to see if any other events need servicing. It finds the timer event and so it processes that and the setTimeout() callback is called.
That callback resolves the promise which triggers the .then() handler to get called.
Note: Because of Javascript's single threaded-ness and event-driven nature, timers in Javascript are not guaranteed to be called exactly when you schedule them. They will execute as close to that as possible, but if other code is running at the time they fire or if their are lots of items in the event queue ahead of you, that code has to finish before the timer callback actually gets to execute.
So If we want to know when the Promise is actually resolved, HOW?
The promise is resolved when your busy loop is done. It's not resolved at exactly the 100ms point (because your busy loop apparently takes longer than 100ms to run). If you wanted to know exactly when the promise was resolved, you would just log inside the setTimeout() callback right where you call resolve(). That would tell you exactly when the promise was resolved though it will be pretty much the same as where you're logging now. The cause of your delay is the busy loop.
Per your comments, it appears that you want to somehow measure exactly when resolve() is actually called in the Promise, but you say that you cannot modify the code in getData(). You can't really do that directly. As you already know, you can measure when the .then() handler gets called which will probably be no more than a couple ms after resolve() gets called.
You could replace the promise infrastructure with your own implementation and then you could instrument the resolve() callback directly, but replacing or hooking the promise implementation probably influences the timing of things even more than just measuring from the .then() handler.
So, it appears to me that you've just over-constrained the problem. You want to measure when something inside of some code happens, but you don't allow any instrumentation inside that code. That just leaves you with two choices:
Replace the promise implementation so you can instrument resolve() directly.
Measure when .then() is triggered.
The first option probably has a heisenberg uncertainty issue in that you've probably influenced the timing by more than you should if you replace or hook the promise implementation.
The second option measures an event that happens just slightly after the actual .resolve(). Pick which one sounds closest to what you actually want.

Resources