JS: Why Promise then() method executes synchronously? - node.js

I need to make part of my method's code asynchronous, so it will execute in a non-blocking manner. For this purpose I've tried to create a "dummy" Promise and put the specified code in then block. I have something like this:
public static foo(arg1, arg2) {
const prom = new Promise((resolve, reject) => {
if (notValid(arg1, arg2)) {
reject();
}
resolve(true);
});
prom.then(() => {
...my code using arg1 and arg2...
});
}
However, then block always executes synchronously and blocks whole app, even though each and every JS documentation tells that then always runs asynchronously. I've also tried to replace the Promise with this:
Promise.resolve().then(() => {
...my code using arg1 and arg2...
});
but got same result.
The only way I've managed to make then block work asynchronously is by using setTimeout:
const pro = new Promise(resolve => {
setTimeout(resolve, 1);
});
pro.then(() => {
...my code using arg1 and arg2...
})
What can be the reason behind then block working synchronously? I don't want to proceed with using setTimeout, because it is kind of a "dirty" solution, and I think that there should be a way to make then run asynchronously.
Application is NodeJS with Express, written using Typescript.

I need to make part of my method's code asynchronous, so it will execute in a non-blocking manner. For this purpose I've tried to create a "dummy" Promise and put the specified code in then block.
Promises don't really make things asynchronous in and of themselves. What they do is wrap around something that's already asynchronous, and give a convenient way to tell when that thing is done. If you wrap a promise around something synchronous, your code is mostly still synchronous, just with a few details about when the .then callback executes.
even though each and every JS documentation tells that then always runs asynchronously.
By that, they mean that it waits for the current call stack to finish, and then runs the code in the .then callback. The technical term for what it's doing is a "microtask". This delay is done so that the order of operations of your code is the same whether the promise is already in a resolved state, or if some time needs to pass before it resolves.
But if your promise is already resolved (eg, because it's wrapped around synchronous code), the only thing your .then callback will be waiting for is the currently executing call stack. Once the current call stack synchronously finishes, your microtask runs synchronously to completion. The event loop will not be able to progress until you're done.
The only way I've managed to make then block work asynchronously is by using setTimeout
setTimeout will make things asynchronous*, yes. The code will be delayed until the timer goes off. If you want to wrap this in a promise you can, but the promise is not the part that makes it asynchronous, the setTimeout is.
I don't want to proceed with using setTimeout, because it is kind of a "dirty" solution
Ok, but it's the right tool for the job.
* when i say setTimeout makes it asynchronous, i just mean it delays the execution. This is good enough for many cases, but when your code eventually executes, it will tie up the thread until it's done. So if it takes a really long time you may need to write the code so it just does a portion of the work, and then sets another timeout to resume later

Related

What happens in Node.JS if a Promise runs forever?

Let's say we have this buggy function:
async function buggy() {
while(true) {
await new Promise(resolve => setTimeout(resolve, 1000));
}
}
If you call it somewhere in NodeJS, would it permanently impact the server performances ?
If so, would it be better to always put a fail-safe mechanism like so for all untrusted promises:
new Promise((resolve, reject) => {
buggy().then(() => resolve);
setTimeout(reject, 10000);
};
No, there is nothing wrong with your buggy function. To say that a promise runs is misleading: a promise doesn't run. It is an object that has provided a callback. As long as that callback resolve isn't called, and its then method isn't called, there is nothing happening with that promise object.
The following happens when buggy is run:
new Promise creates a promise object
setTimeout is run and completes immediately. It registers the resolve callback.
await is executed, which actually calls then on the above promise to add a listener, and makes buggy return. If this is the first time, then it returns a pending promise to the caller.
One second later (during which there is no activity related to buggy), the setTimeout API will put the resolve callback on the relevant job queue.
When the event loop checks this queue, it consumes and executes this job. The promise API (that provided the resolve callback) puts the promise in a resolved state and puts a notification for its then-listeners (including the one created by await) as a job on the promise job queue.
When the event loop checks this queue, it consumes and executes the listener(s). This restores the execution context of buggy, which then continues with its loop.
Repeat these steps from the top
The impact on the engine or memory is comparable with a setInterval call that you never clear with clearInterval. There is just a little bit more overhead due to the extra promise related jobs (in addition to the regular timer job) that kick in after each second, and the saved execution state of buggy, which is comparable with what you would have with an infinite generator (using yield).
In the case that the promise resolve state depends on an external resource / timer that never ends (Infinity as the duration) the suggested fail-safe would work and the server performance wouldn’t be affected so much.
// this won’t stuck and the fail-safe will work
// (ignore the timer not being cleaned)
function buggy() {
return new Promise(resolve => setTimeout(resolve, Infinity));
}
Otherwise (in case the promise is solely synchronous, i.e. the resolve depends on an infinite loop that never ends) the reject function would never get called as the event loop is stuck which will stuck the server.
// The fail-safe wouldn’t work and the server is stuck
async function buggy() {
while(true);
}
Edit:
As pointed out by #Vlaz, in case the buggy function is similar to this:
async function buggy() {
while(true) {
await new Promise(resolve => setTimeout(resolve, 1000));
}
}
The fail-safe would work and the server won’t stuck.
Edit 2:
In both cases you won't need a fail safe because if a the fail safe would work it means that the code doesn't stuck the Event Loop and also calling reject on the promise doesn't abort the buggy one

Considering node.js asynchronous architecture, why doesn't push throw error when executed after array init function calls?

I'm fairly new to Node.js and learning about the history of callbacks, Promises, and async/await. So far I have written some code that I expected to throw an error, but it works fine. I suppose I should be happy with that, but I'd like to fully understand when async/await needs to be used.
Here is my code:
const rooms = {};
function initRoom(roomId) {
if (!rooms[roomId]) {
rooms[roomId] = {};
rooms[roomId].users = [];
rooms[roomId].messages = [];
}
}
initRoom('A');
initRoom('B');
rooms.A.users.push(1);
rooms.A.users.push(2);
rooms.B.users.push(3);
console.log(rooms);
I expected the first push() to throw an error, assuming that it would execute in the stack before the initRoom() function calls completed. Why doesn't it throw an error?
There is nothing in initRoom() that is asynchronous. It's all synchronous code.
The "asynchronous architecture" you refer to has to do with specific library operations in nodejs that have an asynchronous design. These would be things such as disk operations or network operations. Those all have an underlying native code implementation that allows them to be asynchronous.
Nothing in your initRoom() function is asynchronous or calls anything asynchronous. Therefore, it is entirely synchronous and each line of code executes sequentially.
but I'd like to fully understand when async/await needs to be used.
You use promises and optionally async/await when you have operations that are actually asynchronous (they execute in the background and notify of completion later). You would not typically use them with purely synchronous code because they would only complicate the implementation and do not add any value if all the code is already synchronous.
I expected the first push() to throw an error, assuming that it would execute in the stack before the initRoom() function calls completed.
initRoom() is not asynchronous and contains no asynchronous code. When you call it, it runs each line in it sequentially before it returns and allows the next line of code after calling initRoom() to execute.

Why does Node seemingly wait for all Promises to resolve?

I'm new to JavaScript and NodeJS so forgive me if this question is rudimentary.
Let's say I just have a simple file hello.js and I run it with $ node hello.js and all this file contains is
setTimeout(() => {console.log('hello');}, 5000);
Why doesn't this program finish immediately? Why instead does it wait for the underlying Promise to resolve?
After all, isn't the Promise associated with setTimeout created and run asynchronously? So wouldn't the main 'thread' of execution "fall off" when it encounters no more code to run?
The Node event loop keeps running until all outstanding tasks are completed or cancelled.
setTimeout creates a pending event, so the loop will keep running until that executes.
Outstanding Promises, setInterval and other mechanisms can all prevent the event loop from halting.
It's worth noting that setTimeout does not use a promise at all. That's just a regular callback function. The setTimeout() API long predates Promises.
I think there's more to the story/explanation here so I'll add an answer that contains additional info.
Nodejs keeps a reference counter for all unfinished asynchronous operations and nodejs itself will not exit automatically until all the asynchronous operations are complete (until the reference count gets to zero). If you want nodejs to exit before that, you can call process.exit() whenever you want.
Since setTimeout() is an asynchronous operation, it contributes to the reference count of unfinished asynchronous operations and thus it keeps nodejs from automatically exiting until the timer fires.
Note that setTimeout() does not use a promise - it's just a plain callback. It is not promises that nodejs waits for. It is the underlying asynchronous operations that promises often are attached to that it actually waits for.
So, if you did just this:
const myPromise = new Promise((resolve, reject) => {
console.log("did nothing here");
});
Then, nodejs would not wait for that promise all by itself, even though that promise never resolves or rejects. This is because nodejs does not actually wait for promises. It waits for the underlying asynchronous operations that are usually behind promises. Since there is no such underlying asynchronous operation behind this promise, nodejs does not wait for it.
It's also worth mentioning that you can actually tell nodejs to NOT wait for some asynchronous operations by using the .unref() method. In the example you show, if you do this:
const timer = setTimeout(() => {console.log('hello');}, 5000);
timer.unref();
Then, nodejs will NOT wait for that timer before exiting and if this is your entire program, nodejs will actually exit immediately without waiting for the timer to fire.
As an example, I have a nodejs program that carries out some nightly maintenance using a recurring timer. I don't want that maintenance timer to keep the nodejs program running if the other things that it's doing are done. So, after I set the timer, I call the .unref() method on it.

Using caolan's async library, when should I use process.nextTick and when should I just invoke the callback?

I've been reading up on the differences, and it's hard to think about it when utilizing a library that helps with async methods.
I'm using https://github.com/caolan/async to write specific async calls. I'm having a hard time understanding the use of process.nextTick within some of these async methods, particularly async series methods, where async methods are basically performed synchronously.
So for example:
async.waterfall([
next => someAsyncMethod(next),
(res, next) => {
if (res === someCondition) {
return anotherAsyncMethod(next);
}
return process.nextTick(next); // vs just calling next()
},
], cb);
I've seen this done in code before. But I have no idea why? Just invoking next instead of process.nextTick gives me the same results?
Is there any reason for using process.nextTick in these scenarios, where there is an async method being controlled in a synchronous manner?
Also, what about in an async like the following?
async.map(someArray, (item, next) => {
if (item === someCondition) {
return anotherAsyncMethod(next);
}
return process.nextTick(next); // vs just calling next()
}, cb);
Thanks!
The code is happening in a SEQUENTIAL manner, not a synchronous manner. It's an important difference.
In your code, the async methods are called one after another, in sequence. HOWEVER, while that code is executing, node.js can still respond to other incoming requests because your code yields control.
Node is single-threaded. So if your code is doing something synchronously, node cannot accept new requests or perform actions until that code is finished. For instance, if you did a synchronous web request, node would stop doing ANYTHING ELSE until that request was finished.
What's really happening is this:
Start async action in background (yield control)
Node is available to handle other stuff
Async action 1 completes. Node starts async action 2 and yields control.
Node can accept other requests/handle other stuff.
Async action 2 completes...
And so on. Process.nextTick() says to node 'Stop dealing with this for a while, and come back once you've handled the other stuff that's waiting on you'. Node goes off and handles whatever that is, then gets back to handling your scheduled request.
In your case, there is nothing else waiting so node just continues where it left off. However, if there WERE other things going on like other incoming HTTP requests, this would not be the case.
Feel free to ask questions.

Mongoose eachAsync with async function = memory leak?

I am not sure if it's a bug in mongoose or if I am doing something wrong. Once I start using async functions when iterating on a cursor with eachAsync I experience memory leaks (quickly goes up to 4gb and then crashes). After trying some things I noticed that this wouldn't happen if I don't use an async function as callback.
No Memory leak:
const playerCursor: QueryCursor<IPlayerProfileModel> = PlayerProfile.find({}, projection).lean().cursor();
await playerCursor.eachAsync(
(profile: IPlayerProfileModel) => {
return;
},
{ parallel: 50 }
);
Memory leak:
const playerCursor: QueryCursor<IPlayerProfileModel> = PlayerProfile.find({}, projection).lean().cursor();
await playerCursor.eachAsync(
async (profile: IPlayerProfileModel) => {
return;
},
{ parallel: 50 }
);
Obviously above code doesn't make any sense but I need to perform an asynchronous operation within the function.
Question:
What is causing the memory leak / how can I avoid it?
It has to do with how async functions work.
Quoting the documentation:
When the async function returns a value, the Promise will be resolved
with the returned value.
Meaning, values returned by async functions will be automatically wrapped into a Promise.
In your first code sample, your code returns undefined whereas in the second code sample your code returns Promise.resolve(undefined).
What is causing the memory leak?
I didn't take a look at mongoose code but the documentation states:
If fn returns a promise, will wait for the promise to resolve before iterating on to the next one.
Since your first example does not return a Promise, I am thinking your callback is executed on each result all at once rather than sequentially.
How can I avoid it?
I'd recommend using async/wait as you used it on your second code sample.
After taking a look at the code (looking for an answer myself), if provided with a callback that doesn't return a promise eachAsync will run the callback as many times as fast as possible as it can.
This line is where your callback is executed. The next line checks whether it's a Promise and if it's not then execute right away callback which effectively calls your eachAsync callback on the next result. If your callback has any sort of async operation but returns right away then you end up with thousands and thousands of async operations running all at once.
On top of that, you set the option parallel to 100 so it executes eachAsync callback one hundred times in parallel.
This isn't a bug on mongoose because there are cases where this behavior is wanted and it does provide with a sequential processing using Promise. The documentation should mention the caveat of using a callback which doesn't return a Promise.
To go a little further, express uses next on middleware callbacks for the purpose of sequencing them.

Resources