Node; Q Promise delay - node.js

Here are some simple questions based on behaviour I noticed in the following example running in node:
Q('THING 1').then(console.log.bind(console));
console.log('THING 2');
The output for this is:
> "THING 2"
> "THING 1"
Questions:
1) Why is Q implemented to wait before running the callback on a value that is immediately known? Why isn't Q smart enough to allow the first line to synchronously issue its output before the 2nd line runs?
2) What is the time lapse between "THING 2" and "THING 1" being output? Is it a single process tick?
3) Could there be performance concerns with values that are deeply wrapped in promises? For example, does Q(Q(Q("THING 1"))) asynchronously wait 3 times as long to complete, even though it can be efficiently synchronously resolved?

This is actually done on purpose. It is to make it consistent whether or not the value is known or not. That way there is only one order of evaluation and you can depend on the fact that no matter if the promise has already settled or not, that order will be the same.
Also, doing it otherwise would make it possible to write a code to test if the promise has settled or not and by design it should not be known and acted upon.
This is pretty much the as doing callback-style code like this:
function fun(args, callback) {
if (!args) {
process.nextTick(callback, 'error');
}
// ...
}
so that anyone who calls it with:
fun(x, function (err) {
// A
});
// B
can be sure that A will never run before B.
The spec
See the Promises/A+ Specification, The then Method section, point 4:
onFulfilled or onRejected must not be called until the execution context stack contains only platform code.
See also the the note 1:
Here "platform code" means engine, environment, and promise implementation code. In practice, this requirement ensures that onFulfilled and onRejected execute asynchronously, after the event loop turn in which then is called, and with a fresh stack. This can be implemented with either a "macro-task" mechanism such as setTimeout or setImmediate, or with a "micro-task" mechanism such as MutationObserver or process.nextTick. Since the promise implementation is considered platform code, it may itself contain a task-scheduling queue or "trampoline" in which the handlers are called.
So this is actually mandated by the spec.
It was discussed extensively to make sure that this requirement is clear - see:
https://github.com/promises-aplus/promises-spec/pull/70
https://github.com/promises-aplus/promises-spec/pull/104
https://github.com/promises-aplus/promises-spec/issues/100
https://github.com/promises-aplus/promises-spec/issues/139
https://github.com/promises-aplus/promises-spec/issues/229

Related

Is it safe to skip calling callback if no action needed in nodejs

scenario 1
function a(callback){
console.log("not calling callback");
}
a(function(callback_res){
console.log("callback_res", callback_res);
});
scenario 2
function a(callback){
console.log("calling callback");
callback(true);
}
a(function(callback_res){
console.log("callback_res", callback_res);
});
will function a be waiting for callback and will not terminate in scenario 1? However program gets terminated in both scenario.
The problem is not safety but intention. If a function accepts a callback, it's expected that it will be called at some point. If it ignores the argument it accepts, the signature is misleading.
This is a bad practice because function signature gives false impression about how a function works.
It also may cause parameter is unused warning in linters.
will function a be waiting for callback and will not terminate in scenario 1?
The function doesn't contain asynchronous code and won't wait for anything. The fact that callbacks are commonly used in asynchronous control flow doesn't mean that they are asynchronous per se.
will function a be waiting for callback and will not terminate in scenario 1?
No. There is nothing in the code you show that waits for a callback to be called.
Passing a callback to a function is just like passing an integer to a function. The function is free to use it or not and it doesn't mean anything more than that to the interpreter. the JS interpreter has no special logic to "wait for a passed callback to get called". That has no effect one way or the other on when the program terminates. It's just a function argument that the called function can decide whether to use or ignore.
As another example, it used to be common to pass two callbacks to a function, one was called upon success and one was called upon error:
function someFunc(successFn, errorFn) {
// do some operation and then call either successFn or errorFn
}
In this case, it was pretty clear that one of these was going to get called and the other was not. There's no need (from the JS interpreter's point of view) to call a passed callback. That's purely the prerogative of the logic of your code.
Now, it would not be a good practice to design a function that shows a callback in the calling signature and then never, ever call that callback. That's just plain wasteful and a misleading design. There are many cases of callbacks that are sometimes called and sometimes not depending upon circumstances. Array.prototype.forEach is one such example. If you call array.forEach(fn) on an empty array, the callback is never called. But, of course, if you call it on a non-empty array, it is called.
If your function carries out asynchronous operations and the point of the callback is to communicate when the asynchronous operation is done and whether it concluded with an error or a value, then it would generally be bad form to have code paths that would never call the callback because it would be natural for a caller to assume the callback is doing to get called eventually. I can imagine there might be some exceptions to this, but they better be documented really well with the doc/comments for the function.
For asynchronous operations, your question reminds me somewhat of this: Do never resolved promises cause memory leak? which might be useful to read.

NodeJS test continuing async flow with Jasmine 2.4

My route calls a function startJob() that immediately returns a response "Job started." It also calls a function runJob(), which continues with the job flow (async).
Obviously, I can't just do const jobResults = await startJob(), since that function just returns "Job started." The actual results of the job are not available until all the subsequent functions complete.
So I need a way to check the results of the test after all the functions complete, meaning I need to know when all the functions complete.
The last function in the flow logResults() is not going to work to spy on, because I need it to call through (including an async call). Jasmine doesn't allow me to call it through and then call another function that checks the results.
My idea was to add a function at the end of logResults() called finishJob and spy on that. finishJob() wouldn't do anything, so I wouldn't need to call it through.
Pseudo-code:
spyOn(finishJob).and.callFake( [function that checks the results] )
However, it seems like bad practice to add a function to the code purely for testing reasons.
Other possible solutions would be using some kind of node watchers, a state change machine, or just doing a timeout (also bad practice? but quick implementation...)
I found a solution I'm happy with. I decided to start the testing at a different point in the flow.
Instead calling await startJob() to start the test, I called await runJob(), which kicks off the asynchronous process and gets the results.
That way, I could know when the process finishes (using await keyword obvs) and check the results at that time.

How to know when the Promise is actually resolved in Node.js?

When we are using Promise in nodejs, given a Promise p, we can't know when the Promise p is actually resolved by logging the currentTime in the "then" callback.
To prove that, I wrote the test code below (using CoffeeScript):
# the procedure we want to profile
getData = (arg) ->
new Promise (resolve) ->
setTimeout ->
resolve(arg + 1)
, 100
# the main procedure
main = () ->
beginTime = new Date()
console.log beginTime.toISOString()
getData(1).then (r) ->
resolveTime = new Date()
console.log resolveTime.toISOString()
console.log resolveTime - beginTime
cnt = 10**9
--cnt while cnt > 0
return cnt
main()
When you run the above code, you will notice that the resolveTime (the time your code run into the callback function) is much later than 100ms from the beginTime.
So If we want to know when the Promise is actually resolved, HOW?
I want to know the exact time because I'm doing some profiling via logging. And I'm not able to modify the Promise p 's implementation when I'm doing some profiling outside of the black box.
So, Is there some function like promise.onUnderlyingConditionFulfilled(callback) , or any other way to make this possible?
This is because you have a busy loop that apparently takes longer than your timer:
cnt = 10**9
--cnt while cnt > 0
Javascript in node.js is single threaded and event driven. It can only do one thing at a time and it will finish the current thing it's doing before it can service the event posted by setTimeout(). So, if your busy loop (or any other long running piece of Javascript code) takes longer than you've set your timer for, the timer will not be able to run until this other Javascript code is done. "single threaded" means Javascript in node.js only does one thing at a time and it waits until one thing returns control back to the system before it can service the next event waiting to run.
So, here's the sequence of events in your code:
It calls the setTimeout() to schedule the timer callback for 100ms from now.
Then you go into your busy loop.
While it's in the busy loop, the setTimeout() timer fires inside of the JS implementation and it inserts an event into the Javascript event queue. That event can't run at the moment because the JS interpreter is still running the busy loop.
Then eventually it finishes the busy loop and returns control back to the system.
When that is done, the JS interpreter then checks the event queue to see if any other events need servicing. It finds the timer event and so it processes that and the setTimeout() callback is called.
That callback resolves the promise which triggers the .then() handler to get called.
Note: Because of Javascript's single threaded-ness and event-driven nature, timers in Javascript are not guaranteed to be called exactly when you schedule them. They will execute as close to that as possible, but if other code is running at the time they fire or if their are lots of items in the event queue ahead of you, that code has to finish before the timer callback actually gets to execute.
So If we want to know when the Promise is actually resolved, HOW?
The promise is resolved when your busy loop is done. It's not resolved at exactly the 100ms point (because your busy loop apparently takes longer than 100ms to run). If you wanted to know exactly when the promise was resolved, you would just log inside the setTimeout() callback right where you call resolve(). That would tell you exactly when the promise was resolved though it will be pretty much the same as where you're logging now. The cause of your delay is the busy loop.
Per your comments, it appears that you want to somehow measure exactly when resolve() is actually called in the Promise, but you say that you cannot modify the code in getData(). You can't really do that directly. As you already know, you can measure when the .then() handler gets called which will probably be no more than a couple ms after resolve() gets called.
You could replace the promise infrastructure with your own implementation and then you could instrument the resolve() callback directly, but replacing or hooking the promise implementation probably influences the timing of things even more than just measuring from the .then() handler.
So, it appears to me that you've just over-constrained the problem. You want to measure when something inside of some code happens, but you don't allow any instrumentation inside that code. That just leaves you with two choices:
Replace the promise implementation so you can instrument resolve() directly.
Measure when .then() is triggered.
The first option probably has a heisenberg uncertainty issue in that you've probably influenced the timing by more than you should if you replace or hook the promise implementation.
The second option measures an event that happens just slightly after the actual .resolve(). Pick which one sounds closest to what you actually want.

NodeJS -- cost of promise chains in recurssion

I am trying to implement a couple of state handler funcitons in my javascript code, in order to perform 2 different distinct actions in each state. This is similar to a state design pattern of Java (https://sourcemaking.com/design_patterns/state).
Conceptually, my program need to remain connected to an elasticsearch instance (or any other server for that matter), and then parse and POST some incoming data to el. If there is no connection available to elasticsearch, my program would keep tring to connect to el endlessly with some retry period.
In a nutshell,
When not connected, keep trying to connect
When connected, start POSTing the data
The main run loop is calling itself recurssively,
function run(ctx) {
logger.info("run: running...");
// initially starts with disconnected state...
return ctx.curState.run(ctx)
.then(function(result) {
if (result) ctx.curState = connectedSt;
// else it remains in old state.
return run(ctx);
});
}
This is not a truly recursive fn in the sense that each invocation is calling itself in a tight loop. But I suspect it ends up with many promises in the chain, and in the long run it will consume more n more memory and hence eventually hang.
Is my assumption / understanding right? Or is it OK to write this kinda code?
If not, should I consider calling setImmediate / process.nextTick etc?
Or should I consider using TCO (Tail Cost Optimization), ofcourse I am yet to fully understand this concept.
Yes, by returning a new promise (the result of the recursive call to run()), you effectively chain in another promise.
Neither setImmediate() nor process.nextTick() are going to solve this directly.
When you call run() again, simply don't return it and you should be fine.

What is executed (explicitly and implicitly) in parallell and what sequentially in node.js?

This example confuse my understanding of how the node.js works:
// 1.
numbers.forEach(function(number) {
queue.push(Q.call(slowFunction, this, number));
});
// 2.
// Q.all: execute an array of 'promises' and 'then' call either a resolve
// callback (fulfilled promises) or reject callback (rejected promises)
Q.all(queue).then(function(ful) {
// All the results from Q.all are on the argument as an array
console.log('fulfilled', ful);
}, function(rej) {
// The first rejected (error thrown) will be here only
console.log('rejected', rej);
}).fail(function(err) {
// If something went wrong, then we catch it here, usually when there is no
// rejected callback.
console.log('fail', err);
}).fin(function() {
// Finally statement; executed no matter of the above results
console.log('finally');
});
Why it is assumed here, that 1. and 2. parts of code will be executed sequentially?
So, where is the guarantee that Q.all(queue) works on all queue elements pushed in 1.? Could it be so, that the numbers from 1. is so big, that it works than parallel to 2.?
These ideas coming from the understanding that node.js will handling 1. and 2. first of all with the node.js event-loop and then give it to workers, which are actually analogue to the normal threads.
So the question - will be 1. and 2. executed parallel to each other, started from node.js event-loop sequentially or will they be executed sequentially (the 1. push all elements in the queue and only after that the 2. starts to handling each element in the queue) ?
Please provide arguments with some direct links to documentation for this topic.
At the topmost level, 1. and 2's Q.all(queue).then(...).fail(...).fin(...); method chain, will unambiguously be executed sequentially.
The precise timing of execution of functions defined/called within 1 and 2 depends very much on the nature of slowFunction.
If slowFunction is performed wholly by synchronous javascript (eg. some extensive Math), then 1 will have completed in its entirety before 2 starts. In this case the callbacks specified in 2 will execute very shortly after 2's method chain has finished executing because any promises returned by slowfunction will be (or at least should be) already resolved.
If however, slowFunction involves one or more asynchronous node I/O process (eg. file handling or resource fetching), then each call to it will (at least in part) be undertaken by a non-blocking worker thread (not javascript); in this case, providing slowFunction is correctly written, queue will accumulate a set of promises, each of which will be resolved or rejected later. The callbacks specified in 2 will execute after 2's method chain has finished executing AND when all the promises have been either resolved or rejected.
For me, a rephrased version of 2's introductory text would go a long way toward explaining the order of execution :
Q.all: Wait for every 'promise' in the queue array of to be either
resolved or rejected 'then' call the corresponding callback function.
Ref: http://rickgaribay.net/archive/2012/01/28/node-is-not-single-threaded.aspx

Resources