NodeJS -- cost of promise chains in recurssion - node.js

I am trying to implement a couple of state handler funcitons in my javascript code, in order to perform 2 different distinct actions in each state. This is similar to a state design pattern of Java (https://sourcemaking.com/design_patterns/state).
Conceptually, my program need to remain connected to an elasticsearch instance (or any other server for that matter), and then parse and POST some incoming data to el. If there is no connection available to elasticsearch, my program would keep tring to connect to el endlessly with some retry period.
In a nutshell,
When not connected, keep trying to connect
When connected, start POSTing the data
The main run loop is calling itself recurssively,
function run(ctx) {
logger.info("run: running...");
// initially starts with disconnected state...
return ctx.curState.run(ctx)
.then(function(result) {
if (result) ctx.curState = connectedSt;
// else it remains in old state.
return run(ctx);
});
}
This is not a truly recursive fn in the sense that each invocation is calling itself in a tight loop. But I suspect it ends up with many promises in the chain, and in the long run it will consume more n more memory and hence eventually hang.
Is my assumption / understanding right? Or is it OK to write this kinda code?
If not, should I consider calling setImmediate / process.nextTick etc?
Or should I consider using TCO (Tail Cost Optimization), ofcourse I am yet to fully understand this concept.

Yes, by returning a new promise (the result of the recursive call to run()), you effectively chain in another promise.
Neither setImmediate() nor process.nextTick() are going to solve this directly.
When you call run() again, simply don't return it and you should be fine.

Related

lambda trigger callback vs context.done

I was following the guide here for setting up a presignup trigger.
However, when I used callback(null, event) my lambda function would never actually return and I would end up getting an error
{ code: 'UnexpectedLambdaException',
name: 'UnexpectedLambdaException',
message: 'arn:aws:lambda:us-east-2:642684845958:function:proj-dev-confirm-1OP5DB3KK5WTA failed with error Socket timeout while invoking Lambda function.' }
I found a similar link here that says to use context.done().
After switching it works perfectly fine.
What's the difference?
exports.confirm = (event, context, callback) => {
event.response.autoConfirmUser = true;
context.done(null, event);
//callback(null, event); does not work
}
Back in the original Lambda runtime environment for Node.js 0.10, Lambda provided helper functions in the context object: context.done(err, res) context.succeed(res) and context.fail(err).
This was formerly documented, but has been removed.
Using the Earlier Node.js Runtime v0.10.42 is an archived copy of a page that no longer exists in the Lambda documentation, that explains how these methods were used.
When the Node.js 4.3 runtime for Lambda was launched, these remained for backwards compatibility (and remain available but undocumented), and callback(err, res) was introduced.
Here's the nature of your problem, and why the two solutions you found actually seem to solve it.
Context.succeed, context.done, and context.fail however, are more than just bookkeeping – they cause the request to return after the current task completes and freeze the process immediately, even if other tasks remain in the Node.js event loop. Generally that’s not what you want if those tasks represent incomplete callbacks.
https://aws.amazon.com/blogs/compute/node-js-4-3-2-runtime-now-available-on-lambda/
So with callback, Lambda functions now behave in a more paradigmatically correct way, but this is a problem if you intend for certain objects to remain on the event loop during the freeze that occurs between invocations -- unlike the old (deprecated) done fail succeed methods, using the callback doesn't suspend things immediately. Instead, it waits for the event loop to be empty.
context.callbackWaitsForEmptyEventLoop -- default true -- was introduced so that you can set it to false for those cases where you want the Lambda function to return immediately after you call the callback, regardless of what's happening in the event loop. The default is true because false can mask bugs in your function and can cause very erratic/unexpected behavior if you fail to consider the implications of container reuse -- so you shouldn't set this to false unless and until you understand why it is needed.
A common reason false is needed would be a database connection made by your function. If you create a database connection object in a global variable, it will have an open socket, and potentially other things like timers, sitting on the event loop. This prevents the callback from causing Lambda to return a response, until these operations are also finished or the invocation timeout timer fires.
Identify why you need to set this to false, and if it's a valid reason, then it is correct to use it.
Otherwise, your code may have a bug that you need to understand and fix, such as leaving requests in flight or other work unfinished, when calling the callback.
So, how do we parse the Cognito error? At first, it seemed pretty unusual, but now it's clear that it is not.
When executing a function, Lambda will throw an error that the tasked timed out after the configured number of seconds. You should find this to be what happens when you test your function in the Lambda console.
Unfortunately, Cognito appears to have taken an internal design shortcut when invoking a Lambda function, and instead of waiting for Lambda to timeout the invocarion (which could tie up resources inside Cognito) or imposing its own explicit timer on the maximum duration Cognito will wait for a Lambda response, it's relying on a lower layer socket timer to constrain this wait... thus an "unexpected" error is thrown while invoking the timeout.
Further complicating interpreting the error message, there are missing quotes in the error, where the lower layer exception is interpolated.
To me, the problem would be much more clear if the error read like this:
'arn:aws:lambda:...' failed with error 'Socket timeout' while invoking Lambda function
This format would more clearly indicate that while Cognito was invoking the function, it threw an internal Socket timeout error (as opposed to Lambda encountering an unexpected internal error, which was my original -- and incorrect -- assumption).
It's quite reasonable for Cognito to impose some kind of response time limit on the Lambda function, but I don't see this documented. I suspect a short timeout on your Lambda function itself (making it fail more promptly) would cause Cognito to throw a somewhat more useful error, but in my mind, Cognito should have been designed to include logic to make this an expected, defined error, rather than categorizing it as "unexpected."
As an update the Runtime Node.js 10.x handler supports an async function that makes use of return and throw statements to return success or error responses, respectively. Additionally, if your function performs asynchronous tasks then you can return a Promise where you would then use resolve or reject to return a success or error, respectively. Either approach simplifies things by not requiring context or callback to signal completion to the invoker, so your lambda function could look something like this:
exports.handler = async (event) => {
// perform tasking...
const data = doStuffWith(event)
// later encounter an error situation
throw new Error('tell invoker you encountered an error')
// finished tasking with no errors
return { data }
}
Of course you can still use context but its not required to signal completion.

Node; Q Promise delay

Here are some simple questions based on behaviour I noticed in the following example running in node:
Q('THING 1').then(console.log.bind(console));
console.log('THING 2');
The output for this is:
> "THING 2"
> "THING 1"
Questions:
1) Why is Q implemented to wait before running the callback on a value that is immediately known? Why isn't Q smart enough to allow the first line to synchronously issue its output before the 2nd line runs?
2) What is the time lapse between "THING 2" and "THING 1" being output? Is it a single process tick?
3) Could there be performance concerns with values that are deeply wrapped in promises? For example, does Q(Q(Q("THING 1"))) asynchronously wait 3 times as long to complete, even though it can be efficiently synchronously resolved?
This is actually done on purpose. It is to make it consistent whether or not the value is known or not. That way there is only one order of evaluation and you can depend on the fact that no matter if the promise has already settled or not, that order will be the same.
Also, doing it otherwise would make it possible to write a code to test if the promise has settled or not and by design it should not be known and acted upon.
This is pretty much the as doing callback-style code like this:
function fun(args, callback) {
if (!args) {
process.nextTick(callback, 'error');
}
// ...
}
so that anyone who calls it with:
fun(x, function (err) {
// A
});
// B
can be sure that A will never run before B.
The spec
See the Promises/A+ Specification, The then Method section, point 4:
onFulfilled or onRejected must not be called until the execution context stack contains only platform code.
See also the the note 1:
Here "platform code" means engine, environment, and promise implementation code. In practice, this requirement ensures that onFulfilled and onRejected execute asynchronously, after the event loop turn in which then is called, and with a fresh stack. This can be implemented with either a "macro-task" mechanism such as setTimeout or setImmediate, or with a "micro-task" mechanism such as MutationObserver or process.nextTick. Since the promise implementation is considered platform code, it may itself contain a task-scheduling queue or "trampoline" in which the handlers are called.
So this is actually mandated by the spec.
It was discussed extensively to make sure that this requirement is clear - see:
https://github.com/promises-aplus/promises-spec/pull/70
https://github.com/promises-aplus/promises-spec/pull/104
https://github.com/promises-aplus/promises-spec/issues/100
https://github.com/promises-aplus/promises-spec/issues/139
https://github.com/promises-aplus/promises-spec/issues/229

Concurrency between Meteor.setTimeout and Meteor.methods

In my Meteor application to implement a turnbased multiplayer game server, the clients receive the game state via publish/subscribe, and can call a Meteor method sendTurn to send turn data to the server (they cannot update the game state collection directly).
var endRound = function(gameRound) {
// check if gameRound has already ended /
// if round results have already been determined
// --> yes:
do nothing
// --> no:
// determine round results
// update collection
// create next gameRound
};
Meteor.methods({
sendTurn: function(turnParams) {
// find gameRound data
// validate turnParams against gameRound
// store turn (update "gameRound" collection object)
// have all clients sent in turns for this round?
// yes --> call "endRound"
// no --> wait for other clients to send turns
}
});
To implement a time limit, I want to wait for a certain time period (to give clients time to call sendTurn), and then determine the round result - but only if the round result has not already been determined in sendTurn.
How should I implement this time limit on the server?
My naive approach to implement this would be to call Meteor.setTimeout(endRound, <roundTimeLimit>).
Questions:
What about concurrency? I assume I should update collections synchronously (without callbacks) in sendTurn and endRound (?), but would this be enough to eliminate race conditions? (Reading the 4th comment on the accepted answer to this SO question about synchronous database operations also yielding, I doubt that)
In that regard, what does "per request" mean in the Meteor docs in my context (the function endRound called by a client method call and/or in server setTimeout)?
In Meteor, your server code runs in a single thread per request, not in the asynchronous callback style typical of Node.
In a multi-server / clustered environment, (how) would this work?
Great question, and it's trickier than it looks. First off I'd like to point out that I've implemented a solution to this exact problem in the following repos:
https://github.com/ldworkin/meteor-prisoners-dilemma
https://github.com/HarvardEconCS/turkserver-meteor
To summarize, the problem basically has the following properties:
Each client sends in some action on each round (you call this sendTurn)
When all clients have sent in their actions, run endRound
Each round has a timer that, if it expires, automatically runs endRound anyway
endRound must execute exactly once per round regardless of what clients do
Now, consider the properties of Meteor that we have to deal with:
Each client can have exactly one outstanding method to the server at a time (unless this.unblock() is called inside a method). Following methods wait for the first.
All timeout and database operations on the server can yield to other fibers
This means that whenever a method call goes through a yielding operation, values in Node or the database can change. This can lead to the following potential race conditions (these are just the ones I've fixed, but there may be others):
In a 2-player game, for example, two clients call sendTurn at exactly same time. Both call a yielding operation to store the turn data. Both methods then check whether 2 players have sent in their turns, finding the affirmative, and then endRound gets run twice.
A player calls sendTurn right as the round times out. In that case, endRound is called by both the timeout and the player's method, resulting running twice again.
Incorrect fixes to the above problems can result in starvation where endRound never gets called.
You can approach this problem in several ways, either synchronizing in Node or in the database.
Since only one Fiber can actually change values in Node at a time, if you don't call a yielding operation you are guaranteed to avoid possible race conditions. So you can cache things like the turn states in memory instead of in the database. However, this requires that the caching is done correctly and doesn't carry over to clustered environments.
Move the endRound code outside of the method call itself, using something else to trigger it. This is the approach I've taken which ensures that only the timer or the final player triggers the end of the round, not both (see here for an implementation using observeChanges).
In a clustered environment you will have to synchronize using only the database, probably with conditional update operations and atomic operators. Something like the following:
var currentVal;
while(true) {
currentVal = Foo.findOne(id).val; // yields
if( Foo.update({_id: id, val: currentVal}, {$inc: {val: 1}}) > 0 ) {
// Operation went as expected
// (your code here, e.g. endRound)
break;
}
else {
// Race condition detected, try again
}
}
The above approach is primitive and probably results in bad database performance under high loads; it also doesn't handle timers, but I'm sure with some thinking you can figure out how to extend it to work better.
You may also want to see this timers code for some other ideas. I'm going to extend it to the full setting that you described once I have some time.

Too many callbacks issue

I know that writing async functions is recommended in nodejs. However, I feel it's not so nescessary to write some non IO events asynchronously. My code can get less convenient. For example:
//sync
function now(){
return new Date().getTime();
}
console.log(now());
//async
function now(callback){
callback(new Date().getTime());
}
now(function(time){
console.log(time);
});
Does sync method block CPU in this case? Is this remarkable enough that I should use async instead?
Async style is necessary if the method being called can block for a long time waiting for IO. As the node.js event loop is single-threaded you want to yield to the event loop during an IO. If you didn't do this there could be only one IO outstanding at each point in time. That would lead to total non-scalability.
Using callbacks for CPU work accomplishes nothing. It does not unblock the event loop. In fact, for CPU work it is not possible to unblock the event loop. The CPU must be occupied for a certain amount of time and that is unavoidable. (Disregarding things like web workers here).
Callbacks are nothing good. You use them when you have to. They are a necessary consequence of the node.js event loop IO model.
That said, if you later plan on introducing IO into now you might eagerly use a callback style even if not strictly necessary. Changing from synchronous calls to callback-based calls later can be time-consuming because the callback style is viral.
By adding a callback to a function's signature, the code communicates that something asynchronous might happen in this function and the function will call the callback with an error and/or result object.
In case a function does nothing asynchronous and does not involve conditions where a (non programming) error may occur don't use a callback function signature but simply return the computation result.
Functions with callbacks are not very convenient to handle by the caller so avoid callbacks until you really need them.

"Spawn a thread"-like behaviour in node.js

I want to add some admin utilities to a little Web app, such as "Backup Database". The user will click on a button and the HTTP response will return immediately, although the potentially long-running process has been started in the background.
In Java this would probably be implemented by spawning an independent thread, in Scala by using an Actor. But what's an appropriate idiom in node.js? (code snippet appreciated)
I'm now re-reading the docs, this really does seem a node 101 question but that's pretty much where I am on this...anyhow, to clarify this is the basic scenario :
function onRequest(request, response) {
doSomething();
response.writeHead(202, headers);
response.end("doing something");
}
function doSomething(){
// long-running operation
}
I want the response to return immediately, leaving doSomething() running in the background.
Ok, given the single-thread model of node that doesn't seem possible without spawning another OS-level ChildProcess. My misunderstanding.
In my code what I need for backup is mostly I/O based, so node should handle that in a nice async fashion. What I think I'll do is shift the doSomething to after the response.end, see how that behaves.
As supertopi said you could have a look at Child process. But I think it will hurt the performance of your server, if this happens a lot sequentially. Then I think you should queue them instead. I think you should have a look at asynchronous message queues to process your jobs offline(distributed). Some(just to name two) example of message queues are beanstalkd, gearman.
I don't see the problem. All you need to do is have doSomething() start an asynchronous operation. It'll return immediately, your onRequest will write the response back, and the client will get their "OK, I started" message.
function doSomething() {
openDatabaseConnection(connectionString, function(conn) {
// This is called some time later, once the connection is established.
// Now you can tell the database to back itself up.
});
}
doSomething won't just sit there until the database connection is established, or wait while you tell it to back up. It'll return right away, having registered a callback that will run later. Behind the scenes, your database library is probably creating some threads for you, to make the async work the way it should, but your code doesn't need to worry about it; you just return right away, send the response to the client right away, and the asynchronous code keeps running asynchronously.
(It's actually more work to make this run synchronously -- you would have to pass your response object into doSomething, and have doSomething do the response.end call inside the innermost callback, after the backup is done. Of course, that's not what you want to do here; you want to return immediately, which is exactly what your code will do.)

Resources