Node - Run code after return statement - node.js

Is there any way to run a block of code after the return of a function in node js?
Something like this:
function f() {
#do stuff
#return result
#do more stuff
}

No, there is no way to do that in the way that you show. return exits from the containing function and statements immediately after the return statement do not execute (in fact they are dead code).
(Per your comments) If what you're really trying to do is to execute something "out of band" that the rest of the function (including the return value) does not depend upon, you could schedule that code to run later. For example, you could use setTimeout(), process.nextTick() or setImmediate().
function f() {
// do stuff
setTimeout(function() {
// do some stuff here that will execute out of band
// after this function returns
}, 0);
return someVal;
}
There are legit uses for things like this where you want to execute something soon, but you don't want it to get in the way of the current operation. So, you'd essentially like to queue it to execute when the current activity is done.

The answer is No. After you return the function will stop execution. You can consider using a better flow control to run the code like Async/Await or Promise
You use the return statement to stop execution of a function and return the value of expression. according to the following doc
https://learn.microsoft.com/en-us/scripting/javascript/reference/return-statement-javascript

Related

Understanding try and catch in node.js

I'm new to coding. Trying to understand why try...catch isn't supposed to work in node.js. I've created an example, but contrary to expectations, try...catch seems to be working. Where am I going wrong in my understanding ? Please help.
function callback(error) { console.log(error); }
function A() {
var errorForCallback;
var y = parseInt("hardnut");
if (!y) {
throw new Error("boycott parsley");
errorForCallback = "boycott parsley for callback";
}
setTimeout(callback(errorForCallback),1000);
}
try {
A();
}
catch (e) {
console.log(e.message);
}
// Output: boycott parsley
// Synchronous behaviour, try...catch works
-----------Example re-framed to reflect my understanding after reading answer below----------
function callback(error) { console.log(error); }
function A() {
var errorForCallback;
setTimeout(function(){
var y = parseInt("hardnut");
if (!y) {
// throw new Error("boycott parsley");
errorForCallback = "boycott parsley for callback";
}
callback(errorForCallback);
}, 1000);
}
try {
A();
}
catch (e) {
console.log(e.message);
}
// Output: boycott parsley for callback
// Asynchronous behaviour
// And if "throw new Error" is uncommented,
// then node.js stops
The try-catch approach is something that works perfectly with synchronous code. Not all the programming that you do in Node.js is asynchronous and so in those pieces of synchronous code that you write you can perfectly use a try-catch approach. Asynchronous code, on the other hand, does not work that way.
For instance, if you had two function executions like this
var x = fooSync();
var y = barSync();
You would expect three things, first that barSync() would be executed only after fooSync() has finished, and you would expect that x would contain whatever value is returned by the execution of fooSync before barSync() is executed. Also you would expect that if fooSync throws an exception, barSync is never executed.
If you would use a try-catch around fooSync() you could guarantee that if fooSync() fails you can catch that exception.
Now, the conditions completely change if you would have a code like this:
var x = fooAsync();
var y = barSync();
Now imagine that when fooAsync() is invoked in this scenario, it is not actually executed. It's just scheduled for execution later on. It is as if node would have a todo list, and at this moment it is too busy running your current module, and when it finds this function invocation, instead of running it, it simply adds it to the end of its todo list.
So, now you cannot guarantee that barSync() will run before fooAsync(), as a matter of fact, it probably won't. Now you don't control the context in which fooAsync() is executed.
So, after scheduling the fooAsync() function, it immediately moves to execution of barSync(). So, what can fooAsync() return? At this point nothing, because it has not run yet. So x above is probably undefined. If you would put try-catch around this piece of code it would be pointless, because the function will not be executed in the context of this code. It will be executed later on, when Node.js checks if there are any pending tasks in its todo list. It will be executed in the context of another routine that is constantly checking this todo list, and this only thread of execution is called an event loop.
If your function fooAsync() gets to fail, it will fail in the context of execution of this thread running the event loop and therefore it would not be caught by your try-catch statement, at that point, that module above may have probably finished execution.
So, that is why in asynchronous programing you cannot either get a return value, neither can you expect to do a try-catch, because you code is evaluated somewhere else, in another context different from the one where you think you invoked it. It is as if you could would have done something like this instead:
scheduleForExecutionLaterWhenYouHaveTime(foo);
var y = barSync();
And that's the reason why asynchronous programming requires other techniques to determine what happened to your code when it finally runs. Typically this is notified through a callback. You define a callback function which is called back with the details of what failed (if anything) or what your function produced and then you can react to that.

execute a function parallely, while completing execution of rest of the code

I have a code snippet in nodejs like this:
in every 2 sec, foo() will be called.
function foo()
{
while (count < 10)
{
doSometing()
count ++;``
}
}
doSomething()
{
...
}
The limitation is, foo() has no callback.
How to make while loop execute and foo() completes without waiting for dosomething() to complete (call dosomething() and proceed), and dosomething() executes parallely?
I think, what you want is:
function foo()
{
while (count < 10)
{
process.nextTick(doSometing);
count ++;
}
}
process.nextTick will schedule the execution of doSometing on the next tick of the event loop. So, instead of switching immediately to doSometing this code will just schedule the execution and complete foo first.
You may also try setTimeout(doSometing,0) and setImmediate(doSometing). They'll allow I/O calls to occur before doSometing will be executed.
Passing arguments to doSomething
If you want to pass some parameters to doSomething, then it's best to ensure they'll be encapsulated and won't change before doSomething will be executed:
setTimeout(doSometing.bind(null,foo,bar),0);
In this case doSometing will be called with correct arguments even if foo and bar will be changed or deleted. But this won't work in case if foo is an object and you changes one of its properties.
What the alternatives are?
If you want doSomething to be executed in parallel (not just asynchronous, but actually in parallel), then you may be interested in some job-processing solution. I recommend you to look at kickq:
var kickq = require('kickq');
kickq.process('some_job', function (jobItem, data, cb) {
doSomething(data);
cb();
});
// ...
function foo()
{
while (count < 10)
{
kickq.create('some_job', data);
count ++;
}
}
kickq.process will create a separate process for processing your jobs. So, kickq.create will just register the job to be processed.
kickq uses redis to queue jobs and it won't work without it.
Using node.js build-in modules
Another alternative is building your own job-processor using Child Process. The resulting code may look something like this:
var fork = require('child_process').fork,
child = fork(__dirname + '/do-something.js');
// ...
function foo()
{
while (count < 10)
{
child.send(data);
count ++;
}
}
do-something.js here is a separate .js file with doSomething logic:
process.on('message', doSomething);
The actual code may be more complicated.
Things you should be aware of
Node.js is single-threaded, so it executes only one function at a time. It also can't utilize more then one CPU.
Node.js is asynchronous, so it's capable of processing multiple functions at once by switching between them. It's really efficient when dealing with functions with lots of I/O calls, because it's newer blocks. So, when one function waits for the response from DB, another function is executed. But node.js is not a good choice for blocking tasks with heavy CPU utilization.
It's possible to do real parallel calculations in node.js using modules like child_process and cluster. child_process allows you to start a new node.js process. It also creates a communication channel between parent and child processes. Cluster allows you to run a cluster of identical processes. It's really handy when you're dealing with http requests, because cluster can distribute them randomly between workers. So, it's possible to create a cluster of workers processing your data in parallel, though generally node.js is single-threaded.

Run NodeJS event loop / wait for child process to finish

I first tried a general description of the problem, then some more detail why the usual approaches don't work. If you would like to read these abstracted explanations go on. In the end I explain the greater problem and the specific application, so if you would rather read that, jump to "Actual application".
I am using a node.js child-process to do some computationally intensive work. The parent process does it's work but at some point in the execution it reaches a point where it must have the information from the child process before continuing. Therefore, I am looking for a way to wait for the child-process to finish.
My current setup looks somewhat like this:
importantDataCalculator = fork("./runtime");
importantDataCalculator.on("message", function (msg) {
if (msg.type === "result") {
importantData = msg.data;
} else if (msg.type === "error") {
importantData = null;
} else {
throw new Error("Unknown message from dataGenerator!");
}
});
and somewhere else
function getImportantData() {
while (importantData === undefined) {
// wait for the importantDataGenerator to finish
}
if (importantData === null) {
throw new Error("Data could not be generated.");
} else {
// we should have a proper data now
return importantData;
}
}
So when the parent process starts, it executes the first bit of code, spawning a child process to calculate the data and goes on doing it's own bit of work. When the time comes that it needs the result from the child process to continue it calls getImportantData(). So the idea is that getImportantData() blocks until the data is calculated.
However, the way I used doesn't work. I think this is due to me preventing the event loop from executing by using the while-loop. And since the Event-Loop does not execute no message from the child-process can be received and thus the condition of the while-loop can not change, making it an infinite loop.
Of course, I don't really want to use this kind of while-loop. What I would rather do is tell node.js "execute one iteration of the event loop, then get back to me". I would do this repeatedly, until the data I need was received and then continue the execution where I left of by returning from the getter.
I realize that his poses the danger of reentering the same function several times, but the module I want to use this in does almost nothing on the event loop except for waiting for this message from the child process and sending out other messages reporting it's progress, so that shouldn't be a problem.
Is there way to execute just one iteration of the event loop in Node.js? Or is there another way to achieve something similar? Or is there a completely different approach to achieve what I'm trying to do here?
The only solution I could think of so far is to change the calculation in such a way that I introduce yet another process. In this scenario, there would be the process calculating the important data, a process calculating the bits of data for which the important data is not needed and a parent process for these two, which just waits for data from the two child-processes and combines the pieces when they arrive. Since it does not have to do any computationally intensive work itself, it can just wait for events from the event loop (=messages) and react to them, forwarding the combined data as necessary and storing pieces of data that cannot be combined yet.
However this introduces yet another process and even more inter-process communication, which introduces more overhead, which I would like to avoid.
Edit
I see that more detail is needed.
The parent process (let's call it process 1) is itself a process spawned by another process (process 0) to do some computationally intensive work. Actually, it just executes some code over which I don't have control, so I cannot make it work asynchronously. What I can do (and have done) is make the code that is executed regularly call a function to report it's progress and provided partial results. This progress report is then send back to the original process via IPC.
But in rare cases the partial results are not correct, so they have to be modified. To do so I need some data I can calculate independently from the normal calculation. However, this calculation could take several seconds; thus, I start another process (process 2) to do this calculation and provide the result to process 1, via an IPC message. Now process 1 and 2 are happily calculating there stuff, and hopefully the corrective data calculated by process 2 is finished before process 1 needs it. But sometimes one of the early results of process 1 needs to be corrected and in that case I have to wait for process 2 to finish its calculation. Blocking the event loop of process 1 is theoretically not a problem, since the main process (process 0) would not be be affected by it. The only problem is, that by preventing the further execution of code in process 1 I am also blocking the event loop, which prevents it from ever receiving the result from process 2.
So I need to somehow pause the further execution of code in process 1 without blocking the event loop. I was hoping that there was a call like process.runEventLoopIteration that executes an iteration of the event loop and then returns.
I would then change the code like this:
function getImportantData() {
while (importantData === undefined) {
process.runEventLoopIteration();
}
if (importantData === null) {
throw new Error("Data could not be generated.");
} else {
// we should have a proper data now
return importantData;
}
}
thus executing the event loop until I have received the necessary data but NOT continuing the execution of the code that called getImportantData().
Basically what I'm doing in process 1 is this:
function callback(partialDataMessage) {
if (partialDataMessage.needsCorrection) {
getImportantData();
// use data to correct message
process.send(correctedMessage); // send corrected result to main process
} else {
process.send(partialDataMessage); // send unmodified result to main process
}
}
function executeCode(code) {
run(code, callback); // the callback will be called from time to time when the code produces new data
// this call is synchronous, run is blocking until the calculation is finished
// so if we reach this point we are done
// the only way to pause the execution of the code is to NOT return from the callback
}
Actual application/implementation/problem
I need this behaviour for the following application. If you have a better approach to achieve this feel free to propose it.
I want to execute arbitrary code and be notified about what variables it changes, what functions are called, what exceptions occur etc. I also need the location of these events in the code to be able to display the gathered information in the UI next to the original code.
To achieve this, I instrument the code and insert callbacks into it. I then execute the code, wrapping the execution in a try-catch block. Whenever the callback is called with some data about the execution (e.g. a variable change) I send a message to the main process telling it about the change. This way, the user is notified about the execution of the code, while it is running. The location information for the events generated by these callbacks is added to the callback call during the instrumentation, so that is not a problem.
The problem appears, when an exception occurs. I also want to notify the user about exceptions in the tested code. Therefore, I wrapped the execution of the code in a try-catch and any exceptions that get out of the execution are caught and send to the user interface. But the location of the errors is not correct. An Error object created by node.js has a complete call stack so it knows where it occurred. But this location if relative to the instrumented code, so I cannot use this location information as is, to display the error next to the original code. I need to transform this location in the instrumented code into a location in the original code. To do so, after instrumenting the code, I calculate a source map to map locations in the instrumented code to locations in the original code. However, this calculation might take several seconds. So, I figured, I would start a child process to calculate the source map, while the execution of the instrumented code is already started. Then, when an exception occurs, I check whether the source map has already been calculated, and if it hasn't I wait for the calculation to finish to be able to correct the location.
Since the code to be executed and watched can be completely arbitrary I cannot trivially rewrite it to be asynchronous. I only know that it calls the provided callback, because I instrumented the code to do so. I also cannot just store the message and return to continue the execution of the code, checking back during the next call whether the source map has been finished, because continuing the execution of the code would also block the event-loop, preventing the calculated source map from ever being received in the execution process. Or if it is received, then only after the code to execute has completely finished, which could be quite late or never (if the code to execute contains an infinite loop). But before I receive the sourceMap I cannot send further updates about the execution state. Combined, this means I would only be able to send the corrected progress messages after the code to execute has finished (which might be never) which completely defeats the purpose of the program (to enable the programmer to watch what the code does, while it executes).
Temporarily surrendering control to the event loop would solve this problem. However, that does not seem to be possible. The other idea I have is to introduce a third process which controls both the execution process and the sourceMapGeneration process. It receives progress messages from the execution process and if any of the messages needs correction it waits for the sourceMapGeneration process. Since the processes are independent, the controlling process can store the received messages and wait for the sourceMapGeneration process while the execution process continues executing, and as soon as it receives the source map, it corrects the messages and sends all of them off.
However, this would not only require yet another process (overhead) it also means I have to transfer the code once more between processes and since the code can have thousands of line that in itself can take some time, so I would like to move it around as little as possible.
I hope this explains, why I cannot and didn't use the usual "asynchronous callback" approach.
Adding a third ( :) ) solution to your problem after you clarified what behavior you seek I suggest using Fibers.
Fibers let you do co-routines in nodejs. Coroutines are functions that allow multiple entry/exit points. This means you will be able to yield control and resume it as you please.
Here is a sleep function from the official documentation that does exactly that, sleep for a given amount of time and perform actions.
function sleep(ms) {
var fiber = Fiber.current;
setTimeout(function() {
fiber.run();
}, ms);
Fiber.yield();
}
Fiber(function() {
console.log('wait... ' + new Date);
sleep(1000);
console.log('ok... ' + new Date);
}).run();
console.log('back in main');
You can place the code that does the waiting for the resource in a function, causing it to yield and then run again when the task is done.
For example, adapting your example from the question:
var pausedExecution, importantData;
function getImportantData() {
while (importantData === undefined) {
pausedExecution = Fiber.current;
Fiber.yield();
pausedExecution = undefined;
}
if (importantData === null) {
throw new Error("Data could not be generated.");
} else {
// we should have proper data now
return importantData;
}
}
function callback(partialDataMessage) {
if (partialDataMessage.needsCorrection) {
var theData = getImportantData();
// use data to correct message
process.send(correctedMessage); // send corrected result to main process
} else {
process.send(partialDataMessage); // send unmodified result to main process
}
}
function executeCode(code) {
// setup child process to calculate the data
importantDataCalculator = fork("./runtime");
importantDataCalculator.on("message", function (msg) {
if (msg.type === "result") {
importantData = msg.data;
} else if (msg.type === "error") {
importantData = null;
} else {
throw new Error("Unknown message from dataGenerator!");
}
if (pausedExecution) {
// execution is waiting for the data
pausedExecution.run();
}
});
// wrap the execution of the code in a Fiber, so it can be paused
Fiber(function () {
runCodeWithCallback(code, callback); // the callback will be called from time to time when the code produces new data
// this callback is synchronous and blocking,
// but it will yield control to the event loop if it has to wait for the child-process to finish
}).run();
}
Good luck! I always say it is better to solve one problem in 3 ways than solving 3 problems the same way. I'm glad we were able to work out something that worked for you. Admittingly, this was a pretty interesting question.
The rule of asynchronous programming is, once you've entered asynchronous code, you must continue to use asynchronous code. While you can continue to call the function over and over via setImmediate or something of the sort, you still have the issue that you're trying to return from an asynchronous process.
Without knowing more about your program, I can't tell you exactly how you should structure it, but by and large the way to "return" data from a process that involves asynchronous code is to pass in a callback; perhaps this will put you on the right track:
function getImportantData(callback) {
importantDataCalculator = fork("./runtime");
importantDataCalculator.on("message", function (msg) {
if (msg.type === "result") {
callback(null, msg.data);
} else if (msg.type === "error") {
callback(new Error("Data could not be generated."));
} else {
callback(new Error("Unknown message from sourceMapGenerator!"));
}
});
}
You would then use this function like this:
getImportantData(function(error, data) {
if (error) {
// handle the error somehow
} else {
// `data` is the data from the forked process
}
});
I talk about this in a bit more detail in one of my screencasts, Thinking Asynchronously.
What you are running into is a very common scenario that skilled programmers who are starting with nodejs often struggle with.
You're correct. You can't do this the way you are attempting (loop).
The main process in node.js is single threaded and you are blocking the event loop.
The simplest way to resolve this is something like:
function getImportantData() {
if(importantData === undefined){ // not set yet
setImmediate(getImportantData); // try again on the next event loop cycle
return; //stop this attempt
}
if (importantData === null) {
throw new Error("Data could not be generated.");
} else {
// we should have a proper data now
return importantData;
}
}
What we are doing, is that the function is re-attempting to process the data on the next iteration of the event loop using setImmediate.
This introduces a new problem though, your function returns a value. Since it will not be ready, the value you are returning is undefined. So you have to code reactively. You need to tell your code what to do when the data arrives.
This is typically done in node with a callback
function getImportantData(err,whenDone) {
if(importantData === undefined){ // not set yet
setImmediate(getImportantData.bind(null,whenDone)); // try again on the next event loop cycle
return; //stop this attempt
}
if (importantData === null) {
err("Data could not be generated.");
} else {
// we should have a proper data now
whenDone(importantData);
}
}
This can be used in the following way
getImportantData(function(err){
throw new Error(err); // error handling function callback
}, function(data){ //this is whenDone in our case
//perform actions on the important data
})
Your question (updated) is very interesting, it appears to be closely related to a problem I had with asynchronously catching exceptions. (Also Brandon and Ihad an interesting discussion with me about it! It's a small world)
See this question on how to catch exceptions asynchronously. The key concept is that you can use (assuming nodejs 0.8+) nodejs domains to constrain the scope of an exception.
This will allow you to easily get the location of the exception since you can surround asynchronous blocks with atry/catch. I think this should solve the bigger issue here.
You can find the relevant code in the linked question. The usage is something like:
atry(function() {
setTimeout(function(){
throw "something";
},1000);
}).catch(function(err){
console.log("caught "+err);
});
Since you have access to the scope of atry you can get the stack trace there which would let you skip the more complicated source-map usage.
Good luck!

Convert asynchronous/callback method to blocking/synchronous method

Is is possible to convert an asynchronous/callback based method in node to blocking/synchronous method?
I'm curious, more from a theoretical POV, than a "I have a problem to solve" POV.
I see how callback methods can be converted to values, via Q and the like, but calling Q.done() doesn't block execution.
The node-sync module can help you do that. But please be careful, this is not node.js way.
To turn asynchronous functions to synchronous in 'multi-threaded environment', we need to set up a loop checking the result, therefore cause blocking.
Here’s the example code in JS:
function somethingSync(args){
var ret; //the result-holding variable
//doing something async here...
somethingAsync(args,function(result){
ret = result;
});
while(ret === undefined){} //wait for the result until it's available,cause the blocking
return ret;
}
OR
synchronize.js also helps.
While I would not recommend it, this can easy be done using some sort of busy wait. For instance:
var flag = false;
asyncFunction( function () { //This is a callback
flag = true;
})
while (!flag) {}
The while loop will continuously loop until the callback has executed, thus blocking execution.
As you can imagine this would make your code very messy, so if you are going to do this (which I wouldn't recommend) you should make some sort of helper function to wrap your async function; similar to Underscore.js's Function functions, such as throttle. You can see exactly how these work by looking at the annotated source.

Sequence of code execution in Node.js app

I have always wondered about this and have never found a convincing answer.
Please consider the following case:
var toAddress = '';
if(j==1)
{
toAddress="abc#mydomain.com";
}
else
{
toAddress="xyz#mydomain.com";
}
sendAlertEmail(toAddress);
Can I be certain that by the time my sendAlertEmail() function is called, I will have 'toAddress' populated?
For code like the sample you provided:
var toAddress = '';
if(j==1)
{
toAddress="abc#mydomain.com";
}
else
{
toAddress="xyz#mydomain.com";
}
sendAlertEmail(toAddress);
You can definitely be certain that it is strictly sequential. That is to say that the value of toAddress is either "abc#mydomain.com" or "xyz#mydomain.com".
But, for code like the following:
var toAddress = '';
doSomething(function(){
if(j==1)
{
toAddress="abc#mydomain.com";
}
else
{
toAddress="xyz#mydomain.com";
}
});
sendAlertEmail(toAddress);
Then it depends on whether the function doSomething is asynchronous or not. The best place to find out is the documentation. The second best is looking at the implementation.
If doSomething is not asynchronous then the code execution is basically sequential and you can definitely be certain that toAddress is properly populated.
However, if doSomething is asynchronous then you can generally be certain that the code execution is NOT sequential. Since that is one of the basic behavior of asynchronous functions - that they return immediately and execute the functions passed to them at a later time.
Not all functions that operate on functions are asynchronous. An example of synchronous function is the forEach method of arrays. But all asynchronous functions accept functions as arguments. That's because it's the only way to have some piece of code executed at the end of the asynchronous operation. So whenever you see functions taking functions as arguments you should check if it's asynchronous or not.
Node.js is single threaded (or at least the JS execution is) so since all the above code is synchronous and lined up to all occur during the same tick it will run in order and thus toAddress must be populated.
Things get complicated once you introduce an asynchronous function. In the asynchronous case it it possible for variable to shift between lines, since ticks occur between them.
To clarify during each tick the code is simply evaluated from the top of the execution to the bottom. During the first tick the scope of execution is the whole file, but after that it's callbacks and handlers.
The code that you wrote was pretty simple to point out the asynchronous behavior. Take a look at this code :
var toAddress = 'abc#mydomain.com';
if(j==1)
{ func1(toAddress); }
else
{ func2(toAddress); }
sendAlertEmail(toAddress);
There is no guarantee that sendAlertEmail will execute only after func1 or func2 (the if else conditional) has been executed. In node functions return immediately when they are called and execute the next function called. If you want to make sure they execute sequentially use callbacks or use a library like async.

Resources