NodeJS asynchronous/non-blocking io basic

NodeJS asynchronous/non-blocking io basic - node.js

Trying to get my head around a simple nodejs asynchronous way of handling i/o and a below simple snippet as an example leaves me in question marks.
// Just to simulate an io (webservice call).
var performRiskCheckViaWebservice = function(personPassportNumber, callback) {
console.log("Risk check called for personPassportNumber: "+personPassportNumber);
setTimeout(callback(personPassportNumber, "OK"), 5000);
}
function assessRisk(passportNumber) {
performRiskCheckViaWebservice(passportNumber, function(passportNumber, status){
console.log("Risk status of "+passportNumber+" is: "+status);
})
}
assessRisk("1");
assessRisk("2");
assessRisk("3");
In the above simple code snippet, my expectation is to see:
Risk check called for personPassportNumber: 1
Risk check called for personPassportNumber: 2
Risk check called for personPassportNumber: 3
And 5 seconds later:
Risk status of 1 is: OK
Risk status of 2 is: OK
Risk status of 3 is: OK
But the actual output is:
Risk check called for personPassportNumber: 1
Risk status of 1 is: OK
Risk check called for personPassportNumber: 2
Risk status of 2 is: OK
Risk check called for personPassportNumber: 3
Risk status of 3 is: OK
5 seconds later, the program halts.
What's wrong in my understanding?

As it turns out, I have to wrap the callback with an anonymous function inside the setTimeout(...). Here is the working version:
// Just to simulate an io (webservice call).
var performRiskCheckViaWebservice = function(personPassportNumber, callback) {
console.log("Risk check called for personPassportNumber: "+personPassportNumber);
setTimeout(function() {
callback(personPassportNumber, "OK")
}, 5000);
}
function assessRisk(passportNumber) {
performRiskCheckViaWebservice(passportNumber, function(personPassportNumber, status){
console.log("Risk status of "+personPassportNumber+" is: "+status);
})
}
assessRisk("1");
assessRisk("2");
assessRisk("3");

Here is a sync waterfall of your code :
sync assessRisk
sync performRiskCheckViaWebservice
sync console.log Risk.check
sync callback <== problem here
sync console.log Risk status
sync setTimeout 5000
so what you observe is normal.
you can replace
setTimeout(callback(personPassportNumber, "OK"), 5000);
by
setTimeout(callback.bind(null, personPassportNumber, "OK"), 5000);

Related

NodeJS SetInterval with Async/Await

I'm using the following code with a websocket to scrape data from a DB and return to users. It returns chat messages, users status and other things. The code works as expected (in testing) but I have a couple of questions.
setInterval(async () => {
if (connectedUserIDs.length > 0) {
logger.info("short loop...")
await eventsHelper.extractEvents(db, connectedUserIDs, connectedSockets, ws)
}
}, 5000)
Question 1. Will the SetInterval wait for the the "await" or will it just fire every 5 seconds? I assume it will fire every 5 seconds regardless.
If that is the case.
Question 2. Is there a way to repeat a task like above but ensure it only re-runs if the previous run as completed with a minimum time of 5 seconds? I want to avoid a situation where I get multiple queries running at the same time. Maybe the setInterval could be cancelled and restarted each time...
thankyou
Adam

Question 1. Will the SetInterval wait for the the "await" or will it just fire every 5 seconds? I assume it will fire every 5 seconds regardless.
Yes it will naively re-run every 5s
Question 2. Is there a way to repeat a task like above but ensure it only re-runs if the previous run as completed with a minimum time of 5 seconds
Easiest way to prevent collisions is to set the next run after the await:
let timeoutId;
const extractEvents = async () => {
if (connectedUserIDs.length > 0) {
logger.info("short loop...");
await eventsHelper.extractEvents(db, connectedUserIDs, connectedSockets, ws);
}
timeoutId = setTimeout(extractEvents, 5000);
};
timeoutId = setTimeout(extractEvents, 5000);
Storing timeoutId so you can clear it later

Understanding The NodeJS Internal execution

I'm trying to understand what happen under the hood
if I try to execute this NodeJS code :
http.createServer(function (request, response) {
response.writeHead(200, {'Content-Type': 'text/plain'});
response.end('Hello World\n');
}).listen(8081);
I have 2 cases about the above code :
1 . Modify the code to do some blocking in the end line of
the http.createServer callback function :
http.createServer(function (request, response) {
response.writeHead(200, {'Content-Type': 'text/plain'});
response.end('Hello World\n');
sleep(2000); //sleep 2 seconds after handling the first request
}).listen(8081);`
//found this code on the web, to simulate php like sleep function
function sleep(milliseconds)
{
var start = new Date().getTime();
for (var i = 0; i < 1e7; i++)
{
if ((new Date().getTime() - start) > milliseconds)
{
break;
}
}
}
I use this simple bash loop to do two requests to the NodeJS server
$for i in {1..2}; do curl http://localhost:1337; done
result on the client console :
Hello world #first iteration
after two second the next hello world is printed on client console
Hello world #second iteration
On the first iteration of the requests, the server can response immediately to the request.
But at the second iteration of the requests, the server is blocking, and return the response to requests after two second. This is because the sleep
function that blocking the request after handling the first request.
Modify the code, instead of using sleep, i'm using setTimeout in the end line of the http.createServer callback function.
http.createServer(function (request, response) {
response.writeHead(200, {'Content-Type': 'text/plain'});
response.end('Hello World\n');
setTimeout(function(){console.log("Done");}, 2000);
}).listen(8081);`
Again i'm using this simple bash loop to do the requests
for i in {1..2}; do curl http://localhost:1337; done
The result is the response is returned to the two requests immediately.
And the Hello world message is printed also immediately on the console.
This because I'm using the setTimeout function which itself is an asynchronous function.
I have questions about what happen here :
1.Am I right if I say : It is the responsibility for the programmer to make asynchronous call in NodeJS code so that the NodeJS internal can continue to execute other code or request without blocking.
2.The NodeJS internal Use the Google V8 Engine to execute the javascript code and using the libuv for doing the asynchronous thing.
The Event Loop is responsible for checking is there any event associated with callback occur in the event queue and check is there any remaining code in the call stack, if the event queue is not empty and call stack is empty then callback from event queue is pushed to stack, caused the callback to be executed.
The question is :
A. When doing Async thing in NodeJS, Is that execution of callback function is separated (by using libuv thread pool) from the execution of the code in NodeJS main thread?
B. How The Event Loop handle the connections if there is multiple connection arrive at the same time to the server?
I will highly appreciated every answers and try to learn from them.

Regarding few of your questions:
It is the responsibility for the programmer to make asynchronous call
in NodeJS code so that the NodeJS internal can continue to execute
other code or request without blocking.
Correct! notice that it is possible (if required) to execute synchronous blocking code. As example see all the 'Sync' functions of fs module like fs.accessSync
When doing Async thing in NodeJS, Is that execution of callback
function is separated (by using libuv thread pool) from the execution
of the code in NodeJS main thread
Node.js is single threaded, so there is no 'main thread'. When triggered, the execution of the callback function is the only code that is executed. The asynchronous design of node.js is accomplished by the 'Event Loop' as you mentioned
How The Event Loop handle the connections if there is multiple
connection arrive at the same time to the server?
There is no 'same time' really. one comes first, and the rest are being queued. Assuming you have no blocking code they should be handled quickly (you can and should load test your server and see how quick exactly)

First of all, I don't know what sleep does.
Basically event loop keeps a check on what resources are free and what are the needs of queued events if any. When you call setTimeout, it executes the console.log("Done") after 2 seconds. Did you program it to stop the overall execution of the function ? NO. You only asked that particular request to do something after sending down the response. You did not ask to stop the function execution or block events. You can read more about threads here. The program is asynchronous by itself.
Now if you want it to make synchronous, you need your own event loop. Can you take all of actions inside setTimeout.
setTimeout(function() {
response.end('Hello World\n');
response.writeHead(200, {'Content-Type': 'text/plain'});
console.log("Done");
}, 2000);
Do you still deny other requests to stop executing? NO. If you fire 2 requests simultaneously, you will get 2 responses simultaneously after 2 seconds.
Let us go deeper and control the requests more. Let there be two global variables counter = 0 and current_counter = 0. They reside outside http.create.... Once a request comes, we assign it a counter and execute it. Then we wait for 2 seconds and and increment the counter and execute the next request.
counter = 0;
current_counter = 0;
http.createServer(function (request, response) {
var my_count = counter; // my_count specific to each request, not common, not global
counter += 1;
while(current_counter <= my_count)
if (current_counter == my_count) {
setTimeout(function() {
response.end('Hello World\n');
response.writeHead(200, {'Content-Type': 'text/plain'});
console.log("Done");
return current_counter += 1;
}, 2000);
}
}
}).listen(8081);`
Try to understand what I did. I made my own event loop in the form of while loop. It listens to the condition that current_counter equals my_count. Imagine, 3 requests come in less that 2 seconds. Remember, we increment current_counter only after 2 seconds.
Requests which came in less than 2 seconds
A - current_counter = 0, my_count = 0 -> in execution, will send response after 2 seconds.
B - current_counter = 0 (increments after 2 seconds only), my_count = 1 -> stuck in while loop condition. Waiting current_counter to equal 1 for execution.
C - current_counter = 0, my_count = 2 -> stuck in while loop like previous request.
After 2 seconds, request A is responded by setTimeout. The variable current_count becomes 1 and request B's local variable my_count equals it and executes the setTimeout. Hence response to request B is sent and current_counter is incremented after 2 seconds which leads of execution of request C and so on.
You can queue as many requests as possible but the execution happens only after 2 seconds because of my own event loop which check for condition which in turn depends on setTimeout which executes only after 2 seconds.
Thanks !

node delay execution - What's right/wrong with it?

At first, I'm a newbie without experience in node js and would like to learn more. I wrote a delay function and I'm interessted, what you as a javascript professional think about it. What is good or bad on it and why?
I try to write a bot. It has 2 function. Function 1 starts function 2. But function 2 shall not start direct afterwards. It has to start with a delay.
Of course I made research for my topic and have found stuff like this:
How Can I Wait In Node.js (Javascript), l need to pause for a period of time
How to create a sleep/delay in nodejs that is Blocking?
Unfortunately I'm not able to understand and use it. Therefore I made my own try. It works on my computer, but should I bring it on a server?
//function 1 (example)
function start(){
...;
delay(2500, 'That could be an answer');
}
//Delay
function delay(ms, msg){
var started = new Date();
var now;
var diff = 0;;
while(diff < ms){
now = new Date();
diff = now - started;
console.log('Diff time: '+diff);
}
console.log('Delay started at: '+started);
console.log('Now time: '+now);
console.log('ms time: '+ms);
console.log('While loop is done.');
answer(msg);
}
//function 2 (example)
function answer(msg){
...
}
Thank's!

This is blocking.. your event loop will block executing this code. No other work will be done throughout the 2500 ms interval except for busy waiting inside the loop.
I'm not sure why you would want to do this. What you can do if you want to start function 2 at some point after function 1 is use setTimeout. This way, function 2 will be started after at least the time that you pass as argument to the setTimeout function while allowing other code to execute and not blocking the node event loop.
setTimeout(function(){
answer(msg);
}, 2500);

it does not work nevertheless. My delay time is more than an hour. Bute function 2 is executed after a couple of seconds.
setTimeout(function(){
answer(msg);
}, Math.floor(Math.random()*1000*87));

You can use bluebird promises with .delay to maintain your code more clean.
http://bluebirdjs.com/docs/api/promise.delay.html
Make your start function a promise then:
start().delay(2500).then(function (result) {
// result = start function return statment
});

How to forcibly keep a Node.js process from terminating?

TL;DR
What is the best way to forcibly keep a Node.js process running, i.e., keep its event loop from running empty and hence keeping the process from terminating? The best solution I could come up with was this:
const SOME_HUGE_INTERVAL = 1 << 30;
setInterval(() => {}, SOME_HUGE_INTERVAL);
Which will keep an interval running without causing too much disturbance if you keep the interval period long enough.
Is there a better way to do it?
Long version of the question
I have a Node.js script using Edge.js to register a callback function so that it can be called from inside a DLL in .NET. This function will be called 1 time per second, sending a simple sequence number that should be printed to the console.
The Edge.js part is fine, everything is working. My only problem is that my Node.js process executes its script and after that it runs out of events to process. With its event loop empty, it just terminates, ignoring the fact that it should've kept running to be able to receive callbacks from the DLL.
My Node.js script:
var
edge = require('edge');
var foo = edge.func({
assemblyFile: 'cs.dll',
typeName: 'cs.MyClass',
methodName: 'Foo'
});
// The callback function that will be called from C# code:
function callback(sequence) {
console.info('Sequence:', sequence);
}
// Register for a callback:
foo({ callback: callback }, true);
// My hack to keep the process alive:
setInterval(function() {}, 60000);
My C# code (the DLL):
public class MyClass
{
Func<object, Task<object>> Callback;
void Bar()
{
int sequence = 1;
while (true)
{
Callback(sequence++);
Thread.Sleep(1000);
}
}
public async Task<object> Foo(dynamic input)
{
// Receives the callback function that will be used:
Callback = (Func<object, Task<object>>)input.callback;
// Starts a new thread that will call back periodically:
(new Thread(Bar)).Start();
return new object { };
}
}
The only solution I could come up with was to register a timer with a long interval to call an empty function just to keep the scheduler busy and avoid getting the event loop empty so that the process keeps running forever.
Is there any way to do this better than I did? I.e., keep the process running without having to use this kind of "hack"?

The simplest, least intrusive solution
I honestly think my approach is the least intrusive one:
setInterval(() => {}, 1 << 30);
This will set a harmless interval that will fire approximately once every 12 days, effectively doing nothing, but keeping the process running.
Originally, my solution used Number.POSITIVE_INFINITY as the period, so the timer would actually never fire, but this behavior was recently changed by the API and now it doesn't accept anything greater than 2147483647 (i.e., 2 ** 31 - 1). See docs here and here.
Comments on other solutions
For reference, here are the other two answers given so far:
Joe's (deleted since then, but perfectly valid):
require('net').createServer().listen();
Will create a "bogus listener", as he called it. A minor downside is that we'd allocate a port just for that.
Jacob's:
process.stdin.resume();
Or the equivalent:
process.stdin.on("data", () => {});
Puts stdin into "old" mode, a deprecated feature that is still present in Node.js for compatibility with scripts written prior to Node.js v0.10 (reference).
I'd advise against it. Not only it's deprecated, it also unnecessarily messes with stdin.

Use "old" Streams mode to listen for a standard input that will never come:
// Start reading from stdin so we don't exit.
process.stdin.resume();

Here is IFFE based on the accepted answer:
(function keepProcessRunning() {
setTimeout(keepProcessRunning, 1 << 30);
})();
and here is conditional exit:
let flag = true;
(function keepProcessRunning() {
setTimeout(() => flag && keepProcessRunning(), 1000);
})();

You could use a setTimeout(function() {""},1000000000000000000); command to keep your script alive without overload.

spin up a nice repl, node would do the same if it didn't receive an exit code anyway:
import("repl").then(repl=>
repl.start({prompt:"\x1b[31m"+process.versions.node+": \x1b[0m"}));

I'll throw another hack into the mix. Here's how to do it with Promise:
new Promise(_ => null);
Throw that at the bottom of your .js file and it should run forever.

Handling multiple parallel HTTP requests in Node.js

I know that Node is non-blocking, but I just realized that the default behaviour of http.listen(8000) means that all HTTP requests are handled one-at-a-time. I know I shouldn't have been surprised at this (it's how ports work), but it does make me seriously wonder how to write my code so that I can handle multiple, parallel HTTP requests.
So what's the best way to write a server so that it doesn't hog port 80 and long-running responses don't result in long request queues?
To illustrate the problem, try running the code below and loading it up in two browser tabs at the same time.
var http = require('http');
http.createServer(function (req, res) {
res.setHeader('Content-Type', 'text/html; charset=utf-8');
res.write("<p>" + new Date().toString() + ": starting response");
setTimeout(function () {
res.write("<p>" + new Date().toString() + ": completing response and closing connection</p>");
res.end();
}, 4000);
}).listen(8080);

You are misunderstanding how node works. The above code can accept TCP connections from hundreds or thousands of clients, read the HTTP requests, and then wait the 4000 ms timeout you have baked in there, and then send the responses. Each client will get a response in about 4000 + a small number of milliseconds. During that setTimeout (and during any I/O operation) node can continue processing. This includes accepting additional TCP connections. I tested your code and the browsers each get a response in 4s. The second one does NOT take 8s, if that is how you think it works.
I ran curl -s localhost:8080 in 4 tabs as quickly as I can via the keyboard and the seconds in the timestamps are:
54 to 58
54 to 58
55 to 59
56 to 00
There's no issue here, although I can understand how you might think there is one. Node would be totally broken if it worked as your post suggested.
Here's another way to verify:
for i in 1 2 3 4 5 6 7 8 9 10; do curl -s localhost:8080 &;done

Your code can accept multiple connections because the job is done in callback function of the setTimeout call.
But if you instead of setTimeout do a heavy job... then it is true that node.js will not accept other multiple connections! SetTimeout accidentally frees the process so the node.js can accept other jobs and you code is executed in other "thread".
I don't know which is the correct way to implement this. But this is how it seems to work.

Browser blocks the other same requests. If you call it from different browsers then this will work parallelly.

I used following code to test request handling
app.get('/', function(req, res) {
console.log('time', MOMENT());
setTimeout( function() {
console.log(data, ' ', MOMENT());
res.send(data);
data = 'changing';
}, 50000);
var data = 'change first';
console.log(data);
});
Since this request doesn't take that much processing time, except for 50 sec of setTimeout and all the time-out were processed together like usually do.
Response 3 request together-
time moment("2017-05-22T16:47:28.893")
change first
time moment("2017-05-22T16:47:30.981")
change first
time moment("2017-05-22T16:47:33.463")
change first
change first moment("2017-05-22T16:48:18.923")
change first moment("2017-05-22T16:48:20.988")
change first moment("2017-05-22T16:48:23.466")
After this i moved to second phase... i.e., what if my request takes so much time to process a sync file or some thing else that take time.
app.get('/second', function(req, res) {
console.log(data);
if(req.headers.data === '9') {
res.status(200);
res.send('response from api');
} else {
console.log(MOMENT());
for(i = 0; i<9999999999; i++){}
console.log('Second MOMENT', MOMENT());
res.status(400);
res.send('wrong data');
}
var data = 'second test';
});
As my first request was still in process so my second didn't get accepted by Node. Thus i got following response of 2 request-
undefined
moment("2017-05-22T17:43:59.159")
Second MOMENT moment("2017-05-22T17:44:40.609")
undefined
moment("2017-05-22T17:44:40.614")
Second MOMENT moment("2017-05-22T17:45:24.643")
Thus For all Async functions theres a virtual thread in Node and Node does accept other request before completing previous requests async work like(fs, mysql,or calling API), however it keeps it self as single thread and does not process other request until all previous ones are completed.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

NodeJS asynchronous/non-blocking io basic - node.js

Related

NodeJS SetInterval with Async/Await

Understanding The NodeJS Internal execution

node delay execution - What's right/wrong with it?

How to forcibly keep a Node.js process from terminating?

Handling multiple parallel HTTP requests in Node.js

Categories

Resources