Limit on maximum event polling in libuv - node.js

I am checking nodejs internals and is exploring libuv library. In that, there's a method uv__io_poll in which it poll for IO events. The code is bit hard to understand, so please forgive if the question is naive.
In the code(kqueue.c), the first line is
struct kevent events[1024];
I wanted to know that is there any upper limit of events we can listen at a given time.
Suppose I have node code that calls DB and is firing 3-4k read promises parallelly. Will libuv listen to only 1024 events in a cycle or can it listen to all of them?

Related

How multiple simultaneous requests are handled in Node.js when response is async?

I can imagine situation where 100 requests come to single Node.js server. Each of them require some DB interactions, which is implemented some natively async code - using task queue or at least microtask queue (e.g. DB driver interface is promisified).
How does Node.js return response when request handler stopped being sync? What happens to connection from api/web client where these 100 requests from description originated?
This feature is available at the OS level and is called (funnily enough) asynchronous I/O or non-blocking I/O (Windows also calls/called it overlapped I/O).
At the lowest level, in C (C#/Swift), the operating system provides an API to keep track of requests and responses. There are various APIs available depending on the OS you're on and Node.js uses libuv to automatically select the best available API at compile time but for the sake of understanding how asynchronous API works let's look at the API that is available to all platforms: the select() system call.
The select() function looks something like this:
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, time *timeout);
The fd_set data structure is a set/list of file descriptors that you are interested in watching for I/O activity. And remember, in POSIX sockets are also file descriptors. The way you use this API is as follows:
// Pseudocode:
// Say you just sent a request to a mysql database and also sent a http
// request to google maps. You are waiting for data to come from both.
// Instead of calling `read()` which would block the thread you add
// the sockets to the read set:
add mysql_socket to readfds
add maps_socket to readfds
// Now you have nothing else to do so you are free to wait for network
// I/O. Great, call select:
select(2, &readfds, NULL, NULL, NULL);
// Select is a blocking call. Yes, non-blocking I/O involves calling a
// blocking function. Yes it sounds ironic but the main difference is
// that we are not blocking waiting for each individual I/O activity,
// we are waiting for ALL of them
// At some point select returns. This is where we check which request
// matches the response:
check readfds if mysql_socket is set {
then call mysql_handler_callback()
}
check readfds if maps_socket is set {
then call maps_handler_callback()
}
go to beginning of loop
So basically the answer to your question is we check a data structure what socket/file just triggered an I/O activity and execute the appropriate code.
You no doubt can easily spot how to generalize this code pattern: instead of manually setting and checking the file descriptors you can keep all pending async requests and callbacks in a list or array and loop through it before and after the select(). This is in fact what Node.js (and javascript in general) does. And it is this list of callbacks/file-descriptors that is sometimes called the event queue - it is not a queue per-se, just a collection of things you are waiting to execute.
The select() function also has a timeout parameter at the end which can be used to implement setTimeout() and setInterval() and in browsers process GUI events so that we can run code while waiting for I/O. Because remember, select is blocking - we can only run other code if select returns. With careful management of timers we can calculate the appropriate value to pass as the timeout to select.
The fd_set data structure is not actually a linked list. In older implementations it is a bitfield. More modern implementation can improve on the bitfield as long as it complies with the API. But this partly explains why there is so many competing async API like poll, epoll, kqueue etc. They were created to overcome the limitations of select. Different APIs keep track of the file descriptors differently, some use linked lists, some hash tables, some catering for scalability (being able to listen to tens of thousands of sockets) and some catering for speed and most try to do both better than the others. Whatever they use, in the end what is used to store the request is just a data structure that keeps tracks of file descriptors.

Correct way to run synchronous code in node.js without blocking

I have a websocket server in node.js which allows users to solve a given puzzle.
I also have a code that generates random puzzle for about 20 seconds. In the meantime I still want to handle new connections/disconnects, but this synchronous code blocks the event loop.
Here's the simplified code:
io.on('connection', socket => {
//
});
io.listen(port);
setInterval(function() {
if (game.paused)
game.loadRound();
}, 1000);
loadRound runs about 20 seconds, that blocks all connections and setInterval itself
What would be the correct way to run this code without blocking event loop?
You have three basic choices:
Redesign loadRound() so that it doesn't block the event loop. Since you've shared none of the code for it, we can't advise on the feasibility of that, but if it's doing any I/O, then it does not need to block the event loop. Even if it's all just CPU work, it could be designed to do its job in small chunks to allow the event loop some cycles, but often that's more work to redesign it that way than options 2 and 3 below.
Move loadRound() to a worker thread (new in node.js) and communicate the result back via messaging.
Move loadRound() to a separate node.js process using the child_process module and communicate the result back via any number of means (stdio, messaging, etc...).

Where is the NodeJS idle loop?

Using ExpressJS and Socket.IO I have an HTML scene where multiple users can connect to NodeJS. I am about to do some animation that has to sync to all clients.
In the client, I know animation can be achieved by setInterval() (not time-ideal) then socket.emit() to NodeJS. But is there an Idle loop in NodeJS that can be used for master-controlling animations and io.sockets.emit() to update everyone about everyone?
EDIT: I want to do general "animation" of values in node.js e.g. pseudocode:
process.idle(function() {
// ...
itempos.x += (itempos.dest - itempos.x) / 20; // easing
itempos.y += (itempos.dest - itempos.y) / 20; // easing
io.sockets.broadcast('update', itempos);
// ...
});
Being a server-side framework it will rarely idle (CPU or I/O). Besides idleloop is more suited for DOM requirements. But in node.js you have the following functions:
process.nextTick : Execute callback after current event queue finishes i.e. at the beginning of next event loop. It does not allow I/O execution until maxTickDepth nextTick calls are executed. If used too much it can prevent I/O from occurring.
setImmediate : Execute callback after I/O callbacks in current event loop are finished. Allows I/O to happen between multiple setImmediate calls.
Given what you want setImmediate is more suited for your needs.
Check out the Timers docs: http://nodejs.org/api/timers.html
All of the timer functions are globals.
setInterval(callback, delay, [arg], [...])
To schedule the repeated execution of callback every delay milliseconds. Returns a intervalId for possible use with clearInterval(). Optionally you can also pass arguments to the callback.
For synchronized client animation it may make sense to do sequences in chunks at a slower rate than trying to squeeze as many websocket emissions into the animation duration. Human eyes are much slower than websockets in my experience.
There's tons of client frameworks that will do the easing for you, not a server concern.
(All of this oblivious to your use case, of course!)

How game servers with Boost:Asio work asynchronously?

I am trying to create a game server, and currently, I am making it with threads. Every object( a player , monster ), has its own thread with while(1) cycle , in witch particular functions are performed.
And the server basically works like this:
main(){
//some initialization
while(1)
{
//reads clients packet
//directs packet info to a particular object
//object performs some functions
//then server returns result packet back to client
Sleep(1);
}
I have heard that is not efficient to make the server using threads like that,
and I should consider to use Boost::Asio, and make the functions work asynchronously.
But I don't know how then the server would work. I would be grateful if someone would explain how basically such servers work.
Every object( a player , monster ), has its own thread.
I have heard that is not efficient to make the server using threads
like that
You are correct, this is not a scalable design. Consider a large game where you may have 10,000 objects or even a million. Such a design quickly falls apart when you require a thread per object. This is known as the C10K problem.
I should consider to use Boost::Asio, and make the functions work
asynchronously. But I don't know how then the server would work.
I would be grateful if someone would explain how basically such
servers work.
You should start by following the Boost::Asio tutorials, and pay specific attention to the Asynchronous TCP daytime server. The concept of asynchronous programming compared to synchronous programming is not difficult after you understand that the flow of your program is inverted. From a high level, your game server will have an event loop that is driven by a boost::asio::io_service. Overly simplified, it will look like this
int
main()
{
boost::asio::io_service io_service;
// add some work to the io_service
io_service.run(); // start event loop
// should never get here
}
The callback handlers that are invoked from the event loop will chain operations together. That is, once your callback for reading data from a client is invoked, the handler will initiate another asynchronous operation.
The beauty of this design is that it decouples threading from concurrency. Consider a long running operation in your game server, such as reading data from a client. Using asynchronous methods, your game server does not need to wait for the operation to complete. It will be notified when the operation has completed on behalf of the kernel.

Should I use for loop async way when I use node.js?

I'm testing with node.js with express.
Theoretically, If I run something very heavy calculation on a "for loop" without any callbacks,
it is blocked and other request should be ignored.
But In my case, regular "for loop"
for(var i=0;i<300000;i++) {
console.log( i );
}
does not make any request blocks but just high cpu load.
It accepts other requests as well.
but why should I use some other methods to make these non-blocking such as
process.nextTick()
Or does node.js take care of basic loop functions ( for, while ) with wrapping them with process.nextTick() as default?
Node runs in a single thread with an event loop, so as you said, when your for loop is executing, no other processing will happen. The underlying operating system TCP socket may very well accept incoming connections, but if node is busy doing your looping logic then the request itself won't be processed until afterward.
If you absolutely must run some long-running processin Node, then you should use separate worker processes to do the calculation, and leave the main event loop to do request handling.
Node doesn't wrap loops with process.nextTick().
It may be that your program is continuing to accept new connections because console.log is yielding control back to the main event loop; since it's an I/O operation.

Resources