What is the I/O strategy of the following several web server? - node.js

There are five basic I/O model:
blocking IO
nonblocking IO
IO multiplexing
signal driven IO
asynchronous IO
I'm wondering which one is used in nodejs and tornado?(maybe 3rd or 4th?)
And is there a web server that uses the real async IO( 5th, using aio_xxx lib) ?

The short answer is, NodeJs uses I/O multiplexing for network I/O and a blocking I/O with a thread pool for disk I/O.
Here goes the long answer:
Nodejs uses a library called libuv for all I/O. libuv, as you can see in the diagram below (taken from http://docs.libuv.org/en/v1.x/design.html), internally uses the system calls epoll (in Linux), kqueue (in Free BSD), event ports (in Solaris) and IOCP (in Windows).
These system calls are basically I/O multiplexing (network I/O, not disk I/O). The key idea here is:
The application thread registers the file descriptors it is interested in with the kernel
The kernel maintains these data in its own internal data structures. It also maintains a list of all application threads to wake up against each file descriptor. This allows the kernel to wake up the threads efficiently when the file descriptor (socket) becomes ready for read (socket buffer is full with data) or write (buffer is empty for writing data)
The kernel also does other optimizations such as coalescing multiple events of a single file descriptor
This idea mainly originated in a paper by Banga et al which inspired the development kqueue and epoll.
Even before these system calls were available, I/O multiplexing existed in the form of system calls select and poll which did not scale well. select and poll mainly required the application thread to submit the list of file descriptors they are interested in with every call. The kernel is stateless for these system calls. This resulted in multiple scans of the list by both kernel and the application causing scalability issues.
Now as for the asynchronous I/O, I think it mainly refers to the POSIX AIO specification. Given the way network I/O are handled by I/O multiplexing, the POSIX specification could be useful for disk I/O only. However, libuv doesn't use it and probably this is not is use by any webserver mainly because of poor implementations, not all disk operations could be asynchronous etc. The detailed list of reasons it is not used by libuv are mentioned here.

Nonblocking I/O used in nodejs, tornado used asynchronous & nonoblocking model because one operation can active at the same time.
Also NGINX server uses an async.

Related

Solution to Blocking system call by worker thread in Node.js

I recently learnt about user level threads and kernel level threads in Operating System book by tanenbaum. Since user level threads are handled by library packages and since had worked with node.js a bit, i concluded that node.js uses libuv for handling worker threads and hence uses user level threading.
But I wanted to know how node.js deals with the case when some worker thread makes a system call that is blocking and then the kernel will block the entire process even if some threads are capable of running.
But I wanted to know how node.js deals with the case when some worker thread makes a system call that is blocking and then the kernel will block the entire process even if some threads are capable of running.
This isn't what happens in a modern OS. Just because one thread in a process is reading/writing from the disk, the OS does NOT block the entire process from doing anything with its other threads.
Modern hardware uses DMA (Direct Memory Access) for reading/writing to disks precisely so that the CPU does not have to be blocked while a block of data is read from or written to a disk.

What is the implementation detail of libevent? An encapsulation of poll mechanism?

libevent provides programmers the ability to write asynchronous programs due to the event notifications and callback function supports. I don't know if the linux kernel provides such system calls to notify user space applications when a particular event occurs. When people use non-blocking operations, they have to poll all the file descriptors to check whether some of them are ready. However, I think libevent maybe utilize some other advanced means to fulfill this task, e.g. with some unbeknown system calls.
So, 1) how does libevent check the status of different events? By polling or other mechanism? 2) Does libevent forks subprocesses of threads when it runs?
The libevent home page reads (in part),
Currently, libevent supports /dev/poll, kqueue(2), event ports, POSIX select(2), Windows select(), poll(2), and epoll(4).
For modern Linux the answer is thus epoll.

Is there a way to get notified when a packet arrives over socket rather than keep on polling using recv()

I have an application which keeps waiting for a packet over UDP. I do this using recv() call (NON-BLOCKING).
The application is multi-threaded, the purpose of other threads is to do some processing when the particular packet is received.
Since, in IDLE times, one thread keeps on polling for packet the CPU usage for 1 core is near 100%.
Therefore, to remove this intensive polling (and in general, for information) is there a way such that I can get notified when the packet is received? i.e. something similar to registering a parse callback which can be called when any packet is received on that socket.
P.S. I cannot have a delay of more than 5 ms between successive recv() calls.
OS Info : Debian 8u2, Kernel 3.16
Platform : Intel i3, x86_64
There are several ways how to get informed about received data.
select()
As mentioned in comments above select is an old and highly portable mechanism how to wake up a thread when socket is ready for reading and writing. The select has a bad performance if the number of socket is high because the sets of sockets cannot be reused between calls and it is required to iterate over whole set of sockets to find which is readable or writable. The sockets added to set for select should not be written and read from another thread so ot is difficult to use it in multithreaded application. An example how to use it is in man select.
poll()
It is a newer mechanism that select. It eliminates some select performance drawbacks but it some are still present like iterating through set of sockets to find which socket is readable or writable. poll is portable across unixes and windows supports it since Vista.
epoll()
epoll is a modern linux specific polling method. It is quite new (added to kernel in 2002). It eliminates almost all of poll and select performance problems. The only drawback is that it is not portable outside linux ecosystem. Some OSes have own proprietary polling mechanism as well. For example FreeBSD has kpoll.
library based polling
The low level access to select, poll, epoll can be encapsulated and a library may provide unified API for all of these methods. The well know library providing that is http://libevent.org/

Designing multithread application (looking for design patterns)

I'm preparing to write a multithread network application. At the moment I'm wondering what's the best thread pattern for my program. Whole application will handle up to 1000 descriptors (local files, network connections on various protocols and additional descriptors for timers and signals handling). Application will be optimized for Linux. Program will run on regular personal computers, so I assume, that they will have at least Pentium 4.
Here's my current idea:
One thread will handle network I/O
using epoll.
Second thread will
handle local-like I/O (disk I/O,
timers, signal handling) using epoll
Third thread
will handle UI (CLI, GTK+ or Qt)
Handling each network connection in separate thread will kill CPU because of too many context switches.
Maybe there's better way to do this?
Do you know any documents/books about designing multirhread applications? I'm looking for answers on questions like: What's the rational number of threads? etc.
You're on the right track. You want to use a thread pool pattern to handle the networking rather than one thread per network connection.
This website may also be helpful to you and lists the most common design patterns and in what situations they can be used.
http://sourcemaking.com/design_patterns/
To handle the disk I/O you might like to consider using mmap under linux. It's very fast and efficient. That way, you will let the kernel do the work and you probably won't need a separate thread for that.
I'm currently playing with Boost::asio which seems to be quite good. It uses epoll on linux. As it appears you are using a cross platform gui toolkit like Qt, then boost asio will also provide cross platform support so you will be able to use it on windows or linux. I think there might be a cross platform mmap too.

Linux and I/O completion ports?

Using winsock, you can configure sockets or seperate I/O operations to "overlap". This means that calls to perform I/O are returned immediately, while the actual operations are completed asynchronously by separate worker threads.
Winsock also provides "completion ports". From what I understand, a completion port acts as a multiplexer of handles (sockets). A handle can be demultiplexed if it isn't in the middle of an I/O operation, i.e. if all its I/O operations are completed.
So, on to my question... does linux support completion ports or even asynchronous I/O for sockets?
If you're looking for something exactly like IOCP, you won't find it, because it doesn't exist.
Windows uses a notify on completion model (hence I/O Completion Ports). You start some operation asynchronously, and receive a notification when that operation has completed.
Linux applications (and most other Unix-alikes) generally use a notify on ready model. You receive a notification that the socket can be read from or written to without blocking. Then, you do the I/O operation, which will not block.
With this model, you don't need asynchronous I/O. The data is immediately copied into / out of the socket buffer.
The programming model for this is kind of tricky, which is why there are abstraction libraries like libevent. It provides a simpler programming model, and abstracts away the implementation differences between the supported operating systems.
There is a notify on ready model in Windows as well (select or WSAWaitForMultipleEvents), which you may have looked at before. It can't scale to large numbers of sockets, so it's not suitable for high-performance network applications.
Don't let that put you off - Windows and Linux are completely different operating systems. Something that doesn't scale well on one system may work very well on the other. This approach actually works very well on Linux, with performance comparable to IOCP on Windows.
IOCP is pronounced "asynchronous I/O" on various UNIX platforms:
POSIX AIO is the standard
Kernel AIO, epoll and io_uring seem to be a Linux-specific implementations
Kqueue is the *BSD and Mac OSX implementation
Message Passing Interface (MPI) is an option for high-performance computing
obligatory Boost reference - Boost.Asio
Use boost::asio. Hands down. It has a mild learning curve, but it's cross-platform, and automatically uses the best available method for the system you're compiling on. There's simply no reason not to.
I know that this isn't quite an answer to your question, but it's the best advice I could give.
So, on to my question... does linux support completion ports or even asynchronous I/O for sockets?
With regard to sockets, in 5.3 and later kernels, Linux has something analogous to completion ports in the shape of io_uring (for files/block devices io_uring support appeared in the 5.1 kernel).
Read the blog entry from Google on libevent, you can implement IOCP semantics on Unix using asynchronous IO but cannot directly implement asynchronous IO semantics using IOCP,
http://google-opensource.blogspot.com/2010/01/libevent-20x-like-libevent-14x-only.html
For an example cross platform asynchronous IO with a BSD socket API look at ZeroMQ as recently published on LWN.net,
http://www.zeromq.org/
LWN article,
http://lwn.net/Articles/370307/
Boost ASIO implements Windows style IOCP (Proactor design pattern) on Linux using epoll (Reactor pattern). See http://think-async.com/Asio/asio-1.5.3/doc/asio/overview/core/async.html

Resources