Solution to Blocking system call by worker thread in Node.js - node.js

I recently learnt about user level threads and kernel level threads in Operating System book by tanenbaum. Since user level threads are handled by library packages and since had worked with node.js a bit, i concluded that node.js uses libuv for handling worker threads and hence uses user level threading.
But I wanted to know how node.js deals with the case when some worker thread makes a system call that is blocking and then the kernel will block the entire process even if some threads are capable of running.

But I wanted to know how node.js deals with the case when some worker thread makes a system call that is blocking and then the kernel will block the entire process even if some threads are capable of running.
This isn't what happens in a modern OS. Just because one thread in a process is reading/writing from the disk, the OS does NOT block the entire process from doing anything with its other threads.
Modern hardware uses DMA (Direct Memory Access) for reading/writing to disks precisely so that the CPU does not have to be blocked while a block of data is read from or written to a disk.

Related

What is the I/O strategy of the following several web server?

There are five basic I/O model:
blocking IO
nonblocking IO
IO multiplexing
signal driven IO
asynchronous IO
I'm wondering which one is used in nodejs and tornado?(maybe 3rd or 4th?)
And is there a web server that uses the real async IO( 5th, using aio_xxx lib) ?
The short answer is, NodeJs uses I/O multiplexing for network I/O and a blocking I/O with a thread pool for disk I/O.
Here goes the long answer:
Nodejs uses a library called libuv for all I/O. libuv, as you can see in the diagram below (taken from http://docs.libuv.org/en/v1.x/design.html), internally uses the system calls epoll (in Linux), kqueue (in Free BSD), event ports (in Solaris) and IOCP (in Windows).
These system calls are basically I/O multiplexing (network I/O, not disk I/O). The key idea here is:
The application thread registers the file descriptors it is interested in with the kernel
The kernel maintains these data in its own internal data structures. It also maintains a list of all application threads to wake up against each file descriptor. This allows the kernel to wake up the threads efficiently when the file descriptor (socket) becomes ready for read (socket buffer is full with data) or write (buffer is empty for writing data)
The kernel also does other optimizations such as coalescing multiple events of a single file descriptor
This idea mainly originated in a paper by Banga et al which inspired the development kqueue and epoll.
Even before these system calls were available, I/O multiplexing existed in the form of system calls select and poll which did not scale well. select and poll mainly required the application thread to submit the list of file descriptors they are interested in with every call. The kernel is stateless for these system calls. This resulted in multiple scans of the list by both kernel and the application causing scalability issues.
Now as for the asynchronous I/O, I think it mainly refers to the POSIX AIO specification. Given the way network I/O are handled by I/O multiplexing, the POSIX specification could be useful for disk I/O only. However, libuv doesn't use it and probably this is not is use by any webserver mainly because of poor implementations, not all disk operations could be asynchronous etc. The detailed list of reasons it is not used by libuv are mentioned here.
Nonblocking I/O used in nodejs, tornado used asynchronous & nonoblocking model because one operation can active at the same time.
Also NGINX server uses an async.

How does multithreaded kernel work?

I have read that linux kernel is multi threaded and there can be multiple threads running concurrently in each core. In a SMP (symmetric multiprocessing) environment where a single OS manages all the processors/cores how is multithreading implemented?
Is that kernel threads are spawned and each dedicated to manage a core. If so when are these kernel threads created? Is it during bootup at kern_init() after the bootstrapping is complete and immediately after the Application processors are enabled by the bootstrap processor.
So does each core have its own scheduler(implemented by the core's kernel thread) that manages the tasks from a common pool shared by all kernel threads?
How does (direct) messaging between kernel threads residing on different cores happen when they need to intimate some events that another kernel thread might be interested in?
I also thought if one particular selected core with one kernel scheduler that on every system timer interrupt acquire a big kernel lock and decide/schedule what to run on each core?
So I would appreciate any clarity in the implementation details. Thanks in advance for your help.
Early in kernel startup, a thread is started for each core. It is set to the lowest possible priority and generally does nothing but reduce the CPU power and wait for an interrupt. When actual work needs to get done, it's either done by threads other than these threads or by hardware interrupts which interrupt either this thread or some other thread.
The scheduler is typically invoked either by a timer interrupt or by a thread transitioning from running to a state in which it's no longer ready to run. Kernel calls that transition a thread to a state in which it's no longer ready to run typically invoke the scheduler to let the core perform some other task.

In single-threaded applications, is that one and only thread a kernel thread?

From Wikipedia it says:
A kernel thread is the "lightest" unit of kernel scheduling. At least one kernel thread exists within each process.
I've learned that a process is a container that houses memory space, file handles, device handles, system resources, etc... and the thread is the one that really gets scheduled by the kernel.
So in single-threaded applications, is that one thread(main thread i believe) a kernel thread?
I assume you are talking about this article:
http://en.wikipedia.org/wiki/Kernel_thread
According to that article, in a single threaded application, since you have only one thread by definition, it has to be a kernel thread, otherwise it will not get scheduled and will not run.
If you had more than one thread in your application, then it would depend on how user mode multi threading is implemented (kernel threads, fibers, etc ...).
It's important to note however it would be a kernel thread running in user mode, when executing the application code (unless you make a system call). Any attempt to execute a protected instruction when running in user mode would cause a fault that will eventually lead to the process being terminated.
So kernel thread here not to be confused with supervisor/privileged mode and kernel code.
You can execute kernel code, but you have to go through a system call gate first.
No. In modern operating systems applications and the kernel run at different processor protection levels (often called rings). For example, Intel CPUs have four protection levels. Kernel code runs at Ring 0 (kernel mode) and is able to execute the most privileged processor instructions, whereas application code runs at Ring 3 (user mode) and is not allowed to execute certain operations. See http://en.wikipedia.org/wiki/Ring_(computer_security)

Looking for a Linux threadpool api with OS scheduler support

I'm looking for a thread pool abstraction in Linux that provides the same level of kernel scheduler support that the Win32 thread pool provides. Specifically, I'm interested in finding a thread pool that maintains a certain number of running threads. When a running pool thread blocks on I/O, I want the thread pool to be smart enough to start another thread running.
Anyone know of anything like this for linux?
You really can't do this without OS support. There's no good way to tell that a thread is blocked on I/O. You wind up having to atomically increment a counter before each operation that might block and decrement it after. Then you need a thread to monitor that counter and create an additional thread if it's above zero. (Remove threads if they're idle more than a second or so.)
Generally speaking, it's not worth the effort. This only works so well on Windows because it's the "Windows way" and Windows is built from the ground up for it. For Linux, you should be using epoll or boost::asio. Use something that does things the "Linux way" rather than trying to make the Windows way work on non-Windows operating systems.
You can write your own wrappers that use IOCP on Windows, epoll on Linux, and so on. But these already exist, so you need not bother.

Threads and CPU Affinity

Lets say there are two processors on a machine. Thread A is running on P1 and Thread B is running on P2.
Thread A calls Sleep(10000);
Is it possible that when Thread A starts executing again, it runs on P2?
If yes, who decides this transition? If no, why not?
Does Processor store some data that which all threads it's running or OS binds each thread to Processor for its full lifetime ?
It is possible. This would be determined by the operating system process scheduler and may also be dependent on the application that is running. No information about previously running threads is kept by the processor, aside from whatever is in the cache.
This is dependent on many things, it behaves differently depending on the particular operating system. See also: Processor Affinity and Scheduling Algorithms. Under Windows you can pin a particular process to a processor core via the task manager.
Yes, it is possible. Though ultimately a thread inherits its CPU (or CPU core) from the process (executable.) In operating systems, which CPU or CPU core a process runs on for its current quanta (time slice) is decided by the Scheduler:
http://en.wikipedia.org/wiki/Scheduling_(computing)
-Oisin
The OS decides which processor to run the thread on, and it may easily change during the lifetime of that thread, especially if there is a context switch (caused by the sleep). It's completely possible if the system is loaded that both threads will be running on the same processor (or core), just at different times. Or if there isn't any load on the system, both threads may continue to run on separate processors.

Resources