Node.JS kernel mode threading - node.js

I'm trying to figure out how does Node.JS (of its Windows version) is working behind the scenes.
I know there is user mode and kernel mode threads, and I know the processing model looks like this:
I also know that moving from a kernel mode thread to a user mode thread is consider to be a context switching.
Does Node.JS C++ Non-Blocking worker threads are kernel mode ? and where does the single event loop thread lives at kernel mode or user mode ?

As you know node.js has a single threaded architecture. The JavaScript environment and event-loop is managed by a single thread only, internally all the other threads are handled by a C++ level thread pool (like asynchronous I/O handled by libuv thread) .
To answer your question these node.js C++ non-blocking worker threads are not kernel mode. They are user mode. The event-loop thread is also user mode. The threads request kernel mode as and when needed.
When the CPU is in kernel mode, it is assumed to be executing trusted software. Kernel mode is the highest privelege level and the code has full access to all devices. In Windows, only select files written by Windows developers runs completely on kernel mode. All user mode software must request use of the kernel by means of a system call in order to perform privileged instructions, such as process creation or I/O operations.
All processes begin execution in user mode, and they switch to kernel mode only when obtaining a service provided by the kernel. This change in mode is termed mode switch, not context switch, which is the switching of the CPU from one process to another.
I hope it is clear to you that even user-mode threads can execute privileged operations (network access) via system calls, and return to user-mode when required task is finished. Node.js simply uses system calls.
Source : http://www.linfo.org/kernel_mode.html
Update
I should have mentioned that mode switch does not always mean context switch. Quoting the wiki:
When a transition between user mode and kernel mode is required in an
operating system, a context switch is not necessary; a mode transition
is not by itself a context switch. However, depending on the operating
system, a context switch may also take place at this time.
What you mention is also correct that mode switch can cause context switch. But it does not happen always. It is not desirable to have context switches (heavy performance penalty) whenever mode switch happens. What happens inside Windows is difficult to say, but most likely mode switch does not cause context switch every time.
Regarding the one-to-one thread model. Both Windows and Linux follow that. So given each user thread (like node.js event loop thread) OS provides a kernel thread, which takes care of the system calls. Node.js can only invoke mode switch through system calls. Context switch is controlled only by the kernel (thread scheduler).
Update 2
Yes, HTTP.SYS executes in kernel mode. But there is more to it. Node.js does not have many threads, so fewer context switching happens between threads unlike IIS. Context switch (mode switch) for each request is definitely less in HTTP.SYS. It is an improvement from past (which happened to be a disaster), see here. The context switching due to multiple threads is much more than reduction of context switch by using HTTP.SYS. So overall node.js has less context switches.
HTTP.SYS also has other advantages over node's own HTTP implementation that helps IIS. It may be possible (in future) to use HTTP.SYS from node itself to take those advantages. But for now, I don't think HTTP.SYS/IIS compete anywhere near node.js.

Related

Solution to Blocking system call by worker thread in Node.js

I recently learnt about user level threads and kernel level threads in Operating System book by tanenbaum. Since user level threads are handled by library packages and since had worked with node.js a bit, i concluded that node.js uses libuv for handling worker threads and hence uses user level threading.
But I wanted to know how node.js deals with the case when some worker thread makes a system call that is blocking and then the kernel will block the entire process even if some threads are capable of running.
But I wanted to know how node.js deals with the case when some worker thread makes a system call that is blocking and then the kernel will block the entire process even if some threads are capable of running.
This isn't what happens in a modern OS. Just because one thread in a process is reading/writing from the disk, the OS does NOT block the entire process from doing anything with its other threads.
Modern hardware uses DMA (Direct Memory Access) for reading/writing to disks precisely so that the CPU does not have to be blocked while a block of data is read from or written to a disk.

Is it possible to circumvent OS security by not using the supplied System Calls?

I understand that an Operating System forces security policies on users when they use the system and filesystem via the System Calls supplied by stated OS.
Is it possible to circumvent this security by implementing your own hardware instructions instead of making use of the supplied System Call Interface of the OS? Even writing a single bit to a file where you normally have no access to would be enough.
First, for simplicity, I'm considering the OS and Kernel are the same thing.
A CPU can be in different modes when executing code.
Lets say a hypothetical CPU has just two modes of execution (Supervisor and User)
When in Supervisor mode, you are allowed to execute any instructions, and you have full access to the hardware resources.
When in User mode, there is subset of instructions you don't have access to, such has instructions to deal with hardware or change the CPU mode. Trying to execute one of those instructions will cause the OS to be notified your application is misbehaving, and it will be terminated. This notification is done through interrupts. Also, when in User mode, you will only have access to a portion of the memory, so your application can't even touch memory it is not supposed to.
Now, the trick for this to work is that while in Supervisor Mode, you can switch to User Mode, since it's a less privileged mode, but while in User Mode, you can't go back to Supervisor Mode, since the instructions for that are not permitted anymore.
The only way to go back to Supervisor mode is through system calls, or interrupts. That enables the OS to have full control of the hardware.
A possible example how everything fits together for this hypothetical CPU:
The CPU boots in Supervisor mode
Since the CPU starts in Supervisor Mode, the first thing to run has access to the full system. This is the OS.
The OS setups the hardware anyway it wants, memory protections, etc.
The OS launches any application you want after configuring permissions for that application. Launching the application switches to User Mode.
The application is running, and only has access to the resources the OS allowed when launching it. Any access to hardware resources need to go through System Calls.
I've only explained the flow for a single application.
As a bonus to help you understand how this fits together with several applications running, a simplified view of how preemptive multitasking works:
In a real-world situation. The OS will setup an hardware timer before launching any applications.
When this timer expires, it causes the CPU to interrupt whatever it was doing (e.g: Running an application), switch to Supervisor Mode and execute code at a predetermined location, which belongs to the OS and applications don't have access to.
Since we're back into Supervisor Mode and running OS code, the OS now picks the next application to run, setups any required permissions, switches to User Mode and resumes that application.
This timer interrupts are how you get the illusion of multitasking. The OS keeps changing between applications quickly.
The bottom line here is that unless there are bugs in the OS (or the hardware design), the only way an application can go from User Mode to Supervisor Mode is through the OS itself with a System Call.
This is the mechanism I use in my hobby project (a virtual computer) https://github.com/ruifig/G4DevKit.
HW devices are connected to CPU trough bus, and CPU does use to communicate with them in/out instructions to read/write values at I/O ports (not used with current HW too much, in early age of home computers this was the common way), or a part of device memory is "mapped" into CPU address space, and CPU controls the device by writing values at defined locations in that shared memory.
All of this should be not accessible at "user level" context, where common applications are executed by OS (so application trying to write to that shared device memory would crash on illegal memory access, actually that piece of memory is usually not even mapped into user space, ie. not existing from user application point of view). Direct in/out instructions are blocked too at CPU level.
The device is controlled by the driver code, which is either run is specially configured user-level context, which has the particular ports and memory mapped (micro-kernel model, where drivers are not part of kernel, like OS MINIX). This architecture is more robust (crash in driver can't take down kernel, kernel can isolate problematic driver and restart it, or just kill it completely), but the context switches between kernel and user level are a very costly operation, so the throughput of data is hurt a bit.
Or the device drivers code runs on kernel-level (monolithic kernel model like Linux), so any vulnerability in driver code can attack the kernel directly (still not trivial, but lot more easier than trying to get tunnel out of user context trough some kernel bug). But the overall performance of I/O is better (especially with devices like graphics cards or RAID disc clusters, where the data bandwidth goes into GiBs per second). For example this is the reason why early USB drivers are such huge security risk, as they tend to be bugged a lot, so a specially crafted USB device can execute some rogue code from device in kernel-level context.
So, as Hyd already answered, under ordinary circumstances, when everything works as it should, user-level application should be not able to emit single bit outside of it's user sandbox, and suspicious behaviour outside of system calls will be either ignored, or crash the app.
If you find a way to break this rule, it's security vulnerability and those get usually patched ASAP, when the OS vendor gets notified about it.
Although some of the current problems are difficult to patch. For example "row hammering" of current DRAM chips can't be fixed at SW (OS) or CPU (configuration/firmware flash) level at all! Most of the current PC HW is vulnerable to this kind of attack.
Or in mobile world the devices are using the radiochips which are based on legacy designs, with closed source firmware developed years ago, so if you have enough resources to pay for a research on these, it's very likely you would be able to seize any particular device by fake BTS station sending malicious radio signal to the target device.
Etc... it's constant war between vendors with security researchers to patch all vulnerabilities, and hackers to find ideally zero day exploit, or at least picking up users who don't patch their devices/SW fast enough with known bugs.
Not normally. If it is possible it is because of an operating system software error. If the software error is discovered it is fixed fast as it is considered to be a software vulnerability, which equals bad news.
"System" calls execute at a higher processor level than the application: generally kernel mode (but system systems have multiple system level modes).
What you see as a "system" call is actually just a wrapper that sets up registers then triggers a Change Mode Exception of some kind (the method is system specific). The system exception hander dispatches to the appropriate system server.
You cannot just write your own function and do bad things. True, sometimes people find bugs that allow circumventing the system protections. As a general principle, you cannot access devices unless you do it through the system services.

Is the execution of kernel code for handling a system call from a process considered part of the process?

(I am mainly asking the following OS questions from computer science point of view. In the following, if I need to be specific about the OS, I am mainly talking about linux)
A process is defined as an execution of one or more programs.
Yet we often distinguish between user programs and an OS kernel (which also consists of programs).
Does a process only execute user programs, not programs in an OS kernel?
When a process issues a system call, the cpu then switches from user mode to kernel mode and executes the system call handler in the kernel code. Is the execution of the system call handler (as part of the kernel code) part of the process, or is it part of the execution of the OS kernel?
Thanks.
In most operating systems, the "kernel" executes in the context of a process. There are some that work differently but this is the general mechanism use. A process switches between user mode and kernel mode (and some systems have additional modes).
Does a process only execute user programs, not programs in an OS kernel?
There are no programs in an OS kernel (generally). A process can execute interrupt and exception handlers in kernel mode.
When a process issues a system call, the cpu then switches from user mode to kernel mode and executes the system call handler in the kernel code. Is the execution of the system call handler (as part of the kernel code) part of the process, or is it part of the execution of the OS kernel?
The process. The same thing happens with interrupts.
Bill does an I/O request. Jim's process starts to run. Bill's I/O request completes and triggers and interrupt. Jim's process enter's kernel mode and handles Bill's I/O request.
Of course, system security prevents Jim's user mode code from having any access to Bill's data.

In single-threaded applications, is that one and only thread a kernel thread?

From Wikipedia it says:
A kernel thread is the "lightest" unit of kernel scheduling. At least one kernel thread exists within each process.
I've learned that a process is a container that houses memory space, file handles, device handles, system resources, etc... and the thread is the one that really gets scheduled by the kernel.
So in single-threaded applications, is that one thread(main thread i believe) a kernel thread?
I assume you are talking about this article:
http://en.wikipedia.org/wiki/Kernel_thread
According to that article, in a single threaded application, since you have only one thread by definition, it has to be a kernel thread, otherwise it will not get scheduled and will not run.
If you had more than one thread in your application, then it would depend on how user mode multi threading is implemented (kernel threads, fibers, etc ...).
It's important to note however it would be a kernel thread running in user mode, when executing the application code (unless you make a system call). Any attempt to execute a protected instruction when running in user mode would cause a fault that will eventually lead to the process being terminated.
So kernel thread here not to be confused with supervisor/privileged mode and kernel code.
You can execute kernel code, but you have to go through a system call gate first.
No. In modern operating systems applications and the kernel run at different processor protection levels (often called rings). For example, Intel CPUs have four protection levels. Kernel code runs at Ring 0 (kernel mode) and is able to execute the most privileged processor instructions, whereas application code runs at Ring 3 (user mode) and is not allowed to execute certain operations. See http://en.wikipedia.org/wiki/Ring_(computer_security)

Threads and CPU Affinity

Lets say there are two processors on a machine. Thread A is running on P1 and Thread B is running on P2.
Thread A calls Sleep(10000);
Is it possible that when Thread A starts executing again, it runs on P2?
If yes, who decides this transition? If no, why not?
Does Processor store some data that which all threads it's running or OS binds each thread to Processor for its full lifetime ?
It is possible. This would be determined by the operating system process scheduler and may also be dependent on the application that is running. No information about previously running threads is kept by the processor, aside from whatever is in the cache.
This is dependent on many things, it behaves differently depending on the particular operating system. See also: Processor Affinity and Scheduling Algorithms. Under Windows you can pin a particular process to a processor core via the task manager.
Yes, it is possible. Though ultimately a thread inherits its CPU (or CPU core) from the process (executable.) In operating systems, which CPU or CPU core a process runs on for its current quanta (time slice) is decided by the Scheduler:
http://en.wikipedia.org/wiki/Scheduling_(computing)
-Oisin
The OS decides which processor to run the thread on, and it may easily change during the lifetime of that thread, especially if there is a context switch (caused by the sleep). It's completely possible if the system is loaded that both threads will be running on the same processor (or core), just at different times. Or if there isn't any load on the system, both threads may continue to run on separate processors.

Resources