I understand that NodeJS uses a thread pool for blocking I/O calls. What does it do if all the threads in the thread pool are busy with some work and another request comes in?
In a case where the thread pool is needed and no workers are available, the request would be queued until a worker is free. The thread pool is not the sole approach, though. There are three operation types that utilize the thread pool in libuv as documented at the bottom of the page here under the title File I/O.
These operations types are:
Filesystem operations
DNS functions (getaddinfo and getnameinfo)
User-specified code
While not a direct answer to your question, I believe this post by Jason does a wonderful job of explaining thread pools in Node.js. Without going extremely in-depth, it introduces you to the functionality provided the libuv library and has links to very informative literature on the subject of the thread pool.
Related
I suppose, there is a thread pool which the web server are using to serve requests. So the controllers are running within one of the thread of this thread pool. Say it is the 'serving' pool.
In one of my async action method I use an async method,
var myResult = await myObject.MyMethodAsync();
// my completion logic here
As explained many places, we are doing this, to not block the valuable serving pool thread, instead execute MyMethodAsync in an other background thread... then continue the completion logic in again a serving pool thread, probably in other one, but having the http context, and some othe minor things marshaled there correctly.
So the background thread in which MyMethodAsync runs must be from an other thread pool, unless the whole thing makes no sense.
Question
Please confirm or correct my understanding and in case if it is correct, I still miss why would one thread in one pool more valuable resource than other thread in another pool? In the end of the day the whole thing runs on a same particular hardware with given number of cores and CPU performance...
There is only one thread pool in a .NET application. It has both worker threads and I/O threads, which are treated differently, but there is only one pool.
I suppose, there is a thread pool which the web server are using to serve requests. So the controllers are running within one of the thread of this thread pool. Say it is the 'serving' pool.
ASP.NET uses the .NET thread pool to serve requests, yes.
As explained many places, we are doing this, to not block the valuable serving pool thread, instead execute MyMethodAsync in an other background thread... then continue the completion logic in again a serving pool thread, probably in other one, but having the http context, and some othe minor things marshaled there correctly.
So the background thread in which MyMethodAsync runs must be from an other thread pool, unless the whole thing makes no sense.
This is the wrong part.
With truly asynchronous methods, there is no thread (as described on my blog). While the code within MyMethodAsync will run on some thread, there is no thread dedicated to running MyMethodAsync until it completes.
You can think about it this way: asynchronous code usually deals with I/O, so lets say for example that MyMethodAsync is posting something to an API. Once the post is sent, there's no point in having a thread just block waiting for a response. Instead, MyMethodAsync just wires up a continuation and returns. As a result, most asynchronous controller methods use zero threads while waiting for external systems to respond. There's no "other thread pool" because there's no "other thread" at all.
Which is kind of the point of asynchronous code on ASP.NET: to use fewer threads to serve more requests. Also see this article.
I see clear the cluster method as it deploys different whole processes. And I guess the professional programmers made "worker_threads" library for some good reason... but I still need to clear this point for my understanding:
In a normal single threaded process the event loop thread has the aid of the default worker pool to unload its heavy I/O tasks, so the main thread is not blocked.
At the same time, user defined "worker threads" will be used for the same reason with their own event loops and NodeJS instances.
What's the point of spawning those event loop and Nodejs instances when they are not the bottle neck as the libuv is intended to manage to spawn the workers.
Is this meaning that the default worker pool may not be enough? I mean just a quantity matter or concept?
There are two types of operation(call) in Nodejs blocking and non-blocking
non-blocking
Nodejs use Libuv for IO non-blocking operation. Network, file, and DNS IO operations run asynchronously by Libuv. Nodejs use the following scheme:
Asynchronous system APIs are used by Node.js whenever possible, but where they do not exist, Libuv's thread pool is used to create asynchronous node APIs based on synchronous system APIs. Node.js APIs that use the thread pool are:
all fs APIs, other than the file watcher APIs and those that are:
explicitly synchronous asynchronous crypto APIs such as crypto.pbkdf2(),
crypto.scrypt(), crypto.randomBytes(), crypto.randomFill(), crypto.generateKeyPair()
dns.lookup() all zlib *APIs, other than those that are explicitly synchronous.
So we don't have direct access to the Libuv thread pool. We may define our own uses of the thread pool using C++ add-ons.
Blocking calls
Nodejs execute blocking code in the main thread. fs.readfileSync(), compression-algorithm, encrypting data, image-resize, calculating primes for the large range are some examples of blocking operation. Nodejs golden rule is never block event-loop(main thread). We can execute these operations asynchronously by creating child process using cluster module or child-process module. But creating a child process is a heavy task in terms of OS resources and that's why worker-thread was born.
Using worker-thread you can execute blocking javascript code in worker-thread hence unblocking the main thread and you can communicate to parent thread(main thread) via message passing. Worker threads are still lightweight as compared to a child process.
Read more here:
https://nodesource.com/blog/worker-threads-nodejs
https://blog.insiderattack.net/deep-dive-into-worker-threads-in-node-js-e75e10546b11
I need some clarification on what exactly are the nodejs worker threads doing.
I found contradicting info on this one. Some people say worker threads handle all IO, others say they handle only blocking posix requests (for which there is no async version).
For example, assume that I am not blocking the main loop myself with some unreasonable processing. I am just invoking functions from available modules and providing the callbacks. I can see that if this requires some blocking or computationally-expensive operation then it is handled to a worker thread. But for some async IO, is it initiated from the main libuv loop? Or is it passed to a worker thread, to be initiated from there?
Also, would a nodejs worker thread ever initiate a blocking (synchroneous) IO operation when the OS supports an async mode to do the same thing? Is it documented anywhere what kind of operations may end up blocking a worker thread for a longer time?
I'm asking this because there is a fixed-size worker pool and I want to avoid making mistakes with it. Thanks.
Network I/O on all platforms is done on the main thread. File I/O is a different story: on Windows it is done truly asynchronously and non-blocking, but on all other platforms synchronous file I/O operations are performed in a thread pool to be async and non-blocking.
In apache, we have a single thread for each incoming request. Each thread consumes a memory space. the memory spaces don't collide with each other because of which each request serves it purpose.
How does this happen in node.js as it has single thread execution. A single memory space is used by all incoming requests. Why don't the requests collide with each other. What differentiates them?
As you self noticed an event based model allows to share the given memory more efficiently as the overhead of reexecuting a stack again and again is minimized.
However to make an event or single threaded model non-blocking you have to get back to threads somewhere and this is where nodes "io-engine" libuv is working.
libuv supplies an API which underneath manages IO-tasks in a thread pool if an IO task is done async. Using a thread pool results in not blocking the main process however extensive javascript operations still can do (this is why there is the cluster module which allows spawning multiple worker processes).
I hope this answers you question if not feel free to comment!
In a recent course at school about networking / operating systems I learned about thread pools. Now he basic functionality is pretty straight forward and I understand this.
However, what's not specified in my book is what happens when the thread pool is exhausted? For example you have a pool with 20 threads in it and you have 20 connected clients. Another client tries to connect but there's no threads left in the pool, what happens then? Does the client go in a queue? Does the system make another thread to put in the pool? Something else?
The answer depends highly on your language, your operation system, and your pool implementation.
what happens when the thread pool is exhausted? Another client tries to connect but there's no threads left in the pool, what happens then? Does the client go in a queue?
Typically in a server situation, it depends on the socket settings. Either the socket connection gets queued by the OS or the connection gets refused. This is usually not handled by the thread-pool. In ~unix operation systems, this queue or "backlog" is handled by the listen method.
Does the system make another thread to put in the pool?
This depends on the thread-pool. Some pools are fixed size so no more threads will be added. Other thread-pools are "cached" thread pools so it will reuse a free thread or will create a new one if none are available. Many web servers have max thread settings on their pools so remote users don't thrash the system by starting too many concurrent connections.
It depends on the policy used by the thread-pool:
the pool size can be static, and when a new thread is requested the caller will wait on a synchronization primitives like a semaphore, or the request can be pushed into a queue
the pool size can be unlimited but this may be dangerous because creating too much threads can greatly reduce the performance; more often than note it is ranged between a min and a max set by the pool user
the pool can use a dynamic policy depending on the context: hardware resources like CPU or RAM, OS resources like synchronization primitives and threads, current process resources (memory, threads, handles...)
An example of a smart thread-pool: http://www.codeproject.com/Articles/7933/Smart-Thread-Pool
It depends on the thread pool implementation. They might be put on a queue, they might get a new thread created for them, or they might even just get an error message saying come back later. Or if you are the one implementing the thread pool, you can do whatever you want.