I know multithreading is ideal for this situation, but would there be any instance of where this situation could be applicable?
Yes you could be serving multiple clients at once from a single thread. This is typically implemented using the select() or poll() socket functions.
A single threaded select() based polling server can use less system resources than a multi-threaded server.
Related
As far as I know, all IO requests and other asynchronous tasks are done by libuv in nodejs.
I want to know if libuv is using threading. If it is, is it using all available core or not?
First of all, what is libuv. As mentioned in the documentation, it's a multi-platform support library with a focus on asynchronous I/O.
libuv doesn't use thread for asynchronous tasks, but for those that aren't asynchronous by nature.
As an example, it doesn't use threads to deal with sockets, it uses threads to make synchronous fs calls asynchronous.
When threads are involved, libuv uses a thread pool the size of which you can change at compile-time using UV_THREADPOOL_SIZE.
node.js is provided with a precompiled version of libuv and thus a fixed UV_THREADPOOL_SIZE parameter.
It goes without saying that it has nothing to do with the number of cores of your chip.
I'm tempted to affirm that you can safely ignore the topic, for libuv and thus node.js don't use threads intensively for their purposes (unless you are using them in a really perverse way or if you are running an high number of libuv work requests).
Feel free to run an instance of node.js per core if you need as most of the users do.
The design overview section of libuv is also clear enough about this point:
The I/O (or event) loop is the central part of libuv. It establishes the content for all I/O operations, and it’s meant to be tied to a single thread. One can run multiple event loops as long as each runs in a different thread.
The libuv module has a responsibility that is relevant for some particular functions in the standard library. for SOME standard library function calls, the node C++ side and libuv decide to do expensive calculations outside of the event loop entirely.They make something called a thread pool that thread pool is a series of four threads that can be used for running computationally intensive tasks such as hashing functions.
By default libuv creates four threads in this thread pool. Thread Pool in the picture is organized by the Libuv So that means that in addition to that thread used for the event loop there are four other threads that can be used to offload expensive calculations that need to occur inside of our application. Many of the functions include in the node standard library will automatically make use of this thread pool.
Network (Network IO) is responsible for api requests, File system (File IO) is fs module. so node.js single thread delegates those heavy work to the libuv
If you have too many function calls, It will use all of the cores. CPU cores do not actually speed up the processing function calls, they just allow for some amount of concurrency inside of the work that you are doing.
From here:
A single instance of Node.js runs in a single thread. To take
advantage of multi-core systems the user will sometimes want to launch
a cluster of Node.js processes to handle the load.
The cluster module allows easy creation of child processes that all
share server ports.
Multiple processes could be better than multithreading in some cases. Some people even think theads are evil. Maybe node.js is designed in such a way to take advantage of processes better than threads.
I was trying to understand how nodejs can achieve higher concurrency compared to thread-based approaches such as Servlet servers.
I already know that in nodejs "everything runs in parallel except your code", and also there is a backend thread pool in libuv to handle File IO or database calls which are usually the bottlenecks.
So here is my question: if nodejs uses thread pool to handle database calls, how it can service higher concurrent request than Servlet servers such as Tomcat given that Tomcat can also use NIO backed by epoll/kqueue to achieve high concurrency ?
For example, if there's a 100k concurrent request coming in and each requires database operations, if these 100k request are to be serviced concurrently, with nodejs we still end up creating 100k threads which might cause memory exhaustion as Tomcat does. Yes, the 100k threads is just an imagination because (I know) that nodejs has a fixed thread pool and different operations are queued in the event loop, but with Tomcat it handles things in the same way--we also can configure the thread pool size in Tomcat and it also queues request.
Or, am I wrong to say that "nodejs uses backend thread pool in libuv to handle File IO or database calls"? Does nodejs use epoll/kqueue to handle database io without a separate thread?
I was reading this similar question but still didn't get the answer.
if nodejs uses thread pool to handle database calls
That's a wrong assumption. nodejs will typically use networking to talk to a local database running in a different process or on a different host. Networking in node.js does not use threads of any kind - it uses event driven I/O. What the database does for threads is up to the database and independent of node.js since it would be the same no matter which server environment you were using.
node.js does use a thread pool for local disk access, but high scale applications are usually using a database for the crux of their disk access which run in a separate process and have their own I/O optimizations to handle lots of requests. How a given database does it is up to that implementation, but it will not be using a nodejs thread per request.
I was trying to understand how nodejs can achieve higher concurrency compared to thread-based approaches such as Servlet servers.
The general concept is that a properly written server app in node.js uses async I/O for all I/O (except perhaps startup code that only runs during server startup). This means that it can have a lot of requests in-flight at the same time with only a single Javascript thread while most of them are waiting on some type of I/O. If you're going to have a lot of requests in-flight at the same time, it can be a lot more efficient for the system to do it the node.js way of a single thread where all the requests are cooperatively switched vs. using OS threads where every thread has OS overhead associated with it and every pre-emptive thread switch has OS and CPU overhead associated with it.
In node-js, there is no pre-emptive switching between the active requests. Only one runs at a time and it runs until it either finishes or hits an asychronous operation and has nothing else to do until that async I/O operation completes. At that point, the JS engine goes back to the event queue and picks out an event (probably for one of the other requests). This type of cooperate switching can be significantly faster and more efficient than OS-level threads. There is sometimes a programming cost in that a node.js developer has to code with async I/O in order to take advantage of this which has a learning curve in order to get proficient at writing good, clean code with proper error handling and has a learning curve for debugging it too.
For example, if there's a 100k concurrent request coming in and each requires database operations, if these 100k request are to be serviced concurrently, with nodejs we still end up creating 100k threads which might cause memory exhaustion as Tomcat does.
No, you will not be creating 100k threads. A node.js database interface layer that interfaces between node.js and the actual database code in another process or on another host may be written entirely in node.js (using TCP networking to talk to the database) and introduce no new threads at all or it may have some native code and use a small number of threads for its own native code operations, but it will likely be a small number of threads and nothing even close to one per request.
Or, am I wrong to say that "nodejs uses backend thread pool in libuv to handle File IO or database calls"? Does nodejs use epoll/kqueue to handle database io without a separate thread?
For file I/O, yes it uses a thread pool in libuv. For database calls, no - While the details depend entirely upon the database implementation, usually there is not a thread per database call. The database is typically in another process and the nodejs interface library for the DB either directly uses nodejs TCP to talk to the database (which uses no threads) or it has its own native code add-on that talks to the database which probably uses a small number of threads for its work, but typically not a thread per request.
I understand how Node.js works with single thread. Mostly it is using asynchronous methods/modules in order to keep the main runtime thread free as much as possible.
However, some of the asynchronous modules internally are using threads to do their job. Example for this is reading file or other high intensive CPU task. This is done in background and it is abstracted for the Node developer.
My question is , how internally Socket.IO works, does it use threads like the above examples ? Does it use separate thread per connection ? If so , does it mean that we will have 1000 threads, if we have 1000 connected clients ?
Node does not use the thread pool (or separate threads) for sockets, instead it uses whatever platform-specific mechanism for polling sockets for data (e.g. epoll on Linux, kqueue on OS X (IIRC), I/O completion ports on Windows, etc.) on the main thread.
Socket.io works on the event loop like most node applications. No tricky thread business AFAIK. You can check out the source yourself here: https://github.com/Automattic/socket.io
We have used rpcgen to create a rpc server on Linux machine (c language).
When there are many calls to our program it still results in a single
threaded request.
I see that it's common problem from 2004, there is a new rpcgen (or other genarator) that solved this problem?
Thanks,
Kobi
rpcgen will simply generate the serialization routines. Your server might be coded to have several threads. Learn more about pthreads.
You probably should not have too many threads (e.g. at most a dozen, not thousands). You could design your program to use some thread pool, or simply to have a fixed set of worker threads which are continuously handling RPC requests (with the main thread just in charge of accepting connections, etc).
Read rpc(3). You might consider not using svc_run in your server, but instead doing it your own way with threads. Beware that if you use threads, you'll need to synchronize, perhaps with mutex.
You could also consider JSONRPC, or perhaps making your C program some specialized HTTP server (e.g. using libonion) and have your clients do HTTP requests (maybe with libcurl). See also this. And you might consider a message passing architecture, perhaps with Open-MPI.
Beware sun version is being abandoned, look for tirpc
I'm following the boost-asio tutorial and don't know how to make a multi-threaded server using boost. I've compiled and tested the daytime client and daytime synchronous server and improved the communication (server asks the client for a command, processes it and then returns the result to the client). But this server can handle only one client at one time.
I would like to use boost to make a multi-threaded server. There is also daytime asynchronous server which executes
boost::asio::io_service io_service;
tcp_server server(io_service);
io_service.run();
in the main program function. The question is - is boost creating a thread for each client somewhere inside? Is this a multi-threaded solution? If not - how to make a multi-threaded server with boost? Thanks for any advice.
have a look at this tutorial. in short terms:
io_service.run() in multiple threads gives a thread pool
multiple io_services give completely separated threads
You don't need to explicitly work with threads when you want to support multiple clients. But for that you should use asynchronous calls (as opposed to synchronous, which are used in the tutorials you listed). Have a look at the asynchronous echo tcp server example, it serves multiple clients without using threads.
is boost creating a thread for each client somewhere inside?
When working with asynchronous calls, boost asio is doing these things behind the scenes. It could use threads, but it usually doesn't because there are other, preferred mechanisms for working with multiple sockets at once. For example on linux you have epoll, select and poll (in order of preference). I'm not sure what the situation is on windows, there might be other mechanisms or the preference order might be different. But in any case, boost asio takes care of this, chooses the best mechanism there is for your platform and hides it behind those asynchronous calls.