Server with thread per request vs pool of fixed threads - multithreading

Which one of these are better to implement in a server? I was wondering about the tradeoffs between spawning a thread per request versus using a worker pool consisting of a fixed number of threads, and how the performance will vary as the number of users increases. These are some questions that came up to my mind what I started to think about the way I want to implement my server.

Related

Tuning of Server Thread pool

I was just playing with threads and see how much CPU they consume. I have checked in two scenarios.
In first scenario I created four threads and started them with infinite loop. Soon those threads consumed my all 4 CPU cores. After checking performance monitor in task manager I found CPU consumption is 100%.
In second scenario when I tried it with web application and in rest controller(using tomcat server 8.5 version) I have run infinite loop. So that if I request url 4 times with browser(with different tabs obviously). My CPU consumption should be 100%. I couldn't see 100% CPU consumption.
Why is there difference?
My Second question is: how would I tune the server thread pool. I have to use more than 4 threads because it might be possible few of them are waiting for IO operation. I am using hibernate as ORM that maintains connection pooling. So how many threads I should use in thread pool as well as connection pool. How would I decide?
We can't answer the first part of your question without seeing your code. But I suspect the problem is in the way that you have implemented the threads in the webapp case. (Because what you report shouldn't happen ...)
The answer to the second part is "trial and error". More specifically:
Make the pool sizes tunable parameters
Develop a benchmark that is representative of your expected system load.
Run benchmark with different settings, measure performance and graph results.
Based on the graph (and other criteria) pick a settings that are the best compromise between performance and resource (e.g. memory) utilization.
Thread pools and connection pools are different, and have different resource implications. The first is (largely) about memory; i.e. thread stacks and temporary objects used by the threads while they are active. The second is (largely) about resources associated with connections (active or idle).

Node: one core, many processes

I have looked up online and all I seem to find are answers related to the question of "how does Node benefit from running in a multi core cpu?"
But. If you have a machine with just one core, you can only be running one process at any given time. (I am considering task scheduling here). And node uses a single threaded model.
My question: is there any scenario in which it makes sense to run multiple node processes in one core? And if the process is a web server that listens on a port, how can this ever work given that only one process can listen?
My question: is there any scenario in which it makes sense to run
multiple node processes in one core?
Yes, there are some scenarios. See details below.
And if the process is a web server that listens on a port, how can
this ever work given that only one process can listen?
The node.js clustering module creates a scenario where there is one listener on the desired port, but incoming requests are shared among all the clustered processes. More details to follow...
You can use the clustering module to run more than one process that are all configured to handle incoming requests on the same port. If you want to know how incoming requests are shared among the different clustered processes, you can read the explanation in this blog post. In a nutshell, there ends up being only one listener on the desired port and the incoming requests are shared among the various clustered processes.
As to whether you could benefit from more processes than you have cores, the answer is that it depends on what type of benefit you are looking for. If you have a properly written server with all async I/O, then adding more processes than cores will likely not improve your overall throughput (as measured by requests per second that your server could process).
But, if you have any CPU-heavy processing in your requests, having a few more processes may provide a bit fairer scheduling between simultaneous requests because the OS will "share" the CPU among each of your processes. This will likely slightly decrease overall throughput (because of the added overhead of task switching the CPU between processes), but it may make request processing more even when there are multiple requests to be processed together.
If your requests don't have much CPU-heavy processing and are really just waiting for I/O most of the time, then there's probably no benefit to adding more processes than cores.
So, it really depends what you want to optimize for and what your situation is.

What is the benifit using netty4 NIO in the client side comparing to the one thread-per-connection blocking IO?

I see from the server side, the benefit of NIO is the capability to manage multiple network connections with fewer thread comparing to the comparing to one thread per connection blocking IO.
However, if I have a IO client which connects to thousand of servers at the same time, can I just have similar approach to manage these connections IO using fewer threads. I tried the approach in Netty 4 multiple client and found it spawn a "Reader" thread for each channel it created.
So, my questions are:
1) what are the benefits using netty/NIO in the client side?
2) is it possible to manage multiple connections with fewer threads in the client side?
Thanks!
I have uploaded the code samples in github: https://github.com/hippoz/ogop-lseb
The sample server/client class is moc.ogop.ahsp.demo.nio.MultipleConnectionNioMain and moc.ogop.ahsp.demo.nio.NettyNioServerMain
Having lots of threads creates a context-switch problem in the kernel where lots more memory is being loaded and unloaded from each core as the kernel tries to reschedule the threads across the cores.
The benefit of NIO anywhere is performance. Thats pretty much the only reason we use it. Using Blocking IO is MUCH more simple. Using the worker model and NIO you can limit the number of threads (and potential computational time) the process uses. So if you have two workers and they go bonkers using 100% cpu time the whole system won't go to a crawl because you have 2-4 more cores available.
Have fun!
https://en.wikipedia.org/wiki/Context_switch
Why should I use non-blocking or blocking sockets?

Is Vibe.D multi-threaded to handle concurrent requests?

I know that Vibe.D implementation is based on Fibers.
But I don't know how high load scenarios are handled by Vibe.D. Is it the scheduler in Vibe.D allocating fibers on multiple threads or just one thread for all fibers?
This consideration is very important because even with the high efficiency of Fibers a lot of CPU time is wasted is no more than one thread is used to attend all incoming requests.
Their front page says yes:
http://vibed.org/
this page has the details
http://vibed.org/features#multi-threading
Distributed processing of incoming connections
The HTTP server (as well as any other TCP based server) can be instructed to process incoming connections across the worker threads of the thread pool instead of in the main thread. For applications that don't need to share state across different connections in the process, this can increase the maximum number of requests per second linearly with the number of cores in the system. This feature is enabled using the HTTPServerOption.distribute or TCPListenOptions.distribute settings.

Number of threads in a middleware application

I am writing an application server (again, non-related with a question I already posted here) and I am wondering what are the strategies to use when creating worker threads that work on the database. Some preliminary dates: the server receives xml and sends back xml, all the requests query a database - each request could take a few milliseconds to a few seconds.
Say for example that your server services a small to medium number of clients which in turn send a small number of requests per connection. Is it safe to have one worker thread per connection or should it be per request? Also should a thread pool be used to limit the resources used by the server or a worker should be added each time a new connection/request is made?
Should the server limit the number of threads it creates to an upper limit?
Hope I am not too vague ... I can hardly keep my eyes open.
If you don't have extensive experience writing application servers is a daunting task. It can be eased by using frameworks like ACE that allow you to build different configurations of your app serving infrastructure like thread per connection, thread pools, leader follower and then load the appropriate configuration with an extensible service framework.
I would recommend to read these books on ACE to get
C++ Network Programming: Mastering Complexity Using ACE and Patterns
C++ Network Programming: Systematic Reuse with ACE and Frameworks
to get an idea about what the framework can do for you.
The way I write apps like this is to make the number of threads configurable via the command line and/or a configuration file. I then do some load testing with different numbers of threads - there is always an optimal number beyond which performance begins to degrade.
If you follow the model adopted by Java EE app server developers, there's a queue for incoming requests and a pool of worker threads to service them. It's one thread per request. When a worker thread fulfills a request it goes back into the pool. If the incoming requests show up faster than the worker thread pool can service them, the queue allows them to stack up until a worker thread is released. Both the queue size and the thread pool can be tuned to match for your situation.
I'd wonder why anyone would feel the need to write their own server from scratch, especially when the scenario you describe is solved so well by others. If your wish is education, good luck. If you think you're going to improve on what's been done in the past, I'd re-examine that assumption.

Resources