New thread per client connection in socket server?

New thread per client connection in socket server? - multithreading

I am trying to optimize multiple connections per time to a TCP socket server.
Is it considered good practice, or even rational to initiate a new thread in the listening server every time I receive a connection request?
At what time should I begin to worry about a server based on this infrastructure? What is the maximum no of background threads I can work, until it doesn't make any sense anymore?
Platform is C#, framework is Mono, target OS is CentOS, RAM is 2.4G, server is on the clouds, and I'm expecting about 200 connection requests per second.

No, you shouldn't have one thread per connection. Instead, you should be using the asynchronous methods (BeginAccept/EndAccept, BeginSend/EndSend, etc). These will make much more efficient use of system resources.
In particular, every thread you create adds overhead in terms of context switches, stack space, cache misses and so on. Linux is better at managing this stuff than Windows, for example, but that shouldn't be an excuse to give you free reign to create as many threads as you like ;)

Related

Is using Pool instead of Client in node-postgres useful despite Nodejs being single threaded?

I am using Node.js express for building REST api with postgres database using node-postgres package.
My question is whether I should use Client or Pool? I found this answer:
How can I choose between Client or Pool for node-postgres
but I don't understand what would be the use of Pool connection, since Nodejs is single-threaded and there won't be an attempt to use a single connection at the same time even if there are concurrent requests occurring.
Also by using a single connection, I can benefit from the prepared statements much more efficiently. I can prepare them at the initialization phase of my app, and then execute it whenever a request arrives.

Yes since Postgresql is still multithreaded.
When making a database request your process spends 0% CPU time executing code. Yes, you've read that right, zero.
The computer does not execute code in order to wait. Instead it sets up interrupt handlers and tells the hardware (ethernet card or wifi module) to send it an interrupt when there is data. Regardless of the number of requests you make to your database you still only have ONE ethernet card in your PC (well, some servers can have multiple and have increased bandwidth by trunking but I think you can see that the number of PCI cards you have does not have any relationship with the number of threads you are running - rather it is more related with the amount of $money you are willing to spend). Your hardware still basically sends all the requests out one bit at a time.
A traditional multi-threaded server therefore spends exactly the same amount of CPU time as node.js waiting for responses from the database: zero. Which means node.js improves efficiency by not needing to malloc a lot of RAM for each thread since node only has one thread.
Even when you are running your database on the same computer as your node process, communication with the database is not overly parallel. And the TCP/IP stack itself sort of serializes the communication. And while it does not go through the networking hardware the OS still schedules the responses using OS level events (instead of hardware interrupts).
So yes, it makes sense for your node.js process to make multiple parallel connections to the database even when node is singlethreaded - it is to allow the database to process requests in multiple database threads. You are making use of your database's multithreading instead of forcing your database to also use only one thread to process node's single connection.

What is the benifit using netty4 NIO in the client side comparing to the one thread-per-connection blocking IO?

I see from the server side, the benefit of NIO is the capability to manage multiple network connections with fewer thread comparing to the comparing to one thread per connection blocking IO.
However, if I have a IO client which connects to thousand of servers at the same time, can I just have similar approach to manage these connections IO using fewer threads. I tried the approach in Netty 4 multiple client and found it spawn a "Reader" thread for each channel it created.
So, my questions are:
1) what are the benefits using netty/NIO in the client side?
2) is it possible to manage multiple connections with fewer threads in the client side?
Thanks!
I have uploaded the code samples in github: https://github.com/hippoz/ogop-lseb
The sample server/client class is moc.ogop.ahsp.demo.nio.MultipleConnectionNioMain and moc.ogop.ahsp.demo.nio.NettyNioServerMain

Having lots of threads creates a context-switch problem in the kernel where lots more memory is being loaded and unloaded from each core as the kernel tries to reschedule the threads across the cores.
The benefit of NIO anywhere is performance. Thats pretty much the only reason we use it. Using Blocking IO is MUCH more simple. Using the worker model and NIO you can limit the number of threads (and potential computational time) the process uses. So if you have two workers and they go bonkers using 100% cpu time the whole system won't go to a crawl because you have 2-4 more cores available.
Have fun!
https://en.wikipedia.org/wiki/Context_switch
Why should I use non-blocking or blocking sockets?

Multithreaded Corba Client

There is a lot on multithreading on the Corba server side, but I'm interested about the client side. We have a multithreaded client (Solaris, Orbix 6.3) with a Corba singleton "manager" that initialises the ORB. During runtime 'lsof' shows only one TCP connection to the Corba server, so all synchronous calls made from the client worker threads should be serialised.
Would like to change this arrangement to take advantage of parallelism: each thread to manage its own connection. I've changed the setup so that instead of a singleton each worker thread calls ORB_init(), etc.
Totally puzzled now: 'lsof' shows now 2 TCP connections but there are 6 worker threads.
Something is not right, would have expected as many TCP connections as the number of worker threads. May be that the approach is naive - does it makes sense for example to call ORB_init() per thread?
I'd need someones opinion on this. Sample code for a multithreaded client would greatly help. Again, using Orbix 6.3 on Solaris.
Kind regards,
Adrian

The management of connections is implementation specific for plain CORBA. Each vendor has its own proprietary way of configuration their behavior. If you check the RTCORBA specification, that has a standardized way to configure how connections between client and server will be used.
I don't know how Orbix works and whether it supports RTCORBA, that is something you could get from their manuals probably. I do know that TAO has a lot of support for threading at the client side. By default when multiple threads make an invocation to the same server multiple tcpip transports can be opened at the same moment.

Thank you guys for your answers. I found, as Johnny says that this is indeed implementation specific.
omniORB has for example maxGIOPConnectionPerServer - default 5. That's:
The maximum number of concurrent connections the ORB will open to a single server. If multiple threads on the client call the same server, the ORB opens additional connections to the server, up to the maximum specified by this parameter. If the maximum is reached, threads are blocked until a connection becomes free for them to use.
Unfortunately I haven't yet found out what's the equivalent (if any) for Orbix. It's definitely defaulting to 1 connection. Still googling...
Found out though that as part of Solaris -> Linux migration will be moving from Orbix to TAO in a number of months. Hoping TAO would be more friendly and customizable.

Orbix internally uses a lot of optimization routines to ensure that connections are used efficiently. Specifically, it's not going to open up multiple connections to the same server endpoint because it's able to multiplex multiple concurrent GIOP requests over the same TCP connection. CORBA deliberately hides connection management from client and server programmers.
I don't believe this is controllable through configuration. Send a support ticket to Progress Support to confirm. You might be able to force it to happen if you move away from the singleton model and initialize a different ORB for each client (each with their own unique ID), but that would be a very heavy-handed and costly solution to a problem that is a little vague. The underlying ORB is already build to optimize for concurrent requests, so I'm not sure what problem it is you're trying to solve.

In my honest opinion I don't think there is such a concept called multi threaded client for CORBA applications. Because in the server side, there is only one object that is registered with the naming service which is available for all the clients. If you look at the IOR of the object, it will be same for all the clients. So it can establishes at most only one connection to that object. It also leads to thinking that you can not get more than one remote object (which means how much ever you do look-up for the object from different clients, they all get the same reference) for any number of clients. So, in order to support mutli-threading ,the server actually has to support different thread policies. POA the server can have different thread policies. Please go through JAVA PROGRAMMING WITH CORBA for more.

I don't know how exactly Orbix works, but normally ORB initialization in done only once even for a multithreaded setup. The multithreaded (server side) ORB will start an amount of worker threads (on demand or if needed or if configured, a fixed number) to handle incomming connection. These connections are handled by a worker. This worker looks up the servant that can handle this request. Normally this (the real call to the servant) is performed in an extra thread also. But you won't see this thread with lsof. Try so use ps -eLf or top -H with thread support enabled.
EDIT:
On the client side it depends on how many object do you want to call. For each object a caller thread is possible. It is also possible to have more than one caller thread per remote object, but only if called from different threads on the client side logic. (Imagine to have multiple threads and the remote object is shared across the threads)

Seeking tutorials and information on load-balancing between threads

I know the term "Load Balancing" can be very broad, but the subject I'm trying to explain is more specific, and I don't know the proper terminology. What I'm building is a set of Server/Client applications. The server needs to be able to handle a massive amount of data transfer, as well as client connections, so I started looking into multi-threading.
There's essentially 3 ways I can see implementing any sort of threading for the server...
One thread handling all requests (defeats the purpose of a thread if 500 clients are logged in)
One thread per user (which is risky to create 1 thread for each of the 500 clients)
Pool of threads which divide the work evenly for any number of clients (What I'm seeking)
The third one is what I'd like to know. This consists of a setup like this:
Maximum 250 threads running at once
500 clients will not create 500 threads, but share the 250
A Queue of requests will be pending to be passed into a thread
A thread is not tied down to a client, and vice-versa
Server decides which thread to send a request to based on activity (load balance)
I'm currently not seeking any code quite yet, but information on how a setup like this works, and preferably a tutorial to accomplish this in Delphi (XE2). Even a proper word or name to put on this subject would be sufficient so I can do the searching myself.
EDIT
I found it necessary to explain a little about what this will be used for. I will be streaming both commands and images, there will be a double-socket setup where there's one "Main Command Socket" and another "Add-on Image Streaming Socket". So really one connection is 2 socket connections.
Each connection to the server's main socket creates (or re-uses) an object representing all the data needed for that connection, including threads, images, settings, etc. For every connection to the main socket, a streaming socket is also connected. It's not always streaming images, but the command socket is always ready.
The point is that I already have a threading mechanism in my current setup (1 thread per session object) and I'd like to shift that over to a pool-like multithreading environment. The two connections together require a higher-level control over these threads, and I can't rely on something like Indy to keep these synchronized, I'd rather know how things are working than to learn to trust something else to do the work for me.

IOCP server. It's the only high-performance solution. It's essentially asynchronous in user mode, ('overlapped I/O in M$-speak), a pool of threads issue WSARecv, WSASend, AcceptEx calls and then all wait on an IOCP queue for completion records. When something useful happens, a kernel threadpool performs the actual I/O and then queues up the completion records.
You need at least a buffer class and socket class, (and probably others for high-performance - objectPool and pooledObject classes so you can make socket and buffer pools).

500 threads may not be an issue on a server class computer. A blocking TCP thread doesn't do much while it's waiting for the server to respond.
There's nothing stopping you from creating some type of work queue on the server side, served by a limited size pool of threads. A simple thread-safe TList works great as a queue, and you can easily put a message handler on each server thread for notifications.
Still, at some point you may have too much work, or too many threads, for the server to handle. This is usually handled by adding another application server.
To ensure scalability, code for the idea of multiple servers, and you can keep scaling by adding hardware.
There may be some reason to limit the number of actual work threads, such as limiting lock contention on a database, or something similar, however, in general, you distribute work by adding threads, and let the hardware (CPU, redirector, switch, NAS, etc.) schedule the load.

Your implementation is completely tied to the communications components you use. If you use Indy, or anything based on Indy, it is one thread per connection - period! There is no way to change this. Indy will scale to 100's of connections, but not 1000's. Your best hope to use thread pools with your communications components is IOCP, but here your choices are limited by the lack of third-party components. I have done all the investigation before and you can see my question at stackoverflow.com/questions/7150093/scalable-delphi-tcp-server-implementation.
I have a fully working distributed development framework (threading and comms) that has been used in production for over 3 years now across more than a half-dozen separate systems and basically covers everything you have asked so far. The code can be found on the web as well.

Does thread has limit to use the network bandwidth?

I heard there is some limitation for a single thread to use network bandwidth? if this is true, is this the reason to use multithread programming to achieve the maximum bandwidth?

The reason to use multithreading for network tasks is that one thread might be waiting for a response from the remote server. Creating multiple threads can help you having at least one thread downloading from different requests at one time.

The usual reason for issuing more than one network request at a time, (either implicitly with user threads, or implicitly with kernel threads and asynchronous callbacks), is that the effects of network latency can be be minimised. Latency can have a large effect. A web connection, for example, needs a DNS lookup first, then a TCP 3-way connect, then some data transfer and finally a 4-way close. If the page size is small and the bandwidth large compared with the latency, most time is spent waiting for protocol exchanges.
So, if you are crawling multiple servers, a multithreaded design is hugely faster even on a single-core machine. If you are downloading a single video file from one server, not so much..

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string