libuv: multiple event loops for TCP connections? - multithreading

I have been looking at libuv, and learned that due to its none-blocking style, it can handle lots of TCP connections simultaneously on just one thread. On a multi-core machine, it follows that it could handle even more connections, so would like to investigate this.
I have read quite a lot of material, but have not been able to find an example of using multiple event loops, each one within its own thread, for handling TCP connections. The closest I have come is the multi-echo-server example, but this uses multiple processes, not threads.
Can anyone point me at an example of using multiple event loops, each in their own thread, for TCP connections, please?
Would using multiple processes (like nginx does) - as opposed to multiple threads - be generally a better idea? (Perhaps this is why I cannot find an example of what I want?)

Related

Sockets - select / thread / both

Recently I have learnt about network programming. I know that for server to handle multiple clients, there is a need to use select or Thread (at least in python/c/c++, I do not know nothing about something similar to select in java, in java I only know the thread approach).
I have read that using select is better from the performance point of view and threads are better for small servers. However, yesterday I found this page: http://www.assembleforce.com/2012-08/how-to-write-a-multi-threading-server-in-python.h and I do not understand why in the provided code guy uses both select and threads? It's difficult for me to understand how does exactly it works and why it is better than other methods I mentioned? I do not understand the idea behind this code.
Thank you.
Threads and select are not mutually exclusive.
Multi-threading is a form of parallel processing, allowing a single process to seemingly perform multiple tasks in an asynchronous manner.
Using select allows your program to monitor a file descriptor (e.g, a socket), waiting for an event.
Both can (and, to my knowledge, are frequently) used together. In a network server environment, threading can be used to service multiple clients, while select is used so that one of the threads will not hog CPU time while idling.
Imagine that you are receiving data from multiple clients. A thread is waiting for data from client1, which is taking too long, meanwhile, client2 is sending data like crazy. You have three options:
Without select, using blocking calls: Block waiting for data from client1, and leave client2 waiting.
With select, using non-blocking calls: Continuously poll client1, giving up after n tries without any data transfer.
With select: Monitor the clients sockets. If they have data to transfer, read it. Else, relinquish the current thread CPU time.
This is a simple non-blocking approach to network servers, trying to give a low latency response to client. There are different approaches, and for that I recommend you check the book UNIX Network Programming.

Calling accept() from multiple threads

I'm writing a concurrent TCP server that has to handle multiple connections with the 'thread per connection' approach (using a thread pool). My doubt is about which is the most optimal way for every thread to get a different file descriptor.
I found that the next two methods are the most recommended:
A main thread that accepts() all the incoming connections and stores their descriptors on a data structure (e.g.: a queue). Then every thread is able to get an fd from the queue.
Accept() is called directly from every thread. (Recommended in Unix Network Programming V1 )
Problems I find to each of them:
The static data structure that stores all the fd's must be locked (mutex_lock) before a thread can read from it, so in the case that a considerable number of threads wants to read in exactly the same moment I don't know how much time would pass until all of them would get their goal.
I've been reading that the Thundering Herd problem related to simultaneous accept() calls has not been totally solved on Linux yet, so maybe I would need to create an artificial solution to it that would end up making the application at least as slow as with the approach 1.
Sources:
(Some links talking about approach 2: does-the-thundering-herd-problem-exist-on-linux-anymore - and one article I found about it (outdated) : linux-scalability/reports/accept.html
And an SO answer that recommends approach 1: can-i-call-accept-for-one-socket-from-several-threads-simultaneously
I'm really interested on the matter, so I will appreciate any opinion about it :)
As mentioned in the StackOverflow answer you linked, a single thread calling accept() is probably the way to go. You mention concerns about locking, but these days you will find lockfree queue implementations available in Boost.Lockfree, Intel TBB, and elsewhere. You could use one of those if you like, but you might just use a condition variable to let your worker threads sleep and wake one of them when a new connection is established.

Would handling each TCP connection in a separate thread improve latency?

I have an FTP server, implemented on top of QTcpServer and QTcpSocket.
I take advantage of the signals and slots mechanism to support multiple TCP connections simultaneously, even though I have a single thread. My code returns as soon as possible to the event loop, it doesn't block (no wait functions), and it doesn't use nested event loops anywhere. That way I already have cooperative multitasking, like Win3.1 applications had.
But a lot of other FTP servers are multithreaded. Now I'm wondering if using a separate thread for handling each TCP connection would improve performance, and especially latency.
On one hand, threads add to latency because you need to start a new thread for each new connection, but on the other, with my cooperative multitasking, other TCP connections have to wait until I've returned to the main loop before their readyRead()/bytesWritten() signals can be handled.
In your current system and ignoring file I/O time one processor is always doing something useful if there's something useful to be done, and waiting ready-to-go if there's nothing useful to be done. If this were a single processor (single core) system you would have maximized throughput. This is often a very good design -- particularly for an FTP server where you don't usually have a human waiting on a packet-by-packet basis.
You have also minimized average latency (for a single processor system.) What you do not have is consistent latency. Measuring your system's performance is likely to show a lot of jitter -- a lot of variation in the time it takes to handle a packet. Again because this is FTP and not real-time process control or human interaction, jitter may not be a problem.
Now, however consider that there is probably more than one processor available on your system and that it may be possible to overlap I/O time and processing time.
To take full advantage of a multi-processor(core) system you need some concurrency.
This normally translates to using multiple threads, but it may be possible to achieve concurrency via asynchronous (non-blocking) file reads and writes.
However, adding multiple threads to a program opens up a huge can-of-worms.
If you do decide to go the MT route, I'd suggest that you consider depending on a thread-aware I/O library. QT may provide that for you (I'm not sure.) If not, take a look at boost::asio (or ACE for an older, but still solid solution). You'll discover that using the MT capabilities of such a library involves a considerable investment in learning time; however as it turns out the time to add on multithreading "by-hand" and get it right is even worse.
So I'd say stay with your existing solution unless you are worried about unused Processor cycles and/or jitter in which case start learning QT's multithreading support or boost::asio.
Do you need to start a new thread for each new connection? Could you not just have a pool of threads that acts on requests as and when they arrive. This should reduce some of the latency. I have to say that in general a multi-threaded FTP server should be more responsive that a single-threaded one. Is it possible to have an event based FTP server?

What logically is an event loop in a thread?

I came across node.js and python's tornado vs the Apache.
They say :
Apache makes a thread for every connection.
Node.js & tornado actually does event looping on a thread and a single thread can handle many connections.
I don't understand that what logically be a child of a thread.
In computer science terms:
Processes have isolated memory and share CPU with context switches.
Threads divides a process.
Therefore, a process with multiple control points is achieved by multiple threads.
Now,
What how does event loop works under a thread ?
How can it handle different connection under 1 control of a thread ?
Update :
I mean if there is communication with 3 sockets under 1 thread, how can 1 thread communicate with 3 sockets without keeping anyone on wait ?
An event loop at its basic level is something like:
while getNextEvent (&event) {
dispatchEvent (&event);
}
In other words, it's nothing more than a loop which continuously retrieves events from a queue of some description, then dispatches the event to an event handling procedure.
It's likely you know that already but I'm just explaining it for context.
In terms of how the different servers handle it, it appears that every new connection being made in Apache has a thread created for it, and that thread is responsible for that connection and nothing else.
For the other two, it's likely that there are a "set" number of threads running (though this may actually vary based on load) and a connection is handed off to one of those threads. That means any one thread may be handling multiple connections at any point in time.
So the event in that case would have to include some details as to what connection it applies to, so the thread can keep the different connections isolated from each other.
There are no doubt pros and cons to both options. A one-connection-per-thread optio n would have simplified code in the thread function since it didn't have to deal with multiple connections but it may end up with a lot of resource usage as the load got high.
In a multiple-connection-per-thread scenario, the code is a little more complex but you can generally minimise thread creation and destruction overhead by simply having the maximum number of threads running all the time. Outside of high-load periods, they'll just be sitting around doing nothing, waiting on a connection event to be given to them.
And, even under high load, it may be that each thread can quite easily process five concurrent connections without dropping behind which would mean the one-connection-per-thread option was a little wasteful.
Based on your update:
I mean if there is communication with 3 sockets under 1 thread, how can 1 thread communicate with 3 sockets without keeping anyone on wait ?
There are a great many ways to do this. For a start, it would generally all be abstracted behind the getNextEvent() call, which would probably be responsible for handling all connections and farming them out to the correct threads.
At the lowest levels, this could be done with something like a select call, a function that awaits activity on one of many file descriptors, and returns information relating to which file descriptor has something to say.
For example, you provide a file descriptor set of all currently open sockets and pass that to select. It will then give you back a modified set, containing only those that are of interest to you (such as ready-to-read-from).
You can then query that set and dispatch events to the corresponding thread.

High Performance Socket Server using Perl

i need to write a socket server using perl which will run on a 64bit linux (2.6x kernel). Is there a library to support IO Completion Ports and some equivalent on Linux?
I need to listen to multiple ports. 8000-8100 is there a smart way doing this?
The protocol has to use a length byte.
What threading library do you recommend? I have written something similar on Windows using a cooperative multitasking based threadscheduler. i mean i want to avoid creating for each socket a thread to handle more than 10.000 simultaneous conenctions.
thanks in advance.
Threading in Perl is generally not adviced.
Instead, for high performance, you should consider looking into non blocking or event driven programming.
With regular sockets, your process blocks every IO operation, i.e reading from a socket that isn't ready will put your process to sleep until data is available. with non blocking/event driven you poll the sockets and get callbacks when the sockets are ready to be read from or written to, so a single process can multiplex on many sockets, thus providing good scalable performance since you don't need to fork new processes to handle more clients.
There are many good event based frameworks in Perl, e.g POE and AnyEvent POE is a specific event loop with lots of modules and features and AnyEvent is an abstraction layer that lets you use multiple event loops in the same code.
You should also look into libev which is similar to POE but with a lot less overhead.
Writing event driven code is somewhat tricky at first, since you need to be careful with the blocking code you do have, e.g cpu intensive operation, or using libraries which aren't non blocking. because since you have only one process, if it's busy doing something, it can't do anything else - like poll on the sockets and issue callbacks.
So, if you need both non blocking and intensive computations, one way to do it is to create worker forks and use non blocking pipes to communicate between them and your event loop, which is really straight forward with the above libraries.

Resources