How a thread service two data sockets (not control sockets) equally?

How a thread service two data sockets (not control sockets) equally? - linux

Suppose that we have a single-thread application, and it needs to service two clients by writing 1G bytes data to two separate tcp sockets (one socket per client) respectively, in this situcation how can the thread work on the two tasks equally and continually?
I think this problem exists in server applications like Apache, take the Apache Web Server as an example, the Apache sets a max thread limit for itself, say it is MAX_THREADS, and if there are (MAX_THREADS + 1) outstanding requests and sockets there which means at least one thread must handle two sockets equally. Then how would apache handle this situation?
Steve

Usually when we want to handle several sockets in a single threaded application then one of the following system calls are generally used
select (http://en.wikipedia.org/wiki/Select_%28Unix%29)
poll (http://linux.die.net/man/2/poll)
epoll (http://en.wikipedia.org/wiki/Epoll)
More on these calls can be found in the man pages.
the general idea is to make the single thread not get blocked waiting to get a resource and periodically check if data is available to send or receive

Related

Websockets: listen multiple connections simultaneously?

I am working on a project which goal is to receive and store real time data from financial exchanges, using websockets. I have some very general questions about the technology.
Suppose that I have two websocket connections open, receiving real time data from two different servers. How do I make sure not to miss any messages? I have learned a bit of asynchronous programming (python asyncio) but it does not seem to solve the problem: when I listen to one connection, I cannot listen to the other one at the same time, right?
I can think of two solutions: the first one would require that the servers use a buffer system to send their data, but I do not think this is the case (Binance, Bitfinex...). The second solution I see is to listen each websocket using a different core. If my laptop has 8 cores I can listen to 8 connections and be sure not to miss any messages. I guess I can then scale up by using a cloud service.
Is that correct or am I missing something? Many thanks.

when I listen to one connection, I cannot listen to the other one at the same time, right?
Wrong.
When using an evented programming design, you will be using an IO "reactor" that adds IO related events to the event loop.
This allows your code to react to events from a number of connections.
It's true that the code reacts to the events in sequence, but as long as your code doesn't "block", these events could be handled swiftly and efficiently.
Blocking code should be avoided and big / complicated tasks should be fragmented into a number of "events". There should be no point at which your code is "blocking" (waiting) on an IO read or write.
This will allow your code to handle all the connections without significant delays.
...the first one would require that the servers use a buffer system to send their data...
Many evented frameworks use an internal buffer that streams to the IO when "ready" events are raised. For example, look up the drained event in node.js (or the on_ready in facil.io).
This is a convenience feature rather than a requirement.
The event loop might as well add an "on ready" event and assume your code will handle buffering after partial write calls return EAGAIN / EWOULDBLOCK.
The second solution I see is to listen each websocket using a different core.
No need. A single thread on a single core with an evented design should support thousands (and tens of thousands) of concurrent clients with reasonable loads (per-client load is a significant performance factor).
Attaching TCP/IP connections to a specific core can (sometimes) improve performance, but this is a many-to-one relationship. If we had to dedicate a CPU core per connection than server prices would shoot through the roof.

nodejs cluster distributing connection

In nodejs api doc, it says
The cluster module supports two methods of distributing incoming
connections.
The first one (and the default one on all platforms except Windows),
is the round-robin approach, where the master process listens on a
port, accepts new connections and distributes them across the workers
in a round-robin fashion, with some built-in smarts to avoid
overloading a worker process.
The second approach is where the master process creates the listen
socket and sends it to interested workers. The workers then accept
incoming connections directly.
The second approach should, in theory, give the best performance. In
practice however, distribution tends to be very unbalanced due to
operating system scheduler vagaries. Loads have been observed where
over 70% of all connections ended up in just two processes, out of a
total of eight.
I know PM2 is using the first one, but why it doesn't use the second? Just because of unbalnced distribution? thanks.

The second may add CPU load when every child process is trying to 'grab' the socket master sent.

How to set up communication between two processes?

I have the following situation:
A daemon that does a privileged operation on data that is kept in memory.
A multithreaded server currently running on about 30 cores handling user requests.
The server (1) would receive queries from (2), process them one by one, and return an answer. Each query to (1) would never block and only take a fraction of a microsecond on (1) to process, so we are guaranteed to get responses back fast unless (1) gets overrun by too much load.
Essentially, I would like to set up a situation where (1) listens to a UNIX domain socket and (2) writes requests and reads responses. However, I would like each thread of (2) to be able to read and write concurrently. My idea is to have one UNIX socket per thread for communication between (1) and (2) have (1) block on epoll_wait on these sockets processing requests one by one. Each thread on (2) would then read and write independently to its socket.
The problem that I see with this approach is that I can't easily dynamically grow the number of threads on (2). Is there a way to accomplish this in a way that is flexible with respect to runtime configuration? I guess one approach would be to have a large number of sockets and a thread on (2) would pick one socket by random, take a mutex on it, write a query and block waiting for a response, then release the mutex once it gets a response back from (1).
Anyone have better ideas?

I would suggest a viable possibility is to go with your own proposal and have each thread create its own socket for communicating with the daemon. You can use streaming (tcp) sockets which can easily solve your problem of adding more threads dynamically:
The daemon listens on a particular port, using socket(), bind() and listen(). The socket being listened to is initially the only thing in its epoll_wait set.
The client threads connect to this port with connect()
The daemon server accepts (with accept()) the incoming connection to create a new socket, which is added to its epoll_wait set with epoll_ctl().
The above procedure can be used to arbitrarily add as many sockets as you need, all with a single epoll_wait loop on the daemon side.

nodejs - Why Node.js can handle large number of simulteneous persistent connections?

I know Node.js is good at keeping large number of simultaneous persistent connections, for example, a chat room for many many chatters.
I am wondering how it achieves this. I mean anyway it is using TCP/IP which is encapsulated by the underlying OS, why it can handle persistent connections so well that others cannot?
What is the magic thing does it have?

Node.js makes all I/O asynchronous. It only runs in a single thread, but will do other requests or operations while waiting on I/O.
In contrast, classical web servers will not serve another request until the previous one is fully done. For this reason, Apache runs several processes at the same time; let's say there's 10 httpd processes, that normally means 10 requests can be served at any one time (*). If the processes take more time to complete, you will serve less requests - or will have to spawn more processes, even if the process is doing nothing - like waiting for the database to chew up and return data.
A node.js process, faced with a request that will go to the database, leaves the database to work while it goes to serve another request.
*) MPM makes this not quite true, but true enough for all intents and purposes.

Well, the thing is that most web servers (like apache etc.. ) works using thread spawning, where they spwan a new thread for every incoming HTTP request. these threads are synchronous and blocking in nature => which means they will execute the code in the order it is written and any further computation will be blocked by the current I/O or compute task. Like if you want to listen for an event like - chat submission by a chatter you need to have a dedicated thread per user ( per user is necessary for maintaining persistent connection, there are few possible optimization techniques but still you can assume threads to be per user) listening to this event and this thread will be blocked waiting for this event to happen. So for any thread spawning and blocking web-server
Javascript on the other hand is non-blocking ( and conductive to asynchronous codes )by nature => here you register a callback for an event and whenever it occurs some the callback function will be executed. It will not block at any point waiting for this event.
You can find more about this by reading about non-blocking or asynchronous servers.

UNIX socket magic. Recommended for high performance application?

I'm looking using to transfer an accept()ed socket between processes using sendmsg(). In short, I'm trying to build a simple load balancer that can deal with a large number of connections without having to buffer the stream data.
Is this a good idea when dealing with a large number (let's say hundreds) of concurrent TCP connections? If it matters, my system is Gentoo Linux

You can share the file descriptor as per the previous answer here.
Personally, I've always implemented servers using pre-fork. The parent sets up the listening socket, spawns (pre-forks) children, and each child does a blocking accept. I used pipes for parent <-> child communication.

Until someone does a benchmark and establishes how "hard" it is to send a file descriptor, this remains speculation (someone might pop up: "Hey, sending the descriptor like that is dirt-cheap"). But here goes.
You will (likely, read above) be better off if you just use threads. You can have the following workflow:
Start a pool of threads that just wait around for work. Alternatively you can just spawn a new thread when a request arrives (it's cheaper than you think)
Use epoll(7) to wait for traffic (wait for connections + interesting traffic)
When interesting traffic arrives you can just dispatch a "job" to one of the threads.
Now, this does circumvent the whole descriptor sending part. So what's the catch ? The catch is that if one of the threads crashes, the whole process crashes. So it is up to you to benchmark and decide what's best for your server.
Personally I would do it the way I outlined it above. Another point: if the workers are children of the process doing the accept, sending the descriptor is unnecessary.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string