How the buffering work in socket on linux - linux

How does buffering work with sockets on Linux?
i.e. if the server does not read the socket and the client keeps sending data.
So what will happen? How big is the socket's buffer? And will the client know so that it will stop sending?

For UDP socket client will never know - the server side will just start dropping packets after the receive buffer is filled.
TCP, on the other hand, implements flow control. The server's kernel will gradually reduce the window, so the client will be able to send less and less data. At some point the window will go down to zero. At this point the client fills up its send buffer and receives an error from the send(2).

TCP sockets use buffering in the protocol stack. The stack itself implements flow control so that if the server's buffer is full, it will stop the client stack from sending more data. Your code will see this as a blocked call to send(). The buffer size can vary widely from a few kB to several MB.

I'm assuming that you're using send() and recv() for client and server communication.
So, send() will return the number of bytes that have been sent out. This doesn't necessarily equal to to the number of bytes you wanted to send out, so it's up to you to realise this and send the rest.
Now, the recv() returns the number of bytes read to the buffer. So if recv returns a 0, then the server has probably closed the connection.

Related

linux unix sockets vs tcp sockets send buffer size

I am comparing how many bytes a send call can transfer when a socket is tcp and when it is unix socket.
For unix domain socket the number is always 219264,but for TCP this number is much higher.Why is this difference? Both the programs are executed in the same machine
Note:sockets are in non blocking mode
checked the buffer size,these are the values
unix domain socket
receive buffer size =212992
send buffer size =212992
TCP socket
receive buffer size =1062000
send buffer size =2626560
can someone explain me why is this difference?
The tcp buffer is used for packages which have been sent but not acknowledged by the other end yet and for packages which have been received out of order and are waiting for delayed packages to arrive before presented to the application. Of course packages will also stay in the buffer as long as the consuming application doesn't read() the data.
Over UNIX sockets, packages which are waiting for ACK, or the order of packages is not an issue, therefore the buffer can be smaller.

Does NodeJS WebSockets (ws) module implement backpressure?

I'm implementing a WebSockets server on NodeJS using the ws module. The server should send once per minute an update to all clients. I have this already implemented, but I have some concerns about its functionality in conditions where client connections can stall.
I'm concerned of what happens when a connection to the client becomes inactive, for example due to network connection breaking in a way that doesn't send a TCP RST or FIN.
I'm somewhat surprised that the send() method is not called with the await keyword in an async method. Does the send() method just queue all data to be sent? What if the socket buffers become full, can send() block in a way that causes starvation of other clients than the blocked one?
If send() never blocks, what happens if the data is queued and queued and queued...? Can it use an ever-increasing unbounded amount of memory?
Ideally, I would like to omit sending a once-per-minute update if the last update hasn't been fully sent. Can I achieve this with the ws module?
I'm concerned of what happens when a connection to the client becomes
inactive, for example due to network connection breaking in a way that
doesn't send a TCP RST or FIN.
If the connection is lost in this way (perhaps by the client system being switched off or physically disconnected) then TCP at the server will detect the broken connection because it will receive no acknowledgements of sent data. It can take a couple of minutes for TCP to give up, but in this case that doesn't sound like a big problem.
The worst-case scenario is when the client system remains connected but the client process ceases to read data from the connection. In that case sent data will accumulate at the client until the client's socket receive buffer fills, and then sent data will accumulate at the server -- first in the in-kernel socket send buffer, and then in server process memory.
I'm somewhat surprised that the send() method is not called with the
await keyword in an async method.
ws predates async/await and promises by years. I imagine that the API will eventually be retrofitted, but it hasn't happened yet.
Does the send() method just queue
all data to be sent? What if the socket buffers become full, can
send() block in a way that causes starvation of other clients than the
blocked one?
WebSocket.send ends up calling the built-in Net module's Socket.write. (See the sendFrame function at bottom of https://github.com/websockets/ws/blob/master/lib/sender.js for that call, and see https://nodejs.org/docs/latest-v8.x/api/net.html#net_class_net_socket for documentation of the Socket class.)
Socket.write will buffer data in the user process if the kernel can not immediately accept the data. Data is buffered separately per-Socket, so typically this buffering will not affect transmission on other Sockets connected to other clients. However, there's no bound on the amount of data one Socket will buffer. In the extreme case one Socket's buffered data could consume all of the server process's memory, and the resulting server crash would interfere with data delivery to all clients.
There are several ways to avoid this problem. Two easy methods that spring to mind are:
provide a completion callback argument to the send call. That callback will be passed on to the Socket.write call, which will fire the callback when all of that write's data has been written into the kernel. If your server refrains from sending more data to this client until the callback fires, the amount of data buffered in user space for that connection will be limited to something close to the size of the most recent send. (It won't be precisely that size because the buffered data will include WebSocket framing, plus SSL framing and padding if your connection is encrypted, on top of the original data passed to send.) Or
examine the bufferSize property of the connection's Socket before preparing to send data on that connection. bufferSize indicates the amount of data that is currently buffered in user space for that Socket. If it's non-zero, skip the send for that client.

How linux select() works?

Could someone explain how select() works to me? I have a wrong mental model and do not understand it from the man page.
If I have clients with multiple sockets to different servers and I am reading some info periodically from server, how does the kernel knows which socket mark to read? How does he know which socket read() will not be blocking? I think it is not predictable if it actually does not read data from server.
The kernel isn't predicting anything. It's telling you the current status of the socket's receive buffer. If the buffer is not empty, the socket is readable. If the buffer is empty, select() waits. When a packet arrives from the server, the kernel matches it to the correct socket using the IP address, protocol, and port numbers. The packet is put into the socket's receive queue and select() is notified that the status has changed.

Can af_unix socket with SOCK_SEQPACKET be limited to hundreds of bytes? [duplicate]

An example client(http://pastebin.com/hAbpFPia) and server(http://pastebin.com/9pL27hkK) using SOCK_SEQPACKET, indicate that the client can queue up ~42k of data.
Using setsockopt(socket, SOL_SOCKET, SO_SNDBUF,...) to set the size on the client size does limit the buffered data significantly. Is there some way I can enforce/set this limit from the server side? I tried setting SO_RCVBUF on the accept()ed socket and the socket before calling accept() but neither works for me.
Using AF_UNIX.
The server cannot enforce a SO_SNDBUF size on the client.
SO_SNDBUF limits the buffer that the sender's OS uses. The sending application can send up to SO_SNDBUF bytes, even if the underlying network cannot send the data right now, until a send would block. The OS takes care to send this data when the network stack is able to accept a new send.
SO_RCVBUF if the buffer that the receiver's OS uses. The network stack is receiving (and acknowledging to the sender) up to SO_RCVBUF bytes, even if the application does not call recv to retrieve the data. For unix domain sockets SO_RCVBUF does not have any effect, but SO_SNDBUF has.
As you can see both buffers sum up. And they are working on different ends of the network connection. They do not affect each other and the receiver cannot enforce a SO_SNDBUF on the sender's side - as the sender cannot enforce a SO_RCVBUF on the receiver's side.

a UDP socket based rateless file transmission

I'm new to socket programming and I need to implement a UDP based rateless file transmission system to verify a scheme in my research. Here is what I need to do:
I want a server S to send a file to a group of peers A, B, C.., etc. The file is divided into a number of packets. At the beginning, peers will send a Request message to the server to initialize transmission. Whenever S receives a request from a client, it ratelessly transmit encoded packets(how to encode is done by my design, the encoding itself has the erasure-correction capability, that's why I can transmit ratelessly via UDP) to that client. The client keeps collecting packets and try to decode them. When it finally decodes all packets and re-construct the file successfully, it sends back a Stop message to the server and S will stop transmitting to this client.
Peers request the file asynchronously (they may request the file at different time). And the server will have to be able to concurrently serve multiple peers. The encoded packets for different clients are different (they are all encoded from the same set source packets, though).
Here is what I'm thinking about the implementation. I have not much experience with unix network programming though, so I'm wondering if you can help me assess it, and see if it is possible or efficient.
I'm gonna implement the server as a concurrent UDP server with two socket ports(similar to TFTP according to the UNP book). One is to receive controlling messages, as in my context it is for the Request and Stop messages. The server will maintain a flag (=1 initially) for each request. When it receives a Stop message from the client, the flag will be set to 0.
When the serve receives a request, it will fork() a new process that use the second socket and port to send encoded packets to the client. The server keeps sending packets to the client as long as the flag is 1. When it turns to 0, the sending ends.
The client program is easy to do. Just send a Request, recvfrom() the server, progressively decode the file and send a Stop message in the end.
Is this design workable? The main concerns I have are: (1), is that efficient by forking multiple processes? Or should I use threads? (2), If I have to use multiple processes, how can the flag bit be known by the child process? Thanks for your comments.
Using UDB for file transfer is not best idea. There is no way for server or client to know if any packet has been lost so you would only know that during reconstruction assuming you have some mechanism (like counter) to detect lost packes. It would then be hard to request just one of those packets that got lost. And in the end you would have a code that would do what TCP sockets do. So I suggest to start with TCP.
Typical design of a server involves a listener thread that spawns a worker thread whenever there is a new client request. That new thread would handle communication with that particular client and then end. You should keep a limit of clients (threads) that are served simultaneously. Do not spawn a new process for each client - that is inefficient and not needed as this will get you nothing that you can't achieve with threads.
Thread programming requires carefulness so do not cut corners. Otherwise you will have hard time finding and diagnosing problems.
File transfer with UDP wil be fun :(
Your struct/class for each message should contain a sequence number and a checksum. This should enable each client to detect, and ask for the retransmission of, any missing blocks at the end of the transfer.
Where UDP might be a huge winner is on a local LAN. You could UDP-broadcast the entire file to all clients at once and then, at the end, ask each client in turn which blocks it has missing and send just those. I wish Kaspersky etc. would use such a scheme for updating all my local boxes.
I have used such a broadcast scheme on a CANBUS network where there are dozens of microControllers that need new images downloaded. Software upgrades take minutes instead of hours.

Resources