Setting differing send and receive kernel buffer sizes

Setting differing send and receive kernel buffer sizes - linux

What disadvantages (or advantages) are there to setting different values for a send and receive buffer on opposite sides of a connection? It seems to make the most sense (and the norm) to keep these values the same. But if one side (say the sender side) has the resources to double their buffer size, what implications could this have?
I guess a related question is, what disadvantages are there to setting a larger-than-required buffer size? From what I've read, it sounds like you could potentially overflow the receive buffer if your send buffer is larger. Additionally, it seems like there may not be a need to increase buffer sizes as long as your applications are keeping up with the load and can handle max-size messages. It doesn't necessarily mean you could handle more data throughput because you are still limited by the opposite endpoint. Is this correct?
The specific kernel settings in question are as follows:
net.core.wmem_max
net.core.rmem_max

a large buffer size can have a negative effect on performance in some cases. If the TCP/IP buffers are too large and applications are not processing data fast enough, paging can increase. The goal is to specify a value large enough to avoid flow control, but not so large that the buffer accumulates more data than the system can process

A TCP send buffer size smaller than the receiver's receive buffer size will prevent you from using the maximum available bandwidth
a UDP send buffer larger than the receiver's receive buffer size will prevent you from finding out at source about datagrams that are too large.
Neither of these problems is major, unless you're transmitting large amounts of data in the TCP case. In the UDP case you shouldn't attempt to send datagrams larger than 534 (or 576 or whatever that magic number is) anyway.

Related

Should I send data in chunks, or send it all at once?

I have python code that sends data to socket (a rather large file). Should I divide it into 1kb chunks, or would just conn.sendall(file.read()) be acceptable?

It will make little difference to the sending operation. (I assume you are using a TCP socket for the purposes of this discussion.)
When you attempt to send 1K, the kernel will take that 1K, copy it into kernel TCP buffers, and return success (and probably begin sending to the peer at the same time). At which point, you will send another 1K and the same thing happens. Eventually if the file is large enough, and the network can't send it fast enough, or the receiver can't drain it fast enough, the kernel buffer space used by your data will reach some internal limit and your process will be blocked until the receiver drains enough data. (This limit can often be pretty high with TCP -- depending on the OSes, you may be able to send a megabyte or two without ever hitting it.)
If you try to send in one shot, pretty much the same thing will happen: data will be transferred from your buffer into kernel buffers until/unless some limit is reached. At that point, your process will be blocked until data is drained by the receiver (and so forth).
However, with the first mechanism, you can send a file of any size without using undue amounts of memory -- your in-memory buffer (not including the kernel TCP buffers) only needs to be 1K long. With the sendall approach, file.read() will read the entire file into your program's memory. If you attempt that with a truly giant file (say 40G or something), that might take more memory than you have, even including swap space.
So, as a general purpose mechanism, I would definitely favor the first approach. For modern architectures, I would use a larger buffer size than 1K though. The exact number probably isn't too critical; but you could choose something that will fit several disk blocks at once, say, 256K.

Are there any side effects to increasing the socket buffer size in Linux?

I've been working on a large-file data transfer with 10 Gigabit Ethernet, and initially, I was having issues with UDP packets being dropped. After rewriting/optimizing a lot of my code, and developing new designs, I stumbled upon an article that discussed increasing the kernel socket buffer size. After doing so, I found that I could send many more packets without fear of any being dropped. Essentially, it seemed that I could eliminate packet loss, and increase transfer speed just by making the buffer size larger and larger (until I had something sufficient). However, my question is, does increasing the socket buffer size to something very large have any unwanted side effects? My first guess would be performance/CPU/memory issues, but in my testing, that hasn't really been a noticeable issue yet. Perhaps I'm just overly skeptical, but it almost seems too good to be true.

The only side-effect is memory usage. Increase them gradually and monitor the system. As long as you leave enough memory for existing processes you should be golden.

Refer http://www.cyberciti.biz/faq/linux-tcp-tuning/ link, it has useful basic information

Linux character device -- what to do if read buffer is too small?

I'm creating a linux device driver that create a character device.
The data that it returns on reads is logically divided into 16-byte units.
I was planning on implementing this division by returning however many units fit into the read buffer, but I'm not sure what to do if the read buffer is too small (<16 bytes).
What should I do here? Or is there a better way to achieve the division I'm trying to represent?

You could act like the datagram socket device driver: it always returns just a single datagram. If the read buffer is smaller, the excess is discarded -- it's the caller's responsibility to provide enough space for a whole datagram (typically, the application protocol specifies the maximum datagram size).
The documentation of your device should specify that it works in 16-byte units, so there's no reason why a caller would want to provide a buffer smaller than this. So any lost data due to the above discarding could be considered a bug in the calling application.
However, it would also be reasonable to return more than 16 at a time if the caller asks for it -- that suggests that the application will split it up into units itself. This could be more performance, since it minimizes system calls. But if the buffer isn't a multiple of 16, you could discard the remainder of the last unit. Just make sure this is documented, so they know to make it a multiple.
If you're worried about generic applications like cat, I don't think you need to. I would expect them to use very large input buffers, simply for performance reasons.

What's the practical limit on the size of single packet transmitted over domain socket?

Let us assume that there is a Unix domain socket created for a typical server-client program. The client sends a 10GB buffer over the socket and it is consumed by the server in the meanwhile.
Does OS (Linux/BSD) split the 10GB buffer into many packets and send/consume them, or are they sent at once?
If it is not possible to send 10GB buffer of domain socket in one go, then what is the practical size limit of a single packet?
Constraints:
The program will run on both Linux 2.6.32+ and FreeBSD 9+
Size of the buffer to be sent ranges from 3 bytes to 10GB maximum.

There are a number of factors which will determine the maximum of size of a packet that can be sent on a Unix socket:
The wmem_max socket send buffer maximum size kernel setting, which determines the maximum size of the send buffer that can be set using setsockopt (SO_SNDBUF). The current setting can be read from /proc/sys/net/core/wmem_max and can be set using sysctl net.core.wmem_max=VALUE (add the setting to /etc/sysctl.conf to make the change persistent across reboots). Note this setting applies to all sockets and socket protocols, not just to Unix sockets.
If multiple packets are sent to a Unix socket (using SOCK_DATAGRAM), then the maximum amount of data which can be sent without blocking depends on both the size of the socket send buffer (see above) and the maximum number of unread packets on the Unix socket (kernel parameter net.unix.max_dgram_qlen).
Finally, a packet (SOCK_DATAGRAM) requires contiguous memory (as per What is the max size of AF_UNIX datagram message that can be sent in linux?). How much contiguous memory is available in the kernel will depend on many factors (e.g. the I/O load on the system, etc...).
So to maximize the performance on your application, you need a large socket buffer size (to minimize the user/kernel space context switches due to socket write system calls) and a large Unix socket queue (to decouple the producer and consumer as much as possible). However, the product of the socket send buffer size and queue length must not be so large as to cause the kernel to run out of contiguous memory areas (causing write failures).
The actual figures will depend on your system configuration and usage. You will need to determine the limits by testing... start say with wmem_max at 256Kb and max_dgram_qlen at 32 and keep doubling wmem_max until you notice things start breaking. You will need to adjust max_dgram_qlen to balance the activity of the producer and consumer to a certain extent (although if the producer is much faster or much slower than the consumer, the queue size won't have much affect).
Note your producer will have to specifically setup the socket send buffer size to wmem_max bytes with a call to setsockopt (SO_SNDBUF) and will have to split data into wmem_max byte chunks (and the consumer will have to reassemble them).
Best guess: the practical limits will be around wmem_max ~8Mb and unix_dgram_qlen ~32.

There are no "packets" per se with domain sockets. The semantics of tcp "streams" or udp "datagrams" are sort of simulated w/i the kernel to look similar to user space apps but that's about as far as it goes. The mechanics aren't as involved as network sockets using network protocols. What you are really interested in here is how much the kernel will buffer for you.
From your program's perspective it doesn't really matter. Think of the socket as a pipe or FIFO. When the buffer fills you are going to block; if the socket is non-blocking you are going to get short writes (assuming streams) or error with EAGAIN. This is true regardless of the size of the buffer. However you should be able query the buffer size with getsockopt and to increase its size with setsockopt but I doubt you are going to get anywhere near 10GB.
Alternatively, you might look at sendfile.

There are two ideas here. One is the size of the packet sent if using SOCK_DGRAM and the other is the size of the buffer for the domain socket. This depends on the variables set with the domain socket. Size can depend if it is a memory file socket.

If you're talking about SOCK_DGRAM, it is easily determined by experiment. It seems a lot more likely that you're talking about SOCK_STREAM, in which case it simply does not matter. SOCK_STREAM will sort it outer you. Just write in whatever size chunks you like: the larger the better.

How to cope with 320 million 272-byte UDP packets?

So, I have an incoming UDP stream composed of 272 byte packets at a data rate of about 5.12Gb/s (around 320e6 packets per second). This data is being sent by an FPGA-based custom board. The packet size is a limit of the digital design being run, so although theoretically it could be possible to increase it to make things more efficient, it would require a large amount of work. At the receiving end these packets are read and interpreted by a network thread and placed in a circular buffer shared with a buffering thread, which will copy this data to a GPU for processing.
The above setup at the receiving end could cope with 5.12Gb/s for 4096 KB packet (used on a different design) using simple recv calls, however with the current packet size I'm having a hard time keeping up with the packet flow, too much time is being "wasted" in context switching and copying small data segments from kernel space to user space. I did a quick test implementation which uses recvmmsg, however thing didn't improve by much. On average I can processes about 40% of the incoming packets.
So I was wondering whether it was possible to get a handle of the kernel's UDP data buffer for my application (mmap style), or use some sort of zero-copying from kernel to user space?
Alternatively, do you know of any other method which would reduce this overhead and be capable of performing the required processing?
This is running on a Linux machine (kernel 3.2.0-40) using C code.

There is support for mmap packet receiving in Linux.
It's not so easy to use as UDP sockets, because you will receive packets like from RAW socket.
See this for more information.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string