Setting Socket Receive Buffer Size, gets truncated to 244KB - linux

I'm trying to increase the size of my socket receive buffer size using setsockopt() on linux. I can set it successfully to any value below 244KB. Any value above 244KB gets truncated to 244KB.
There appears to be some sort of system limit in place, but I can't figure where it is coming from, as it doesn't correspond with values below:
$ cat /proc/sys/net/ipv4/tcp_rmem
4096 87380 4194304
$ cat /proc/sys/net/ipv4/tcp_wmem
4096 16384 4194304
$ cat /proc/sys/net/core/rmem_default
124928
$ cat /proc/sys/net/core/wmem_default
124928
The default value is 87380 as expected, but I can't increase it to 4194304. It gets limited to 244KB. Interestingly that value is 2X rmem_default, do I need to change that?
Thanks

From man page for TCP:
The maximum sizes for socket buffers declared via the SO_SNDBUF and
SO_RCVBUF mechanisms are limited by the global net.core.rmem_max and
net.core.wmem_max sysctls. Note that TCP actually allocates twice the
size of the buffer requested in the setsockopt(2) call, and so a suc-
ceeding getsockopt(2) call will not return the same size of buffer as
requested in the setsockopt(2) call
So what you pass for SO_SNDBUF/SO_RCVBUF gets doubled while allocation. And as such you cannot pass the max value (4194304) in setsockopt

Related

Can anybody let me know the command to obtain the current maximum size to store the object in the Memcached

Can anybody let me know the command to obtain the current maximum size to store the object in the Memcached.
In the documentation I have not come across getting the actual size of my Memcached capacity.
It's available in the output of stat settings command.
I didn't get if you wanted the maximum size of a single item or the total available memory but both are available.
Maximum item size
(at least in not too old releases, I checked with 1.4.13). Using telnet:
telnet <hostname> <port>
> stat settings
(...)
item_size_max 1048576
(...)
Maximum capacity in bytes
Available both in stat and stat settings:
telnet <hostname> <port>
> stat settings
(...)
maxbytes 10737418240
(...)
> stat
(...)
limit_maxbytes 10737418240
(...)
Remaining capacity
As far as I know, it's not directly available, you have to compute it from stat command output:
> stat
(...)
bytes 5349380740
(...)
limit_maxbytes 10737418240
(...)
Here I have limit_maxbytes - bytes = 5388037500 bytes remaining.
Documentation
This is confirmed by the documentation shipped with sources, in doc/protocol.txt:
| maxbytes | size_t | Maximum number of bytes allows in this cache |
| item_size_max | size_t | maximum item size |
Side note
Note that this is only memory consumption and memory limit from a memcached point of view. The limit is the one that has been given in command line with -m option. It won't tell you if the physical memory is not sufficient to handle that much data.

TCP receiving window size higher than net.core.rmem_max

I am running iperf measurements between two servers, connected through 10Gbit link. I am trying to correlate the maximum window size that I observe with the system configuration parameters.
In particular, I have observed that the maximum window size is 3 MiB. However, I cannot find the corresponding values in the system files.
By running sysctl -a I get the following values:
net.ipv4.tcp_rmem = 4096 87380 6291456
net.core.rmem_max = 212992
The first value tells us that the maximum receiver window size is 6 MiB. However, TCP tends to allocate twice the requested size, so the maximum receiver window size should be 3 MiB, exactly as I have measured it. From man tcp:
Note that TCP actually allocates twice the size of the buffer requested in the setsockopt(2) call, and so a succeeding getsockopt(2) call will not return the same size of buffer as requested in the setsockopt(2) call. TCP uses the extra space for administrative purposes and internal kernel structures, and the /proc file values reflect the larger sizes compared to the actual TCP windows.
However, the second value, net.core.rmem_max, states that the maximum receiver window size cannot be more than 208 KiB. And this is supposed to be the hard limit, according to man tcp:
tcp_rmem
max: the maximum size of the receive buffer used by each TCP socket. This value does not override the global net.core.rmem_max. This is not used to limit the size of the receive buffer declared using SO_RCVBUF on a socket.
So, how come and I observe a maximum window size larger than the one specified in net.core.rmem_max?
NB: I have also calculated the Bandwidth-Latency product: window_size = Bandwidth x RTT which is about 3 MiB (10 Gbps # 2 msec RTT), thus verifying my traffic capture.
A quick search turned up:
https://github.com/torvalds/linux/blob/4e5448a31d73d0e944b7adb9049438a09bc332cb/net/ipv4/tcp_output.c
in void tcp_select_initial_window()
if (wscale_ok) {
/* Set window scaling on max possible window
* See RFC1323 for an explanation of the limit to 14
*/
space = max_t(u32, sysctl_tcp_rmem[2], sysctl_rmem_max);
space = min_t(u32, space, *window_clamp);
while (space > 65535 && (*rcv_wscale) < 14) {
space >>= 1;
(*rcv_wscale)++;
}
}
max_t takes the higher value of the arguments. So the bigger value takes precedence here.
One other reference to sysctl_rmem_max is made where it is used to limit the argument to SO_RCVBUF (in net/core/sock.c).
All other tcp code refers to sysctl_tcp_rmem only.
So without looking deeper into the code you can conclude that a bigger net.ipv4.tcp_rmem will override net.core.rmem_max in all cases except when setting SO_RCVBUF (whose check can be bypassed using SO_RCVBUFFORCE)
net.ipv4.tcp_rmem takes precedence net.core.rmem_max according to https://serverfault.com/questions/734920/difference-between-net-core-rmem-max-and-net-ipv4-tcp-rmem:
It seems that the tcp-setting will take precendence over the common max setting
But I agree with what you say, this seems to conflict with what's written in man tcp, and I can reproduce your findings. Maybe the documentation is wrong? Please find out and comment!

How to find the socket buffer size of linux

What's the default socket buffer size of linux? Is there any command to see it?
If you want see your buffer size in terminal, you can take a look at:
/proc/sys/net/ipv4/tcp_rmem (for read)
/proc/sys/net/ipv4/tcp_wmem (for write)
They contain three numbers, which are minimum, default and maximum memory size values (in byte), respectively.
For getting the buffer size in c/c++ program the following is the flow
int n;
unsigned int m = sizeof(n);
int fdsocket;
fdsocket = socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP); // example
getsockopt(fdsocket,SOL_SOCKET,SO_RCVBUF,(void *)&n, &m);
// now the variable n will have the socket size
Whilst, as has been pointed out, it is possible to see the current default socket buffer sizes in /proc, it is also possible to check them using sysctl (Note: Whilst the name includes ipv4 these sizes also apply to ipv6 sockets - the ipv6 tcp_v6_init_sock() code just calls the ipv4 tcp_init_sock() function):
sysctl net.ipv4.tcp_rmem
sysctl net.ipv4.tcp_wmem
However, the default socket buffers are just set when the sock is initialised but the kernel then dynamically sizes them (unless set using setsockopt() with SO_SNDBUF). The actual size of the buffers for currently open sockets may be inspected using the ss command (part of the iproute/iproute2 package), which can also provide a bunch more info on sockets like congestion control parameter etc. E.g. To list the currently open TCP (t option) sockets and associated memory (m) information:
ss -tm
Here's some example output:
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 0 192.168.56.102:ssh 192.168.56.1:56328
skmem:(r0,rb369280,t0,tb87040,f0,w0,o0,bl0,d0)
Here's a brief explanation of skmem (socket memory) - for more info you'll need to look at the kernel sources (i.e. sock.h):
r:sk_rmem_alloc
rb:sk_rcvbuf # current receive buffer size
t:sk_wmem_alloc
tb:sk_sndbuf # current transmit buffer size
f:sk_forward_alloc
w:sk_wmem_queued # persistent transmit queue size
o:sk_omem_alloc
bl:sk_backlog
d:sk_drops
I'm still trying to piece together the details, but to add to the answers already given, these are some of the important commands:
cat /proc/sys/net/ipv4/udp_mem
cat /proc/sys/net/core/rmem_max
cat /proc/sys/net/ipv4/tcp_rmem
cat /proc/sys/net/ipv4/tcp_wmem
ss -m # see `man ss`
References & help pages:
Man pages
man 7 socket
man 7 udp
man 7 tcp
man ss
https://www.linux.org/threads/how-to-calculate-tcp-socket-memory-usage.32059/
Atomic size is 4096 bytes, max size is 65536 bytes. Sendfile uses 16 pipes each of 4096 bytes size.
cmd : ioctl(fd, FIONREAD, &buff_size).

What is NetBSD's FIONSPACE ioctl equivalent in Linux?

I'm using Linux 2.6.38 (fc14). What is the ioctl flag to get the amount of free space on a socket file descriptor (say, a TCP socket)? I found NetBSD has FIONREAD, FIONWRITE and FIONSPACE for such related purposes. But, I could only use FIONREAD in Linux.
SIOCOUTQ is the Linux equivalent of FIONWRITE. I don't believe there is a direct FIONSPACE equivalent: instead, you can subtract the value returned by SIOCOUTQ from the socket send buffer size, which can be obtained with getsockopt(s, SOL_SOCKET, SO_SNDBUF, ...).
For information, about what #HKK says, found in man socket(7):
SO_SNDBUF
Sets or gets the maximum socket send buffer in bytes. The
kernel doubles this value (to allow space for bookkeeping overhead)
when it is set using setsockopt(2), and this doubled value is
returned by getsockopt(2). The default value is set by the
/proc/sys/net/core/wmem_default file and the maximum allowed value is
set by the /proc/sys/net/core/wmem_max file. The minimum (doubled)
value for this option is 2048.

Specifying UDP receive buffer size at runtime in Linux

In Linux, one can specify the system's default receive buffer size for network packets, say UDP, using the following commands:
sysctl -w net.core.rmem_max=<value>
sysctl -w net.core.rmem_default=<value>
But I wonder, is it possible for an application (say, in c) to override system's defaults by specifying the receive buffer size per UDP socket in runtime?
You can increase the value from the default, but you can't increase it beyond the maximum value. Use setsockopt to change the SO_RCVBUF option:
int n = 1024 * 1024;
if (setsockopt(socket, SOL_SOCKET, SO_RCVBUF, &n, sizeof(n)) == -1) {
// deal with failure, or ignore if you can live with the default size
}
Note that this is the portable solution; it should work on any POSIX platform for increasing the receive buffer size. Linux has had autotuning for a while now (since 2.6.7, and with reasonable maximum buffer sizes since 2.6.17), which automatically adjusts the receive buffer size based on load. On kernels with autotuning, it is recommended that you not set the receive buffer size using setsockopt, as that will disable the kernel's autotuning. Using setsockopt to adjust the buffer size may still be necessary on other platforms, however.

Resources