How to determine initial TCP window size without sending a packet? - linux

I was thinking of writing a patch for Nmap that would make it more difficult to detect a port scanning attempt by setting the initial window size to one that the sender's operating system uses. Can I somehow find out which initial window size would be used if I tried to send a packet to, say, 8.8.8.8 over wlan0?

The default should be 65536 bytes. Maybe this is what you are looking for:
/proc/sys/net/core/rmem_max – Maximum TCP Receive Window.
/proc/sys/net/core/wmem_max – Maximum TCP Send Window.

Related

How to set the maximum TCP receive window size in Linux?

I want to limit the rate of every TCP connection. Can I set the maximum TCP receive window size in Linux?
With iptables + tc can only limit IP packets. The parameters net.core.rmem_max and net.core.wmem_max didn't not work well.
man tcp:
Linux supports RFC 1323 TCP high performance extensions. These include Protection Against Wrapped Sequence Numbers (PAWS), Window Scaling and Timestamps. Window scaling allows the use of large (> 64K) TCP windows in order to support links with high latency or bandwidth. To make use of them, the send and receive buffer sizes must be increased. They can be set globally with the /proc/sys/net/ipv4/tcp_wmem and /proc/sys/net/ipv4/tcp_rmem files, or on individual sockets by using the SO_SNDBUF and SO_RCVBUF socket options with the setsockopt(2) call.

TCP Sockets send buffer size efficiency

When working with WinSock or POSIX TCP sockets (in C/C++, so no extra Java/Python/etc. wrapping), is there any efficiency pro/cons to building up a larger buffer (e.g. say upto 4KB) in user space then making as few calls to send as possible to send that buffer vs making multiple smaller calls directly with the bits of data (say 1-1000 bytes), other the the fact that for non-blocking/asynchronous sockets the single buffer is potentially easier for me to manage.
I know with recv small buffers are not recommended, but I couldn't find anything for sending.
e.g. does each send call on common platforms go to into kernel mode? Could a 1 byte send actually result in a 1 byte packet being transmitted under normal conditions?
As explained on TCP Illustrated Vol I, by Richard Stevens, TCP divides the send buffer in near to optimum segments to fit in the maximum packet size along the path to the other TCP peer. That means that it will never try to send segments that will be fragmented by ip along the route to destination (when a packet is fragmented at some ip router, it sends back an IP fragmentation ICMP packet and TCP will take it into account to reduce the MSS for this connection). That said, there is no need for larger buffer than the maximum packet size of the link level interfaces you'll have along the path. Having one, let's say, twice or thrice longer, makes you sure that TCP will not stop sending as soon as it receives some acknowledge of remote peer, because of not having its buffer filled with data.
Think that the normal interface type is ethernet and it has a maximum packet size of 1500 bytes, so normally TCP doesn't send a segment greater than this size. And it normally has an internall buffer of 8Kb per connection, so there's little sense in adding buffer size at kernel space for that (if this is the only reason to have a buffer in kernel space).
Of course, there are other factors that force you to use a buffer in user space (for example, you want to store the data to send to your peer process somewhere, as there's only 8Kb data in kernel space to buffer, and you will need more space to be able to do some other processes) An example: ircd (the Internet Relay Chat daemon) uses write buffers of up to 100Kb before dropping a connection because the other side is not receiving/acknowledging that data. If you only write(2) to the connection, you'll be put on wait once the kernel buffer is full, and perhaps that's not what you want.
The reason to have buffers in user space is because TCP makes also flow control, so when it's not able to send data, it has to be put somewhere to cope with it. You'll have to decide if you need your process to save that data up to a limit or you can block sending data until the receiver is able to receive again. The buffer size in kernel space is limited and normally out of control for the user/developer. Buffer size in user space is limited only by the resources allowable to it.
Receiving/sending small chunks of data in a TCP connection is not recommendable because of the increased overhead of TCP handshaking and headers impose. Suppose a telnet connection in which for each character sent, a header for TCP and other for IP is added (20 bytes min for TCP, 20 bytes min for IP, 14 bytes for ethernet frame and 4 for the ethernet CRC) makes up to 60 bytes+ to transmit only one character. And normally each tcp segment is acknowledged individually, so that makes a full roundtrip time to send a segment and get the acknowledge (just to be able to free the buffer resources and assume this character as transmitted)
So, finally, what's the limit? It depends on your application. If you can cope with the kernel resources available and don't need more buffers, you can pass without havin buffers in user space. If you need more, you'll need to implement buffers and be able to feed the kernel buffer with your buffer data when available.
Yes, a one byte send can - under very normal conditions - result in sending a TCP packet with only a single byte payload. Send coalescing in TCP is normally done by use of Nagle's algorithm. With Nagle's algorithm, sending data is delayed iff there is data that has already been sent but not yet acknowledged.
Conversely data will be sent immediately if there is no unacknowledged data. Which is usually true in the following situations:
The connection has just been opened
The connection has been idle for some time
The connection only received data but nothing was sent for some time
In that case the first send call that your application performs will cause a packet to be sent immediately, no matter how small. So starting communication with two or more small sends is usually a bad idea because it increases overhead and delay.
The infamous "send send recv" pattern can also cause really large delays (e.g. on Windows typically 200ms). This happens if the local TCP stack uses Nagle's algorithm (which will usually delay the second send) and the remote stack uses delayed acknowledgment (which can delay the acknowledgment of the first packet).
Since most TCP stack implementations use both, Nagle's algorithm and delayed acknowledgment, this pattern should best be avoided.

socket UDP recvfrom not working for large packets even if buffer given is enough on LINUX

I am calling a recvfrom api of a valid address , where i am trying to read data of size 9600 bytes , the buffer i have provided i of size 12KB , I am not even getting select read events.
Even tough recommended MTU size is 1.5 KB, I am able to send and receive packets of 4 KB.
I am using android NDK , (Linux) for development.
Please help . Is there a socket Option i have to set to read large buffers ?
If you send a packet larger than the MTU, it will be fragmented. That is, it'll be broken up into smaller pieces, each which fits within the MTU. The problem with this is that if even one of those pieces is lost (quite likely on a cellular connection...), the entire packet will effectively disappear.
To determine whether this is the case you'll need to use a packet sniffer on one (or both) ends of the connection. Wireshark is a good choice on a PC end, or tcpdump on the android side (you'll need root). Keep in mind that home routers may reassemble fragmented packets - this means that if you're sniffing packets from inside a home router/firewall, you might not see any fragments arrive until all of them arrive at the router (and obviously if some are getting lost this won't happen).
A better option would be to simply ensure that you're always sending packets smaller than the MTU, of course. Fragmentation is almost never the right thing to be doing. Keep in mind that the MTU may vary at various hops along the path between server and client - you can either use the common choice of a bit less than 1500 (1400 ought to be safe), or try to probe for it by setting the MTU discovery flag on your UDP packets (via IP_MTU_DISCOVER) and always sending less than the value returned by getsockopt's IP_MTU option (including on retransmits!)

How to programmatically increase the per-socket buffer for UDP sockets on LInux?

I'm trying to understand the correct way to increase the socket buffer size on Linux for our streaming network application. The application receives variable bitrate data streamed to it on a number of UDP sockets. The volume of data is substantially higher at the start of the stream and I've used:
# sar -n UDP 1 200
to show that the UDP stack is discarding packets and
# ss -un -pa
to show that each socket Recv-Q length grows to the nearly the limit (124928. from sysctl net.core.rmem_default) before packets are discarded. This implies that the application simply can't keep up with the start of the stream. After discarding enough initial packets the data rate slows down and the application catches up. Recv-Q trends towards 0 and remains there for the duration.
I'm able to address the packet loss by substantially increasing the rmem_default value which increases the socket buffer size and gives the application time to recover from the large initial bursts. My understanding is that this changes the default allocation for all sockets on the system. I'd rather just increase the allocation for the specific UDP sockets and not modify the global default.
My initial strategy was to modify rmem_max and to use setsockopt(SO_RCVBUF) on each individual socket. However, this question makes me concerned about disabling Linux autotuning for all sockets and not just UDP.
udp(7) describes the udp_mem setting but I'm confused how these values interact with the rmem_default and rmem_max values. The language it uses is "all sockets", so my suspicion is that these settings apply to the complete UDP stack and not individual UDP sockets.
Is udp_rmem_min the setting I'm looking for? It seems to apply to individual sockets but global to all UDP sockets on the system.
Is there a way to safely increase the socket buffer length for the specific UDP ports used in my application without modifying any global settings?
Thanks.
Jim Gettys is armed and coming for you. Don't go to sleep.
The solution to network packet floods is almost never to increase buffering. Why is your protocol's queueing strategy not backing off? Why can't you just use TCP if you're trying to send so much data in a stream (which is what TCP was designed for).

UDP IP Fragmentation and MTU

I'm trying to understand some behavior I'm seeing in the context of sending UDP packets.
I have two little Java programs: one that transmits UDP packets, and the other that receives them. I'm running them locally on my network between two computers that are connected via a single switch.
The MTU setting (reported by /sbin/ifconfig) is 1500 on both network adapters.
If I send packets with a size < 1500, I receive them. Expected.
If I send packets with 1500 < size < 24258 I receive them. Expected. I have confirmed via wireshark that the IP layer is fragmenting them.
If I send packets with size > 24258, they are lost. Not Expected. When I run wireshark on the receiving side, I don't see any of these packets.
I was able to see similar behavior with ping -s.
ping -s 24258 hostA works but
ping -s 24259 hostA fails.
Does anyone understand what may be happening, or have ideas of what I should be looking for?
Both computers are running CentOS 5 64-bit. I'm using a 1.6 JDK, but I don't really think it's a programming problem, it's a networking or maybe OS problem.
Implementations of the IP protocol are not required to be capable of handling arbitrarily large packets. In theory, the maximum possible IP packet size is 65,535 octets, but the standard only requires that implementations support at least 576 octets.
It would appear that your host's implementation supports a maximum size much greater than 576, but still significantly smaller than the maximum theoretical size of 65,535. (I don't think the switch should be a problem, because it shouldn't need to do any defragmentation -- it's not even operating at the IP layer).
The IP standard further recommends that hosts not send packets larger than 576 bytes, unless they are certain that the receiving host can handle the larger packet size. You should maybe consider whether or not it would be better for your program to send a smaller packet size. 24,529 seems awfully large to me. I think there may be a possibility that a lot of hosts won't handle packets that large.
Note that these packet size limits are entirely separate from MTU (the maximum frame size supported by the data link layer protocol).
I found the following which may be of interest:
Determine the maximum size of a UDP datagram packet on Linux
Set the DF bit in the IP header and send continually larger packets to determine at what point a packet is fragmented as per Path MTU Discovery. Packet fragmentation should then result in a ICMP type 3 packet with code 4 indicating that the packet was too large to be sent without being fragmented.
Dan's answer is useful but note that after headers you're really limited to 65507 bytes.

Resources