How to prepare the "correct" buffer size when receiving netlink responses? - linux

Implementing a netlink (rtnetlink) module I ran into a problem:
Like for UDP, a part of the message (packet) is lost when the receive buffer is not big enough (e.g. when the buffer is 1024 bytes and you received 1024 bytes).
So I wondered how to prepare the "correct" receive buffer.
Initially I had the idea to MSG_PEEK just the nlmsghdr, then extract the message size from there, and next do the final receive.
As security measure I allocated one byte more, just to be able to complain if the receive size used the full buffer.
Unfortunately it did!
Is there an algorithm that will work, not needing a ridiculously huge receive buffer?
Example:
nlmsg_len was 1276, so I tried to receive 1277 bytes, and I did.
So on the next attempt I blindly added 2000 bytes to the receive buffer, and the result was 2552 bytes long (1276 bytes longer than expected).
As said above I think I cannot "continue" to receive a longer message using multiple recvs, so I must receive all at once.

Related

BitTorrent p2p - What to do with very large blocks of data, read from piece message from a peer (larger than piece-length)?

I am writing a BitTorrent client, where the application is receiving large blocks of data after requesting pieces from other peers. Sometimes the blocks are larger than piece-length of the torrent.
For example, where torrent piece-length 524288 bytes, some piece requests result in 1940718596 bytes long responses.
Also, the message seems valid as the length encoded in the first four bytes happens to be the same (that large num).
Question: What to do with that data, should I ignore the excess bytes (after piece-length)? Or, should I write the data into corresponding files? - what is concerning because it might override the next pieces!
The largest chunk of a piece the protocol allows in a piece message is 16 KB (16384 bytes). So if a peer sent a 1940718596 bytes (1.8 GB) long piece message, the correct response is to disconnect from it.
Also, if a peer sends a piece message that doesn't correspond to a request message you have sent earlier, you shall also disconnect from it.
A peer that receives a request message asking for more than a 16 KB chunk, shall also disconnect the requester. Requesting a whole piece in a single request message is NOT allowed.
A request message that goes outside the end of the piece, is of course, also NOT allowed.
While it's possible that you will encounter other peers that don't follow the protocol, the most likely when writing a new client, is that the error is on your side.
The most important tool you can use is WireShark. Look how other clients behave and compare with yours.

BLE connection buffersize to packet length

I am currently working on a graduation project where I want to transmit a sessiontoken using BLE. On the server side I am using Node.js and Bleno to create the connection. After the client subscribes to the notification, the server will push the token.
A small part of the code is:
const buf1 = Buffer.from(info, 'utf8');
updateValueCallback(buf1);
At this step, I am using nRF Connect to check if everything is working. My intention works, except I see that only the first 20 characters are transferred. (As much as the packet size)
My question concerns the buffer size. Will, when I finally connect to an Android app, the whole string be transmitted? In this case the underlying protocols will cut the string and reassemble it on the other side. In this case the buffer size doesn't matter. Or must I negotiate the MTU to be the size of the string. In other words must the buffersize be the size of the transmitted package?
In the case the buffer is smaller than the whole string, can the whole string still be transmitted with it?
GATT requires that a notification is maximum MTU - 3 bytes long. The default MTU is 23 so hence the maximum modification value length is 20 bytes by default. By negotiating a larger MTU you can send longer notifications (if your BLE stack supports that).
I haven't used Bleno but all the stack that I have used I needed to slice the data myself 20 bytes at the time. And on receiver side collect them and put them together again.
The stacks have been good to buffer the data and transmit it one chunk at the time. So I have looped the function (as your updateValueCallback()) until all the slices of my data was done.
Hope it works for you.

Unused bytes by protobuf implementation (for limiter implementation)

I need to transfer data over a serial port. In order to ensure integrity of the data, I want a small envelope protocol around each protobuf message. I thought about the following:
message type (1 byte)
message size (2 bytes)
protobuf message (N bytes)
(checksum; optional)
The message type will mostly be a mapping between messages defined in proto files. However, if a message gets corrupted or some bytes are lost, the message size will not be correct and all subsequent bytes cannot be interpreted anymore. One way to solve this would be the introduction of limiters between messages, but for that I need to choose something that is not used by protobuf. Is there a byte sequence that is never used by any protobuf message?
I also thought about a different way. If the master finds out that packages are corrupted, it should reset the communication to a clean start. For that I want the master to send a RESTART command to the slave. The slave should answer with an ACK and then start sending complete messages again. All bytes received between RESTART and ACK are to be discarded by the master. I want to encode ACK and RESTART as special messages. But with that approach I face the same problem: I need to find byte sequences for ACK and RESTART that are not used by any protobuf messages.
Maybe I am also taking the wrong approach - feel free to suggest other approaches to deal with lost bytes.
Is there a byte sequence that is never used by any protobuf message?
No; it is a binary serializer and can contain arbitrary binary payloads (especially in the bytes type). You cannot use sentinel values. Length prefix is fine (your "message size" header), and a checksum may be a pragmatic option. Alternatively, you could impose an artificial sentinel to follow each message (maybe a guid chosen per-connection as part of the initial handshake), and use that to double-check that everything looks correct.
One way to help recover packet synchronization after a rare problem is to use synchronization words in the beginning of the message, and use the checksum to check for valid messages.
This means that you put a constant value, e.g. 0x12345678, before your message type field. Then if a message fails checksum check, you can recover by finding the next 0x12345678 in your data.
Even though that value could sometimes occur in the middle of the message, it doesn't matter much. The checksum check will very probably catch that there isn't a real message at that position, and you can search forwards until you find the next marker.

Linux tty flip buffer lock when reading part of available data

I have a driver that builds on the new serdev bus in the linux kernel.
In my driver I receive messages from an external device, all messages ends with a null byte (0x00) and the protocol ensures that there are no null bytes in my data (COBS). Now I try to have the TTY layer hand me full messages by scanning for zeros in my input and if there are none I'll just return zero in the callback that is called from the tty layer when bytes are available.
This kind of works. Or rather it works for some messages. After a while though it locks up and the tty layer keeps sending the same size of received bytes indefinitely. My guess is that this happens when one half of the tty flip buffer is full and the rest of my message is in the other half.
I have two questions:
Am I correct in that the tty layer can "hang" until I read out all data in one half of the flip buffer?
If that is so, is there some way to prevent this from happening? I'd rather not implement my own buffering scheme on top of the tty buffer already available.
Thanks
It looks like (drivers/tty/tty_buffer.c and the function flush_to_ldisc) that it is not possible to do what I attempted to do. When the tty buffer is about to flip over the consumer will have to do a read and buffer any half messages.
That is, returning zero and hoping for a larger chunk of data in your callback next time will only work up until the end of the first part of the buffer then the last bit of data must be read.
This is not a problem in userspace because a read call will have an argument that is the most bytes you want but read is free to return fewer bytes than requested.

TCP Sockets send buffer size efficiency

When working with WinSock or POSIX TCP sockets (in C/C++, so no extra Java/Python/etc. wrapping), is there any efficiency pro/cons to building up a larger buffer (e.g. say upto 4KB) in user space then making as few calls to send as possible to send that buffer vs making multiple smaller calls directly with the bits of data (say 1-1000 bytes), other the the fact that for non-blocking/asynchronous sockets the single buffer is potentially easier for me to manage.
I know with recv small buffers are not recommended, but I couldn't find anything for sending.
e.g. does each send call on common platforms go to into kernel mode? Could a 1 byte send actually result in a 1 byte packet being transmitted under normal conditions?
As explained on TCP Illustrated Vol I, by Richard Stevens, TCP divides the send buffer in near to optimum segments to fit in the maximum packet size along the path to the other TCP peer. That means that it will never try to send segments that will be fragmented by ip along the route to destination (when a packet is fragmented at some ip router, it sends back an IP fragmentation ICMP packet and TCP will take it into account to reduce the MSS for this connection). That said, there is no need for larger buffer than the maximum packet size of the link level interfaces you'll have along the path. Having one, let's say, twice or thrice longer, makes you sure that TCP will not stop sending as soon as it receives some acknowledge of remote peer, because of not having its buffer filled with data.
Think that the normal interface type is ethernet and it has a maximum packet size of 1500 bytes, so normally TCP doesn't send a segment greater than this size. And it normally has an internall buffer of 8Kb per connection, so there's little sense in adding buffer size at kernel space for that (if this is the only reason to have a buffer in kernel space).
Of course, there are other factors that force you to use a buffer in user space (for example, you want to store the data to send to your peer process somewhere, as there's only 8Kb data in kernel space to buffer, and you will need more space to be able to do some other processes) An example: ircd (the Internet Relay Chat daemon) uses write buffers of up to 100Kb before dropping a connection because the other side is not receiving/acknowledging that data. If you only write(2) to the connection, you'll be put on wait once the kernel buffer is full, and perhaps that's not what you want.
The reason to have buffers in user space is because TCP makes also flow control, so when it's not able to send data, it has to be put somewhere to cope with it. You'll have to decide if you need your process to save that data up to a limit or you can block sending data until the receiver is able to receive again. The buffer size in kernel space is limited and normally out of control for the user/developer. Buffer size in user space is limited only by the resources allowable to it.
Receiving/sending small chunks of data in a TCP connection is not recommendable because of the increased overhead of TCP handshaking and headers impose. Suppose a telnet connection in which for each character sent, a header for TCP and other for IP is added (20 bytes min for TCP, 20 bytes min for IP, 14 bytes for ethernet frame and 4 for the ethernet CRC) makes up to 60 bytes+ to transmit only one character. And normally each tcp segment is acknowledged individually, so that makes a full roundtrip time to send a segment and get the acknowledge (just to be able to free the buffer resources and assume this character as transmitted)
So, finally, what's the limit? It depends on your application. If you can cope with the kernel resources available and don't need more buffers, you can pass without havin buffers in user space. If you need more, you'll need to implement buffers and be able to feed the kernel buffer with your buffer data when available.
Yes, a one byte send can - under very normal conditions - result in sending a TCP packet with only a single byte payload. Send coalescing in TCP is normally done by use of Nagle's algorithm. With Nagle's algorithm, sending data is delayed iff there is data that has already been sent but not yet acknowledged.
Conversely data will be sent immediately if there is no unacknowledged data. Which is usually true in the following situations:
The connection has just been opened
The connection has been idle for some time
The connection only received data but nothing was sent for some time
In that case the first send call that your application performs will cause a packet to be sent immediately, no matter how small. So starting communication with two or more small sends is usually a bad idea because it increases overhead and delay.
The infamous "send send recv" pattern can also cause really large delays (e.g. on Windows typically 200ms). This happens if the local TCP stack uses Nagle's algorithm (which will usually delay the second send) and the remote stack uses delayed acknowledgment (which can delay the acknowledgment of the first packet).
Since most TCP stack implementations use both, Nagle's algorithm and delayed acknowledgment, this pattern should best be avoided.

Resources