I need to transfer data over a serial port. In order to ensure integrity of the data, I want a small envelope protocol around each protobuf message. I thought about the following:
message type (1 byte)
message size (2 bytes)
protobuf message (N bytes)
(checksum; optional)
The message type will mostly be a mapping between messages defined in proto files. However, if a message gets corrupted or some bytes are lost, the message size will not be correct and all subsequent bytes cannot be interpreted anymore. One way to solve this would be the introduction of limiters between messages, but for that I need to choose something that is not used by protobuf. Is there a byte sequence that is never used by any protobuf message?
I also thought about a different way. If the master finds out that packages are corrupted, it should reset the communication to a clean start. For that I want the master to send a RESTART command to the slave. The slave should answer with an ACK and then start sending complete messages again. All bytes received between RESTART and ACK are to be discarded by the master. I want to encode ACK and RESTART as special messages. But with that approach I face the same problem: I need to find byte sequences for ACK and RESTART that are not used by any protobuf messages.
Maybe I am also taking the wrong approach - feel free to suggest other approaches to deal with lost bytes.
Is there a byte sequence that is never used by any protobuf message?
No; it is a binary serializer and can contain arbitrary binary payloads (especially in the bytes type). You cannot use sentinel values. Length prefix is fine (your "message size" header), and a checksum may be a pragmatic option. Alternatively, you could impose an artificial sentinel to follow each message (maybe a guid chosen per-connection as part of the initial handshake), and use that to double-check that everything looks correct.
One way to help recover packet synchronization after a rare problem is to use synchronization words in the beginning of the message, and use the checksum to check for valid messages.
This means that you put a constant value, e.g. 0x12345678, before your message type field. Then if a message fails checksum check, you can recover by finding the next 0x12345678 in your data.
Even though that value could sometimes occur in the middle of the message, it doesn't matter much. The checksum check will very probably catch that there isn't a real message at that position, and you can search forwards until you find the next marker.
Related
Some sources say that recv should have the max length possible of a message, like recv(1024):
message0 = str(client.recv(1024).decode('utf-8'))
But other sources say that it should have the total bytes of the receiving message. If the message is "hello":
message0 = str(client.recv(5).decode('utf-8'))
What is the correct way of using recv()?
Some sources say ... But other sources say ... message ...
Both sources are wrong.
The argument for the recv is the maximum number of bytes one want to read at once.
With an UDP socket this is the message size one want to read or larger, but a single recv will only return a single message anyway. If the given size is smaller than the message it will be pruned and the rest will be discarded.
With a TCP socket (the case you ask about) there is no concept of a message in the first place since TCP is a byte stream only. recv will simply return the number of bytes available for read, up to the given size. Specifically a single recv in a TCP receiver does not not need to match a single send in the sender. It might match and it often will match if the amount of data is small, but there is no guarantee and one should never rely on it.
... message0 = str(client.recv(5).decode('utf-8'))
Note that calling decode('utf-8') directly on the data returned by recv is a bad idea. One first need to be sure that all the expected data are read and only then call decode('utf-8'). If only part of the data are read the end of the read data could be in the middle of a character, since a single character in UTF-8 might be encoded in multiple bytes (everything except ASCII characters). If decode('utf-8') is called with an incomplete encoded character it will throw an exception and your code will break.
I'm writing a small app to test out how torrent p2p works and I created a sample torrent and am seeding it from my Deluge client. From my app I'm trying to connect to Deluge and download the file.
The torrent in question is a single-file torrent (file is called A - without any extension), and its data is the ASCII string Test.
Referring to this I was able to submit the initial handshake and also get a valid response back.
Immediately afterwards Deluge is sending even more data. From the 5th byte it would seem like it is a bitfield message, but I'm not sure what to make of it. I read that torrent clients may send a mixture of Bitfield and Have messages to show which parts of the torrent they possess. (My client isn't sending any bitfield, since it is assuming not to have any part of the file in question).
If my understanding is correct, it's stating that the message size is 2: one for identifier + payload. If that's the case why is it sending so much more data, and what's that supposed to be?
Same thing happens after my app sends an interested command. Deluge responds with a 1-byte message of unchoke (but then again appends more data).
And finally when it actually submits the piece, I'm not sure what to make of the data. The first underlined byte is 84 which corresponds to the letter T, as expected, but I cannot make much more sense of the rest of the data.
Note that the link in question does not really specify how the clients should supply messages in order once the initial handshake is completed. I just assumed to send interested and request based on what seemed to make sense to me, but I might be completely off.
I don't think Deluge is sending the additional bytes you're seeing.
If you look at them, you'll notice that all of the extra bytes are bytes that already existed in the handshake message, which should have been the longest message you received so far.
I think you're reading new messages into the same buffer, without zeroing it out or anything, so you're seeing bytes from earlier messages again, following the bytes of the latest message you read.
Consider checking if the network API you're using has a way to check the number of bytes actually received, and only look at that slice of the buffer, not the entire thing.
I need to create a monitor, which will log information about packet missing using ZeroMQ ipc. Actually I don't really understand everything about it because of there are some LINX, TIPS protocols also. Can you please explain me that and answer the main question?
You could make the application self-monitoring, by including a message serial number in each message structure. The message sender keeps track of the serial number it last sent, and increments it every time it sends a message.
The recipient should then be receiving messages with ever-increasing message serial numbers embedded. If that ever jumps by 2 or more, a message has gone missing.
IPC is not lossy like a network can be - the bytes put in come out the other end. TCP is not lossy either, provided both ends are still running and the network itself hasn't failed. However, depending on the ZMQ pattern used and how it's set up whole messages can be undelivered (for example, if the recipient hasn't connected yet, etc). If that's what you mean by "packet missing", it would be revealed by including an incrementing message serial number.
Sometimes I receive this strange responses from other nodes. Transaction id match to my request transaction id as well as the remote IP so I tend to believe that node responded with this but it looks like sort of a mix of response and request
d1:q9:find_node1:rd2:id20:.éV0özý.?tjN.?.!2:ip4:DÄ.^7:nodes.v26:.ï?M.:iSµLW.Ðä¸úzDÄ.^æCe1:t2:..1:y1:re
Worst of all is that it is malformed. Look at 7:nodes.v it means that I add nodes.v to the dictionary. It is supposed to be 5:nodes. So, I'm lost. What is it?
The internet and remote nodes is unreliable or buggy. You have to code defensively. Do not assume that everything you receive will be valid.
Remote peers might
send invalid bencoding, discard those, don't even try to recover.
send truncated messages. usually not recoverable unless it happens to be the very last e of the root dictionary.
omit mandatory keys. you can either ignore those messages or return an error message
contain corrupted data
include unknown keys beyond the mandatory ones. this is not an error, just treat them as if they weren't there for the sake of forward-compatibility
actually be attackers trying to fuzz your implementation or use you as DoS amplifier
I also suspect that some really shoddy implementations are based on whatever string types their programming language supports and incorrectly handle encoding instead of using arrays of uint8 as bencoding demands. There's nothing that can be done about those. Ignore or occasionally send an error message.
Specified dictionary keys are usually ASCII-mappable, but this is not a requirement. E.g. there are some tracker response types that actually use random binary data as dictionary keys.
Here are a few examples of junk I'm seeing[1] that even fails bdecoding:
d1:ad2:id20:�w)��-��t����=?�������i�&�i!94h�#7U���P�)�x��f��YMlE���p:q9Q�etjy��r7�:t�5�����N��H�|1�S�
d1:e�����������������H#
d1:ad2:id20:�����:��m�e��2~�����9>inm�_hash20:X�j�D��nY��-������X�6:noseedi1ee1:q9:get_peers1:t2:�=1:v4:LT��1:y1:qe
d1:ad2:id20:�����:��m�e��2~�����9=inl�_hash20:X�j�D��nY���������X�6:noseedi1ee1:q9:get_peers1:t2:�=1:v4:LT��1:y1:qe
d1:ad2:id20:�����:��m�e��2~�����9?ino�_hash20:X�j�D��nY���������X�6:noseedi1ee1:q9:get_peers1:t2:�=1:v4:LT��1:y1:qe
[1] preserved char count. replaced all non-printable, ASCII-incompatible bytes with the unicode replacement character.
I am left with a few questions after reading the RFC 6520 for Heartbeat:
https://www.rfc-editor.org/rfc/rfc6520
Specifically, I don't understand why a heartbeat needs to include arbitrary payloads or even padding for that matter. From what I can understand, the purpose of the heartbeat is to verify that the other party is still paying attention at the other end of the line.
What does these variable length custom payloads provide that a fixed request and response do not?
E.g.
Alice: still alive?
Bob: still alive!
After all, FTP uses the NOOP command to keep connections alive, which seem to work fine.
There is, in fact, a reason for this payload/padding within RFC 6520
From the document:
The user can use the new HeartbeatRequest message,
which has to be answered by the peer with a HeartbeartResponse
immediately. To perform PMTU discovery, HeartbeatRequest messages
containing padding can be used as probe packets, as described in
[RFC4821].
>In particular, after a number of retransmissions without
receiving a corresponding HeartbeatResponse message having the
expected payload, the DTLS connection SHOULD be terminated.
>When a HeartbeatRequest message is received and sending a
HeartbeatResponse is not prohibited as described elsewhere in this
document, the receiver MUST send a corresponding HeartbeatResponse
message carrying an exact copy of the payload of the received
HeartbeatRequest.
If a received HeartbeatResponse message does not contain the expected
payload, the message MUST be discarded silently. If it does contain
the expected payload, the retransmission timer MUST be stopped.
Credit to pwg at HackerNews. There is a good and relevant discussion there as well.
(The following is not a direct answer, but is here to highlight related comments on another question about Heartbleed.)
There are arguments against the protocol design that allowed an arbitrary limit - either that there should have been no payload (or even echo/heartbeat feature) or that a small finite/fixed payload would have been a better design.
From the comments on the accepted answer in Is the heartbleed bug a manifestation of the classic buffer overflow exploit in C?
(R..) In regards to the last question, I would say any large echo request is malicious. It's consuming server resources (bandwidth, which costs money) to do something completely useless. There's really no valid reason for the heartbeat operation to support any length but zero
(Eric Lippert) Had the designers of the API believed that then they would not have allowed a buffer to be passed at all, so clearly they did not believe that. There must be some by-design reason to support the echo feature; why it was not a fixed-size 4 byte buffer, which seems adequate to me, I do not know.
(R..) .. Nobody thinking from a security standpoint would think that supporting arbitrary echo requests is reasonable. Even if it weren't for the heartbleed overflow issue, there may be cryptographic weaknesses related to having such control over the content the peer sends; this seems unlikely, but in the absence of a strong reason to support a[n echo] feature, a cryptographic system should not support it. It should be as simple as possible.
While I don't know the exact motivation behind this decision, it may have been motivated by the ICMP echo request packets used by the ping utility. In an ICMP echo request, an arbitrary payload of data can be attached to the packet, and the destination server will return exactly that payload if it is reachable and responding to ping requests. This can be used to verify that data is being properly sent across the network and that payloads aren't being corrupted in transit.