interleave protobuf-net and file

interleave protobuf-net and file - c#-4.0

I need to exchange both protobuf-net objects and files between computers and am trying to figure out the best way to do that. Is there a way for A to inform B that the object that follows is a protobuf or a file? Alternately, when a file is transmitted, is there a way to know that the file has ended and the Byte[] that follows is a protobuf?
Using C# 4.0, Visual Studio 2010
Thanks, Manish

This has nothing to do with protobuf or files, and everything to do with your comms protocol, specifically "framing". This means simply: how you demark sub-messages in a single stream. For example, if this is a raw socket you might choose to send (all of)
a brief message-type, maybe a byte: 01 for file, 02 for a protobuf message of a particular file
a length prefix (typically 4 bytes network-byte-order)
the payload, consisting of the previous number of bytes
Then rinse and repeat for each message.
You don't state what comms you are asking, so I can be more specific.
Btw, another approach would be to treat a file as a protobuf message with a byte[] member - mainly suitable for small files, though

Related

Usage of Base64 encoding

I write server that sends two types of messages: plain message (without encoding) and encrypted message (AES encryption). The transport is UDP. The message content is fixed-length header (2 bytes) and body (JSON string). The question: should I encode these messages using Base64 encoding? If so, what is the reason?

base64encoding has one reason to exist (and this one only): Make something, that is not safe for handling and/or transport in a text-based system (such as E-Mail, classic C strings) safe to do so.
UDP definitly has no such limit, so it depends on whether any other part of your application does. If not, I recommend you use the raw data.

base64 was meant to encode binary data (which is very compact) into group of 64 symbols in the ASCII table (which is less compact). base64 is very good for storing binary data inside text files, things like storing image data inside HTML document etc. I don't see any reason to use it in your case

nodejs zlib unzip binary data from socket

I receive some binary data via socket in nodejs, zipped using zlib.
I do not have access to the source that origins the messages.
when I try to unzip the data I got "sometimes" errors from zlib.
below is an example, I coded two message in HEX for convenience.
I am sure they are both zlib compressed (the header "789c" give me that confidence), but I cannot understand why the test1 message works and the test2 message don't.
Maybe a dictionary is needed?
Maybe a version missmatch between compression and decompression algorithm?
I feel I can exclude an issue on reading data, since both messages are read the same way.
The help of a zlib expert will be very appreciated.
let test1 = "789c4d8e410e82301444f79ea2f9fb021f51206971e5c2ad7a01da7e840894d06a20c6bb8bbad0e54c66f29ed84d5dcbee34bac6f61230886057ac4439d1e0f777eabd2b84b2662e84b6bda7c933dd364b2dc139e5815d6e8d91f040aa22131bcdb7581a9e6c50f11c938c27695646b45e672a534f60aeb6c361d9a7e936e2790c3f701c2006086121e84365ba267d75b74e02b0ba74a7d9795a0202b35545e3791e48c2f17d08bf7ee1bff3ea051dd44446";
let test2 = "789c4d8e410e82301444f79ca2f9fb0205059ab4b072e156bd006d3f42544a683110e3dd455de8722633794f54f3ed4aee38bacef6125818435506a29e71f0bb3bf6de954259b39442dbdee3ec89be766b2dc139e5819ca7ce4878285d37292b38cd52cde826ce38e55b8334df2aad8d490aaeea2710d7da61bfeef33c8b294fe0074e42c642065129f04325ba457d71d34d0290b676c7c5795c5c0303629b06c7d332a084c3fb107dfda27fe7e0054ca744b6";
console.log(zlib.unzipSync(Buffer.from(test1, "hex")).toString()); // correct output
console.log(zlib.unzipSync(Buffer.from(test2, "hex")).toString()); // ERROR: data check

The second test message is, in fact, invalid. Either it was corrupted in transit (including within your code), or it was corrupted when it was made.
One thing to check for is, if you are on a Windows operating system, whether you are using binary mode to read the file. If not, then it would in fact be possible for some inputs to be corrupted and not others.

Unused bytes by protobuf implementation (for limiter implementation)

I need to transfer data over a serial port. In order to ensure integrity of the data, I want a small envelope protocol around each protobuf message. I thought about the following:
message type (1 byte)
message size (2 bytes)
protobuf message (N bytes)
(checksum; optional)
The message type will mostly be a mapping between messages defined in proto files. However, if a message gets corrupted or some bytes are lost, the message size will not be correct and all subsequent bytes cannot be interpreted anymore. One way to solve this would be the introduction of limiters between messages, but for that I need to choose something that is not used by protobuf. Is there a byte sequence that is never used by any protobuf message?
I also thought about a different way. If the master finds out that packages are corrupted, it should reset the communication to a clean start. For that I want the master to send a RESTART command to the slave. The slave should answer with an ACK and then start sending complete messages again. All bytes received between RESTART and ACK are to be discarded by the master. I want to encode ACK and RESTART as special messages. But with that approach I face the same problem: I need to find byte sequences for ACK and RESTART that are not used by any protobuf messages.
Maybe I am also taking the wrong approach - feel free to suggest other approaches to deal with lost bytes.

Is there a byte sequence that is never used by any protobuf message?
No; it is a binary serializer and can contain arbitrary binary payloads (especially in the bytes type). You cannot use sentinel values. Length prefix is fine (your "message size" header), and a checksum may be a pragmatic option. Alternatively, you could impose an artificial sentinel to follow each message (maybe a guid chosen per-connection as part of the initial handshake), and use that to double-check that everything looks correct.

One way to help recover packet synchronization after a rare problem is to use synchronization words in the beginning of the message, and use the checksum to check for valid messages.
This means that you put a constant value, e.g. 0x12345678, before your message type field. Then if a message fails checksum check, you can recover by finding the next 0x12345678 in your data.
Even though that value could sometimes occur in the middle of the message, it doesn't matter much. The checksum check will very probably catch that there isn't a real message at that position, and you can search forwards until you find the next marker.

What is BitTorrent peer (Deluge) saying?

I'm writing a small app to test out how torrent p2p works and I created a sample torrent and am seeding it from my Deluge client. From my app I'm trying to connect to Deluge and download the file.
The torrent in question is a single-file torrent (file is called A - without any extension), and its data is the ASCII string Test.
Referring to this I was able to submit the initial handshake and also get a valid response back.
Immediately afterwards Deluge is sending even more data. From the 5th byte it would seem like it is a bitfield message, but I'm not sure what to make of it. I read that torrent clients may send a mixture of Bitfield and Have messages to show which parts of the torrent they possess. (My client isn't sending any bitfield, since it is assuming not to have any part of the file in question).
If my understanding is correct, it's stating that the message size is 2: one for identifier + payload. If that's the case why is it sending so much more data, and what's that supposed to be?
Same thing happens after my app sends an interested command. Deluge responds with a 1-byte message of unchoke (but then again appends more data).
And finally when it actually submits the piece, I'm not sure what to make of the data. The first underlined byte is 84 which corresponds to the letter T, as expected, but I cannot make much more sense of the rest of the data.
Note that the link in question does not really specify how the clients should supply messages in order once the initial handshake is completed. I just assumed to send interested and request based on what seemed to make sense to me, but I might be completely off.

I don't think Deluge is sending the additional bytes you're seeing.
If you look at them, you'll notice that all of the extra bytes are bytes that already existed in the handshake message, which should have been the longest message you received so far.
I think you're reading new messages into the same buffer, without zeroing it out or anything, so you're seeing bytes from earlier messages again, following the bytes of the latest message you read.
Consider checking if the network API you're using has a way to check the number of bytes actually received, and only look at that slice of the buffer, not the entire thing.

Packing 20 bytes chunk via BLE

I've never worked with bluetooth before. I have to sends data via BLE and I've found the limit of 20 bytes per chunk.
The sender is an Arduino and the receiver could be both an Android or a Node.js app on a pc.
I have to send 9 values, stored in float values, so 4 bytes * 9 = 36 bytes. I need 2 chunks for all my data via BLE. The receiving part needs both chunks to process them. If some data are lost, I don't care.
I'm not expert in network protocols and I think I have to give each message an incremental timestamp so that the receiver can glue the two chunks with the same timestamp or discard the last one if the new timestamp is higher. But I'm not sure how to do a checksum, if I really need it or not, if I really have to care about it, or if - for a simple beta version of my system - I can ignore all those problems..
Does anyone can give me some advice? Like examples of similar situations handled with BLE communication?

You can get around the size limitation using the "Read Blob Request" of ATT. It allows you to read an attribute and also give an offset. So, you can use it to read the attribute with an offset of 0, if there's more than ATT_MTU bytes than you can request again with the offset at ATT_MTU*1, if there's still more ATT_MTU*2, etc... (You can read it in 3.4.4.5 of the Bluetooth v4.1 specifications; it's in the 4.0 spec too but I don't have that in front of me right now)
If the value changes between request, I'm not sure how you could go about detecting such a change. You could have the attribute send notifications when there's a change to interrupt the process in case the value changes in the middle of reading it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string