Kafka-Node in Node fetchMaxBytes parameter - node.js

Does anyone know what 2 parameters in the fetchMaxBytes represent?
If its represented as 1024*1024, does that mean that the consumer will fetch 1024 messages of each 1Kb? Or will it jest fetch 1Mb of messages?
I was not able to find any relevant information from the documentation except this: "The maximum bytes to include in the message set for this partition. This helps bound the size of the response."
I need this parameter to get messages one by one rather than getting a couple of messages in a single shot.

I am not familiar with node.js but I assume fetchMaxBytes corresponds to replicate.fetch.max.bytes. For this case, the value is the maximum buffer size (in bytes, ie, 1024*1024 = 1MB) for fetching messages. A buffer can contain multiple messages of arbitrary size. It basically means, wait for fetching not longer as "until a buffer got filled up".

Related

How to prepare the "correct" buffer size when receiving netlink responses?

Implementing a netlink (rtnetlink) module I ran into a problem:
Like for UDP, a part of the message (packet) is lost when the receive buffer is not big enough (e.g. when the buffer is 1024 bytes and you received 1024 bytes).
So I wondered how to prepare the "correct" receive buffer.
Initially I had the idea to MSG_PEEK just the nlmsghdr, then extract the message size from there, and next do the final receive.
As security measure I allocated one byte more, just to be able to complain if the receive size used the full buffer.
Unfortunately it did!
Is there an algorithm that will work, not needing a ridiculously huge receive buffer?
Example:
nlmsg_len was 1276, so I tried to receive 1277 bytes, and I did.
So on the next attempt I blindly added 2000 bytes to the receive buffer, and the result was 2552 bytes long (1276 bytes longer than expected).
As said above I think I cannot "continue" to receive a longer message using multiple recvs, so I must receive all at once.

Incomplete Output from connection.recv buffer [duplicate]

Some sources say that recv should have the max length possible of a message, like recv(1024):
message0 = str(client.recv(1024).decode('utf-8'))
But other sources say that it should have the total bytes of the receiving message. If the message is "hello":
message0 = str(client.recv(5).decode('utf-8'))
What is the correct way of using recv()?
Some sources say ... But other sources say ... message ...
Both sources are wrong.
The argument for the recv is the maximum number of bytes one want to read at once.
With an UDP socket this is the message size one want to read or larger, but a single recv will only return a single message anyway. If the given size is smaller than the message it will be pruned and the rest will be discarded.
With a TCP socket (the case you ask about) there is no concept of a message in the first place since TCP is a byte stream only. recv will simply return the number of bytes available for read, up to the given size. Specifically a single recv in a TCP receiver does not not need to match a single send in the sender. It might match and it often will match if the amount of data is small, but there is no guarantee and one should never rely on it.
... message0 = str(client.recv(5).decode('utf-8'))
Note that calling decode('utf-8') directly on the data returned by recv is a bad idea. One first need to be sure that all the expected data are read and only then call decode('utf-8'). If only part of the data are read the end of the read data could be in the middle of a character, since a single character in UTF-8 might be encoded in multiple bytes (everything except ASCII characters). If decode('utf-8') is called with an incomplete encoded character it will throw an exception and your code will break.

Unused bytes by protobuf implementation (for limiter implementation)

I need to transfer data over a serial port. In order to ensure integrity of the data, I want a small envelope protocol around each protobuf message. I thought about the following:
message type (1 byte)
message size (2 bytes)
protobuf message (N bytes)
(checksum; optional)
The message type will mostly be a mapping between messages defined in proto files. However, if a message gets corrupted or some bytes are lost, the message size will not be correct and all subsequent bytes cannot be interpreted anymore. One way to solve this would be the introduction of limiters between messages, but for that I need to choose something that is not used by protobuf. Is there a byte sequence that is never used by any protobuf message?
I also thought about a different way. If the master finds out that packages are corrupted, it should reset the communication to a clean start. For that I want the master to send a RESTART command to the slave. The slave should answer with an ACK and then start sending complete messages again. All bytes received between RESTART and ACK are to be discarded by the master. I want to encode ACK and RESTART as special messages. But with that approach I face the same problem: I need to find byte sequences for ACK and RESTART that are not used by any protobuf messages.
Maybe I am also taking the wrong approach - feel free to suggest other approaches to deal with lost bytes.
Is there a byte sequence that is never used by any protobuf message?
No; it is a binary serializer and can contain arbitrary binary payloads (especially in the bytes type). You cannot use sentinel values. Length prefix is fine (your "message size" header), and a checksum may be a pragmatic option. Alternatively, you could impose an artificial sentinel to follow each message (maybe a guid chosen per-connection as part of the initial handshake), and use that to double-check that everything looks correct.
One way to help recover packet synchronization after a rare problem is to use synchronization words in the beginning of the message, and use the checksum to check for valid messages.
This means that you put a constant value, e.g. 0x12345678, before your message type field. Then if a message fails checksum check, you can recover by finding the next 0x12345678 in your data.
Even though that value could sometimes occur in the middle of the message, it doesn't matter much. The checksum check will very probably catch that there isn't a real message at that position, and you can search forwards until you find the next marker.

Packing 20 bytes chunk via BLE

I've never worked with bluetooth before. I have to sends data via BLE and I've found the limit of 20 bytes per chunk.
The sender is an Arduino and the receiver could be both an Android or a Node.js app on a pc.
I have to send 9 values, stored in float values, so 4 bytes * 9 = 36 bytes. I need 2 chunks for all my data via BLE. The receiving part needs both chunks to process them. If some data are lost, I don't care.
I'm not expert in network protocols and I think I have to give each message an incremental timestamp so that the receiver can glue the two chunks with the same timestamp or discard the last one if the new timestamp is higher. But I'm not sure how to do a checksum, if I really need it or not, if I really have to care about it, or if - for a simple beta version of my system - I can ignore all those problems..
Does anyone can give me some advice? Like examples of similar situations handled with BLE communication?
You can get around the size limitation using the "Read Blob Request" of ATT. It allows you to read an attribute and also give an offset. So, you can use it to read the attribute with an offset of 0, if there's more than ATT_MTU bytes than you can request again with the offset at ATT_MTU*1, if there's still more ATT_MTU*2, etc... (You can read it in 3.4.4.5 of the Bluetooth v4.1 specifications; it's in the 4.0 spec too but I don't have that in front of me right now)
If the value changes between request, I'm not sure how you could go about detecting such a change. You could have the attribute send notifications when there's a change to interrupt the process in case the value changes in the middle of reading it.

read dynamic length content using msgrcv

I use msgrcv function to read message from message queue. It works fine when I read known length data. Some cases my message length is variable. In such How can i allocate only require amount of memory and read the message from message queue without losing any data from message queue. Please give idea to overcome this issue.
Note:
In IBM message queue, when we read exceeded length data, it fills the actual size of the message into structure which we are passing mqget function. Like this, Is there any way to do this operation in message queue.
From my brief reading of the msgrcv() man page, if your buffer size is too small and you don't specify the MSG_NOERROR flag, msgrcv() will return -1 (with errno set to E2BIG) and leave the message in the queue.
In this case, you could double your buffer size (up to MSGMAX, which is 8192 on linux by default) and try again.

Resources