ZeroMQ Pub/Sub drops messages only when subscribing to topics - multithreading

I'm losing messages only when I'm subscribing to topics.
Here's the scenario:
The subscriber subscribes to a specific topic and then calls the publisher in a different thread (with the same context and the same "subscribed topic").
The publisher receives the "subscribed topic" and publishes it.
When I run two procedures (meaning 2 subscriber threads and 2 publisher threads), I receive only one message on one of the threads (randomly).
I have no idea why I'm losing the second message.
Publisher thread:
void *publisher = zmq_socket(ptStruct->zContext, ZMQ_PUB);
assert(0 == zmq_bind(publisher, "inproc://#1"));
printf("Publishes to %d \n", ptStruct->iID);
assert(-1 != zmq_send(publisher, &(ptStruct->iID), sizeof(ptStruct->iID), 0));
zmq_close(publisher);
Subscriber thread:
void *subscriber = zmq_socket(ptStruct->zContext, ZMQ_SUB);
assert(0 == zmq_connect(subscriber, "inproc://#1"));
assert(0 == zmq_setsockopt(subscriber, ZMQ_SUBSCRIBE, &(ptStruct->iID), sizeof(ptStruct->iID)));
printf("Subscribed to %d \n", ptStruct->iID);
/* Now run the publisher in a different thread */
OSTHREAD_CreateThread(&ptThread, publishThread, ptStruct, NULL);
assert(-1 != zmq_recv(subscriber, acRec, 255, 0));
printf("Got %d \n", acRec[0]);
zmq_close(subscriber);
I run the subscriber thread twice and this is the output:
Subscribed to 1
Subscribed to 2
Publishes to 1
Got 1
Publishes to 2

You're creating two different publishers that are bind()-ing to the same inproc endpoint "#1" - an endpoint can only be bound to once, the second publisher is failing to bind() on the same endpoint and then not sending the message.
Additionally, you'll probably want to add in some delay between the publisher bind()-ing and then send()-ing the first message, due to the slow joiner problem - the publisher could attempt to send, and then drop, your message before the publisher and subscriber finish connecting, which would also cause you to lose your message.

Related

About checkpoint strategy in event hub processor

I use event hubs processor host to receive and process the events from event hubs. For better performance, I call checkpoint every 3 minutes instead of every time when receiving the events:
public async Task ProcessEventAsync(context, messages)
{
foreach (var eventData in messages)
{
// do something
}
if (checkpointStopWatth.Elapsed > TimeSpan.FromMinutes(3);
{
await context.CheckpointAsync();
}
}
But the problem is, that there might be some events never being checkpoint if not new events sending to event hubs, as the ProcessEventAsync won't be invoked if no new messages.
Any suggestions to make sure all processed events being checkpoint, but still checkpoint every several mins?
Update: Per Sreeram's suggestion, I updated the code as below:
public async Task ProcessEventAsync(context, messages)
{
foreach (var eventData in messages)
{
// do something
}
this.lastProcessedEventsCount += messages.Count();
if (this.checkpointStopWatth.Elapsed > TimeSpan.FromMinutes(3);
{
this.checkpointStopWatch.Restart();
if (this.lastProcessedEventsCount > 0)
{
await context.CheckpointAsync();
this.lastProcessedEventsCount = 0;
}
}
}
Great case - you are covering!
You could experience loss of event checkpoints (and as a result event replay) in the below 2 cases:
when you have sparse data flow (for ex: a batch of messages every 5 mins and your checkpoint interval is 3 mins) and EventProcessorHost instance closes for some reason - you could see 2 min of EventData - re-processing. To handle that case,
Keep track of the lastProcessedEvent after completing IEventProcessor.onEvents/IEventProcessor.ProcessEventsAsync & checkpoint when you get notified on close - IEventProcessor.onClose/IEventProcessor.CloseAsync.
There might just be a case when - there are no more events to a specific EventHubs partition. In this case, you would never see the last event being checkpointed - with your Checkpointing strategy. However, this is uncommon, when you have continuous flow of EventData and you are not sending to specific EventHubs partition (EventHubClient.send(EventData_Without_PartitionKey)). If you think - you could run into this situation, use the:
EventProcessorOptions.setInvokeProcessorAfterReceiveTimeout(true); // in java or
EventProcessorOptions.InvokeProcessorAfterReceiveTimeout = true; // in C#
flag to wake up the processEventsAsync every so often. Then, keep track of, LastProcessedEventData and LastCheckpointedEventData and make a judgement whether to checkpoint when no Events are received, based on EventData.SequenceNumber property on those events.

Using multiple MQTT node-red nodes behaving not as expected

I have multiple MQTT nodes with different topics configured in them. Now I will process the value of multiple topics and figure out some assumptions(Basically stream analytics).
My expectation:
I know that java script is single threaded. So I thought that when one topic data is received it will be processed and then only once it is completed other topic will be received and so on.
Reality:
It is working like multi threaded.
Test case:
Flow: MQTT ---> Process for a second ---> Output
Sleep function code(Not really sleeping more like processing):
var start = new Date().getTime();
for (var i = 0; i < 1e7; i++)
{
if ((new Date().getTime() - start) > 1000)
{
break;
}
}
return msg;
Now I will publish data form 1 to 100 continuously using for loop.
My expectation:
Now 1, 2, 3....100 will be displayed one after another with 1 second gap. So now it should be taking 100 seconds approximately to display values form 1 to 100.
Reality:
First it will sleep for 100 seconds and then form 1 to 100 all will be displayed at once. So what is happening here?
Flow json:
[{"id":"e9a53835.09af38","type":"tab","label":"Flow 1","disabled":false,"info":""},{"id":"5ffb1b40.1405b4","type":"debug","z":"e9a53835.09af38","name":"","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","x":438,"y":216,"wires":[]},{"id":"a8406277.a78ee","type":"mqtt in","z":"e9a53835.09af38","name":"Test MQTT Queue","topic":"1","qos":"2","broker":"b4c58fab.26844","x":146,"y":120,"wires":[["629e90bb.996ad"]]},{"id":"629e90bb.996ad","type":"function","z":"e9a53835.09af38","name":"Sleep 1 seconds","func":"var start = new Date().getTime();\nfor (var i = 0; i < 1e7; i++)\n{\n if ((new Date().getTime() - start) > 1000)\n {\n break;\n }\n}\nreturn msg;","outputs":1,"noerr":0,"x":298,"y":168,"wires":[["5ffb1b40.1405b4"]]},{"id":"b4c58fab.26844","type":"mqtt-broker","z":"","name":"","broker":"127.0.0.1","port":"1883","clientid":"","usetls":false,"compatmode":true,"keepalive":"60","cleansession":true,"willTopic":"","willQos":"2","willRetain":"false","willPayload":"","birthTopic":"","birthQos":"2","birthRetain":"false","birthPayload":""}]
C# publisher function:
// Retain: false, QOS= 2 on both publisher and client.
for (int i = 1; i <= 10; i++)
{
client.Publish(1, Encoding.UTF8.GetBytes(i.ToString()), MqttMsgBase.QOS_LEVEL_EXACTLY_ONCE, false);
}
The first message arrives and is passed to the Function node which then does a busy-wait loop, not allowing the node.js event loop to process any other work.
During that time, the remaining 99 messages arrive in the underlying MQTT client and internal events are queued up to process them
The first message then finally makes it to the Debug node. The Debug node passes the message to the websocket asynchronously - which means that piece of work is wrapped in an event and put at the end of the node.js event queue - behind the 99 messages.
the same then happens for the next 99 events - they are processed synchronously with no opporunity for the node.js event loop to make progress, each one added another event to the end of the queue to have the message passed to Debug
The last of the messages is processed, the node.js event loop then reaches the events to process the debug messages over the websocket and all 100 messages appear in the Debug sidebar
The key here is that blocking synchronously is a bad thing to do in the node.js world. If you want to delay a message, use a Delay node, which does so using timers - thereby allowing node.js to continue processing other work in the background.

Event Hub receiver does not read all messages

I implemented 2 simple services in Service Fabric, which communicate over Event Hub and I encounter very strange behavior.
The listener service reads the messages using PartitionReciever with ReceiveAsync method. It reads the messages always from the start of the partition, but even though the maxMessageCount parameter is set to very high number, which definitely exceeds the number of messages in the partition, it reads only "random" amount of messages but never the full list. It always starts to read correctly from the beginning of the partition but it almost never reads the full list of messages which should be present there...
Did I miss something in documentation and this is normal behavior, or am I right, that this is very strange bahviour?
A code snippet of my receiver service:
PartitionReceiver receiver = eventHubClient.CreateReceiver(PartitionReceiver.DefaultConsumerGroupName, Convert.ToString(partition), PartitionReceiver.StartOfStream);
ServiceEventSource.Current.Write("RecieveStart");
IEnumerable<EventData> ehEvents = null;
int i = 0;
do
{
try
{
ehEvents = await receiver.ReceiveAsync(1000);
break;
}
catch (OperationCanceledException)
{
if (i == NUM_OF_RETRIES-1)
{
await eventHubClient.CloseAsync();
StatusCode(500);
}
}
i++;
} while (i < NUM_OF_RETRIES);

High performance on Nodejs RabbitMQ server

I'm building an analysis system with a million users online in the same time. I use RabbitMQ such as message broker to reduce capacity for server
Here is my diagram
My system include 3 components.
Publisher server : ( Producer )
This system was built on nodejs. The purpose of this system to publish the messages into queue
RabbitMQ queue : This system stored the messages that publisher server sent to. After that, one connect is opened to send message from queue for subscriber server.
Subscriber server ( Consumer ) : This system receive the messages from queue
Publisher server source code
var amqp = require('amqplib/callback_api');
amqp.connect("amqp://localhost", function(error, connect) {
if (error) {
return callback(-1, null);
} else {
connect.createChannel(function(error, channel) {
if (error) {
return callback(-3, null);
} else {
var q = 'logs';
var msg = data; // object
// convert msg object to buffer
var new_msg = Buffer.from(JSON.stringify(msg), 'binary');
channel.assertExchange(q, 'fanout', { durable: false });
channel.publish(q, 'message_queues', new Buffer(new_msg));
console.log(" [x] Sent %s", new_msg);
return callback(null, msg);
}
});
}
});
create exclusively exchange "message_queues" with "fanout" to send
broadcast to all consumer
Subscriber server source code
var amqp = require('amqplib/callback_api');
amqp.connect("amqp://localhost", function(error, connect) {
if (error) {
console.log('111');
} else {
connect.createChannel(function(error, channel) {
if (error) {
console.log('1');
} else {
var ex = 'logs';
channel.assertExchange(ex, 'fanout', { durable: false });
channel.assertQueue('message_queues', { exclusive: true }, function(err, q) {
if (err) {
console.log('123');
} else {
console.log(" [*] Waiting for messages in %s. To exit press CTRL+C", q.queue);
channel.bindQueue(q.queue, ex, 'message_queues');
channel.consume(q.queue, function(msg) {
console.log(" [x] %s", msg.content.toString());
}, { noAck: true });
}
});
}
});
}
});
receive messge from "message_queues" exchange
When I implement send a message. The system work well, however I tried benchmark test performance of this system (with ~ 1000 users sent request per second ) then the system has some issue. The system seem as overload / buffer overflow ( or some thing don't work well ).
I just only read about rabbitmq 2 days ago. I know its tutorials is basic example, so I need help to build systems in real world than .. Any
solution & suggestion
Hope that my question make a sense
Your question is general. Probably you should provide more details to help to identify the bottleneck and help you out.
So, first of all I think you should check the rabbit mq - whether its a bottleneck or not.
There are many things that can go wrong:
The number of consumers that can consume the message is too low (I assume you use a pool of consumers)
The network is too slow
The queues and messages are replicated between too many nodes of Rabbit MQ and go do disk (its possible to use rabbit mq like this)
The consumer can't really handle a message and it gets constantly re-queued
So, in general during your tests you should check rabbit mq and see what happens there.
The message once arrives into queue is in Ready State once this happens, it will be there till one of consumers connected to queue won't attempt to take the the message for handling
When one of consumers (rabbit does round-robin between them) picks the message for processing it's state will turn to Unacknowledged
if consumer fails to handle the message, it will be re-queued by rabbit so that another consumer would have a chance to handle the message.
Of course, if consumer handles the message successfully, the message disappears from rabbit mq server.
Assuming you've installed rabbit mq web ui (I highly recommend it especially for beginners) - you can visually see what happens in your queue - you'll see how many messages are in ready state, and how many are unacknowledged.
This will help to identify a bottleneck.
For example - if you see that only one message is usually in unacknowledged state, this can mean that the consumer can't handle the message and sends it back to rabbit. On the other hand new messages always arrive from producer, so the number of ready messages will increase very fast
It also can point on the fact that you use only one consumer that can handle only one message at a time. So you can consider paralleling here, by running many consumers in different threads or even clustering your application (in rabbit consumers can reside in different machines)
Hope this helps in general, of course, as I've said before if you have more specific questions - please provide more information about what exactly happens during the test

EventhubReceiver: Inconsistent messages received

Issue: While calling EventhubReceiver.ReceiveAsync() method in rapid succession, the receiver does not receive all the messages in the event hub. The sample code that is used to receive events.
Note: the offset passed to the method is "0" across multiple calls
public async Task<IEnumerable<EventData>> GetMessagesAsync(string messageOffset, int maxCount)
{
var receiver = await GetEventHubReceiver(messageOffset);
return await receiver.ReceiveAsync(maxCount);
}
Is there any inherent checkpoint with the receive call that needs to be taken into account? A different approach perhaps?

Resources