Event Hub receiver does not read all messages - azure

I implemented 2 simple services in Service Fabric, which communicate over Event Hub and I encounter very strange behavior.
The listener service reads the messages using PartitionReciever with ReceiveAsync method. It reads the messages always from the start of the partition, but even though the maxMessageCount parameter is set to very high number, which definitely exceeds the number of messages in the partition, it reads only "random" amount of messages but never the full list. It always starts to read correctly from the beginning of the partition but it almost never reads the full list of messages which should be present there...
Did I miss something in documentation and this is normal behavior, or am I right, that this is very strange bahviour?
A code snippet of my receiver service:
PartitionReceiver receiver = eventHubClient.CreateReceiver(PartitionReceiver.DefaultConsumerGroupName, Convert.ToString(partition), PartitionReceiver.StartOfStream);
ServiceEventSource.Current.Write("RecieveStart");
IEnumerable<EventData> ehEvents = null;
int i = 0;
do
{
try
{
ehEvents = await receiver.ReceiveAsync(1000);
break;
}
catch (OperationCanceledException)
{
if (i == NUM_OF_RETRIES-1)
{
await eventHubClient.CloseAsync();
StatusCode(500);
}
}
i++;
} while (i < NUM_OF_RETRIES);

Related

Event Hub Consumer in Service Fabric

I'm trying to get a service fabric to consistently pull messages from an azure event hub. I seem to have everything wired up but notice that my consumer just stops pulling events.
I have a hub with a couple thousand events I've pushed to it. Configured the hub with 1 partition and have my service fabric service with also only 1 partition to ease debugging.
Service starts, creates the EventHubClient, from there uses it to create a PartitionReceiver. The receiver is passed to an "EventLoop" that enters an "infinite" while that calls receiver.ReceiveAsync. The code for the EventLoop is below.
What I am observing is the first time through the loop I almost always get 1 message. Second time through I get somewhere between 103 and 200ish messages. After that, I get no messages. Also seems like if I restart the service, I get the same messages again - but that's because when I restart the service I'm having it start back at the beginning of the stream.
Would expect this to keep running until my 2000 messages were consumed and then it would wait for me (polling ocassionally).
Is there something specific I need to do with the Azure.Messaging.EventHubs 5.3.0 package to make it keep pulling events?
//Here is how I am creating the EventHubClient:
var connectionString = "something secret";
var connectionStringBuilder = new EventHubsConnectionStringBuilder(connectionString)
{
EntityPath = "NameOfMyEventHub"
};
try
{
m_eventHubClient = EventHubClient.Create(connectionStringBuilder);
}
//Here is how I am getting the partition receiver
var receiver = m_eventHubClient.CreateReceiver("$Default", m_partitionId, EventPosition.FromStart());
//The event loop which the receiver is passed to
private async Task EventLoop(PartitionReceiver receiver)
{
m_started = true;
while (m_keepRunning)
{
var events = await receiver.ReceiveAsync(m_options.BatchSize, TimeSpan.FromSeconds(5));
if (events != null) //First 2/3 times events aren't null. After that, always null and I know there are more in the partition/
{
var eventsArray = events as EventData[] ?? events.ToArray();
m_state.NumProcessedSinceLastSave += eventsArray.Count();
foreach (var evt in eventsArray)
{
//Process the event
await m_options.Processor.ProcessMessageAsync(evt, null);
string lastOffset = evt.SystemProperties.Offset;
if (m_state.NumProcessedSinceLastSave >= m_options.BatchSize)
{
m_state.Offset = lastOffset;
m_state.NumProcessedSinceLastSave = 0;
await m_state.SaveAsync();
}
}
}
}
m_started = false;
}
**EDIT, a question was asked on the number of partitions. The event hub has a single partition and the SF service also has a single one.
Intending to use service fabric state to keep track of my offset into the hub, but that's not the concern for now.
Partition listeners are created for each partition. I get the partitions like this:
public async Task StartAsync()
{
// slice the pie according to distribution
// this partition can get one or more assigned Event Hub Partition ids
string[] eventHubPartitionIds = (await m_eventHubClient.GetRuntimeInformationAsync()).PartitionIds;
string[] resolvedEventHubPartitionIds = m_options.ResolveAssignedEventHubPartitions(eventHubPartitionIds);
foreach (var resolvedPartition in resolvedEventHubPartitionIds)
{
var partitionReceiver = new EventHubListenerPartitionReceiver(m_eventHubClient, resolvedPartition, m_options);
await partitionReceiver.StartAsync();
m_partitionReceivers.Add(partitionReceiver);
}
}
When the partitionListener.StartAsync is called, it actually creates the PartitionListener, like this (it's actually a bit more than this, but the branch taken is this one:
m_eventHubClient.CreateReceiver(m_options.EventHubConsumerGroupName, m_partitionId, EventPosition.FromStart());
Thanks for any tips.
Will
How many partition do you have? I can't see in your code how you make sure you read all partitions in the default consumer group.
Any specific reason why you are using PartitionReceiver instead of using an EventProcessorHost?
To me, SF seems like a perfect fit for using the event processor host. I see there is already a SF integrated solution that uses stateful services for checkpointing.

Correct way to process batches using receiveMessages

We are using the #azure/service-bus package to process message batches from multiple topics.
The code we use to take 20 messages from the topic every 2 seconds looks like this.
let isProcessing: boolean = false;
setInterval(async () => {
if (isProcessing === false) {
isProcessing = true;
try {
const messages: Array<ServiceBusMessage>
= await receiver.receiveMessages(Configuration.SB.batchSize as number);
if (messages.length > 0) {
this.logger.info(`[SB] ${topic} - ${messages.length} require processing`);
await Promise.all([
...messages.map(message => this.handleMsg(receiver, message, topic, moduleRef, handler))
]).catch(error => {
this.logger.error(error.message, error);
});
}
isProcessing = false;
} catch (error) {
this.logger.error(error.message, error);
isProcessing = false;
}
}
}, Configuration.SB.tickInterval as number);
My question is - Is this the best way to do this? Is there a better way? It works and is fairly performant BUT I think we are losing receiveAndDelete messages sometimes and I am trying to workout if its our implementation
Thanks for any help
It works and is fairly performant BUT I think we are losing receiveAndDelete messages sometimes and I am trying to workout if its our implementation
There are two modes to receive messages
Unsafe with ReceiveAndDelete
Safe with PeekLock
When ReceiveAndDelete mode is used, the moment messages are received by the client, they are automatically deleted from the server. So this is at-most-once delivery.
With PeekLock a message is "leased" to the client for a maximum of 5 minutes and the client has to either acknowledge successful processing by requesting message completion or by cancelling/dead-lettering if it can't handle it. If none of these operations take place within the defined lease time (which doesn't have to be strictly 5 minutes and could be less), the message is retried until a maximum number of delivery attempts (MaxDeliveryCount) is exceeded and the message is dead-lettered. Note that the message is never lost. Even if it failed to process and was dead-lettered. Therefore this is at-least-once-delivery which could be more suitable for your scenario. It will have a slight impact on how you code your client, but not a drastic change.

About checkpoint strategy in event hub processor

I use event hubs processor host to receive and process the events from event hubs. For better performance, I call checkpoint every 3 minutes instead of every time when receiving the events:
public async Task ProcessEventAsync(context, messages)
{
foreach (var eventData in messages)
{
// do something
}
if (checkpointStopWatth.Elapsed > TimeSpan.FromMinutes(3);
{
await context.CheckpointAsync();
}
}
But the problem is, that there might be some events never being checkpoint if not new events sending to event hubs, as the ProcessEventAsync won't be invoked if no new messages.
Any suggestions to make sure all processed events being checkpoint, but still checkpoint every several mins?
Update: Per Sreeram's suggestion, I updated the code as below:
public async Task ProcessEventAsync(context, messages)
{
foreach (var eventData in messages)
{
// do something
}
this.lastProcessedEventsCount += messages.Count();
if (this.checkpointStopWatth.Elapsed > TimeSpan.FromMinutes(3);
{
this.checkpointStopWatch.Restart();
if (this.lastProcessedEventsCount > 0)
{
await context.CheckpointAsync();
this.lastProcessedEventsCount = 0;
}
}
}
Great case - you are covering!
You could experience loss of event checkpoints (and as a result event replay) in the below 2 cases:
when you have sparse data flow (for ex: a batch of messages every 5 mins and your checkpoint interval is 3 mins) and EventProcessorHost instance closes for some reason - you could see 2 min of EventData - re-processing. To handle that case,
Keep track of the lastProcessedEvent after completing IEventProcessor.onEvents/IEventProcessor.ProcessEventsAsync & checkpoint when you get notified on close - IEventProcessor.onClose/IEventProcessor.CloseAsync.
There might just be a case when - there are no more events to a specific EventHubs partition. In this case, you would never see the last event being checkpointed - with your Checkpointing strategy. However, this is uncommon, when you have continuous flow of EventData and you are not sending to specific EventHubs partition (EventHubClient.send(EventData_Without_PartitionKey)). If you think - you could run into this situation, use the:
EventProcessorOptions.setInvokeProcessorAfterReceiveTimeout(true); // in java or
EventProcessorOptions.InvokeProcessorAfterReceiveTimeout = true; // in C#
flag to wake up the processEventsAsync every so often. Then, keep track of, LastProcessedEventData and LastCheckpointedEventData and make a judgement whether to checkpoint when no Events are received, based on EventData.SequenceNumber property on those events.

Complete a message in a dead letter queue on Azure Service Bus

I want to be able to remove selected messages from my deadletter queue.
How is this accomplished?
I constantly get an error:
The operation cannot be completed because the RecieveContext is Null
I have tried every approach I can think of and read about, here is where I am now:
public void DeleteMessageFromDeadletterQueue<T>(string queueName, long sequenceNumber)
{
var client = GetQueueClient(queueName, true);
var messages = GetMessages(client);
foreach(var m in messages)
{
if(m.SequenceNumber == sequenceNumber)
{
m.Complete();
}
else
{
m.Abandon();
}
}
}
/// <summary>
/// Return a list of all messages in a Queue
/// </summary>
/// <param name="client"></param>
/// <returns></returns>
private IEnumerable<BrokeredMessage> GetMessages(QueueClient client)
{
var peekedMessages = client.PeekBatch(0, peekedMessageBatchCount).ToList();
bool getmore = peekedMessages.Count() == peekedMessageBatchCount ? true : false;
while (getmore)
{
var moreMessages = client.PeekBatch(peekedMessages.Last().SequenceNumber, peekedMessageBatchCount);
peekedMessages.AddRange(moreMessages);
getmore = moreMessages.Count() == peekedMessageBatchCount ? true : false;
}
return peekedMessages;
}
Not sure why this seems to be such a difficult task.
The issue here is that you've called PeekBatch which returns the messages that are just peeked. There is no Receive context which you can then use to either complete or abandon the message. The Peek and PeekBatch operations only return the messages and does not lock them at all, even if the receivedMode is set to PeekLock. It's mostly for skimming the queue, but you can't take an action on them. Note the docs for both Abandon and Complete state, "must only be called on a message that has been received by using a receiver operating in Peek-Lock ReceiveMode." It's not clear here, but Peek and PeekBatch operations don't count in this as they don't actually get a receive context. This is why it fails when you attempt to call abandon. If you actually found the one you were looking for it would throw a different error when you called Complete.
What you want to do is use a ReceiveBatch operation instead in PeekBatch RecieveMode. This will actually pull a batch of the messages back and then when you look through them to find the one you want you can actually affect the message complete. When you fire the abandon it will immediately release the message that isn't what you want back to the queue.
If your deadletter queue is pretty small usually this won't be bad. If it is really large then taking this approach isn't the most efficient. You're treating the dead letter queue more like a heap and digging through it rather than processing the messages "in order". This isn't uncommon when dealing with dead letter queues which require manual intervention, but if you have a LOT of these then it may be better to have something processing the dead letter queue into a different type of store where you can more easily find and destroy messages, but can still recreate the messages that are okay to push to a different queue to reprocess.
There may be other options, such as using Defer, if you are manually dead lettering things. See How to use the MessageReceiver.Receive method by sequenceNumber on ServiceBus.
I was unsuccessful with MikeWo's suggestion, because when I used the combination of instantiating the DLQ QueueClient with ReceiveMode.PeekLock and pulling messages with ReceiveBatch, I was using the versions of Receive/ReceiveBatch requesting the message by its SequenceNumber.
[aside: in my app, I peek all the messages and list them, and have another handler to re-queue to the main queue a dead-lettered message based on it's specific sequence number...]
But the call to Receive(long sequenceNumber) or ReceiveBatch(IEnumerable sequenceNumber) on the DLQClient always throws the exception, "Failed to lock one or more specified messages. The message does not exist." (even when I only passed 1 and it is definitely in the queue).
Additionally, for reasons that aren't clear, using ReceiveBatch(int messageCount), always only returns the next 1 message in the queue no matter what value is used as the messageCount.
What finally worked for me was the following:
QueueClient queueClient, deadLetterClient;
GetQueueClients(qname, ReceiveMode.PeekLock, out queueClient, out deadLetterClient);
BrokeredMessage msg = null;
var mapSequenceNumberToBrokeredMessage = new Dictionary<long, BrokeredMessage>();
while (msg == null)
{
#if UseReceive
var message = deadLetterClient.Receive();
#elif UseReceiveBatch
var messageEnumerable = deadLetterClient.ReceiveBatch(CnCountOfMessagesToPeek).ToList();
if ((messageEnumerable == null) || (messageEnumerable.Count == 0))
break;
else if (messageEnumerable.Count != 1)
throw new ApplicationException("Invalid expectation that there'd always be only 1 msg returned by ReceiveBatch");
// I've yet to get back more than one in the deadletter queue, but...
var message = messageEnumerable.First();
#endif
if (message.SequenceNumber == lMessageId)
{
msg = message;
break;
}
else if (mapSequenceNumberToBrokeredMessage.ContainsKey(message.SequenceNumber))
{
// this means it's started the list over, so we didn't find it...
break;
}
else
mapSequenceNumberToBrokeredMessage.Add(message.SequenceNumber, message);
message.Abandon();
}
if (msg == null)
throw new ApplicationException("Unable to find a message in the deadletter queue with the SequenceNumber: " + msgid);
var strMessage = GetMessage(msg);
var newMsg = new BrokeredMessage(strMessage);
queueClient.Send(newMsg);
msg.Complete();

Azure Service Bus issue

I'm working on a project where we would like to use ServiceBus like a classical queue, for messaging. Very often I see messages in queue that I can't get. These messages are active (not dead letters). I don't receive any exceptions, it waits for timeout and then message is null without any issues. I've already checked size of messages but all are less then 1.5 KB.
try
...
QueueClient _queueClient = QueueClient.CreateFromConnectionString(Properties.Settings.Default.ServiseBusConnectionString, Properties.Settings.Default.StatQueueName);
var count = 1000;
for (var i = 1; i <= count; i++)
{
var message = _queueClient.Receive();
//timeout
if (message == null)
{
//message is null :(
continue;
}
}
...
catch
Read How to receive messages from a queue and make sure you use _queueClient.Complete() or _queueClient.Abandon() to finish with each message.

Resources