best practices with poison message handling for Azure service bus topic - azure

Dealing with poison messages (throwing exception while consuming) from Azure Service Bus can lead to loops till number of retries has reached maxDeliveryCount setting of topic subscription.
Does the SequenceNumber of message added by Azure Service bus keeps on increasing on each failed attempt till it reaches maxDeliveryCount ?
Setting maxDeliveryCount = 1, is that best practice to deal with poison messages so that consumer never attempt twice to process message once it failed

Best practices depend on your application and your retry approach.
Most of time I noticed message get failed
Dependent service not available (Redis, SQL connection issue)
Faulty message (message doesn't have a mandatory parameter or some value is incorrect)
Process code issue (bug in message processing code)
For the 1st and 3rd scenario, I created C# web job to run and reprocess deadletter message.
Below is my code
internal class Program
{
private static string connectionString = ConfigurationSettings.AppSettings["GroupAssetConnection"];
private static string topicName = ConfigurationSettings.AppSettings["GroupAssetTopic"];
private static string subscriptionName = ConfigurationSettings.AppSettings["GroupAssetSubscription"];
private static string databaseEndPoint = ConfigurationSettings.AppSettings["DatabaseEndPoint"];
private static string databaseKey = ConfigurationSettings.AppSettings["DatabaseKey"];
private static string deadLetterQueuePath = "/$DeadLetterQueue";
private static void Main(string[] args)
{
try
{
ReadDLQMessages(groupAssetSyncService, log);
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
throw;
}
finally
{
documentClient.Dispose();
}
Console.WriteLine("All message read successfully from Deadletter queue");
Console.ReadLine();
}
public static void ReadDLQMessages(IGroupAssetSyncService groupSyncService, ILog log)
{
int counter = 1;
SubscriptionClient subscriptionClient = SubscriptionClient.CreateFromConnectionString(connectionString, topicName, subscriptionName + deadLetterQueuePath);
while (true)
{
BrokeredMessage bmessgage = subscriptionClient.Receive(TimeSpan.FromMilliseconds(500));
if (bmessgage != null)
{
string message = new StreamReader(bmessgage.GetBody<Stream>(), Encoding.UTF8).ReadToEnd();
syncService.UpdateDataAsync(message).GetAwaiter().GetResult();
Console.WriteLine($"{counter} message Received");
counter++;
bmessgage.Complete();
}
else
{
break;
}
}
subscriptionClient.Close();
}
}
For 2nd scenario, we manually verify deadletter messages (Custom UI/ Service Bus explore), sometimes we correct message data or sometimes we purge message and clear queue.
I won't recommend maxDeliveryCount=1. If some network/connection issue occurs, the built-in retry will process and clear from the queue. When I was working in a finance application, I was keeping maxDeliveryCount=5 while in my IoT application is maxDeliveryCount=3.
If you are reading messages in batch, a complete batch will re-process if an error occurred any of message.
SequenceNumber The sequence number can be trusted as a unique identifier since it is assigned by a central and neutral authority and not by clients. It also represents the true order of arrival, and is more precise than a time stamp as an order criterion, because time stamps may not have a high enough resolution at extreme message rates and may be subject to (however minimal) clock skew in situations where the broker ownership transitions between nodes.

Related

Why SingleThreadExecutor throws OutOfMemoryError in Java

I have a Message Producer (RabbitMq) and a springboot service that receives messages from this a Queue (RabbitMQ). S the amount of messages from this Queue is unknown as it depends on the traffic or amount of messages pushed to this rabbitMQ. After messages have been received from this RabbitMq into my Springboot service, I then store those messages locally in an ArrayDeque. Every message that comes through is stored in this local queue and then send to the socket to another Application. These messages have to be send in the order that they arrived from the RabbitMQ.
Here is a snippet of my code.
public void addMessageToQueue(CML cml) throws ParseException {
if (cml != null) {
AgentEventData agentEventData = setAgentEventData(cml);
log.info("Populated AgentEventData: {} ", agentEventData);
MessageProcessor.getMessageQueue().getMessageQueue().add(agentEventData);
// ExecutorService executorService = Executors.newFixedThreadPool(MessageProcessor.getMessageQueue().getMessageQueue().size());
log.info("Message QUEUE Size: {}", MessageProcessor.getMessageQueue().getMessageQueue().size());
QUEUE_MONITOR.setCachedQueue(MessageProcessor.getMessageQueue());
/**
* Queue has already methods for monitoring events, no need for a seperate object
* */
executeTasks();
} else {
log.error("CML Message is NULL, Message Cannot be added to the Message Queue.");
}
}
private static void executeTasks() {
ExecutorService executorService = Executors.newSingleThreadExecutor();
try {
executorService.execute(new MessageProcessor());
} catch (Exception e) {
log.error("Exception when executing Task: {}", e.getMessage());
}
log.info("Shutting down Executor Service........");
executorService.shutdown();
log.info("Executor Service Shutdown : {}", executorService.isShutdown());
}
I tried using a newSingleThreadExecutor as shown in the executeTasks() method but after some time when my app is running in the server, i get the Consumer thread error, java.lang.OutOfMemoryError, unable to create native thread. Possibly out of memory or process/resource limits reached.
I then tried newFixedThreadExecutor(10), and still get the same error after some time.
What is it that i am doing wrong and which approach best fit my App/Service ?

In queue-triggered Azure Webjobs can an Azure Storage Queue message be modified after webjob function failure but before poisoning?

I've got queue-triggered functions in my Azure webjobs. Normal behavior of course is when the function fails MaxDequeueCount times the message is put into the appropriate poison queue. I would like to modify the message after the error but before poison queue insertion. Example:
Initial message:
{ "Name":"Tom", "Age", 30" }
And upon failure I want to modify the message as follows and have the modified message be inserted into the poison queue:
{ "Name":"Tom", "Age", 30", "ErrorMessage":"Unable to find user" }
Can this be done?
According to the Webjobs documentation, messages will get put on the poison queue after 5 failed attempts to process the message:
The SDK will call a function up to 5 times to process a queue message.
If the fifth try fails, the message is moved to a poison queue. The
maximum number of retries is configurable.
Source: https://github.com/Azure/azure-webjobs-sdk/wiki/Queues#poison
This is the automatic behavior. But you can still handle exceptions in your WebJobs Function code (so the exception doesn't leave your function and automatic poison message handling is not triggered) and put a modified message to the poison queue using output bindings.
Another option would be to check the dequeueCount property which indicates how many times the message was tried to be processed.
You can get the number of times a message has been picked up for
processing by adding an int parameter named dequeueCount to your
function. You can then check the dequeue count in function code and
perform your own poison message handling when the number exceeds a
threshold, as shown in the following example.
public static void CopyBlob(
[QueueTrigger("copyblobqueue")] string blobName, int dequeueCount,
[Blob("textblobs/{queueTrigger}", FileAccess.Read)] Stream blobInput,
[Blob("textblobs/{queueTrigger}-new", FileAccess.Write)] Stream blobOutput,
TextWriter logger)
{
if (dequeueCount > 3)
{
logger.WriteLine("Failed to copy blob, name=" + blobName);
}
else
{
blobInput.CopyTo(blobOutput, 4096);
}
}
(also taken from above link).
Your function signature could look like this
public static void ProcessQueueMessage(
[QueueTrigger("myqueue")] CloudQueueMessage message,
[Queue("myqueue-poison")] CloudQueueMessage poisonMessage,
TextWriter logger)
The default maximum retry time is 5. you also can set this value by yourself using the property Queues.MaxDequeueCount of the JobHostConfiguration() instance, code like below:
static void Main(string[] args)
{
var config = new JobHostConfiguration();
config.Queues.MaxDequeueCount = 5; // set the maximum retry time
var host = new JobHost(config);
host.RunAndBlock();
}
Then you can update the failed queue message when the maximum retry time have reached. You can specify a non-existing Blob container to enforce the retry mechanism. Code like below:
public static void ProcessQueueMessage([QueueTrigger("queue")] CloudQueueMessage message, [Blob("container/{queueTrigger}", FileAccess.Read)] Stream myBlob, ILogger logger)
{
string yourUpdatedString = "ErrorMessage" + ":" + "Unable to find user";
string str1 = message.AsString;
if (message.DequeueCount == 5) // here, the maximum retry time is set to 5
{
message.SetMessageContent(str1.Replace("}", "," + yourUpdatedString + "}")); // modify the failed message here
}
logger.LogInformation($"Blob name:{message} \n Size: {myBlob.Length} bytes");
}
When the above is done, you can see the updated queue message in the queue-poison.
UPDATED:
Since CloudQueueMessage is a sealed class, we cannot inherit it.
For your MySpecialPoco message, you can use JsonConvert.SerializeObject(message), code like below:
using Newtonsoft.Json;
static int number = 0;
public static void ProcessQueueMessage([QueueTrigger("queue")] object message, [Blob("container/{queueTrigger}", FileAccess.Read)] Stream myBlob, ILogger logger)
{
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
CloudQueueClient queueClient = storageAccount.CreateCloudQueueClient();
CloudQueue queue = queueClient.GetQueueReference("queue-poison");// get the poison queue
CloudQueueMessage msg1 = new CloudQueueMessage(JsonConvert.SerializeObject(message));
number++;
string yourUpdatedString = "\"ErrorMessage\"" + ":" + "\"Unable to find user\"";
string str1 = msg1.AsString;
if (number == 5)
{
msg1.SetMessageContent(str1.Replace("}", "," + yourUpdatedString + "}"));
queue.AddMessage(msg1);
number = 0;
}
logger.LogInformation($"Blob name:{message} \n Size: {myBlob.Length} bytes");
}
But the bad thing is that, both the original / updated queue messages are written into poison queue.

How to specify EventHub Consumer Group in a WebJob?

I am using WebJob's bindings to EventHub as described here:
https://github.com/Azure/azure-webjobs-sdk/wiki/EventHub-support
While the webjob is running, trying to run the Azure Service Bus Explorer on the same hub result in this exception:
Exception: A receiver with a higher epoch '14' already exists. A new receiver with epoch 0 cannot be created.
Make sure you are creating receiver with increasing epoch value to ensure connectivity, or ensure all old epoch receivers are closed or disconnected.
From what I understand, this is caused by the 2 listeners(webjob & bus explorer) using the same Consumer Group.
So my question, how can I specify a different Consumer Group in my webjob ?
My current code look like this:
Program.cs:
var config = new JobHostConfiguration()
{
NameResolver = new NameResolver()
};
string eventHubConnectionString = ConfigurationManager.ConnectionStrings["EventHub"].ConnectionString;
string eventHubName = ConfigurationManager.AppSettings["EventHubName"];
string eventProcessorHostStorageConnectionString = ConfigurationManager.ConnectionStrings["EventProcessorHostStorage"].ConnectionString; ;
var eventHubConfig = new EventHubConfiguration();
eventHubConfig.AddReceiver(eventHubName, eventHubConnectionString, eventProcessorHostStorageConnectionString);
config.UseEventHub(eventHubConfig);
var host = new JobHost(config);
host.RunAndBlock();
Functions.cs:
public class Functions
{
public static void Trigger([EventHubTrigger("%EventHubName%")] string message, TextWriter log)
{
log.WriteLine(message);
}
}
[Edit - Bonus Question]
I don't fully grasp the use of Consumer Group and 'epoch' thing. One Consumer Group is limited to one receiver ?
The EventHubTrigger has an optional ConsumerGroup property (source). So, based on that modify the trigger like this:
public class Functions
{
public static void Trigger([EventHubTrigger("%EventHubName%", ConsumerGroup = "MyConsumerGroup")] string message, TextWriter log)
{
log.WriteLine(message);
}
}

How to get queue message type

I'm using Azure Storage Queues and I want to write some code that retrieves all queues, and then finds a handler that can process the message in this queue. For that I defined an interface like this:
public interface IHandler<T>
I have multiple implementations of this interface, like these: IHandler<CreateAccount> or IHandler<CreateOrder>. I use 1 queue per message type, so the CreateAccount messages would go into the create-account-queue.
How do I hook these up? In order to find the right Handler class for a message, I first need to know the message type, but it seems that CloudQueueMessage objects don't contain that information.
Not really an answer to your question but I will share how we're handling exact same situation in our application.
In our application, we're sending different kinds of messages like you are and handling those messages in a background process.
What we're doing is including the message type in the message body itself. So our message typically looks like:
message: {
type: 'Type Of Message',
contents: {
//Message contents
}
}
One key difference is that all messages go in a single queue (instead of different queues in your case). The receiver (background process) just polls one queue, gets the message and identifies the type of message and call handler for that message accordingly.
You can associate metadata with each queue. Since you mentioned that you use one queue per message type, you could put the handler name in the metadata for each queue. You can then enumerate all queues and get the metadata per queue that tells you what type of handler you should use. Here's a quick console app that demonstrates what I think you're asking for:
using System;
using System.Collections.Generic;
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Queue;
namespace QueueDemo
{
class Program
{
static void Main(string[] args)
{
//get a ref to our account.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse("UseDevelopmentStorage=true;");
CloudQueueClient cloudQueueClient = storageAccount.CreateCloudQueueClient();
//create our queues and add metadata showing what type of class each queue contains.
CloudQueue queue1 = cloudQueueClient.GetQueueReference("queue1");
queue1.Metadata.Add("classtype", "classtype1");
queue1.CreateIfNotExists();
CloudQueue queue2 = cloudQueueClient.GetQueueReference("queue2");
queue2.Metadata.Add("classtype", "classtype2");
queue2.CreateIfNotExists();
//enumerate our queues in a storage account and look at their metadata...
QueueContinuationToken token = null;
List<CloudQueue> cloudQueueList = new List<CloudQueue>();
List<string> queueNames = new List<string>();
do
{
QueueResultSegment segment = cloudQueueClient.ListQueuesSegmented(token);
token = segment.ContinuationToken;
cloudQueueList.AddRange(segment.Results);
}
while (token != null);
try
{
foreach (CloudQueue cloudQ in cloudQueueList)
{
//call this, or else your metadata won't be included for the queue.
cloudQ.FetchAttributes();
Console.WriteLine("Cloud Queue name = {0}, class type = {1}", cloudQ.Name, cloudQ.Metadata["classtype"]);
queueNames.Add(cloudQ.Name);
}
}
catch (Exception ex)
{
Console.WriteLine("Exception thrown listing queues: " + ex.Message);
throw;
}
//clean up after ourselves and delete queues.
foreach (string oneQueueName in queueNames)
{
CloudQueue cloudQueue = cloudQueueClient.GetQueueReference(oneQueueName);
cloudQueue.DeleteIfExists();
}
Console.ReadKey();
}
}
}
However, it might be easier to subclass QueueMessage, then dequeue each message and identify what subclass you're currently looking at, then pass it to the proper handler.

Azure web jobs - parallel message processing from queues not working properly

I need to provision SharePoint Online team rooms using azure queues and web jobs.
I have created a console application and published as continuous web job with the following settings:
config.Queues.BatchSize = 1;
config.Queues.MaxDequeueCount = 4;
config.Queues.MaxPollingInterval = TimeSpan.FromSeconds(15);
JobHost host = new JobHost();
host.RunAndBlock();
The trigger function looks like this:
public static void TriggerFunction([QueueTrigger("messagequeue")]CloudQueueMessage message)
{
ProcessQueueMsg(message.AsString);
}
Inside ProcessQueueMsg function i'm deserialising the received json message in a class and run the following operations:
I'm creating a sub site in an existing site collection;
Using Pnp provisioning engine i'm provisioning content in the sub
site (lists,upload files,permissions,quick lunch etc.).
If in the queue I have only one message to process, everything works correct.
However, when I send two messages in the queue with a few seconds delay,while the first message is processed, the next one is overwriting the class properties and the first message is finished.
Tried to run each message in a separate thread but the trigger functions are marked as succeeded before the processing of the message inside my function.This way I have no control for potential exceptions / message dequeue.
Tried also to limit the number of threads to 1 and use semaphore, but had the same behavior:
private const int NrOfThreads = 1;
private static readonly SemaphoreSlim semaphore_ = new SemaphoreSlim(NrOfThreads, NrOfThreads);
//Inside TriggerFunction
try
{
semaphore_.Wait();
new Thread(ThreadProc).Start();
}
catch (Exception e)
{
Console.Error.WriteLine(e);
}
public static void ThreadProc()
{
try
{
DoWork();
}
catch (Exception e)
{
Console.Error.WriteLine(">>> Error: {0}", e);
}
finally
{
// release a slot for another thread
semaphore_.Release();
}
}
public static void DoWork()
{
Console.WriteLine("This is a web job invocation: Process Id: {0}, Thread Id: {1}.", System.Diagnostics.Process.GetCurrentProcess().Id, Thread.CurrentThread.ManagedThreadId);
ProcessQueueMsg();
Console.WriteLine(">> Thread Done. Processing next message.");
}
Is there a way I can run my processing function for parallel messages in order to provision my sites without interfering?
Please let me know if you need more details.
Thank you in advance!
You're not passing in the config object to your JobHost on construction - that's why your config settings aren't having an effect. Change your code to:
JobHost host = new JobHost(config);
host.RunAndBlock();

Resources