Visibility timeout for part of consumers - azure

I'm trying to create a resilient system based on Azure Storage Queue.
I have workers in two different Azure Regions that are processing messages from the same queue. I would like to prioritize the worker from the same region as a Queue before the worker from the second region will start handling the message.
My idea was to use a visibility timeout feature that will vary by worker type. Is it possible?

My idea was to use a visibility timeout feature that will vary by
worker type. Is it possible?
To the best of my knowledge, it is not possible. A message can only have a single visibility timeout that is set either when a message is sent to the queue or when a worker updates the message by dequeuing it.

Related

Will a Queue Storage message that is placed back in the queue always be placed in the front of the queue?

The docs say for Azure Storage queues that:
Messages in Storage queues are typically first-in-first-out, but
sometimes they can be out of order; for example, when a message's
visibility timeout duration expires (for example, as a result of a
client application crashing during processing). When the visibility
timeout expires, the message becomes visible again on the queue for
another worker to dequeue it. At that point, the newly visible message
might be placed in the queue (to be dequeued again) after a message
that was originally enqueued after it.
I only allow my function app to scale to max 1 instance, so to me it sound like that if the function crashes, the message is placed back in the queue (in the front). And when it restarts it tries the same message, not the next one in the queue. So in this way I will be able guarantee ordering. Does this sound right?
I know I can guarantee ordering with Service Bus using sessions. But I'm trying to avoid it as I have to run this solution with VNETs and then I'll have to use the premium version which is pricey..

Azure Service Bus: 1 client -> 1 queue approach - how to manage dead-letter queues?

In my messaging app I decided to allocate an individual queue for a client. This way routing and security are quite easy. But I can't figure out how to deal with dead-letter queues then (when message expires, let's say). I want to use serverless approach with Azure functions to handle messages from DLQ. It's easy to setup to trigger a function when a message gets placed in a queue. But if I have 1000 clients, that would require 1000 functions?.. From what I can see you can attach a function to a single "trigger" - meaning single queue. Am I missing something here? what's the right approach to uniformly deal with DLQ messages?
Any help is appreciated. Thank you.
If your concern is how to deal with thousands of DLQs, one possible solution would be to configure your queues and set ForwardDeadLetteredMessagesTo property in each of the queue to point to another queue.
That way the dead-lettered messages from all the queues will go in a single queue and you can attach a Function to that queue for processing of such messages.

Failure handling for Queue Centric work pattern

I am planning to use a queue centric design as described here for one of my applications. That essentially consists of using a Azure queue where work requests are queued from the UI. A worker reads from the queue, processes and deletes the message from the queue.
The 'work' done by the worker is within a transaction so if the worker fails before completing, upon restart it again picks up the same message (as it has not be deleted from the queue) and tries to perform the operation again (up to a max number of retries)
To scale I could use two methods:
Multiple workers each with a separate queue. So if I have five workers W1 to W5, I have 5 queues Q1 to Q5 and each worker knows which queue to read from and failure handling is similar as the case with one queue and one worker
One queue and multiple workers. Here failure/Retry handling here would be more involved and might end up using the 'Invisibility' time in the message queue to make sure no two workers pick up the same job. The invisibility time would have to be calculated to make sure that its enough for the job to complete and yet not be large enough that retries are performed after a long time.
Would like to know if the 1st approach is the correct way to go? What are robust ways of handling failures in the second approach above?
You would be better off taking approach 2 - a single queue, but with multiple workers.
This is better because:
The process that delivers messages to the queue only needs to know about a single queue endpoint. This reduces complexity at this end;
Scaling the number of workers that are pulling from the queue is now decoupled from any code / configuration changes - you can scale up and down much more easily (and at runtime)
If you are worried about the visibility, you can initially choose a default timespan, and then if the worker looks like it's taking too long, it can periodically call UpdateMessage() to update the visibility of the message.
Finally, if your worker timesout and failed to complete processing of the message, it'll be picked up again by some other worker to try again. You can also use the DequeueCount property of the message to manage number of retries.
Multiple workers each with a separate queue. So if I have five workers
W1 to W5, I have 5 queues Q1 to Q5 and each worker knows which queue
to read from and failure handling is similar as the case with one
queue and one worker
With this approach I see following issues:
This approach makes your architecture tightly coupled (thus beating the whole purpose of using queues). Because each worker role listens to a dedicated queue, the web application responsible for pushing messages in the queue always need to know how many workers are running. Anytime you scale up or down your worker role, some how you need to tell web application so that it can start pushing messages in appropriate queue.
If a worker role instance is taken down for whatever reason there's a possibility that some messages may not be processed ever as other worker role instances are working on their dedicated queues.
There may be a possibility of under utilization/over utilization of worker role instances depending on how web application pushes the messages in the queue. For optimal utilization, web application should know about the worker role utilization so that it can decide which queue to send message to. This is certainly not a desired thing for a web application to do.
I believe #2 is the correct way to go. #Brendan Green has covered your concerns about #2 in his answer excellently.

Status as never finished by one of my Webjob while processing the message

I have a webjob which process the message only once by using the condition (DevliverCount = 1). Because I don't want other instance to process it if the locktime expired by first webjob. As other webjob try to process the message after locktime expired, the condition (DevliverCount = 1) will not met and comes out of the method which deletes the message from the queue automatically.
The problem over here is if the message state went to never finished (other than success) I wont have message in queue to process. How to handle this situation?
I think part of the problem is that you're trying to use the MaxDeliveryCount property to prevent concurrent message processing:
MaxDeliveryCount
The max delivery count setting is not used to prevent multiple consumers from processing a message at the same time, it's used to prevent "poison messages" where any consumer attempts to process a message whose contents prevent successful processing, and therefore the message would otherwise be processed forever.
I recommend you determine exactly what it is you're trying to accomplish. If you want a simple competing consumers scenario where multiple webjobs consume messages from a single queue, then there are standard ways to accomplish that:
good description of competing consumers
competing consumers with Service Bus queues
You can use MaxDeliveryCount in conjunction with competing consumers... if you want to prevent poison messages you can set MaxDeliveryCount to something larger than 1 and still give other consumers a chance to process messages whose locks expire.
Azure Service Bus supports dead-lettering of poison messages that exceed max delivery count, so you're able to examine such messages offline... they aren't simply deleted forever.
You might also need to add code in your webjobs to renew locks prior to their expiration... otherwise service bus can't differentiate between "valid messages that are taking a long time to process" and "poison messages that can't be processed". Without lock renewal your long-running valid messages will be dead-lettered the same as poison messages, which is almost certainly not what you want.
Good luck!

Azure queue message priority

I have a queue in Azure storage named for example 'messages'. And every 1 hour some service push to this queue some amount of messages that should update data. But, in some cases I also push to this queue message from another place and I want this message be proceeded immediately and I can not set priority for this message.
What is the best solution for this problem?
Can I use two different queues ('messages' and 'messages-priority') or it is a bad approach?
The correct approach is to use multiple queues - a 'normal priority' and a 'high priority' queue. What we have implemented is multiple queue reader threads in a single worker role - each thread first checks the high priority queue and, if its empty, looks in the normal queue. This way the high priority messages will be processed by the first available thread (pretty much immediately), and the same code runs regardless of where messages come from. It also saves having to have a reader continuously looking in a single queue and having to be backed off because there are seldom messages.

Resources