Logic App (Consumption) Handling Lost Locks (not Timeout) - azure

We have instances where our service bus message lock has been lost before it can be completed. MS referred me to the documentation:
Important
It is important to note that the lock that PeekLock acquires on the
message is volatile and may be lost in the following conditions
Service Update OS update Changing properties on the entity (Queue,
Topic, Subscription) while holding the lock. When the lock is lost,
Azure Service Bus will generate a MessageLockLostException which will
be surfaced on the client application code. In this case, the client's
default retry logic should automatically kick in and retry the
operation.
We already handle the 5 minute timeout with a parallel loop. Now we need to handle a lost lock due to volatility. What is everyone's best practice here?
A resubmit is not appropriate - in case of duplication
Dead-lettering cannot be done because the lock is lost, a second instance will already have started for the same message
Message could be completed immediately, however we lose the dead-letter ability etc...

Could this be a good solution for you?
Change the logic app to be http triggered
Add another logic app triggered by the message that creates a record in a storage of some sort with the state of processing the message set to 0 for example and then calls the first logic app
Add another logic app that sets the record to complete state 1 for example
When the logic app finishes, it calls the second logic app to update the record
What happens:
message arrives
new logicapp1 picks it and completes the message
new logicapp1 creates a record and calls your main logic app
main logic app do its processing
main logic app calls new logicapp2
new logicapp2 updates the record as completed

Related

What happens to the messages being processed on functions running when we disable the function?

We are working with Azure functions, which are triggered on every message in the service bus queue. We are trying to solve a problem whereby we need to disable a function on the function app processing messages, dynamically, so that it does not process messages any further and we do not lose any message in the process as well.
We can disable the functions via multiple ways, referring to link but the problem remains the same. Unable to figure out what happens to the functions already spawned when trying to disable the same.
Since the function is service bus triggered there is always a possibility that the function is processing a message and we disable the same, does it get processed, any sorts of cancellation is raised, it just dies out with an exception?
It would be great someone could direct me to some documentation or something. Thanks.
Azure Service Bus triggered function will already have a lock on the message that's being processed. If Function is terminated and the message was not completed or disposition, the lock will expire and the message will reappear on the queue. That's because of the Functions runtime receives a message in PeekLock mode.
One factor to consider is the queue's MaxDeliveryCount. If a function is terminated upon the last processing attempt, the message will be dead-lettered as all processing attempts have been exhausted. That's a standard Azure Service Bus behaviour.

Service Bus Queue doesn't remove completed messages

I'm sending an HTTP Post message to a Service Bus Queue and when this receive it, a Logic App starts. But when the Logic App finishes sometimes the message is n ot removed from the queue and this restart the Logic App.
What can I do for remove this messages?
This may help. When you make a logic app resource a single instance, it is recommended to use a Peek/Lock trigger. Otherwise the message doesn't leave the queue until the next trigger runs.
Ref:
https://social.msdn.microsoft.com/Forums/en-US/e2eb4505-cb7e-4bad-aeaf-1da2e10739d4/whenamessageisreceivedinaqueueautocomplete-trigger-is-not-deleting-the-message-off-the?forum=azurelogicapps
Make sure that was no errors in Logic App. Error would rollback transaction (if message consumption is part of transaction) and leave message on queue for next attempt.

Azure Web Jobs, Azure Service Bus Queue Trigger prevent message from getting deleted

I am looking into setting up a web job trigger to read message from service bus queue. What would be the best practice to implement a retry logic in case of any errors handling the downstream systems.
Would we be able to throw an exception so that the message will not be deleted from the queue and will be retried after certain time period?
Appreciate your feedback.
You don't need to define retry logic explicitly. When the message is de-queued from service bus , it gets invisible from queue for certain time period (lock time default 30secs , you can configure it). You try to process the message , if it gets successful you simply call BrokeredMessage.CompleteAsync which means i am done and mark this message as completed. If you have some problem in down stream you can abandon the message by calling BrokeredMessage.AbandonAsync . This will unlock the message and the message appears back in the queue. The message will be picked up by the worker again and process it. Until you get successful or reach the max retry limit after which the message is send to dead letter queue.

Azure EventGrid Webhook timeout

Came to know from the documentation that timeout for webhook is 60 secs. If that's the case then are we expecting developers to do asynchronous operations? I mean what if the work that I want to do as part of the webhook takes more than 60 secs? But if we make that operation asynchronous and the work I want do as part of the webhook fails then how do we recover from that situation because we already responded event grid 200 OK. In that case - would we lose the event?
In the scenario like yours such as the event handler processing over 60 seconds, the following can be implemented, based on the retrying and dead-lettering technique:
use the primary event subscription with a retry policy and
dead-lettering. This subscriber (function) with a binding to the storage table will handle a state of the long running (max 24 hrs) event processing and also forwarding the first event message to to the storage queue for triggering a long running process. The response from this primary subscriber will depend from the state of the StorageQueueTrigger function.
every new retry event message will check the state of the long running process and based on that, the response code (for instance OK(200) or Service.Unavailable(503)) is sent back to the Event Grid.
In the above scenario, the retry mechanism represents a "watchdog timer" for watching a long running event message processing. The second function such as QueueTrigger function is yielding a process between the Event Grid and long running process.
In summary, your scenario will require the following:
EventSubscriber with retry policy and dead-lettering for Webhook (EventGridTrigger or HttpTrigger function)
EventGridTrigger or HttpTrigger function
Storage Table
QueueTrigger Function
If anything unusually happen during the watchdog timer, the dead-lettering is sent to your container storage with a deadLetterReason.
Note, that in the case if your long running process is over 5/10 minutes, the StorageQueue trigger needs to be considered in the App Service plan or using your custom worker processor.
Update:
The following screen snippet shows the above solution for "long running subscriber" with a Watchdog timer:
also it can be used directly a StorageQueue Event Handler to yield the long running process from the EventGrid, but in this case, the function has a more responsibilities such as retrying, notification, dead-lettering, etc., see the following picture:

Requeue or delete messages in Azure Storage Queues via WebJobs

I was hoping if someone can clarify a few things regarding Azure Storage Queues and their interaction with WebJobs:
To perform recurring background tasks (i.e. add to queue once, then repeat at set intervals), is there a way to update the same message delivered in the QueueTrigger function so that its lease (visibility) can be extended as a way to requeue and avoid expiry?
With the above-mentioned pattern for recurring background jobs, I'm also trying to figure out a way to delete/expire a job 'on demand'. Since this doesn't seem possible outside the context of WebJobs, I was thinking of maybe storing the messageId and popReceipt for the message(s) to be deleted in Table storage as persistent cache, and then upon delivery of message in the QueueTrigger function do a Table lookup to perform a DeleteMessage, so that the message is not repeated any more.
Any suggestions or tips are appreciated. Cheers :)
Azure Storage Queues are used to store messages that may be consumed by your Azure Webjob, WorkerRole, etc. The Azure Webjobs SDK provides an easy way to interact with Azure Storage (that includes Queues, Table Storage, Blobs, and Service Bus). That being said, you can also have an Azure Webjob that does not use the Webjobs SDK and does not interact with Azure Storage. In fact, I do run a Webjob that interacts with a SQL Azure database.
I'll briefly explain how the Webjobs SDK interact with Azure Queues. Once a message arrives to a queue (or is made 'visible', more on this later) the function in the Webjob is triggered (assuming you're running in continuous mode). If that function returns with no error, the message is deleted. If something goes wrong, the message goes back to the queue to be processed again. You can handle the failed message accordingly. Here is an example on how to do this.
The SDK will call a function up to 5 times to process a queue message. If the fifth try fails, the message is moved to a poison queue. The maximum number of retries is configurable.
Regarding visibility, when you add a message to the queue, there is a visibility timeout property. By default is zero. Therefore, if you want to process a message in the future you can do it (up to 7 days in the future) by setting this property to a desired value.
Optional. If specified, the request must be made using an x-ms-version of 2011-08-18 or newer. If not specified, the default value is 0. Specifies the new visibility timeout value, in seconds, relative to server time. The new value must be larger than or equal to 0, and cannot be larger than 7 days. The visibility timeout of a message cannot be set to a value later than the expiry time. visibilitytimeout should be set to a value smaller than the time-to-live value.
Now the suggestions for your app.
I would just add a message to the queue for every task that you want to accomplish. The message will obviously have the pertinent information for processing. If you need to schedule several tasks, you can run a Scheduled Webjob (on a schedule of your choice) that adds messages to the queue. Then your continuous Webjob will pick up that message and process it.
Add a GUID to each message that goes to the queue. Store that GUID in some other domain of your application (a database). So when you dequeue the message for processing, the first thing you do is check against your database if the message needs to be processed. If you need to cancel the execution of a message, instead of deleting it from the queue, just update the GUID in your database.
There's more info here.
Hope this helps,
As for the first part of the question, you can use the Update Message operation to extend the visibility timeout of a message.
The Update Message operation can be used to continually extend the
invisibility of a queue message. This functionality can be useful if
you want a worker role to “lease” a queue message. For example, if a
worker role calls Get Messages and recognizes that it needs more time
to process a message, it can continually extend the message’s
invisibility until it is processed. If the worker role were to fail
during processing, eventually the message would become visible again
and another worker role could process it.
You can check the REST API documentation here: https://msdn.microsoft.com/en-us/library/azure/hh452234.aspx
For the second part of your question, there are really multiple ways and your method of storing the id/popReceipt as a lookup is a possible option, you can actually have a Web Job dedicated to receive messages on a different queue (e.g plz-delete-msg) and you send a message containing the "messageId" and this Web Job can use Get Message operation then Delete it. (you can make the job generic by passing the queue name!)
https://msdn.microsoft.com/en-us/library/azure/dd179474.aspx
https://msdn.microsoft.com/en-us/library/azure/dd179347.aspx

Resources