Azure Function and storage queue, what to do if function fails - azure

I'm working out a scenario where a post a message to an Azure Storage Queue. For testing purposes I've developed a console app, where I get the message and I'm able to update it with a try count, and when the logic is done, I delete the message.
Now I'm trying to port my code to an Azure Function. One thing that seems to be very different is, when the Azure Function is called, the message is deleted from the queue.
I find it hard to find any documentation on this specific subject and I feel I'm missing something with regard to the concept of combining these two.
My questions:
Am I right, that when you trigger a function on a new queue item, the function takes the message and deletes it from the queue, even if the function fails?
If 1 is correct, how do you make sure that the message is retried and posted to a dead queue for later processing?

The runtime only deletes the queue message when your Function successfully processes it (i.e. no error has occurred). When the message is dequeued and passed to your function, it becomes invisible for a period of time (10 minutes). While your function is running this invisibility is maintained. If your function fails, the message is not deleted - it remains in the queue in an invisible state. After the visibilty timeout expires, the message will become visible in the queue again for reprocessing.
The details of how core WebJobs SDK queue processing works can be found here. On that page, see the section "How to handle poison messages" which addresses your question. Basically you'll get all the right behaviors for free - retry handling, poison message handling, etc. :)

Related

What happens to the messages being processed on functions running when we disable the function?

We are working with Azure functions, which are triggered on every message in the service bus queue. We are trying to solve a problem whereby we need to disable a function on the function app processing messages, dynamically, so that it does not process messages any further and we do not lose any message in the process as well.
We can disable the functions via multiple ways, referring to link but the problem remains the same. Unable to figure out what happens to the functions already spawned when trying to disable the same.
Since the function is service bus triggered there is always a possibility that the function is processing a message and we disable the same, does it get processed, any sorts of cancellation is raised, it just dies out with an exception?
It would be great someone could direct me to some documentation or something. Thanks.
Azure Service Bus triggered function will already have a lock on the message that's being processed. If Function is terminated and the message was not completed or disposition, the lock will expire and the message will reappear on the queue. That's because of the Functions runtime receives a message in PeekLock mode.
One factor to consider is the queue's MaxDeliveryCount. If a function is terminated upon the last processing attempt, the message will be dead-lettered as all processing attempts have been exhausted. That's a standard Azure Service Bus behaviour.

Azure Service Bus - Add a message to the queue in a deferred state

I'm wondering if it is possible to send a brokered message to a queue/topic where the message is already in a deferred state?
I'm asking this because I currently have a process that does the following ...
The process starts and a brokered message is sent to a queue (this triggers a function that records the message body as an entity in table storage with a 'Processing' status).
Additional work is done in the process
If we get to the end of the process without any issues, another brokered message is sent to the queue with a completion message (this triggers the same function that updates the entity in table storage with a 'Complete' status).
While this method is mostly working, it feels clunky and fragile. I would really like to be able to send a message to the queue and then have the final step make the message visible on the queue so it can be consumed by the function (Durable Function).
I thought about setting the ScheduledEnqueueTimeUtc, but I can't guarantee when the process will finish (I'm thinking worst case scenario here) so I'm not sure how long to set it.
I also looked at the Defer option for a BrokeredMessage but it seems this can only be set from the receiver and not be in a deferred state initially.
Is what I'm trying to do possible with Service Bus brokered messages? Could I set the Scheduled Enqueue time so some ridiculously long time (e.g. 2 hours) and if it reaches that time it is automatically expired and moved to the Dead Letter queue? Should I send the initial message to the Dead Letter queue and then once the process is complete, retrieve it and resubmit it?
Has anyone had any experience with implementing a process like this ... send a start message and only process the message once a completion notification has been received? I need this to be as robust as possible as I'm dealing with financial transactions in this process.
Hopefully my explanation makes sense.
I'm wondering if it is possible to send a brokered message to a queue/topic where the message is already in a deferred state?
That's not possible. You can only delay a brand new message, not defer it. Deferring required a message to be received first for it to have a SequenceNumber.
Using ScheduledEnqueueTimeUtc has its challenges as you will be sending it in the future, but cannot cancel once processing is over. Instead, you could leverage QueueClient.ScheduleMessageAsync() that returns back SequenceNumber immediately. This way you can set the message far into future, but also cancel it if processing is finished earlier.
I ended up solving this issue by keeping the process of sending two messages, but refactoring my durable function to record the messages in Table Storage, check that both messages have been received and if they have, add a new message to Azure Queue Storage. A second function listens to the queue which starts its process.
After much testing, this appears to be quite a robust solution. It then doesn't matter what order the two messages arrive, or how long they take ... as long as both of them have arrive, that is when the second function will kick off.

Azure Function Event Hub Trigger reliability

I'm a bit confused regarding the EventHubTrigger for Azure functions.
I've got an IoT Hub, and am using its eventhub-compatible endpoint to trigger an Azure function that is going to process and store the received data.
However, if my function fails (= throws an exception), that message (or messages) being processed during that function call will get lost. I actually would expect the Azure function runtime to process the messages at a later time again. Specifically, I would expect this behavior because the EventHubTrigger is keeping checkpoints in the Function Apps storage account in order to keep track of where in the event stream it has to continue.
The documention of the EventHubTrigger even states that
If all function executions succeed without errors, checkpoints are added to the associated storage account
But still, even when I deliberately throw exceptions in my function, the checkpoints will get updated and the messages will not get received again.
Is my understanding of the EventHubTriggers documentation wrong, or is the EventHubTriggers implementation (or its documentation) wrong?
This piece of documentation seems confusing indeed. I guess they mean the errors of Function App host itself, not of your code. An exception inside function execution doesn't stop the processing and checkpointing progress.
The fact is that Event Hubs are not designed for individual message retries. The processor works in batches, and it can either mark the whole batch as processed (i.e. create a checkpoint after it), or retry the whole batch (e.g. if the process crashed).
See this forum question and answer.
If you still need to re-process failed events from Event Hub (and errors don't happen too often), you could implement such mechanism yourself. E.g.
Add an output Queue binding to your Azure Function.
Add try-catch around processing code.
If exception is thrown, add the problematic event to the Queue.
Have another Function with Queue trigger to process those events.
Note that the downside of this is that you will loose ordering guarantee provided by Event Hubs (since Queue message will be processed later than its neighbors).
Quick fix. As retry policy would not work if down system is down for few hours. You can call Process.GetCurrentProcess().Kill(); in exception handling. This would stop the checkpoint moving forward. I have tested this with consumption based function app. You will not see anything in logs but i added email to notify that something went wrong and to avoid data loss i have killed the function instance.
Hope this helps.
Would put an blog over it and other part of workflow where I stop function in case of continuous failure on down system using logic app.

Can I configure azure function to peek and read message in service bus queue but not delete it?

Per Azure Functions Service Bus bindings:
Trigger behavior
...
PeekLock behavior - The Functions runtime receives a message in PeekLock mode and calls Complete on the message if the function finishes successfully, or calls Abandon if the function fails. If the function runs longer than the PeekLock timeout, the lock is automatically renewed.
I am assuming that when azure function calls Complete on the message, it will be removed from the queue.
What should I do in my function if I want my function to spy on the message but never delete it?
Unsuccessful processing of a message resulting in function throwing an exception or an explicit abandon operation on the message will not complete the message.
Saying that, I see a problem with this approach. You're not truly "spying" on the messages, but actively processing those. Which means a given message will be re-delivered and eventually end up in the dead letter queue. If you want to spy, you should peek at the messages, but Azure Service Bus trigger doesn't do that.
If you need a wiretap implementation, it's probably not a bad idea to use a topic and have a 2 subscriptions, one to consume the messages and another to duplicate all the messages for your wiretap function (that perhaps does some sort of analysis or logging). Without understanding the full scope of what you're doing, hard to provide an answer.

Requeue or delete messages in Azure Storage Queues via WebJobs

I was hoping if someone can clarify a few things regarding Azure Storage Queues and their interaction with WebJobs:
To perform recurring background tasks (i.e. add to queue once, then repeat at set intervals), is there a way to update the same message delivered in the QueueTrigger function so that its lease (visibility) can be extended as a way to requeue and avoid expiry?
With the above-mentioned pattern for recurring background jobs, I'm also trying to figure out a way to delete/expire a job 'on demand'. Since this doesn't seem possible outside the context of WebJobs, I was thinking of maybe storing the messageId and popReceipt for the message(s) to be deleted in Table storage as persistent cache, and then upon delivery of message in the QueueTrigger function do a Table lookup to perform a DeleteMessage, so that the message is not repeated any more.
Any suggestions or tips are appreciated. Cheers :)
Azure Storage Queues are used to store messages that may be consumed by your Azure Webjob, WorkerRole, etc. The Azure Webjobs SDK provides an easy way to interact with Azure Storage (that includes Queues, Table Storage, Blobs, and Service Bus). That being said, you can also have an Azure Webjob that does not use the Webjobs SDK and does not interact with Azure Storage. In fact, I do run a Webjob that interacts with a SQL Azure database.
I'll briefly explain how the Webjobs SDK interact with Azure Queues. Once a message arrives to a queue (or is made 'visible', more on this later) the function in the Webjob is triggered (assuming you're running in continuous mode). If that function returns with no error, the message is deleted. If something goes wrong, the message goes back to the queue to be processed again. You can handle the failed message accordingly. Here is an example on how to do this.
The SDK will call a function up to 5 times to process a queue message. If the fifth try fails, the message is moved to a poison queue. The maximum number of retries is configurable.
Regarding visibility, when you add a message to the queue, there is a visibility timeout property. By default is zero. Therefore, if you want to process a message in the future you can do it (up to 7 days in the future) by setting this property to a desired value.
Optional. If specified, the request must be made using an x-ms-version of 2011-08-18 or newer. If not specified, the default value is 0. Specifies the new visibility timeout value, in seconds, relative to server time. The new value must be larger than or equal to 0, and cannot be larger than 7 days. The visibility timeout of a message cannot be set to a value later than the expiry time. visibilitytimeout should be set to a value smaller than the time-to-live value.
Now the suggestions for your app.
I would just add a message to the queue for every task that you want to accomplish. The message will obviously have the pertinent information for processing. If you need to schedule several tasks, you can run a Scheduled Webjob (on a schedule of your choice) that adds messages to the queue. Then your continuous Webjob will pick up that message and process it.
Add a GUID to each message that goes to the queue. Store that GUID in some other domain of your application (a database). So when you dequeue the message for processing, the first thing you do is check against your database if the message needs to be processed. If you need to cancel the execution of a message, instead of deleting it from the queue, just update the GUID in your database.
There's more info here.
Hope this helps,
As for the first part of the question, you can use the Update Message operation to extend the visibility timeout of a message.
The Update Message operation can be used to continually extend the
invisibility of a queue message. This functionality can be useful if
you want a worker role to “lease” a queue message. For example, if a
worker role calls Get Messages and recognizes that it needs more time
to process a message, it can continually extend the message’s
invisibility until it is processed. If the worker role were to fail
during processing, eventually the message would become visible again
and another worker role could process it.
You can check the REST API documentation here: https://msdn.microsoft.com/en-us/library/azure/hh452234.aspx
For the second part of your question, there are really multiple ways and your method of storing the id/popReceipt as a lookup is a possible option, you can actually have a Web Job dedicated to receive messages on a different queue (e.g plz-delete-msg) and you send a message containing the "messageId" and this Web Job can use Get Message operation then Delete it. (you can make the job generic by passing the queue name!)
https://msdn.microsoft.com/en-us/library/azure/dd179474.aspx
https://msdn.microsoft.com/en-us/library/azure/dd179347.aspx

Resources