Azure Function Queue Trigger Gets Triggered Twice, Two Instances Running? - azure

I have an Azure Function, with a queue trigger. The function must proces one queue message each time, one by one, sequentially. This is important, since the function uses an external OAuth API, with various limitions on requesting new access and refresh tokens.
In order to process the queue sequentially, I've the following settings:
host.json
"queues": {
"batchSize": 1,
"newBatchThreshold": 0
}
Application settings
FUNCTIONS_WORKER_PROCESS_COUNT = 1
WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT = 1
Despite these settings, it happens sometimes that the function gets triggered multiple times. Not in my tests, where I dump a lot of messages in the queue, but when the functions is live.
I've dived into the Application Insights Log, and I found that the function gets restarted a lot during the day, and when it restarts, it sometimes start the functions twice.
For example:
Two instances are started. I'm guessing this causes my double triggers from the queue.
How can I make sure the function is triggered once?

I believe the only way to make it even more secure that only one message will be executed at the same time is to use sessions. Give every message the same sessionId to guarantee your desired behavior. Be aware that some things like deadlettering will behave differently if you switch to sessions.
You can also use the [Singleton] attribute if that fits your function app. It should also work, but can incur additional costs when you use the consumption plan. So you need to validate how it affects costs given your workload.
See also my other answer here: azure servicebus maxConcurrentCalls totally ignored

Related

What happens if multiple azure function apps bind to the same storage queue for input

I have function apps running in two different regions for redundancy. i.e. there are two separate apps in azure portal (deployed from the same code). So both apps have the function that input binds to the same storage queue. Would all messages be delivered to both or would the messages get split between the two?
I am using C#, dotnet core, and Functions 2.0.
You do not have to worry about it. The function runtime will lock the messages using the default storage queue behavior.
From the docs:
The queue trigger automatically prevents a function from processing a queue message multiple times; functions do not have to be written to be idempotent.
Now I do know the docs are talking about one function that is scaling out but the same applies to two functions with the same qeueue binding.
So
Would all messages be delivered to both or would the messages get split between the two?
The latter, messages will split between the two.
Is anyone seeing anything different with this? I'm using the Azure Message Bus Queues, but it should be the same. I can see in our log where the queue starts processing the item at almost exactly the same time and it does it twice which matches the number of function apps pointing to the same queue. It doesn't do it every time, but often enough where I can say it's not locking the message from being picked up multiple times.
This might be the case for a single function app that has logic to mitigate processing the same message twice.
However:
I Have seen a scaled function app using a queue trigger sometimes getting the same message on multiple instances.
You have to be prepared for that when scaling the function.

Controlling queue polling times

I have a piece of code that pushes a message to a service bus queue every time a new article is added on my web app. This then gets picked up by a ServiceBusTrigger with SendGrid output in my functions app that sends me an email that a new article has been added by someone.
This doesn't happen often at all and the only reason i decided to make it behave this way is to get my feet wet with some of the awesome Azure services.
My question is - since i don't really care about reeciving these notification email in real time... how can I reduce the frequency with which the trigger is checking the queue?
In my functions app's host.json I've already minimized the maxConcurrentCalls to 1 (default is 16).
"serviceBus": {
"maxConcurrentCalls": 1,
"prefetchCount": 100,
"autoRenewTimeout": "00:05:00"
}
Is there a way to also set it so that my trigger only checks the queue every 30 minutes or something like that?
No. Message retrieval is managed by Scaling Controller, which you don't have much influence on, apart from host.json parameters that you have already seen.
To implement your scenario, you would need to switch to Timer trigger running every 30 minutes and retrieve messages manually from Service Bus, arguably losing many benefits of Azure Functions.
Update: You can now integrate your Service Bus to Azure Event Grid and then use Event Grid triggered Function. Unfortunately, as of today it only works for Premium Service Bus namespace, so you'd most probably have to wait until they expand the feature to lower tiers.

Time triggered Azure Function and queue processing

I have a Function called once a day processing all messages in a queue. But I would like to also have the retries and poison messages logic as with the Queue Trigger. Is this somehow possible?
So at that point you function is purely a timer triggered function and from there you are no different than a console app in terms of how you would have to process messages from a queue with your own client connection, message loop, retrying and dead lettering (poison) logic. It's honestly just not the right tool for that job.
One approach I suppose you could consider if you wanted to be creative so that you could benefit from using an Azure Function's built in queue trigger behavior while still controlling what time the queue is processed is actually starting and stopping the function instance itself via something like Azure Scheduler. Scheduling the starting of the function is pretty straightforward and, once started, it will immediately begin draining the queue. The challenge is knowing how to stop it. The Azure Function runtime with the queue binding won't ever stop on its own as it's reading off the queue using a pull model so it's just gonna sit there waiting for new messages to arrive, right? So stopping it is really a question of understanding the need of the business. Do you stop once there's no more messages left for that day? Do you stop at a specific time of day? Etc. There's no correct answer here, it's totally domain specific, but whatever that answer is will dictate the exact approach taken.
Honestly, as I said earlier on, I'm not sure this is the right tool for the job. Yeah, it's nice that you get the retry and poison handling, but you're really going against the grain of the runtime. I would personally suggest you look into scheduling this simple console executable, executed in a task-like fashion using Azure Container Instances (some docs on that approach here). If you were counting on the auto-scale of Azure Functions, that's totally something you can get from Azure Container Instances as well.

Requeue or delete messages in Azure Storage Queues via WebJobs

I was hoping if someone can clarify a few things regarding Azure Storage Queues and their interaction with WebJobs:
To perform recurring background tasks (i.e. add to queue once, then repeat at set intervals), is there a way to update the same message delivered in the QueueTrigger function so that its lease (visibility) can be extended as a way to requeue and avoid expiry?
With the above-mentioned pattern for recurring background jobs, I'm also trying to figure out a way to delete/expire a job 'on demand'. Since this doesn't seem possible outside the context of WebJobs, I was thinking of maybe storing the messageId and popReceipt for the message(s) to be deleted in Table storage as persistent cache, and then upon delivery of message in the QueueTrigger function do a Table lookup to perform a DeleteMessage, so that the message is not repeated any more.
Any suggestions or tips are appreciated. Cheers :)
Azure Storage Queues are used to store messages that may be consumed by your Azure Webjob, WorkerRole, etc. The Azure Webjobs SDK provides an easy way to interact with Azure Storage (that includes Queues, Table Storage, Blobs, and Service Bus). That being said, you can also have an Azure Webjob that does not use the Webjobs SDK and does not interact with Azure Storage. In fact, I do run a Webjob that interacts with a SQL Azure database.
I'll briefly explain how the Webjobs SDK interact with Azure Queues. Once a message arrives to a queue (or is made 'visible', more on this later) the function in the Webjob is triggered (assuming you're running in continuous mode). If that function returns with no error, the message is deleted. If something goes wrong, the message goes back to the queue to be processed again. You can handle the failed message accordingly. Here is an example on how to do this.
The SDK will call a function up to 5 times to process a queue message. If the fifth try fails, the message is moved to a poison queue. The maximum number of retries is configurable.
Regarding visibility, when you add a message to the queue, there is a visibility timeout property. By default is zero. Therefore, if you want to process a message in the future you can do it (up to 7 days in the future) by setting this property to a desired value.
Optional. If specified, the request must be made using an x-ms-version of 2011-08-18 or newer. If not specified, the default value is 0. Specifies the new visibility timeout value, in seconds, relative to server time. The new value must be larger than or equal to 0, and cannot be larger than 7 days. The visibility timeout of a message cannot be set to a value later than the expiry time. visibilitytimeout should be set to a value smaller than the time-to-live value.
Now the suggestions for your app.
I would just add a message to the queue for every task that you want to accomplish. The message will obviously have the pertinent information for processing. If you need to schedule several tasks, you can run a Scheduled Webjob (on a schedule of your choice) that adds messages to the queue. Then your continuous Webjob will pick up that message and process it.
Add a GUID to each message that goes to the queue. Store that GUID in some other domain of your application (a database). So when you dequeue the message for processing, the first thing you do is check against your database if the message needs to be processed. If you need to cancel the execution of a message, instead of deleting it from the queue, just update the GUID in your database.
There's more info here.
Hope this helps,
As for the first part of the question, you can use the Update Message operation to extend the visibility timeout of a message.
The Update Message operation can be used to continually extend the
invisibility of a queue message. This functionality can be useful if
you want a worker role to “lease” a queue message. For example, if a
worker role calls Get Messages and recognizes that it needs more time
to process a message, it can continually extend the message’s
invisibility until it is processed. If the worker role were to fail
during processing, eventually the message would become visible again
and another worker role could process it.
You can check the REST API documentation here: https://msdn.microsoft.com/en-us/library/azure/hh452234.aspx
For the second part of your question, there are really multiple ways and your method of storing the id/popReceipt as a lookup is a possible option, you can actually have a Web Job dedicated to receive messages on a different queue (e.g plz-delete-msg) and you send a message containing the "messageId" and this Web Job can use Get Message operation then Delete it. (you can make the job generic by passing the queue name!)
https://msdn.microsoft.com/en-us/library/azure/dd179474.aspx
https://msdn.microsoft.com/en-us/library/azure/dd179347.aspx

Correct code pattern for recurrent events in Azure worker roles with sizable delays between each event

I have an Azure worker role whose job is to periodically run some code against a SQL Azure database. Here's my current code:
const int oneHour = 216000000; // milliseconds
while (true)
{
var numConversions = SaveSeedsToSQL.ConvertRemainingPotentialQueryURLsToSeeds();
SaveLogEntryToSQL.Save(new LogEntry { Count = numConversions });
Thread.Sleep(oneHour);
}
Is Thread.Sleep(216000000) the best way of programming such regular but infrequent events or is there some kind of wake-up-and-run-again mechanism for Azure worker roles that I should be utilizing?
This code works of course, but there are some problems:
You can fail somewhere and this schedule gets all thrown off. That
is important if you must actually do it at a specific time.
There is no concurrency control here. If you want something only done once,
you need a mechanism such that a single instance will perform the
work and the other instances won't.
There are a few solutions to this problem:
Run the Windows Scheduler on the role (built in). That solves problem 1, but not 2.
Run Quartz.NET and schedule things. That solves #1 and depending on how you do it, also #2.
Use future scheduled queue messages in either Service Bus or Windows Azure queues. That solves both.
The first two options work with caveats, so I think the last option deserves more attention. You can simply create a message(s) that your role(s) will understand and post it to the queue. Once the time comes, it becomes visible and your normally polling roles will see it and can work on it. The benefit here is that it is both time accurate as well as a single instance operates on it since it is a queue message. When completed with the work, you can have the instance schedule the next one and post it to the queue. We use this technique all the time. You only have to be careful that if for some reason your role fails before scheduling the next one, the whole system kinda fails. You should have some sanity checks and safeguards there.

Resources