How to handle an Azure Function rerunning when using message queue binding? - azure

I have a v1 Azure Function that is triggered by a message being written to the Azure Storage message queue.
The Azure Function needs to perform multiple updates to SharePoint Online. Occasionally these operations fail. This results in the message being returned to the queue and being reprocessed.
When I developed the function, I didn't consider that it might partially complete and then restart. I've done a little research and it sounds like I need to modify it to be re-entrant.
Is there a design pattern that I should follow to cater for this without having to add a lot of checks to determine if an operation has already been carried out by a previous execution? Alternatively, is there any Azure functionality that can help (beyond the existing message retries and poison queue)

It sounds like you will need to do some re-engineering. Our team had a similar issue and wrote a home-grown solution years ago. But we eventually scrapped our solution and went with Azure Durable Functions.
Not gonna lie - this framework has some complexity and it took me a bit to wrap my head around it. Check out the function chaining pattern.
We have processing that requires multiple steps that all must be completed. We're spanning multiple data stores (Updating Cosmos Db, Azure SQL, Blob Storage, etc), so there's no support for distributed transactions across multiple PaaS offerings. Durable Functions will allow you to break your process up into discrete steps. If a step fails, an orchestrator will re-run that step based on a retry policy.
So in a nutshell, we use Durable Task Activity functions to attempt each step. If the step fails due to what we think is a transient error, we retry. If it's an unrecoverable error, we don't retry.

Related

Durable Function triggered with delay

I'm currently facing a weird issue.
Randomly (I guess) my azure durable function invocation is triggered with delay >10min.
My understanding is that there's something wrong with the lease for control queue.
I'm on Consumption Plan. So i'm wondering if the scale-in/out mechanism is working properly with my durable function. My feeling is that a host instance takes the lease then goes into drain mode -> recycling etc. and keeps the lease during 10min before releasing.
My feeling is that it's happening after a period of inactivity.
Have you ever seen such behavior ?
I found the similar recent issue #1148771 reported on January 2023 in MS Q&A Forum of Azure Functions where the user is experiencing delay in durable functions and the hosting model is Consumption Plan during Orchestration start.
This case still is in investigation by the Microsoft Support team and mentioned some of the reasons such as:
The lease for control queue is held by previous function instance which is supposed to be recycled.
For the basic checks, make sure you have the latest versions of all the packages used in the function code.
Also, in the same Q&A Forum, mentioned the scenario what happened for causing the issue based on the timestamps, orchestration Ids provided. If your scenario is similar, then you can track the discussion for the solution.
If it is product issue, then you could raise a ticket in azure-functions-durable-extensions repository of the GitHub.
Refer to the GitHub Issue #606 and this MS Doc troubleshooting steps of Orchestration start delays in Azure Durable Functions.
Today, I’ve received the update from the product team that the issue is caused by the current Lease Management logic used by Durable Function. It could happen when an instance was shut down.
This issue is specific to Azure Storage backend.
This issue has been existing for a while. We're trying to refactor the
current Lease Management logic with hope to fix such issue, but no
clear ETA yet. You can subscribe to Orchestrations freezing
mid-execution · Issue #2207 · Azure/azure-functions-durable-extension
(github.com) to get the notification when the fix is ready.

How to process current and previous IoT Hub event from an azure function?

I have a simple scenario where I want to take the diff between current value of a parameter and previous value from IoT hub telemetry messages and attach this result and send to Time Series Insights environment (via an event hub if required so).
How can I achieve this? I am studying about Azure functions but not able to figure out how to exactly go about it.
The minimum timestamp difference between messages is 1 second and only edge devices (at max perhaps 3) will send the telemetry data. Each edge device might be collecting data from around 500 devices.
I am looking for a guidance on logical steps involved and a few critical pieces of Python code
Are these telemetry messages or property changes? Also what's the scale (number of devices)? To do this effectively you need to ensure you have both the current and previous values, which means storing the last reported value and timestamp externally as it could be a long time between. The Event Hub is not guaranteed to have all past messages (default is 24h), so if there's a long lag between messages it's not the right store to rely on.
Durable Entities can be used to store state (using something similar to the Actor Model). These are persisted in Azure Storage so at extremely high throughput a memory-only calculation option might make sense with delayed persistence, but you can build a memory-caching layer into your function to help if needed. This is likely going to be the best bet for what you want to do.
For most people the performance hit of going to Azure storage and back is minimal and Durable Entities will be the easiest path forward.
If you are doing it in a near-real time stream, the best solution is to use Azsure Streaming Analytics using the LAG operator. ASA has a bunch of useful features that you will need such as the PARTITION BY and event ordering policies. Beware, ASA can be expensive to run and hard to work with, but is a good service for commercial solutions.
If you don't need near-real time, a plain 'ol python script that queries (blob) persisted data is a good option, and can be wrapped up in an Azure function if it doesn't take too long to run.
Azure functions are not recommended for stateful message processing. You simply have insufficient control of the number of function instances running, the size of the batch, etc. So it is impossible to consistently and confidently know what the 'previous' timeseries value is. With Azure functions, you have to develop assuming that concurrency is never going to be an issue, which you cannot do with streaming IoT data.

Azure Functions how much code can be done in one?

I am a complete newbie for Azure and Azure Functions but my team plans to move to Azure soon. Now I'm researching how I could use Azure Functions to basically do what I would normally do in a .Net console application.
My question is, can Azure Functions handle quite a bit of code processing?
Our team uses several console apps that effectively pick up a pipe delimited file, do some business logic, update a database with the data, and log everything along the way. From what I've been reading so far I typically see that Azure Functions are used for little pieces of code. How little do they mean? Is it best practice to have a bunch of Azure Functions to replace a console app EX: have one function that does the reading of a file and create a list of objects, another function to loop through those items and add business logic, and then another to write the data to a database or can I use one Azure Function to do all of that?
Direct answer is yes - you can run bigger pieces of code as Azure Function - this is not a problem as long as you meet their limitations. You can even have dependency injecton. For chained scenarios, you can use Durable Functions. However, Microsoft do not recommend long running functions, cause of unexpected timeouts. See best practices for azure functions.
Because of that, I would consider alternatives:
If all what you need is run console app in Azure you can use WebJobs. Here is example how to deploy console app directly to azure via VisualStudio
For more complex logic you can use .NET Core Worker Service which behaves as Windows Service, and could be deployed to azure as App Service.
If you need long-running jobs but with scheduled runs only I had really great experience with Hangfire which can be hosted in Azure as well.
This is really hard to answer because we don't know what kind of console app you have over there. I usually try to use the same SOLID principles used to any class on my functions too. And whenever you need to coordenate actions or if you need to run things in parallel you always use Durable Functions Framework too.
The only concern is related to execution time. Your function cans get pretty expensive if you're running on consumption plan and do know pay attention to it. I recommend you the reading of the following gread article:
https://dev.to/azure/is-serverless-really-as-cheap-as-everyone-claims-4i9n
You can do all of that in one function.
If you need on-the-fly data processing, you can safely use Azure Functions even if it takes reading files or database communication.
What you need to be careful at and configure, though, is the timeout. Their scalability is an interesting topic as well.
If you need to host an application, you need a machine or a part of the storage space of a machine in Azure to do that.

Document-centric event scheduling on Azure

I'm aware of the many different ways of scheduling system-centric events in Azure. E.g. Azure Scheduler, Logic Apps, etc. These can be used for things like backups, sending batch emails, or other maintenance functions.
However, I'm less clear on what technology is available for events relating to a large volume of documents or records.
For example, imagine I have 100,000 documents in Cosmos and some of the datetime properties on those documents relate to events: e.g. expiry, reminders, escalations, timeouts, etc. Each record has a different set of dates and times.
What approaches are there to fire off code whenever one of those datetimes is reached?
Stuff I've thought of so far:
Have a scheduled task that runs once per minute and looks for anything relating to that particular minute in Cosmos then does "stuff".
Schedule tasks on Service Bus queues with a future date as-and-when the Cosmos records are created and then have something to receive those messages and do "stuff".
But are there better ways of doing this? Is there a ready-made Azure service that would take away much of the background infrastructure work and just let me schedule a single one-off event at a particular point in time and hit a webhook or something like that?
Am I mis-categorising Azure Scheduler as something that you'd use for a handful of regularly scheduled tasks rather than the mixed bag of dates and times you'd find in 100,000 Cosmos records?
FWIW, in my use-case there isn't really a precision issue - stuff scheduled for 10:05:00 happening at 10:05:32 would be perfectly acceptable, for example.
Appreciate your thoughts.
First of all, Azure Schedular will be replaced by Azure Logic Apps:
Azure Logic Apps is replacing Azure Scheduler, which is being retired. To schedule jobs, follow this article for moving to Azure Logic Apps instead.
(source)
That said, Azure Logic Apps is one of your options since you can define a logic apps that starts a one time job by using a delay activity. See the docs for details.
It scales very well and you can pay for what you use (or use a fixed pricing model).
Another option is using a durable azure function with a timer in it. Once elapsed, you could do your thing. You can use a consumption plan as well, so you pay only for what you use or you can use a fixed pricing model. It also scales very well so hundreds of those instances won't be a problem.
In both cases you have to trigger the function or logic app when the Cosmos records are created. Put the due time as context in the trigger and there you go.
Now, given your statement
I'm aware of the many different ways of scheduling system-centric events in Azure. E.g. Azure Scheduler, Logic Apps, etc. These can be used for things like backups, sending batch emails, or other maintenance functions.
That is up to you. You can do anything you want. You don't specify in your question what work needs to be done when the due time is reached but I doubt it is something you can't do with those services.

Architecture azure functions

I have an azure function with an azure storage queue trigger. It runs fine without any problems. Inside the queue there will be saved a json and then the function does their job.
But now we need more functionality. I like to expand the json with a functionality key. Now is it better to expand also the function
If functionality = A go to class A
Else go to class B
Or is it better to create a new function with the same trigger?
Regards
It is okay to have different classes in the function.
To make the function responsible only for a particular process, you can split it into two functions and have Service Bus Topic Subscriptions instead of Storage Queues. This will keep the implementation reliable as Service Bus got wide set of features when compared to Storage Queues.
You can use Rules in Topic Subscriptions for filtering the messages.
Functions are just like traditional apps. There's no issues in referencing a class library that handles that deserializing.
What you are looking for is a concept called Message Versioning. It's a heavy topic so I may not be able to handle it here completely but versioning will happen.
One possibility, is to consider each messages as a Command (read on CQRS). You could pre parse the version number in the message and have a CommandHandler for each version.
This is not specific to Functions. Here's a piece of advice Functions related. Keep a single function. With versioning happening, it will be simpler to debug and find what Functions is still working or not.

Resources