We have an Azure Function App (timer triggered) on a Consumption plan for testing purpose. The App fist fires a bunch of Stored Procedures on a SQL Server. We use Task.Run() and inside of it it's just a Synchronous operation to run an SP on the Server. It's a fire and forgets tasks that we require and the Exceptions/Errors from SQL are logged to the table inside of the SQL Server. This particular Azure App is a plan to migrate our SQL Agent Jobs (as we are moving towards a PaaS Database) to the cloud. Moreover, the Function App triggers an SP across multiple databases. So a single Task.Run for each DB.
The thing is the execution of the SP might take around 20 minutes to complete itself. I see that around 19 minutes the Connection is dropped. So I see that an SP was was started let's say at 5:00 AM and with appropriate logging inside of an SP, it went on till 5:19 AM and then it stopped (no success log). So I believe the SQLConnection from C# is dropped. The consumption plan default is 5 minutes. So if it's a timeout issue then why still I can continue till 19 minutes and then only it's dropped. I have observed this behavior for some days now.
I cannot arrive at a feasible explanation of the above behavior.
Maximum timeout for azure functions in consumption plan is 10min:
Change plan to support longer timeout or you can use Durable functions (intended for long-running tasks).
Durable Functions is an extension of Azure Functions that lets you
write stateful functions in a serverless compute environment. The
extension lets you define stateful workflows by writing orchestrator
functions and stateful entities by writing entity functions using the
Azure Functions programming model. Behind the scenes, the extension
manages state, checkpoints, and restarts for you, allowing you to
focus on your business logic.
Refs:
https://learn.microsoft.com/pl-pl/azure/azure-functions/functions-scale#function-app-timeout-duration
https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-overview?tabs=csharp
https://learn.microsoft.com/en-us/learn/modules/create-long-running-serverless-workflow-with-durable-functions/
Related
When would I prefer Azure Functions to Azure Container Instances, considering they both offer the possibility to perform run-once tasks and they bill on consumption?
Also, reading this Microsoft Learn Module:
Serverless compute can be thought of as a function as a service (FaaS), or a microservice that is hosted on a cloud platform.
Azure Functions is a platform that allows you to run plain code (instead of containers). The strength of Azure Functions is the rich set of bindings (input- and output bindings) it supports. If you want to execute a piece of code when something happen (e. g. a blob was added to a storage Account, a timer gets triggered, ....) then I definitely would go with Azure Functions.
If you want to run some container-based workload for a short period of time and you don't have an orchestrator (like Azure Kubernetes Services) in place - Azure Container Instances makes sense.
Take a look at this from Microsoft doc
Source: https://learn.microsoft.com/en-us/dotnet/architecture/modernize-with-azure-containers/modernize-existing-apps-to-cloud-optimized/choosing-azure-compute-options-for-container-based-applications
If you would like to simplify application development model where your application architecture has microservices that are more granular, such that various functionalities are reduced typically to a single function then, Azure functions can be considered for usage.
In case, the solution needs some extension to existing azure application with event trigger based use cases , the azure functions can be better choice . Here, the specific code (function) shall be invoked only for specific event or trigger as per requirement and the function instances are created and destroyed on demand (compute on demand - function as a service (FaaS) ).
More often, the event driven architecture is seen in IoT where typically you can define a specific trigger that causes execution of Azure function. Accordingly, Azure functions have its place in IoT ecosystem as well.
If the solution has fast bursting and scaling requirement, then container Instances can be used whereas if the requirement is predictable scaling then, VMs can be used.
Azure function avoids allocation of extra resources (VMs) and also the cost is considered only when the function is processing work. Here, we need not take care of infrastructure such as where the code is going to execute, server configuration, memory etc. For ACI, the cost is per-second where it is accounted based on the time the container runs - CaaS(Container As A Service).
ACI enables for quickly spawning a container for performing the operation and deletion of it when done where the cost is only for few hours of usage rather than a dedicated VM which would be of high cost. ACI enables one to run a container by avoiding dependency on orchestrators like Kubernetes in scenarios where we would not need orchestration functions like service discovery, mesh and co-ordination features.
The key difference is that, in case of Azure function, the function is the unit of work whereas in container instance, the entire container contains the unit of work. So, Azure functions start and end based on event triggers whereas the microservices in containers shall get executed the entire time.
The processing / execution time also plays a critical role where if the event handler function consumes processing time of 10 minutes or more to execute, it is better to host in VM because the maximum timeout that is configurable for functions is 10 minutes.
There are typical solutions that utilize both the functionalities such that Azure function shall be triggered for minimal processing / decision making and inturn can invoke container instance for specific burst processing / complete processing.
Also, ACI along with AKS form a powerful deployment model for microservices where AKS can be for typical deployment of microservices and ACIs for handling the burst workloads thereby reducing the challenges in management of scaling and ensuring effective utilization of the per second usage cost model.
How does the concept of storage queue polling apply when an Azure Function is hosted under the consumption plan?
I get the principal of polling with classic hosted WebJob functions and I understand that the maximum polling interval of 1 minute can be overridden. However in the case of consumption plan hosting there is no app-level memory resident process, therefore I assume that Azure internals spin up a FunctionApp via some other trigger beyond my control.
The motivation for this question is that I am trying to understand typical E2E function invocation propagation delays when an Azure hosted WebApp adds a message to a storage queue. In my case the WebApp, StorageQueue and pre-compiled function DLL will run in the same Azure region.
I need to cap Azure Function invocation delays to under 10 seconds with an average of <3 seconds.
Unfortunately this isn't possible on the consumption plan with the current polling model, as we poll your trigger resource every 10s to determine if there are new events requiring a function instance to be loaded/started.
If your function app runs frequently enough that it always has active instances (a new queue message every 5 min, for example) you can get the invocation delays that you want, as the instances themselves handle the polling.
The worst case (no function instances running) is ~10s polling + ~5s instance startup time to process a new event.
I'm using Azure webjobs with queue-triggered functions (which rely on the Azure webjobs sdk) to perform some background processing work. Within the webjobs I make various connects to a SQL Azure database (using PetaPoco which uses System.Data.SqlClient).
I want to be purposeful in my database connection strategy - specifically because there are some concurrency issues inherent to the environment.
One concurrency scenario is with the SDK's BatchSize property that you can set for queue-triggered webjobs. It's my understanding that setting BatchSize > 1 results in multiple instances of the queue-triggered function running within the same webjob process.
The second concurrency scenario is the website scale-out scenario where you're running multiple instances of the webjob itself. These of course are in different processes.
In my website I have a database connection per request (the machine handles connection pooling by default). No problems there.
How should I treat connections in the webjob scenario, accounting for the concurrency scenarios described above? Webjobs are of course just long-lived console processes (these are continuous webjobs). Should I create a database connection when my webjob starts and simply re-use that connection through the webjob's lifetime? Should I instantiate and close connections per function when I need them?
These are the types of things I'm trying to understand.
Webjobs are of course just long-lived console processes (these are continuous webjobs).
The main process is the long-lived processes , but for trigged sub- process will be released after the triggered function is executed. It means that connection will also be released automatically in the sub-process. For best program practices that we 'd better close it manually before exit function.
The second concurrency scenario is the website scale-out scenario where you're running multiple instances of the webjob itself. These of course are in different processes.
WebJob SDK queue trigger will automatically prevents a queue triggered by multiple instances.
If your web app runs on multiple instances, a continuous WebJob runs on each machine, and each machine will wait for triggers and attempt to run functions. The WebJobs SDK queue trigger automatically prevents a function from processing a queue message multiple times; functions do not have to be written to be idempotent. However, if you want to ensure that only one instance of a function runs even when there are multiple instances of the host web app, you can use the Singleton attribute.
It's my understanding that setting BatchSize > 1 results in multiple instances of the queue-triggered function running within the same webjob process
BatchSize it means that how many queue messages that can be picked up simutaneouly to be executed in Parallel in a WebJob.
How to use Azure queue storage with the WebJobs SDK induling parallel execution and multiple instances, we could get more info from the doucment.
I have created an azure service which is responsible for below task:
(1) Access the blob containers and download the files from there.
(2) Extract some data from downloaded files
(3) Stored the extracted data to an Azure SQL Server
I want to run this processing after every 7 days. Is there a way to achieve this? or can I use any other option than cloud service to achieve the above goal?
I would recommend you to use Azure Function as its Timer-based processing (Timer trigger) feature is able to fulfill your requirements.
Timer triggers call functions based on a schedule, one time or
recurring.
Reference: Azure Functions timer trigger, Azure Functions Pricing
Another great advantage of using Azure Function for your scenario is its pricing model.
Azure Functions consumption plan is billed based on resource
consumption and executions.
Consumption plan pricing includes a
monthly free grant of 1 million requests and 400,000 GB-s of resource
consumption per month.
Certainly not natively with the Cloud Service itself. I mean, you can obviously code it so it performs some task(s) and sleeps for 7 days, but you will pay for all of that time, that makes no sense
You can use Azure WebJobs, Functions and Scheduler for this purpose, or you can create a PowerShell\Cli or something else cron task\task scheduler to turn on your Azure Cloud Service, wait for it to finish processing and turn it off. But that seems like a lot of extra effort, I'd rather go with Scheduler or Functions.
Can anybody explain the difference between Azure Web Jobs and Azure Scheduler
Azure Web Jobs
Only available on Azure Websites
It is used to run code at particular intervals. E.g. a console application every day
Used to trigger and run workloads.
Mainly recommended for workloads that either scale with the website or are relatively small.
Can be persistently running if "Always On" selected, otherwise you will get the 20 min timeout.
The code that needs to be run and schedule are defined together.
Azure Scheduler
Is not tied to Websites or Cloud Services
It allows you to call a website or add a message to a storage queue
Used for triggering events or triggering small workloads (e.g. add to queue), usually to trigger larger workloads
Mainly recommended for triggering more complex workloads.
This is only a trigger, and a separate function listening to trigger events (e.g. queue's) needs to be coded separately.
For many instances I prefer to use the scheduler to push to a storage queue and a worker role on each instance takes off the queue. This keeps tasks controlled granularly and can also move up or down in scale outside of your website.
With WebJobs they scale up and down with your site and hence your background tasks can become over taxed if your website is experiencing low traffic and scaled down.
Azure Scheduler - Provides a way to easily schedule http calls in a well-defined schedule, like every hour, every Friday at 9:00 am, Once a day, ...
Azure WebJobs - Provides a way to run small to medium work load (in the form of a script: .exe, .cmd, .sh, .js, ...) at the same context of an Azure Website (but can be hosted even with an empty website).
While a WebJob can run continuously (with a process that has a while loop) and Azure will make sure this WebJob is always running (with "Always On" set).
There is also an integration between Azure scheduler and Azure WebJobs where you have a WebJob that is running some finite work and the schduler is responsible for scheduling this work (invoking the WebJob).
So in summary, the scheduler is about scheduling work and WebJobs is about running work load.