We have a solution where we use an Azure Storage Queue to process messages that take approx 6 minutes.
I've read that the maximum batchSize of Queue messages concurrently processed are 32 per VM.
If the function app scales out to multiple VMs, each VM could run one instance of each queue-triggered function.
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue?tabs=in-process%2Cextensionv5%2Cextensionv3&pivots=programming-language-csharp#host-json
How does that translate to Azure Functions Premium plan?
Lets say we want to be able to process 64 messages at once using Azure Functions Premium plan with Always ready instances. If we have 2 ready instances, can they process 2 * 32 concurrent messages? Or do they underwater really need to be on seperate VM's and 2 instances will not do anything different?
In the Premium plan, you can have your app always ready on a specified number of instances. The maximum number of always ready instances is 20. When events begin to trigger the app, they are first routed to the always ready instances.
https://learn.microsoft.com/en-us/azure/azure-functions/functions-premium-plan?tabs=portal#always-ready-instances
Yes. In Azure Functions premium plan, if you have pre-warmed instance, then that is given a dedicated VM instance. So, if you had 2 VM instances running your function app, then they can process 2*(batchSize + newBatchThreshold) concurrent Queue messages!
The Azure platform scales the function app onto new VM as the existing instances gets more busy.
Related
I am using Azure Functions on the App Service Plan. My understanding is for every new execution the Azure Function will create a new App Service, execute the function and then shut down the App Service. There would be nothing shared between the multiple App Services that are spawned due to multiple requests.
However when I do test my Function(which is a video processing one), for one request the time it takes is around 2-3 mins however for multiple simultaneous requests the time increases to 10-15 mins. My questions are whether my understanding above is correct? If not then what resource is shared amongst these App Services? How should I decide my scaling options(manual vs auto)?
"My understanding is for every new execution the Azure Function will create a new App Service" Nope it will not run new instance each time. Generally if there is no load on AF it will stop all instances.
Then if first request/event comes in it will start first instance. This is why we have ColdStart in Serverless. After that scale controller will measure your instance performance memory and CPU consumption and decide if it needs to scale but it wont be instant. So if lets say you sent N amount of requests to do smth with video they could go to same first instance and increase load. Then AF will scale, because of CPU spike but it wont help with old requests since they are handled at first instance. Keep in mind For non-HTTP triggers, new instances are allocated, at most, once every 30 seconds which means that your AF should have CPU spike for at least 30 second to add new instance https://learn.microsoft.com/en-us/azure/azure-functions/event-driven-scaling
I am not sure if Azure Functions are good option for video processing. Azure function should be used for quick stuff usually I would say not more than 30 sec. But there are some limitation of execution time depends how you run it https://learn.microsoft.com/en-us/azure/azure-functions/functions-premium-plan?tabs=portal
Not sure what type of video processing you doing but i would have a look into Azure Media Services
The other options as you mentioned is Batch jobs with low priority https://azure.microsoft.com/en-au/blog/announcing-public-preview-of-azure-batch-low-priority-vms/ it actually a good use case you have: Media processing and transcoding, rendering and so on
A small addition to Vova's answer: if you're running your Function in an App Service (also known as a Dedicated Plan), it will by default only scale instances within the possibilities of the App Service Plan you defined. This means that all of the instances of your Function App run on the same virtual machine. That is most probably the reason you're seeing increasing request times with more requests.
If you want your Functions to scale beyond the capabilities of that plan, you will need to manually scale or enable autoscaling for the App Service plan.
An App Service plan defines a set of compute resources for an app to run. These compute resources are analogous to the server farm in conventional hosting.
and
Using an App Service plan, you can manually scale out by adding more VM instances. You can also enable autoscale, though autoscale will be slower than the elastic scale of the Premium plan. [...] You can also scale up by choosing a different App Service plan.
If you run your Function App on Consumption Plan (the true serverless hosting plan option since it enables scaling to zero),
The Consumption plan scales automatically, even during periods of high load.
In case you need longer execution times than those available in Consumption Plan, but the App Service Plan doesn't seem to be the best hosting environment for your Functions there's also the Premium Plan.
The Azure Functions Elastic Premium plan is a dynamic scale hosting option for function apps.
Premium plan hosting provides the following benefits to your functions:
Avoid cold starts with perpetually warm instances
Virtual network connectivity.
Unlimited execution duration, with 60 minutes guaranteed.
Premium instance sizes: one core, two core, and four core instances.
More predictable pricing, compared with the Consumption plan.
High-density app allocation for plans with multiple function apps.
More info on all the different Azure Functions hosting options.
I have one Application Service Plan. Premium Tier. 3 Functions Apps. 3 application insights.
For ASP settings for every function app I have this settings
what does mean this number of servers when I open stream in application insights?
when I dont have any jobs to execute in any functions I see that number of servers is 2 for every function app. If some function does some jobs I see more servers allocated for this function. As I understand when no jobs, 2 servers means 1 mandatory instance + 1 always ready.As as I see the same servers allocated for all 3 functions apps. Does it means that "server" it is not the same as "function app" object?
You are right. Since all your function apps are using the same App Service plan, they will share the server instances pool. As you have understood, there will be 2 active servers if your function app was in active use with some workload. If the load increases, Azure would assign work to the pre-warmed instance and create 1 more 'Always ready' instance. Unless you have set the Maximum burst limit for the App Service Plan, or the Max scale out limit for any of the Function app, Azure will keep adding VM instances to the server pool as workload increases.
Per this documentation, I am using the Azure Function consumption plan and am trying to limit the parallelism of one of the queue-triggered functions so that only one instance runs at a time:
{
"queues": {
"batchSize": 1
}
}
The queue is part of a Microsoft Storage Account, and isn't a service bus.
However, my problem is that the function is still being run in parallel if there are multiple items on the queue at once. I read in the fine print of the documentation above:
If you want to avoid parallel execution for messages received on one queue, you can set batchSize to 1. However, this setting eliminates concurrency only so long as your function app runs on a single virtual machine (VM). If the function app scales out to multiple VMs, each VM could run one instance of each queue-triggered function.
Since I am using a consumption plan, how do I know if the function app is running on a single VM or on multiple? How can I successfully limit this function's batch size to one?
In consumption plan, a single function app only scales up to a maximum of 200 instances. A single instance may process more than one message or request at a time though, so there isn't a set limit on number of concurrent executions.
Also, when you're using a Consumption plan, instances of the Azure Functions host are dynamically added and removed based on the number of incoming events.
As you want to limit the parallelism of one of the queue-triggered functions, so I suggest that you could use Azure App Service Plan to achieve it.
For more details, you could refer to this article.
I have an Azure function that is triggered every 1 second.
Every time it is triggered, it reads a batch of messages from a Service Bus queue and processes them.
It runs on App service plan which scales on the active messages in the queue.
However as the service plan scales out I do not see any increase in the throughput of the function.
Does the Time triggered Azure function scale as the increase in the instances of the App service plan?
No, Timer triggers are singletons. That means that at any given moment only one instance will be firing Function calls on timer.
Obviously, different Functions have independent invocations.
To scale processing of Service Bus messages you should use Service Bus Trigger directly, which can manage scaling for you.
How does the concept of storage queue polling apply when an Azure Function is hosted under the consumption plan?
I get the principal of polling with classic hosted WebJob functions and I understand that the maximum polling interval of 1 minute can be overridden. However in the case of consumption plan hosting there is no app-level memory resident process, therefore I assume that Azure internals spin up a FunctionApp via some other trigger beyond my control.
The motivation for this question is that I am trying to understand typical E2E function invocation propagation delays when an Azure hosted WebApp adds a message to a storage queue. In my case the WebApp, StorageQueue and pre-compiled function DLL will run in the same Azure region.
I need to cap Azure Function invocation delays to under 10 seconds with an average of <3 seconds.
Unfortunately this isn't possible on the consumption plan with the current polling model, as we poll your trigger resource every 10s to determine if there are new events requiring a function instance to be loaded/started.
If your function app runs frequently enough that it always has active instances (a new queue message every 5 min, for example) you can get the invocation delays that you want, as the instances themselves handle the polling.
The worst case (no function instances running) is ~10s polling + ~5s instance startup time to process a new event.