I've been doing research lately around Azure Function cold start time that can occur with a consumption-based plan. I understand the concepts, but how do I actually measure the cold start time of my Azure Functions? I can't seem to find any good documentation on this.
In the Azure Portal, I see there is a "Monitor" tab for each of my functions, but the only statistic shown is "Duration (MS)" and it is unclear if this includes the start up time.
In general, are there better ways to monitor this?
I don't think that there is an official metric which shows cold start time.
I've been measuring it by running a function with predictable "hot" execution time, and then measuring the total latency from a client who calls the function. The client was located in the same region as function under test.
Also, my function was returning the ID of an instance it was running on. The first response from each instance is definitely a cold start.
I hope you find my blog posts on cold starts useful:
Cold Starts in Numbers
Cold Starts Beyond First Request
Related
I have read through most of the questions that seems to be similar to what I'll ask so hopefully I'm not wasting anyone's time.
We have a Function App in Azure Cloud that contains several Durable Functions.
One of these durable functions is a HTTP trigger API REST call.
It will normally take between 0.5 - 3 seconds to execute fully (from call to done, delivered result). But sometimes it takes 20-35 seconds. I don't know why or how I can search for errors.
The durable function fetches information from a Cosmos DB and delivers the result back to the caller.
Function App, Durable Function and Cosmos DB are all located in the same Region. (Checked that).
The Durable Function is set to B2:2 and has toggled Always On to ON.
Is there something I miss or something I should check to make sure it runs smoother?
Log of executions of the app:
I greatly appreciate everyone's time and energy they put into helping me. Thanks a lot.
---- Additions to the post after posting ----
I have checked the interactive tool and if I read that correctly it tells me a maximum execution time of 0.8 seconds and a maximum network lag of 6 seconds. That would indicate something that I suspected before I set up this post and that is that Azure needs to cold start the function. But I have always on toggled on so why?
It doesn't seem to take 30 seconds to complete the function. It seems to take less than 1 second to complete the function and up to a maximum of 6 seconds lag, but where are the other 23 seconds going in a 30 second call?
B2:2 is the service agreement I have with Azure. B2 is the test environments second paid state with 2 instances scaling (I have changed that to 3 after posting this).
Application Insights are on and no other dependencies are present except the Cosmos DB.
AFAIK in Azure Functions,
After 5 minutes of inactivity, Function App goes to the cold state. To come out of it, 10 seconds delay occurs.
Even the Function App is in Hot State, it will take some excessive amount of time to load the external libraries defined in it.
In the Function App, Code Logic Performance also matters the cause of slowness in the Azure Functions.
There are few steps for reducing the cold-start times particularly for the Functions having external libraries:
Running from a package file WEBSITE_RUN_FROM_PACKAGE to 1 may reduce cold-start times, particularly for JavaScript functions with large npm package trees.
From the Azure Portal > Diagnose and solve problems > Troubleshoot Performance category to identify the causes of slowness:
Try Always On Feature available in App Service Plan and Premium Plan of the Azure Functions to prevent such issues.
Regarding the Performance and reliability improving of Azure Functions, please refer here.
If this issue persists still, then please raise an incident with Microsoft Support to get the root cause and resolution.
Try fiddling around with maxqueuepollingintervall. It helped out with our cold starts quite a bit.
I have a Python Azure Function (Linux Consumption Plan) that is being set up to run multiple HttpTriggers at various times throughout the day. It's possible for more than one of these triggers to execute at or around the same time as the other triggers. To avoid exceeding the 1.5 GB memory limit, I'd like to make sure only one function invocation is allowed to run at a time. Is there any way to achieve this?
Edit: After doing a little research, would this setting allow me to avoid concurrent executions of my HttpTriggers: https://learn.microsoft.com/en-us/azure/azure-functions/functions-app-settings#website_max_dynamic_application_scale_out?
If I set WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT to 1, would that mean only one invocation could run at a time and the other HttpTriggers would wait?
Check out the full description of WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT here. It says that:
WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT sets a maximum number of instances that a function app can scale to.
This limit is not yet fully supported - it does work to limit your
scale out, but there are some cases where it might not be completely
foolproof. We're working on improving this.
I believe this is not what you are looking for.
Logically, this can be made possible if we choose timer trigger times in such a way that they never collide within a day (24hrs). Please note, this depends on the business requirements of function app HttpTriggers - what is the required frequency of them.
Another solution can be to have separate consumption plans for different HttpTriggers. Infact, in this case, you will get a monthly free grant of 1 million requests and 4,00,000 GB-s of resource consumption per month per subscription in pay-as-you-go pricing across all function apps in that subscription.
As mentioned above, there can be different solutions for this, you need to choose as per what suits you the best.
I have tried for the 1st time Azure Function, besides a couple of problems where I found a workaround, it was quite easy to develop and publish my function to Azure. I even tried preview features like durable entities and it works great, I am enthusiast.
However, I had some concerns with the timings. My function is http triggered, it's called by another application. Most of the time execution time is ~1sec which is great. Sometimes, I don't know why it takes up to 30 secs to execute the same function. Is this normal? Maybe some cold start? Or it's me doing something wrong? I am a newbie so I'd like the experts opinion. I am using consumption plan in w. Europe.
Unfortunately for this application anything > 4 sec is not acceptable because it will cause an error in the caller reflected in turn to the end user.
Here you can se a screen capture of logs with timings, look at the bottom what crazy slow times.
Any way to ensure timing always within 4 secs?
This much variation would not be expected with cold start. Generally cold start is about 2-5 seconds and should only happen if a long period of no invocations. Also the measurement here is just execution time, and doesn’t include startup time. I’d recommend looking into logs and adding traces to see if there’s a line of code it’s hanging on.
First step is to understand what happens once you hit one Azure Function endpoint, step by step:
Azure must allocate your application to a server with capacity,
The Functions runtime must then start up on that server,
Your code then needs to execute.
I don't know why it takes up to 30 secs to execute the same function. Is this normal? Maybe some cold start?
I think the answer is related to cold start, the following image represents what happens when you trigger a function app's endpoint (Source: Understanding serverless cold start):
I have similar issues once using Consumption plan. A dedicated plan might be a solution for your case, half minute to warm up an endpoint is pretty bad. To keep the function warm, you have a chance to use Premium plan which promises the following:
When you're using the Premium plan, instances of the Azure Functions host are added and removed based on the number of incoming events just like the Consumption plan. Premium plan supports the following features: Perpetually warm instances to avoid any cold start
You can read about this further: Premium plan (preview)
Additional information:
Be careful with the mentioned option because the pricing might be different based on the following:
Instead of billing per execution and memory consumed, billing for the Premium plan is based on the number of core seconds, execution time, and memory used across needed and reserved instances. At least one instance must be warm at all times. This means that there is a fixed monthly cost per active plan, regardless of the number of executions.
I would consider at least for testing purposes the above mentioned option, I hope the answer helps and gives you the idea why you have slow startup.
Application Insight gets my function's logs with minimum 3 minutes delay. I realised that log-streaming service (available in each function) shows the logs in real time. But that's not very convenient. Is there any other way to get logs in real time?
Also according to Azure, Application Insight perform its tasks in near-real time. Having 3 minutes delay, I'm wondering how can it do its tasks in near real time?
P.S: my function app is quite simple and doesn't do heavy task.
You could consider using "Live Stream" feature of Application Insights:
https://learn.microsoft.com/en-us/azure/azure-monitor/app/live-stream
We have a simple Azure Function that makes a DocumentDB query. It seems like the first time we call it there is a long wait to finish, and then successive calls are very fast.
For example, I just opened our app and the first Function call took 10760ms, definitely noticeable by any end user. After this, all Function calls take approximately 100ms to process and are nearly imperceptible.
It seems as though there is some "wake up" cycle in Azure Functions. Is there some way to minimize this, or better yet is this documented somewhere so we can understand what's really going on here?
Function apps running on a consumption plan do indeed have an idle time after which they effectively go to sleep. The next invocation is required to "wake them up" as you've observed and people have mentioned in the comments.
As to why this happens, it's so that Microsoft can most optimally distribute compute workloads in a multi-tenant environment while ensuring that you're only billed to the second for the time where your function is actually doing work. This is the beauty of serverless.
For workloads where this is not acceptable behavior, you could consider moving off of the consumption plan and on to the actual App Service plan. Alternatively, you could implement a timer triggered function that goes off every minute for example and use that as a "keep alive" mechanism by pinging the function that you don't want to go to sleep.