Azure Function App is very slow, but only sometimes - azure

I have read through most of the questions that seems to be similar to what I'll ask so hopefully I'm not wasting anyone's time.
We have a Function App in Azure Cloud that contains several Durable Functions.
One of these durable functions is a HTTP trigger API REST call.
It will normally take between 0.5 - 3 seconds to execute fully (from call to done, delivered result). But sometimes it takes 20-35 seconds. I don't know why or how I can search for errors.
The durable function fetches information from a Cosmos DB and delivers the result back to the caller.
Function App, Durable Function and Cosmos DB are all located in the same Region. (Checked that).
The Durable Function is set to B2:2 and has toggled Always On to ON.
Is there something I miss or something I should check to make sure it runs smoother?
Log of executions of the app:
I greatly appreciate everyone's time and energy they put into helping me. Thanks a lot.
---- Additions to the post after posting ----
I have checked the interactive tool and if I read that correctly it tells me a maximum execution time of 0.8 seconds and a maximum network lag of 6 seconds. That would indicate something that I suspected before I set up this post and that is that Azure needs to cold start the function. But I have always on toggled on so why?
It doesn't seem to take 30 seconds to complete the function. It seems to take less than 1 second to complete the function and up to a maximum of 6 seconds lag, but where are the other 23 seconds going in a 30 second call?
B2:2 is the service agreement I have with Azure. B2 is the test environments second paid state with 2 instances scaling (I have changed that to 3 after posting this).
Application Insights are on and no other dependencies are present except the Cosmos DB.

AFAIK in Azure Functions,
After 5 minutes of inactivity, Function App goes to the cold state. To come out of it, 10 seconds delay occurs.
Even the Function App is in Hot State, it will take some excessive amount of time to load the external libraries defined in it.
In the Function App, Code Logic Performance also matters the cause of slowness in the Azure Functions.
There are few steps for reducing the cold-start times particularly for the Functions having external libraries:
Running from a package file WEBSITE_RUN_FROM_PACKAGE to 1 may reduce cold-start times, particularly for JavaScript functions with large npm package trees.
From the Azure Portal > Diagnose and solve problems > Troubleshoot Performance category to identify the causes of slowness:
Try Always On Feature available in App Service Plan and Premium Plan of the Azure Functions to prevent such issues.
Regarding the Performance and reliability improving of Azure Functions, please refer here.
If this issue persists still, then please raise an incident with Microsoft Support to get the root cause and resolution.

Try fiddling around with maxqueuepollingintervall. It helped out with our cold starts quite a bit.

Related

Logic Apps Times Out Due To Long Running Azure Functions Being Called Inside It

I am currently working on supporting an old application which uses logic apps and azure functions.
The logic apps are on consumption plan and it times out frequently due to long running azure functions which in turn calls ms sql server using EF core.
Now, we don't want to spend much time on development as it will be sunset and migrated so azure durable functions, webhooks, and event bus is not being considered.
Are there any other ways to solve this which requires no major code changes?
We are planning to move from consumption to standard logic apps to increase the timeout from 2 minutes to 3.9 minutes.
Any pointers would be highly appreciated.
I had found one alternative that you can use until loop.
Until loops run until specific condition is true. Until loop requires 200 ok Response from the request. As per MS-Doc, it says:
This loop action definition sends an HTTP request to the specified URL .
As #skin suggested, you can use webhooks, durable functions which does exactly what you need (if development is not concern). And as you said you can change to Standard to increase timeout till 4 min.

Azure function timeout/fails to complete

G'day folks,
I'm having some issues with an Azure function that I'm hoping someone might be able to help with.
We have a relatively long-running process (3-4 mins) that is being triggered from a Service Bus message, and we were having issues with the function execution ending without error and then attempting to re-process. The time take for this to happen is less than all the timeout/lock duration settings we have configured. Watching the logs (log stream, for both file system and app insights) we see the last line of the previous execution, then it kicks straight into the next.
To determine whether it's service bus related, I've also tried executing the process via a blob trigger (the process uses the file as a data source anyway) but I'm seeing the same thing except I don't see the subsequent retries.
In both scenarios I don't see anything in App insights apart from the Trace records. I don't get an exception, or even a 'request' entry. (function logic is all enclosed in try/catch blocks btw)
So my question is - Is it possible to trap these scenarios so we can determine the root cause? Currently I've got nothing to go on to try and diagnose. These errors don't happen when running locally.
FWIW we've seen this issue happen during the execution of a third-party libraries (MS Graph and an OpenXMLPowerTools library) - as we're generating documents for upload into Sharepoint. Not sure if this is relevant.
Thanking you in advance,
Tim
May be this is because of the plan that you are using , If you're using the Consumption plan, the default timeout is 5 minutes, but you can increase it to a maximum of 10 minutes. The maximum timeout on a Premium plan is 60 minutes. You can set your timeout as long as you want if you have a dedicated App Service plan.
Also try configuring the timeout of your function app i.e by changing the value of functionTimeout in host.json of your function app.
You should have a look at durable functions.
They allows us to have long running processes, i.e. import/export tasks.
I was able to wrap a long running import process, which takes about 20 mins to run successfully.

ASP.NET WebApp in Azure using lots of CPU

We have a long running ASP.NET WebApp in Azure which has no real endpoints exposed – it serves a single functional purpose primarily reading and manipulating database data, effectively a batched, scheduled task, triggered by a timer every 30 seconds.
The app runs fine most of the time but we are seeing occasional issues where the CPU load for the app goes close to the maximum for the AppServicePlan, instantaneously rather than gradually, and stops executing any more timer triggers and we cannot find anything explicitly in the executing code to account for it (no signs of deadlocks etc. and all code paths have try/catch so there should be no unhandled exceptions). More often than not we see errors getting a connection to a database but it’s not clear if those are cause or symptoms.
Note, this is the only resource within the AppService Plan. The Azure SQL database is in the same region and whilst utilised by other apps is very lightly used by them and they also exhibit none of the issues seen by the problem app.
It feels like this is infrastructure related but we have been unable to find anything to explain what is happening so if anyone has any suggestions for where we should be looking they would be gratefully received. We have enabled basic Application Insights (not SDK) but other than seeing CPU load spike prior to loss of app response there is little information of interest given our limited knowledge of how to best utilise Insights.
According to your description, I thought of two points to troubleshoot your problem. First of all, you can track the running status of your program through the code, and put a log at the beginning and end of your batch scheduled tasks to record the status of each run. If possible, record request and response information and start and end information. This can completely record the time and running status of your task.
Secondly, you can record logs before the program starts database operations, and whether the database connection is successful. The best case is to be able to record, what business will trigger CPU load when operating, and track the specific operating conditions, in order to specifically analyze what causes the database connection failure.
Because you cannot reproduce your problem, you can only guess the cause of the problem. If you still can't find where the problem is through the above two points, then modify your timer appropriately, and let the program trigger once every 5 minutes instead of 30s.

Sometime my function take a looong time to execute

I have tried for the 1st time Azure Function, besides a couple of problems where I found a workaround, it was quite easy to develop and publish my function to Azure. I even tried preview features like durable entities and it works great, I am enthusiast.
However, I had some concerns with the timings. My function is http triggered, it's called by another application. Most of the time execution time is ~1sec which is great. Sometimes, I don't know why it takes up to 30 secs to execute the same function. Is this normal? Maybe some cold start? Or it's me doing something wrong? I am a newbie so I'd like the experts opinion. I am using consumption plan in w. Europe.
Unfortunately for this application anything > 4 sec is not acceptable because it will cause an error in the caller reflected in turn to the end user.
Here you can se a screen capture of logs with timings, look at the bottom what crazy slow times.
Any way to ensure timing always within 4 secs?
This much variation would not be expected with cold start. Generally cold start is about 2-5 seconds and should only happen if a long period of no invocations. Also the measurement here is just execution time, and doesn’t include startup time. I’d recommend looking into logs and adding traces to see if there’s a line of code it’s hanging on.
First step is to understand what happens once you hit one Azure Function endpoint, step by step:
Azure must allocate your application to a server with capacity,
The Functions runtime must then start up on that server,
Your code then needs to execute.
I don't know why it takes up to 30 secs to execute the same function. Is this normal? Maybe some cold start?
I think the answer is related to cold start, the following image represents what happens when you trigger a function app's endpoint (Source: Understanding serverless cold start):
I have similar issues once using Consumption plan. A dedicated plan might be a solution for your case, half minute to warm up an endpoint is pretty bad. To keep the function warm, you have a chance to use Premium plan which promises the following:
When you're using the Premium plan, instances of the Azure Functions host are added and removed based on the number of incoming events just like the Consumption plan. Premium plan supports the following features: Perpetually warm instances to avoid any cold start
You can read about this further: Premium plan (preview)
Additional information:
Be careful with the mentioned option because the pricing might be different based on the following:
Instead of billing per execution and memory consumed, billing for the Premium plan is based on the number of core seconds, execution time, and memory used across needed and reserved instances. At least one instance must be warm at all times. This means that there is a fixed monthly cost per active plan, regardless of the number of executions.
I would consider at least for testing purposes the above mentioned option, I hope the answer helps and gives you the idea why you have slow startup.

Is there a possibility that the timer trigger will be executed with a lot of delay?

First of all, the development environment is asp.net core 2.0 (azure functions preview).
A timer trigger was triggered after a significantly later time than the defined time. (About 3 hours and 20 minutes)
It is not a time zone problem.
Under normal circumstances, how much delays should be considered?
Timers in Azure Functions are accurate. If they fail to run or run with significant delay then it indicates there is some other underlying issue, either an application problem or a functions bug (v2 still has many bugs). Assuming you already have app insights enabled, look at the exceptions table for evidence of something going wrong before the timer stopped running.

Resources