I am using Azure Application Insight on App Service
As shown in the picture taken from Application Insight, sometimes the API will wait for long time before executing the real operation which is after 22 minutes
Any idea why it will "wait and do nothing"? Its just on and off getting this senario
Thanks
The issue may be caused by various factors, including
Please verify the Application Insights SDK version; if it is old, please upgrade it to the latest version.
It appears that ServerTelemetryChannel uses sync-over-async calls when disposing due to async calls in long-running operations. **GetAwaiter().The calling thread is also blocked by GetResult()**Find reference in the tutorial.
Related
I have a long running (node.js) orchestrator in Azure Function App that calls a couple hundred activity functions. Sometimes with a group of 5 or so running in parallel with context.df.Task.all. I find that it will run steadily for about two hours then the function app itself seems to abruptly stop. The logs stop displaying in the log stream. And the records in my database that the activity functions are supposed to be writing stop writing. There are no exceptions in the logs. It will remain paused or stalled like this indefinitely... until I restart the function app. Then it will come back to to life and resume where it stopped before for a time and then stop again.
Does this behavior sound familiar to anyone?
Should I update the extension bundle to [4.0.0, 5.0.0)
Could my storage account be the problem? Should I create a new one?
We are using the "Premium Plan", Could I be running up against a limit of some kind? If so what and what should I tell the IT team to increase.
As far as I know,
Should I update the extension bundle to [4.0.0, 5.0.0)
I believe this issue is not related to extension bundles because this is regarding on the usage compatible extensions, libraries, packages used in the Function App and extension bundle is versioned where each version comprises of Rich set of supported binding extensions to be installed based on the version of the Function App.
If any timeout value is defined in the host.json, make it as unbounded (-1) as the function project is deployed/hosting in the premium plan for the longer timeout duration of function executions.
Could my storage account be the problem? Should I create a new one?
Instead of creating a new account, you can increase the quota of the Storage account to 5 PiB.
If Storage account is in consideration, then make sure that both the function app and storage account are in same region to reduce latency issues.
Also, in production environment - it is better to allocate a separate storage account for each azure function app.
We are using the "Premium Plan", Could I be running up against a limit of some kind? If so what and what should I tell the IT team to increase.
Also, you mentioned in the question that the function app stalls, with no executions after stalling and works by restart from where it has paused. I have seen some points mentioned by Microsoft even the long running functions hosted in premium plan will stops with no executions like your scenario:
Refer to the MS Doc for more information.
I've had a couple of random instances in a 1 hour period, of the Azure Search service returning a connection timeout, it is being called from a .net core web application running as an Azure App Service.
App Insights has a dependency failure for the same time (a POST to /indexes('products')/docs/search.post.search?api-version=2019-05-06) with a response of "Faulted".
Any help/idea on why this happened and how I can prevent would be appreciated.
You could be attempting to retrieve too much data at once. Or you may have a throttling issue because of too much traffic. The reason for the timeout is not possible to determine without more context.
To avoid timeouts, you could optimize and resolve your root cause by reducing the response size, limiting the number of requests, or addressing your issue's root cause.
Also, consider implementing a retry mechanism with exponential backoff. See this thread for information: Azure Search .net SDK- How to use "FindFailedActionsToRetry"?
As Dan mentioned, it is recommended to use retries since failures due to network or many other reasons can happen and this will help improve your app availability. However, if you are seeing failures happen repeatedly or need more information then please open a support issue so the support team can investigate it further.
G'day folks,
I'm having some issues with an Azure function that I'm hoping someone might be able to help with.
We have a relatively long-running process (3-4 mins) that is being triggered from a Service Bus message, and we were having issues with the function execution ending without error and then attempting to re-process. The time take for this to happen is less than all the timeout/lock duration settings we have configured. Watching the logs (log stream, for both file system and app insights) we see the last line of the previous execution, then it kicks straight into the next.
To determine whether it's service bus related, I've also tried executing the process via a blob trigger (the process uses the file as a data source anyway) but I'm seeing the same thing except I don't see the subsequent retries.
In both scenarios I don't see anything in App insights apart from the Trace records. I don't get an exception, or even a 'request' entry. (function logic is all enclosed in try/catch blocks btw)
So my question is - Is it possible to trap these scenarios so we can determine the root cause? Currently I've got nothing to go on to try and diagnose. These errors don't happen when running locally.
FWIW we've seen this issue happen during the execution of a third-party libraries (MS Graph and an OpenXMLPowerTools library) - as we're generating documents for upload into Sharepoint. Not sure if this is relevant.
Thanking you in advance,
Tim
May be this is because of the plan that you are using , If you're using the Consumption plan, the default timeout is 5 minutes, but you can increase it to a maximum of 10 minutes. The maximum timeout on a Premium plan is 60 minutes. You can set your timeout as long as you want if you have a dedicated App Service plan.
Also try configuring the timeout of your function app i.e by changing the value of functionTimeout in host.json of your function app.
You should have a look at durable functions.
They allows us to have long running processes, i.e. import/export tasks.
I was able to wrap a long running import process, which takes about 20 mins to run successfully.
We have a long running ASP.NET WebApp in Azure which has no real endpoints exposed – it serves a single functional purpose primarily reading and manipulating database data, effectively a batched, scheduled task, triggered by a timer every 30 seconds.
The app runs fine most of the time but we are seeing occasional issues where the CPU load for the app goes close to the maximum for the AppServicePlan, instantaneously rather than gradually, and stops executing any more timer triggers and we cannot find anything explicitly in the executing code to account for it (no signs of deadlocks etc. and all code paths have try/catch so there should be no unhandled exceptions). More often than not we see errors getting a connection to a database but it’s not clear if those are cause or symptoms.
Note, this is the only resource within the AppService Plan. The Azure SQL database is in the same region and whilst utilised by other apps is very lightly used by them and they also exhibit none of the issues seen by the problem app.
It feels like this is infrastructure related but we have been unable to find anything to explain what is happening so if anyone has any suggestions for where we should be looking they would be gratefully received. We have enabled basic Application Insights (not SDK) but other than seeing CPU load spike prior to loss of app response there is little information of interest given our limited knowledge of how to best utilise Insights.
According to your description, I thought of two points to troubleshoot your problem. First of all, you can track the running status of your program through the code, and put a log at the beginning and end of your batch scheduled tasks to record the status of each run. If possible, record request and response information and start and end information. This can completely record the time and running status of your task.
Secondly, you can record logs before the program starts database operations, and whether the database connection is successful. The best case is to be able to record, what business will trigger CPU load when operating, and track the specific operating conditions, in order to specifically analyze what causes the database connection failure.
Because you cannot reproduce your problem, you can only guess the cause of the problem. If you still can't find where the problem is through the above two points, then modify your timer appropriately, and let the program trigger once every 5 minutes instead of 30s.
I am experiencing performance issues with my Azure application. Application Insights shows a hot path which resolves to a call to Microsoft.Azure.CloudConfigurationManager.GetSetting(). Here is the hot-path - you can see it waits for 5 seconds:
[![App insights hot path][1]][1] [1]: https://i.stack.imgur.com/CEPXw.png
Of note:
this code has been running for years without hiccup as far as I know
I recently added deployment slots to the application. Seeing as these have their own slot-settings, perhaps there is some kind of issue there??