How to troubleshoot App slowness with Azure AD calls?

How to troubleshoot App slowness with Azure AD calls? - azure

I've inherited an ASP.NET MVC app that takes between twenty seconds and a minute to display every single page. The vast majority of that time is spent in ActiveDirectoryClient. It looks like the original author may have read this Q&A about how to check group membership.
Calling ActiveDirectoryClient.Users.Where(...).ExecuteAsync() takes 3-10 seconds on a good day. Calling IUserFetcher.MemberOf.ExecuteAsync() takes another 5-7 seconds. Basically, every use of ActiveDirectoryClient takes several seconds, and there are many of them.
I tried using ActiveDirectoryClient.IsMemberOfAsync(...), but this just consumes 1.5 GB of RAM and never returns. (By "never" I mean I waited five minutes before stopping the debugger.)
I suspect that the problem is not with these bits of code, but some overall misconfiguration of Azure or the graph client. So maybe this question isn't even on the right site. Where do I begin troubleshooting this?

Related

Azure Function App is very slow, but only sometimes

I have read through most of the questions that seems to be similar to what I'll ask so hopefully I'm not wasting anyone's time.
We have a Function App in Azure Cloud that contains several Durable Functions.
One of these durable functions is a HTTP trigger API REST call.
It will normally take between 0.5 - 3 seconds to execute fully (from call to done, delivered result). But sometimes it takes 20-35 seconds. I don't know why or how I can search for errors.
The durable function fetches information from a Cosmos DB and delivers the result back to the caller.
Function App, Durable Function and Cosmos DB are all located in the same Region. (Checked that).
The Durable Function is set to B2:2 and has toggled Always On to ON.
Is there something I miss or something I should check to make sure it runs smoother?
Log of executions of the app:
I greatly appreciate everyone's time and energy they put into helping me. Thanks a lot.
---- Additions to the post after posting ----
I have checked the interactive tool and if I read that correctly it tells me a maximum execution time of 0.8 seconds and a maximum network lag of 6 seconds. That would indicate something that I suspected before I set up this post and that is that Azure needs to cold start the function. But I have always on toggled on so why?
It doesn't seem to take 30 seconds to complete the function. It seems to take less than 1 second to complete the function and up to a maximum of 6 seconds lag, but where are the other 23 seconds going in a 30 second call?
B2:2 is the service agreement I have with Azure. B2 is the test environments second paid state with 2 instances scaling (I have changed that to 3 after posting this).
Application Insights are on and no other dependencies are present except the Cosmos DB.

AFAIK in Azure Functions,
After 5 minutes of inactivity, Function App goes to the cold state. To come out of it, 10 seconds delay occurs.
Even the Function App is in Hot State, it will take some excessive amount of time to load the external libraries defined in it.
In the Function App, Code Logic Performance also matters the cause of slowness in the Azure Functions.
There are few steps for reducing the cold-start times particularly for the Functions having external libraries:
Running from a package file WEBSITE_RUN_FROM_PACKAGE to 1 may reduce cold-start times, particularly for JavaScript functions with large npm package trees.
From the Azure Portal > Diagnose and solve problems > Troubleshoot Performance category to identify the causes of slowness:
Try Always On Feature available in App Service Plan and Premium Plan of the Azure Functions to prevent such issues.
Regarding the Performance and reliability improving of Azure Functions, please refer here.
If this issue persists still, then please raise an incident with Microsoft Support to get the root cause and resolution.

Try fiddling around with maxqueuepollingintervall. It helped out with our cold starts quite a bit.

Sometime my function take a looong time to execute

I have tried for the 1st time Azure Function, besides a couple of problems where I found a workaround, it was quite easy to develop and publish my function to Azure. I even tried preview features like durable entities and it works great, I am enthusiast.
However, I had some concerns with the timings. My function is http triggered, it's called by another application. Most of the time execution time is ~1sec which is great. Sometimes, I don't know why it takes up to 30 secs to execute the same function. Is this normal? Maybe some cold start? Or it's me doing something wrong? I am a newbie so I'd like the experts opinion. I am using consumption plan in w. Europe.
Unfortunately for this application anything > 4 sec is not acceptable because it will cause an error in the caller reflected in turn to the end user.
Here you can se a screen capture of logs with timings, look at the bottom what crazy slow times.
Any way to ensure timing always within 4 secs?

This much variation would not be expected with cold start. Generally cold start is about 2-5 seconds and should only happen if a long period of no invocations. Also the measurement here is just execution time, and doesn’t include startup time. I’d recommend looking into logs and adding traces to see if there’s a line of code it’s hanging on.

First step is to understand what happens once you hit one Azure Function endpoint, step by step:
Azure must allocate your application to a server with capacity,
The Functions runtime must then start up on that server,
Your code then needs to execute.
I don't know why it takes up to 30 secs to execute the same function. Is this normal? Maybe some cold start?
I think the answer is related to cold start, the following image represents what happens when you trigger a function app's endpoint (Source: Understanding serverless cold start):
I have similar issues once using Consumption plan. A dedicated plan might be a solution for your case, half minute to warm up an endpoint is pretty bad. To keep the function warm, you have a chance to use Premium plan which promises the following:
When you're using the Premium plan, instances of the Azure Functions host are added and removed based on the number of incoming events just like the Consumption plan. Premium plan supports the following features: Perpetually warm instances to avoid any cold start
You can read about this further: Premium plan (preview)
Additional information:
Be careful with the mentioned option because the pricing might be different based on the following:
Instead of billing per execution and memory consumed, billing for the Premium plan is based on the number of core seconds, execution time, and memory used across needed and reserved instances. At least one instance must be warm at all times. This means that there is a fixed monthly cost per active plan, regardless of the number of executions.
I would consider at least for testing purposes the above mentioned option, I hope the answer helps and gives you the idea why you have slow startup.

Azure slow response times

Wondering if an Azure experts out there can give me some suggestions, we have a App Service app running and have noticed that on the first few requests (even if always on is ON) it can take a very long time for response.
The below chart is what we observed ,one can see that it takes up to 2 minutes initially and then afterwards we get more reasonable response times of a few milliseconds/seconds.
How can we make sure that it ALWAYS responds quickly? As a simple test, it is not doing anything processing intensive, just a few simple DB queries to check if a key exists.

At the beginning (the very first few minutes) Azure SQL Databases run queries slowly due to reduce memory allocation. You can see the query plan of those queries that run slowly at first and then show good performance and you can see query plan is the same. On the first run you may see query waits are: MEMORY_ALLOCATION_EXT, IO_QUEUE_LIMIT or PAGEIOLATCH_SH.
After periods of no activity, failovers or scaling up/down tiers memory allocation may be reduced and queries may show poor performance the first few minutes.
Hope this helps

Is Azure Functions running in Consumption mode appropriate for massively varying, yet time critical Load?

I’m about to start work on an API that will literally go from 0 RPS to a couple hundred thousand HTTP RPS at the same time and run at that rate for ~2 mins. All processing of those 30 million requests must finish by the end of that 2 min period. This would happen 7 times a WEEK.
Going serverless with Azure Functions in Consumption Plan Hosting Mode sounds appealing. This document describes that a scale controller exists to coordinate app instances, but doesn't really discuss what I can expect from it for HTTP triggers. I can’t find any info that says the scale controller will be able to respond in the time frame I'd need.
The best info I could find was this info saying it took nearly 8 mins to scale up for his tests.
Is this a bad use case for Azure Functions in consumption mode?
Obviously, spinning up a testing harness that is capable of issuing 30 million requests within 2 minutes is an undertaking of its own, and an expensive one. I'd like to learn from others who have already done so.

Based on my experience, this scenario is not properly covered by Consumption Plan. They can scale up to many instances, but not very rapidly. 2 minutes is way too fast to rely on.
I was mostly working with queues, not HTTP, but I got delays up to 40 minutes caused by low pace of scaling up.
If you can predict which 2 minutes are going to be heavy-loaded, your best bet could be to provision the capacity with a script (or another Function).

Azure App Service: How can I determine which process is consuming high CPU?

UPDATE: I've figured it out. See the end of this question.
I have an Azure App Service running four sites. One of the sites has two deployment slots in addition to the primary one. Recently I've been seeing really high CPU utilization for the App Service plan as a whole.
The dark orange line shows the CPU percentage. This is just after restarting all my sites, which brought it down to this level.
However, when I look at the CPU use reported by each site, it's really low.
The darker blue line shows the CPU time, which is basically nothing. I did this for all of my sites, and all the graphs look the same. Basically, it seems that none of my sites are causing the issue.
A couple of the sites have web jobs, so I took a look at the logs but everything is running fine there. The jobs run for a few seconds every few hours.
So my question is: how can I determine the source of this CPU utilization? Any pointers would be greatly appreciated.
UPDATE: Thanks to the replies below, I was able to get more detail into what was happening. I ended up getting what I needed from SCM / Kudu tools. You can get here by going to your web app in Azure and choosing Advanced Tools from the side nav. From the Kudu dashboard, choose Process Explorer. The value in the Total CPU Time column is not directly useful, because it's the time in seconds that the process has run since it started, which might have been minutes or days ago.
However, if you make a record of the value at intervals, you can look at the change over time, and one process might jump out at you. In my case, it was my WebJobs process. Every 60 seconds, this one process was consuming about 10 seconds of processor time, just within one environment.
The great thing about this Kudu dashboard is, if you can catch the problem while it is actually happening, you can hit the Start Profiling button and capture a diagnostic session. You can then open this up in Visual Studio and get some nice details about where the CPU time is being spent.
Just in case anyone else is seeing similar issues, I'll provide more details about my particular case. As I mentioned, my WebJobs exe was the culprit, and I found that all the CPU time was being spent in StackExchange.Redis.SocketManager, which manages connections to Azure Redis Cache. In my main web app, I create only one connection, as recommended. But Since my web jobs only run every once in a while, I was creating a new connection to Azure Redis Cache each time one ran, which apparently can lead to issues. I changed my code to create the Redis Cache connection once when the WebJob process starts up and use the existing connection when any individual WebJob runs.
Time will tell if this really fixes the issue, but I think it will. When the problem occurred, it always fit the same pattern: After a few days of running fine, my CPU would slowly ramp up over the course of about 12 hours. My thinking is that each time a WebJob ran, it created a connection object, which at first didn't produce trouble, but gradually as WebJobs ran every hour or two, cruft was building up until finally some critical threshold was met and the CPU usage would take off.
Hope this helps someone out there. Best wishes!

May be you should go to webApp scm?
%yourAppName%.scm.azurewebsites.com;
There is a page, that can show you all process, that runned now on your web app. (something like Console > Process).
Also you can go to support page (from scm right corner).
You can find some more info about your performance there, and make memory dump (not for this problem, but it useful for performance issues).

According to your description, I assumed that you could leverage the Crash Diagnoser extension to capture dump files from your Web Apps and WebJobs when the CPUs usage percentage is higher than the specific threshold to isolate this issue. For more details, you could refer to this official blog.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string