Slow AAD Differential Query for - azure

The Azure AD differential query works well and fast when we query the difference between actual Azure AD and previous state not older than 30-60 minutes. But when we query for a week ago or month ago – it takes 10 minutes to return changes – even if Azure directory is small and there are 3-4 changed attributes for this period of time, what is very slow. Is it expected behavior? Are there any workarounds?

Based on the test, it is not a expected behavior. My first differential query request stated at 10/10/2016. And today I test the differential query REST using the Fiddler cost about 30 seconds.
To narrow this issue, I suggest that you call the other service or call this REST in different network to ensure the issue is not caused by the network. Other Azure Graph REST is also recommend to test to see whether the issue is Azure Active Directory related.

For sure it's not... I can query 21K users in 3-4 minutes over a 24 Mbit DSL with partial properties (only those I want) and less than 10 minutes for all properties (and objects have almost all properties set so deserialization if fully in effect)
Delta queries, a few seconds, always.
Are you using your own routine over a basic HTTP client or are you using classes provided in the MS.Azure.AD assembly?

Related

Azure Function App is very slow, but only sometimes

I have read through most of the questions that seems to be similar to what I'll ask so hopefully I'm not wasting anyone's time.
We have a Function App in Azure Cloud that contains several Durable Functions.
One of these durable functions is a HTTP trigger API REST call.
It will normally take between 0.5 - 3 seconds to execute fully (from call to done, delivered result). But sometimes it takes 20-35 seconds. I don't know why or how I can search for errors.
The durable function fetches information from a Cosmos DB and delivers the result back to the caller.
Function App, Durable Function and Cosmos DB are all located in the same Region. (Checked that).
The Durable Function is set to B2:2 and has toggled Always On to ON.
Is there something I miss or something I should check to make sure it runs smoother?
Log of executions of the app:
I greatly appreciate everyone's time and energy they put into helping me. Thanks a lot.
---- Additions to the post after posting ----
I have checked the interactive tool and if I read that correctly it tells me a maximum execution time of 0.8 seconds and a maximum network lag of 6 seconds. That would indicate something that I suspected before I set up this post and that is that Azure needs to cold start the function. But I have always on toggled on so why?
It doesn't seem to take 30 seconds to complete the function. It seems to take less than 1 second to complete the function and up to a maximum of 6 seconds lag, but where are the other 23 seconds going in a 30 second call?
B2:2 is the service agreement I have with Azure. B2 is the test environments second paid state with 2 instances scaling (I have changed that to 3 after posting this).
Application Insights are on and no other dependencies are present except the Cosmos DB.
AFAIK in Azure Functions,
After 5 minutes of inactivity, Function App goes to the cold state. To come out of it, 10 seconds delay occurs.
Even the Function App is in Hot State, it will take some excessive amount of time to load the external libraries defined in it.
In the Function App, Code Logic Performance also matters the cause of slowness in the Azure Functions.
There are few steps for reducing the cold-start times particularly for the Functions having external libraries:
Running from a package file WEBSITE_RUN_FROM_PACKAGE to 1 may reduce cold-start times, particularly for JavaScript functions with large npm package trees.
From the Azure Portal > Diagnose and solve problems > Troubleshoot Performance category to identify the causes of slowness:
Try Always On Feature available in App Service Plan and Premium Plan of the Azure Functions to prevent such issues.
Regarding the Performance and reliability improving of Azure Functions, please refer here.
If this issue persists still, then please raise an incident with Microsoft Support to get the root cause and resolution.
Try fiddling around with maxqueuepollingintervall. It helped out with our cold starts quite a bit.

EF Core 3.1.14 Recurring Cold Start

We have deployed a very simple .NET CORE 3 Web API application to Azure Cloud. The application is a web api and talks to a very simple SQL server database hosted in Azure as well. There are two main performance problems we are noticing
All API calls go to DB for either read or write operations. The tables only contain 4 and 5 rows and the queries are just basic select and insert queries with no joins.
The first call to the API is very slow (30 seconds to query 1 record in a table of size 10) and we added the timer and noticed it is the DB call that is taking 99.99% of the time. So I used the Azure Data Studio Profiler and realized that the query reached SQL Server after like 29.90 seconds. So the issue is not the query itself. Also, the second, third query etc. are super fast and return within < 30 milli-seconds. So the issue is not the internet connectivity between the Web App and the Azure SQL Database.
The bigger problem is that, if you stop calling the API for say 2-3 minutes and then do another call, then again the first query takes 30 seconds. But the subsequent queries are faster.
If this was only happening when w3wp.exe starts then I wouldn't be concerned but if the requests to the API stop coming for 2-3 minutes then again it is down. This is of concern.
We have Always ON set to Yes.
I tried collecting .NET Trace in Azure for the web app but this is giving me this weird error.
Here are the Nuget package versions installed in the VS solution related to EF.
Here is the SQL Server pricing tier.
Is there any other way to collect trace for Azure Web APP. I really need to see the call stack of the code for those 30 seconds to move forward. I have access to KUDU etc.
Thanks.
UPDATE 3 - 8th May 2021
I have posted the answer to my own question. This may not be root cause for other people who face similar problem but at least 1 area to look into.
UPDATE 2 - 7th May 2021
After adding EF Core logging as suggested by Ivan, he is right that the opening of connection is taking too long? Why is that? And how to stop that from happening?
UPDATE 1 - 7th May 2021
Jason Pan - We are using App Service Plan and this is the only application hosted there. The plan is P1V2 (https://azure.microsoft.com/en-us/pricing/details/app-service/windows/).
Ivan Stoev - Yes since the .NET Trace is not working for some reason as explained in my question, we captured the App Insights Profiler Trace to capture the call stack and as per call stack it appears that the connection to the SQL server was opened after 30 seconds. So I made two changes in my code:
a. Removed IDisposable from our Repository class that was having our context inject through DI. Before inside the Dispose method, I was calling Dispose on our context class.
b. I replaced services.AddDbContext with services.AddDbContextPool
I then wrote a test program to call the API method randomly once every 2 to 4 mins for 1 hour and only 1 call took 30 seconds and the remaining 21 took few milli-seconds.
But my next step is to run a 24 hours test (1 call every 2-7 mins for example) to see if this was just fluke or actually the solution.
Ok so posting an answer to my question. It turned out that there was no issue with web application, app service plan, sql server or entity framework. I took a network trace of my application and 1 other application which doesn't have any issue and opened it with network monitor. We noted that they are taking different paths. After looking into the IP address we realized that the other application had a virtual network setup. You can see that by going to your app service plan and then click on Networking option in the left menu bar. And then choose the first one for vNet. Once we configured vNet, then all responses were within < 1 second.
There was one another oversight by me. The Auth0 calls were also taking 14 seconds sometimes. And when I tried running tcpping google.com from KUDU that timed out sometimes as well. But was working fine for other web applications.

Create database within Azure SQL elastic pool takes almost 10 minutes to complete

We use Azure SQL databases and an elastic pool (level "Standard").
Usually the creation of a new customer database takes approximately 1-2 minutes but suddenly it started taking way longer (up to 10 minutes) and I have no idea why this is happening. I checked the pool in the Azure portal and everything seems fine. We are still far away from reaching the given limits (257/500 databases; ~11GB/200GB data size). Upscaling for a short period of time has no effect.
Is there anything else I can do?
I think there are some ongoing issue at Microsoft cloud services just check if your issue related to that, if that’s true your issue should be temporary

Sometime my function take a looong time to execute

I have tried for the 1st time Azure Function, besides a couple of problems where I found a workaround, it was quite easy to develop and publish my function to Azure. I even tried preview features like durable entities and it works great, I am enthusiast.
However, I had some concerns with the timings. My function is http triggered, it's called by another application. Most of the time execution time is ~1sec which is great. Sometimes, I don't know why it takes up to 30 secs to execute the same function. Is this normal? Maybe some cold start? Or it's me doing something wrong? I am a newbie so I'd like the experts opinion. I am using consumption plan in w. Europe.
Unfortunately for this application anything > 4 sec is not acceptable because it will cause an error in the caller reflected in turn to the end user.
Here you can se a screen capture of logs with timings, look at the bottom what crazy slow times.
Any way to ensure timing always within 4 secs?
This much variation would not be expected with cold start. Generally cold start is about 2-5 seconds and should only happen if a long period of no invocations. Also the measurement here is just execution time, and doesn’t include startup time. I’d recommend looking into logs and adding traces to see if there’s a line of code it’s hanging on.
First step is to understand what happens once you hit one Azure Function endpoint, step by step:
Azure must allocate your application to a server with capacity,
The Functions runtime must then start up on that server,
Your code then needs to execute.
I don't know why it takes up to 30 secs to execute the same function. Is this normal? Maybe some cold start?
I think the answer is related to cold start, the following image represents what happens when you trigger a function app's endpoint (Source: Understanding serverless cold start):
I have similar issues once using Consumption plan. A dedicated plan might be a solution for your case, half minute to warm up an endpoint is pretty bad. To keep the function warm, you have a chance to use Premium plan which promises the following:
When you're using the Premium plan, instances of the Azure Functions host are added and removed based on the number of incoming events just like the Consumption plan. Premium plan supports the following features: Perpetually warm instances to avoid any cold start
You can read about this further: Premium plan (preview)
Additional information:
Be careful with the mentioned option because the pricing might be different based on the following:
Instead of billing per execution and memory consumed, billing for the Premium plan is based on the number of core seconds, execution time, and memory used across needed and reserved instances. At least one instance must be warm at all times. This means that there is a fixed monthly cost per active plan, regardless of the number of executions.
I would consider at least for testing purposes the above mentioned option, I hope the answer helps and gives you the idea why you have slow startup.

SQL Azure Premium tier is unavailable for more than a minute at a time and we're around 10-20% utilization, if that

We run a web service that gets 6k+ requests per minute during peak hours and about 3k requests per minute during off hours. Lots of data feeds compiled from 3rd party web services and custom generated images. Our service and code is mature, we've been running this for years. A lot of work by good developers has gone into our service's code base.
We're migrating to Azure, and we're seeing some serious problems. For one, we are seeing our Premium P1 SQL Azure database routinely become unavailable for 1-2 full entire minutes. I'm sorry, but this seems absurd. How are we supposed to run a web service with requests waiting 2 minutes for access to our database? This is occurring several times a day. It occurs less after switching from Standard level to Premium level, but we're nowhere near our DB's DTU capacity and we're getting throttled hard far too often.
Our SQL Azure DB is Premium P1 and our load according to the new Azure portal is usually under 20% with a couple spikes each hour reaching 50-75%. Of course, we can't even trust Azure's portal metrics. The old portal gives us no data for our SQL, and the new portal is very obviously wrong at times (our DB was not down for 1/2 an hour, like the graph suggests, but it was down for more than 2 full minutes):
Azure reports the size of our DB at a little over 12GB (in our own SQL Server installation, the DB is under 1GB - that's another of many questions, why is it reported as 12GB on Azure?). We've done plenty of tuning over the years and have good indices.
Our service runs on two D4 cloud service instances. Our DB libraries are all implementing retry logic, waiting 2, 4, 8, 16, 32, and then 48 seconds before failing completely. Controllers are all async, most of our various external service calls are async. DB access is still largely synchronous but our heaviest queries are async. We heavily utilize in-memory and Redis caching. The most frequent use of our DB is 1-3 records inserted for each request (those tables are queried only once every 10 minutes to check error levels).
Aside from batching up those request logging inserts, there's really not much more give in our application's db access code. We're nowhere near our DTU allocation on this database, and the server our DB is on has like 2000 DTU's available to be allocated still. If we have to live with 1+ minute periods of unavailability every day, we're going to abandon Azure.
Is this the best we get?
Querying stats in the database seems to show we are nowhere near our resource limits. Also, on premium tier we should be guaranteed our DTU level second-by-second. But, again, we go more than an entire solid minute without being able to get a database connection. What is going on?
I can also say that after we experience one of these longer delays, our stats seem to reset. The above image was a couple minutes before a 1 min+ delay and this is a couple minutes after:
We have been in contact with Azure's technical staff and they confirm this is a bug in their platform that is causing our database to go through failover multiple times a day. They stated they will be deploying fixes starting this week and continuing over the next month.
Frankly, we're having trouble understanding how anyone can reliably run a web service on Azure. Our pool of Websites randomly goes down for a few minutes a few times a month, taking our public sites down. If our cloud service returns too many 500 responses something in front of it is cutting off all traffic and returning 502's (totally undocumented behavior as far as we can tell). SQL Azure has very limited performance and obviously isn't ready for prime time.

Resources