How does one diagnose performance (response time) bottlenecks in Azure?
I've got a .NET Core website on Azure that consists of a web app service and one SQL database.
I've set up load test and deployed it via the cloud to hit the website. Configuration properties for the load test agents are
4 cores
start at 10 simultaneous users, + 10 every 20 seconds, up to 150 users
5 second think time between requests
Web app resource allocation is as follows
2 instances of
4 Core, 7GB RAM (S3 Standard)
This image shows the hardware utilization during the load test (2 tests shown, around 1pm and 1:30pm)
Seems reasonable, except my response times are, in my opinion, too slow, considering the hardware isn't stressed at all. For instance, at 10 users, my response time starts at 20ms, but at 150 users (at the end of the test), I'm seeing 5 second response times.
For the last portion of the test, My requests per second was at about 50.
Database performance, at 100DTUs, doesn't seem to be a factor:
What else can I do to diagnose slow response times? If the web server hardware isn't pegged, and the database isn't even sneezing, what other knobs can I turn on Azure?
The long response time bottleneck can be caused by various reasons, for example, bandwidth restrictions, source limited, bad application design, dependency of tightly coupled component etc.More information about how to troubleshoot slow web app performance issues , please refer to the document. There are some snipped from the document.
Enable diagnostics logging for your web app.
WebApp provides diagnostic functionality for logging information from both the web server and the web application.
We can enable Detailed Error Logging, Failed Request Tracing, Web Server Logging for web server diagnostic
Use Kudu Debug console (https://. scm.azurewebsites.net/)
Kudu provides environment settings for your application, log stream, diagnostic dump
We also can use Azure Application Insights to monitor the usage and performance of our app, then we can get more detail info about request, more detail exception info, response time and so on.
If we get more detail info about application exception, request failed, server logs and application log, it will be more helpful for us to diagnose.
There are also some related articles about how to diagnostic Web App and how to use Application insights:
Enable diagnostics logging for web apps in Azure App Service
Monitor performance in web applications
Diagnose exceptions in your web apps with Application Insights
Using Search in Application Insights
Try using new relic extensions
It provides great insight to response time and lot more with the free account
You can also enable application insights on the web app. It will provide you with details on response time and other details
Related
I have deployed two Azure web apps containing web jobs that perform sending of emails. It's the same code deployed with minor web.config changes and pointing to different dbs with the same structure. They use the same SMTP channel (smtp.office365.com, port 587), and server A is on a higher spec and takes 6 seconds to send an email, and server B is on a lower spec and sends an email in under a second. Both are located in South Africa North. The performance measurement is strictly around the sending of the email, so it can't be a db issue.
Both servers are operating well and I can't see any obvious performance issues. The times taken to send emails are consistently around the same speed throughout the day.
Where should I look to resolve this difference?
You can follow troubleshooting steps to understand the web apps slow performance:
Yes, it may not be a db issue.
Recheck those minor web.config changes that is pointed to different dbs.
Troubleshoot for less performance on the web apps.
Like some of the steps are Service Health checks, Monitor Azure WebJobs, Metrics.
Any program code or database queries are in-efficient.
You can enable application diagnostics and use the Application Insights Profiler.
Use AutoHeal, as it recycles the WP (worker process) on configuration modifications and to execute the request on the given time. And you can restart the web app from portal, if you need this by automatically then you can make use of this AutoHeal.
To automate this on your web app you need to add triggers in the web.config file in the root.
Hope there is no link between sending emails with the help of database, means you are not getting the data from db then sending the emails.
we had a problem at 24th of February between 12:35 and 12:50 (pm by utc). Our app starts to respond slowly and it led to failures, but our app didn’t overload Availability and Performance.
Could the system memory and CPU change cause slowdown our app respond? If yes, then how can avoid this problem next time?
There is a screenshot of our app slowdown chart: Web App Slow
In this time we detect change in physical memory and CPU Memory Analysis, High CPU Analysis
Availability and Performance tools detected top 5 slow request execution. It is requests from our app service to external services. Could these requests the overall performance of the app service or app service plan?
Yes. Low compute resource can lead to slow performance (application using high memory/CPU).
Other reasons for performance issue at application level issues, network requests taking a long time, application code or database queries being inefficient, or application crashing due to an exception. To isolate and avoid such issues in future you may try these steps.
Firstly, review the service heath for any reported issues during that time-frame:
You can track the health of the service on the Azure portal
From the screenshots you have shared, it looks like there 2 5xx errors and 48 4xx errors, you may review the logs to fetch more details on the issue.
-Access Kudu - https://.scm.azurewebsites.net/.
To analyze logs and collect diagnostic dumps as required.
Enable diagnostics logging for apps in Azure App Service
Typically, in Azure App Service, for increased performance and throughput, you can adjust the scale at which you are running your application. I'm unsure on what App Service Plan (ASP) you're using.
If you have multiple Apps under a single ASP, the compute resources are shared by all of those running apps. Based on your requirement and usage you may either consider changing your App Service plan to a higher pricing tier or scale-out instances.
I am doing performance testing for my API hosted in Azure App Service. My API response time is increasing whenever there is a spike in the Working Memory Set graph. But my app service plan is showing 50-55% of memory with only one instance running.
Can you clarify to me, why API response time is increasing every time there is a spike in the 'Working memory Set' even though my app service plan memory is only around 50%?
Response Time Graph
Working Memory Set Graph
This is tough to answer without accessing your site logs but in general your api is consuming resources when it's called and then resting again. Similar to how your heartrate would increase when you start running but should return to normal when you stop again.
Is the response time not within your allowed time range? What is the experience from the customer side? Performance testing on cloud environments can be a slippery slope.
I would suggest using the built in Diagnose and Solve blade of your web app, which has the same troubleshooting tools a support engineer would use to assist you in a paid technical support ticket. This should help tell you if there are any issues with your site that might be impacting performance.
Also, please note that if you're running on the free or shared tier that perf testing is not really applicable as we do not support running production apps on those tiers.
I know this has been asked before, but I tried all known solutions and still no luck. I have a request that returns roughly 26MB of JSON. It is returning a 502 on my azure web app. I have set maxRequestLength and maxAllowedContentLength to their max allowed values as detailed here.
How to set the maxAllowedContentLength to 500MB while running on IIS7?
I have also set the applicationHost.xdt on the site folder of my webapp and verified it is applied as detailed here.
ApplicationHost.xdt in Azure Web Apps
None the less, my request timeout at exactly 4 minutes every time. I can run the same request against my localhost running on iisexpress pointed to the Azure SQL database and it returns the data, so I know this is something azure webapp speciic.
I have enabled all types of logging in "App Service Logs" section of my webapp. I see other failed request traces for 401 when session expires, but this request doesn't log a failed request trace, or an application error. In live log stream it shows the request as a 200 response in the web server logs.
Any other ideas?
Thanks for a detailed question and sharing the solutions that you have already tried. I'm unsure if "Always ON" feature is turned on on your WebApp. Such time-out error may occur due this,so kindly enable it and let us know for further investigation.
Additional information, Azure Load Balancer has a default idle timeout setting of approximately four minutes (230 sec); this is a general idle request timeout that will cause clients to get disconnected after 230 seconds. However, the command will still continue running server-side after that. For a typical scenario, this is generally a reasonable response time limit for a web request. In such scenarios, you could look at async methods to run additional reports. WebJobs or Azure Functions is another option.
If ‘Always On’ config is not turned On, please do turn it on. The AlwaysOn would help keep the app loaded even when there's no traffic, it will send a request to the ROOT of your application. Whatever file is delivered when a request is made to / is the one which will be warmed up and this feature comes with the App Service Plan is not charged separately
1) From the Azure Portal, go to your WebApp.
2) Select Settings> Configuration > General settings.
3) For Always On, select On.
What would be the best way to monitor when our Azure web app is being unloaded when no requests have been made to the web app for a certain amount of time?
Enabling Logstream for the web server doesn't seem to reveal anything of use..
Any hints much appreciated!
You can use Azure Application Insights to create a web test that will alert you when the site is not available anymore. It will ping your site from the data centers you select and perform some action you select (mail, webhook, etc).
However, if you want your web app to stay online, you could upgrade its plan to be at least basic, and under settings enable always on.
In addition to the kim’s response:
If you are running your web app in the Standard pricing tier, Web Apps lets you monitor two endpoints from three geographic locations.
Endpoint monitoring configures web tests from geo-distributed locations that test response time and uptime of web URLs. The test performs an HTTP GET operation on the web URL to determine the response time and uptime from each location. Each configured location runs a test every five minutes.
Uptime is monitored using HTTP response codes, and response time is measured in milliseconds. A monitoring test fails if the HTTP response code is greater than or equal to 400 or if the response takes more than 30 seconds. An endpoint is considered available if its monitoring tests succeed from all the specified locations.
Web Apps also provides you with the ability to troubleshoot issues related to your web app by looking at HTTP logs, event logs, process dumps, and more. You can access all this information using our Support portal at http://.scm.azurewebsites.net/Support
The Azure App Service support portal provides you with three separate tabs to support the three steps of a common troubleshooting scenario:
-Observe current behavior
-Analyze by collecting diagnostics information and running the built-in analyzers
-Mitigate
If the issue is happening right now, click Analyze > Diagnostics > Diagnose Now to create a diagnostic session for you, which collects HTTP logs, event viewer logs, memory dumps, PHP error logs, and PHP process report.
Once the data is collected, the support portal runs an analysis on the data and provides you with an HTML report.
In case you want to download the data, by default, it would be stored in the D:\home\data\DaaS folder.
Hope this helps.