Azure - App Availability percentage is Zero - azure

our Api app is in UAT on Azure with service plan (Standard 3 large). What should we do if App Availability is Zero. It is getting slow response or timeout issue. When i restart the application it is up to normal. (We are using Parallel Language programming.(Async/Await)
How to find the route cause from it for slowness issue.

Ensure that Always On feature is enabled.
Such problems may be caused by application level issues, such as:
network requests taking a long time
application code or database queries being inefficient
application using high memory/CPU
application crashing due to an exception
You could enable web server diagnostics to fetch more details on the issue.
Detailed Error Logging - Detailed error information for HTTP status codes that indicate a failure (status code 400 or greater). This may contain information that can help determine why the server returned the error code.
Failed Request Tracing - Detailed information on failed requests, including a trace of the IIS components used to process the request and the time taken in each component. This can be useful if you are attempting to improve web app performance or isolate what is causing a specific HTTP error.
Web Server Logging - Information about HTTP transactions using the W3C extended log file format. This is useful when determining overall web app metrics, such as the number of requests handled or how many requests are from a specific IP address.
Also, Azure Application Insights collects telemetry from your application to help analyze its operation and performance. You can use this information to identify problems that may be occurring or to identify improvements to the application that would most impact users. This tutorial takes you through the process of analyzing the performance of both the server components of your application and the perspective of the client: https://learn.microsoft.com/en-us/azure/application-insights/app-insights-tutorial-performance
Ref: https://learn.microsoft.com/en-us/azure/app-service/app-service-web-troubleshoot-performance-degradation

Related

Azure slow communication between APIs

In some 1-5% of our requests, we are seeing slow communication between APIs (REST API requests). Both APIs are developed by us and hosted on Azure, each app service on its own app service plan in the same region, P1v2 tier.
What we are seeing on application insights is that POST or GET requests on origin API can take a few seconds to execute, while real execution time on destination API is only a few milliseconds.
Examples (first line POST request on origin, second execution time on destination API): slow req 1, slow req 2
Our best guess is that the time difference is lost in communication between components. We don't have an explanation for it since the payload is really small and in most cases, communication takes less than 5 milliseconds.
We dismiss the possible explanation it could be due to component cold start since it happens during constant load and no horizontal scaling was performed.
Do you have any idea what might cause it or how to do additional analysis in order to discover it?
If you're running multiple sites on the App Service Plan, then enable the "Always On" setting for your web app > All Settings > Application Settings > Click on Always On
See here for details: https://azure.microsoft.com/en-us/documentation/articles/web-sites-configure/
When Always On is off, the site is shut down after 20 minutes of inactivity to free up resources for any additional websites that might be using the same App Service Plan.
The amount of information it needs to collect, process and then present itself requires some time, and involve internal calls as well, that is why considering the server load and usage, it takes around 6 to 7 seconds sometimes even more.
To Troubleshoot that latency, try this steps, provided by Microsoft.

Why I got an error request every 5 minutes in an Azure App Service

I have a java web app on Azure, and I got failed requests in it's Application Insights. It look likes someone are calling 'http://myApp.azurewebsites.net/error' every 5 minutes, but I do not have this interface, so there are many failed requests with 404 in Application Insights. Then I add this interface in app, but there are still many failed requests with 404 code. I have no idea about those requests, I do not know where are them from or what do them want to do. Did I set wrong configurations in my app?
There is a setting named 'Always on' in App Service's configuration, and it's works fine when I turned off this setting.
To narrow down this issues, you can enable the Diagnostic log for your web apps. Web Server Diagnostic logging helps you to trace the exception details originate from components. And if you suspect error comes from your application then "Application Diagnostic" is the source to trace the reason for errors.
Also, Enable the log stream on your web app so irrespective of peak or off peak hours, you can monitor the live log stream , how your web app performs and respond to each request.
It's caused by "Always On" being ON under the Configuration / General settings of your AppService.
As per the docs:
Always On: Keeps the app loaded even when there's no traffic. When Always On is not turned on (default), the app is unloaded after 20 minutes without any incoming requests. The unloaded app can cause high latency for new requests because of its warm-up time. When Always On is turned on, the front-end load balancer sends a GET request to the application root every five minutes. The continuous ping prevents the app from being unloaded.
To mitigate the impact, you can add a controller/ action that handles the default route.

Azure App service returns 502 bad gateway from HttpClient

I have an app service (plan B2) running on Azure.
My integration tests running from docker container are calling some app service endpoints one by one and sometimes receive 500 or 502 error.
When I debug tests I make some pauses between calls and all requests work successfully. Also, when I scale up my app service, everything works properly.(I don't want to scale up because cpu and other params are low.)
In my tests I have only one HttpClient and I dispose it at the end so I don't think there should be any connections leaks.
Also, in TCP Connections I have around 60 total connections while in Azure docs the limit is 1,920.
This app is not accessed by any users but here it says that I had the maximum connections. Is there any way how can I track these connections? Why when I receive these 5xx errors I don't see anything in app insights? Also how 15 connections can exceed the limit when the limit is 1920? Are these connections related to my errors and how they can be fixed?
You don't see them in Application Insights because they're happening at IIS level which is breaking the request, and because of that, data is not being sent to Application Insights.
The place to look for information is "Diagnose and solve problems", then "Availability and Performance". More info in here:
https://learn.microsoft.com/en-us/azure/app-service/overview-diagnostics
PS: I do think the problem is related to the Dispose of your HTTPClient. It's a well known issue and the reason why they've introduced HttpClientFactory. More info in here:
https://www.stevejgordon.co.uk/httpclient-creation-and-disposal-internals-should-i-dispose-of-httpclient
https://stackoverflow.com/a/15708633/1384539

Load test on Azure

I am running a load test using JMeter on my Azure web services.
I scale my services on S2 with 4 instances and run JMeter 4 instances with 500 threads on each.
It starts perfectly fine but after a while calls start failing and giving Timeout error (HTTP status:500).
I have checked HTTP request queue on azure and found that on 2nd instance it is very high and two instances it is very low.
Please help me to success my load test.
I assume you are using Azure App Service. If you check the settings of your App, you will notice ARR’s Instance Affinity will be enabled by default. A brief explanation:
ARR cleverly keeps track of connecting users by giving them a special cookie (known as an affinity cookie), which allows it to know, upon subsequent requests, to which server instance they were talking to. This way, we can be sure that once a client establishes a session with a specific server instance, it will keep talking to the same server as long as his session is active.
This is an important feature for session-sensitive applications, but if it's not your case then you can safely disable it to improve the load balance between your instances and avoid situations like the one you've described.
Disabling ARR’s Instance Affinity in Windows Azure Web Sites
It might be due to caching of network names resolution on JVM or OS level so all your requests are hitting only one server. If it is the case - add DNS Cache Manager to your Test Plan and it should resolve your issue.
See The DNS Cache Manager: The Right Way To Test Load Balanced Apps article for more detailed explanation and configuration instructions.

NServicebus: Programmatic reading of error queue

I’m currently building an application using NServicebus and Azure.
The regular processes are working, but now I’d like to do more about the management and monitoring aspect of the application.
The customer wants to see a dashboard where he can see the health of the application and also be able to correct issues.
What I’d like to do is:
Detect when things are sent to an error queue (to be able to send an alert to an admin)
Allow admin to handle messages on error queue from management application, without
resorting to the provided command line tool.
Is there a way to programmatically do error handling in NServicebus? I know which errors are transient and which errors might need manual intervention.
Is it possible to plug in logic to the error handling logic of nservicebus?
Is it possible to handle messages on the error queue programmatically?
Thanks,
Erwin
Regarding "dashboard where he can see the health of the application and also be able to correct issues":
Please take a look at ServicePulse (http://particular.net/ServicePulse) for production and online monitoring.
This provides both endpoint health indicators and Failed message indicators (including "Retry" capabilities).
For advanced debugging and visualization of your process you should also consider ServiceInsight (http://particular.net/ServiceInsight).
Behind the scenes of ServicePulse there's the ServiceControl server which exposes REST HTTP API with programmatic access to audited and error messages.
HTH,
Danny.

Resources