IIS logs show much longer duration than Insights - iis

I have a Web API application hosted under an on-prem (not Azure) IIS which logs to Insights and is showing Request durations very similar to the times reported in the IIS logs.
However, every so often, there is an entry in the IIS logs with a TimeTakenMS duration which is much longer than the Insights Request duration.
For example, a relatively simple request to read a small amount of data from the DB is logged with a total duration by Insights as 984.126ms but IIS is logging it with TimeTakenMS as 43718.
I have conclusively linked the two requests (they show a unique URL). I have eliminated application start up/recycle times (the application is clearly already started and serving other requests and the recycle boundary is hours away from the logged time stamps) and I have eliminated exceptions (Insights is not logging any exceptions at this time).
I also have a StopWatch set up in the controller for every web method and TrackTrace the elapsed milliseconds to Insights in a finally block, just before the method returns.
What other factors could cause an IIS hosted application to fail to log actual execution times to Insights but cause IIS to log much longer times?
I've considered network + processing time, but 42 seconds seems an unreasonably long duration (even for the start up time of this particular application.)

Related

Azure Functions service not recognizing request sent from outside client

We have a service which pings our EP1 Premium service and yesterday we received 3 client side timeout errors after 2 minutes of waiting. When opening the trace in App insights, these requests which time out are not even logged and have no trace of ever being received Azure side, and therefore stay unanswered. By looking at the metrics provided in the Azure Functions app, I found out that 1-2 minutes after the request has been sent, the app loses all its ability to work as its Total App Domains falls to 0 as well as all connections, threads and so on and this state lasts until the next request is received, therefore "skipping" the request that happened beforehand. This is a big issue as I need to make sure requests get answered in a timely manner.
The client service sent HTTP requests to the Azure Functions app expecting an answer, only to time out while the Azure-side doesn't have any record of ever receiving the request.
I believe this issues is related to Consumption Plan of Azure Functions called Cold Start behaviour. The "skipping" mechanism is explained below:
Apps may scale to zero when idle, meaning some requests may have additional latency at startup. The consumption plan does have some optimizations to help decrease cold start time, including pulling from pre-warmed placeholder functions that already have the function host and language processes running.https://learn.microsoft.com/en-us/azure/azure-functions/functions-scale#cold-start-behavior
Please also consider of having look on this article, which explains the behaviour. https://azure.microsoft.com/en-us/blog/understanding-serverless-cold-start/

Azure slow communication between APIs

In some 1-5% of our requests, we are seeing slow communication between APIs (REST API requests). Both APIs are developed by us and hosted on Azure, each app service on its own app service plan in the same region, P1v2 tier.
What we are seeing on application insights is that POST or GET requests on origin API can take a few seconds to execute, while real execution time on destination API is only a few milliseconds.
Examples (first line POST request on origin, second execution time on destination API): slow req 1, slow req 2
Our best guess is that the time difference is lost in communication between components. We don't have an explanation for it since the payload is really small and in most cases, communication takes less than 5 milliseconds.
We dismiss the possible explanation it could be due to component cold start since it happens during constant load and no horizontal scaling was performed.
Do you have any idea what might cause it or how to do additional analysis in order to discover it?
If you're running multiple sites on the App Service Plan, then enable the "Always On" setting for your web app > All Settings > Application Settings > Click on Always On
See here for details: https://azure.microsoft.com/en-us/documentation/articles/web-sites-configure/
When Always On is off, the site is shut down after 20 minutes of inactivity to free up resources for any additional websites that might be using the same App Service Plan.
The amount of information it needs to collect, process and then present itself requires some time, and involve internal calls as well, that is why considering the server load and usage, it takes around 6 to 7 seconds sometimes even more.
To Troubleshoot that latency, try this steps, provided by Microsoft.

High response duration on first request for .net core api on Azure

I have deployed a .Net Core API to Azure as an App Service.
I have set the Always on feature to true.
When I log the requests, I see that Azure Always on requests are coming every 5 minutes.
My usage with API is HTTPS but Always on requests are sending with HTTP. I don't know if this is the case
For the first request, it is sometimes 10 seconds, but after the first request, it is around 100ms.
What is missing here?
I have logged the durations:
There are quite a few reasons why this might be the case:
You're connecting to resources that take time connecting to the first time
Some information is being cached and needs to be read the first time
There is initialization code present
Lazy instantiation of (static/singleton) instances
... other ...
Add some logging to your application, maybe enable Application Insights if you haven't done so already and go try to find the culprit.

Why I got an error request every 5 minutes in an Azure App Service

I have a java web app on Azure, and I got failed requests in it's Application Insights. It look likes someone are calling 'http://myApp.azurewebsites.net/error' every 5 minutes, but I do not have this interface, so there are many failed requests with 404 in Application Insights. Then I add this interface in app, but there are still many failed requests with 404 code. I have no idea about those requests, I do not know where are them from or what do them want to do. Did I set wrong configurations in my app?
There is a setting named 'Always on' in App Service's configuration, and it's works fine when I turned off this setting.
To narrow down this issues, you can enable the Diagnostic log for your web apps. Web Server Diagnostic logging helps you to trace the exception details originate from components. And if you suspect error comes from your application then "Application Diagnostic" is the source to trace the reason for errors.
Also, Enable the log stream on your web app so irrespective of peak or off peak hours, you can monitor the live log stream , how your web app performs and respond to each request.
It's caused by "Always On" being ON under the Configuration / General settings of your AppService.
As per the docs:
Always On: Keeps the app loaded even when there's no traffic. When Always On is not turned on (default), the app is unloaded after 20 minutes without any incoming requests. The unloaded app can cause high latency for new requests because of its warm-up time. When Always On is turned on, the front-end load balancer sends a GET request to the application root every five minutes. The continuous ping prevents the app from being unloaded.
To mitigate the impact, you can add a controller/ action that handles the default route.

Slow response times from free web app server every day at same time

Every day at about 3:00PM-4:00PM GMT the response times start to increase (no memory increase or CPU increase)
There is a azure availability test going to server every 10 minutes.
As this is a dev site there is no traffic to it other than me (at the odd time) and the availability test
I log to a variable internally the startup time and this shows that the site is not restarting
The first request via a browser when this starts happening is very slow (2 minutes - probably some timeout).
After that it runs perfectly. That seems like the site is shutting down and then starting up on first request, but the pings are keeping it alive so the site is not shutting down (as far as I know)
On the odd log entry I get - I seem to be getting 502 errors - but I can't confirm this as the FEEB logs are usually off at this time.
FREB logs turn off automatically after 1 hour and as this is the middle of the night for me (NZDT) - I don't get a chance to turn on.
See attached images - as you can see the response times just increase at same time
Ignore the requests where they are above 20 - thats me going to it via browser
I always check the azure dashboard BEFORE viewing site in browser
Just got this error (from web browser randomly - keep accessing the same page:
502: The specified CGI application encountered an error and the server terminated the process.
Other relevant Info (Perhaps):
I initially had the availability test ping going to a ping endpoint /ping that only returned a 200 and empty string when I noticed this happening
It now points to the sites homepage to see if it changed anything - still the same.
Assuming the database is not the issue as the /ping endpoint doesn't touch the database - just a straight controller return.
Internal Exception handling is catching nothing
Service: Azure Free Web App (Development)
There are no web jobs or timed events on this site
Azure Dashboard Initial
Current tests:
Uploading as new site to a Basic 1 Small
Restarting dev site 12 hours before issues (usually 20 hours before)
Results:
Restarting free web-app 12ish hours before issue - same result at same time - so its not the app slowly overloading or it would me much later
Basic 1 Small: no problems - could it be something with the dev server ?
Azure Dashboard From Today
Observations:
Same behavior with /ping endpoint (just return empty string 200 Ok) and Main home page endpoint (database lookups [w/caching] / razer)
If anyone has any ideas what might be going on - I would very much appreciate it
:-)
Update:
It seems to of stopped (on its own) about 11/1/2016 1:50:49 AM GMT - my internal timestamp says it restarted - and then the errors started again same time as usual. Note: no-one is using the app. The basic 1 Small Server is still going fine.
Sorry I can't add anymore images (not enough rep)
By default, web apps are unloaded if they are idle for some period of time, which could cause the web site slow response during this period of time. Besides, this article is about troubleshooting HTTP "502 Bad Gateway" error or a HTTP "503 Service Unavailable" error in Azure web apps, you could read it. And from the article we could know scaling the web app could mitigate the issue.

Resources