This week we have started getting 502 errors on our Web App, these are random; some times they happen when there is consistent load other times they happen with even a single request.
I have checked event viewer and there is no application crash, also don't have any really slow requests in IIS logs. I had auto-heal enabled, which is now disabled. I have also enabled auto scale and even with 4 instances running I get 502 error every once in a while. There is no log entry for this 502 in IIS logs, so I am guessing something upstream is returning this response, I just don't know why its doing that and why its so random.
Related
We have a blue, green site and server farm set up for zero downtime deployment which works fine but we are seeing in the web logs for the site that does the routing that every now and then the wrong server is being sent requests.
2023-02-19 09:50:05 /DocumentView DID=23932&SERVER-ROUTED=LIVE-GREEN
2023-02-19 09:50:09 /FileDownload FID=50516&SERVER-ROUTED=LIVE-BLUE
2023-02-19 09:50:13 /Publish DocID=9154358&SERVER-ROUTED=LIVE-BLUE
2023-02-19 09:50:15 /SiteView DID=23932&SERVER-ROUTED=LIVE-GREEN
In this instance it tries to send two requests to Live-Blue which is set to Unhealthy and the site Stopped. The Health Test interval is set to 1 seconds (at the moment).
It's not happening a lot(maybe once every 25k request) but is very annoying to those that do get the 502 error message. (502.3 - Bad Gateway: Forwarder Connection Error (ARR).)
Anyone have any ideas on how to fix or diagnose the issue?
I am having a function app which is running on consumption plan (Y1). as you can see below screen shots the execution count was suddenly drop to 0 for some time (~30min). in that time the app getting the requests as well. but giving 4xx errors.
why is this happening?
Thats an interesting little period you have in your graph there. Is it all http 4xx if you drill into app insight logs can see if its 400's or 404s by any chance. I have a feeling its because the client calling the function app was passing bad data etc resulting in a 400 Bad Request during that period.
If its something with the core function app you will get a 502 or a 500, was there a deployment or code change during that period that could have triggered client side bad requests.
I have a java web app on Azure, and I got failed requests in it's Application Insights. It look likes someone are calling 'http://myApp.azurewebsites.net/error' every 5 minutes, but I do not have this interface, so there are many failed requests with 404 in Application Insights. Then I add this interface in app, but there are still many failed requests with 404 code. I have no idea about those requests, I do not know where are them from or what do them want to do. Did I set wrong configurations in my app?
There is a setting named 'Always on' in App Service's configuration, and it's works fine when I turned off this setting.
To narrow down this issues, you can enable the Diagnostic log for your web apps. Web Server Diagnostic logging helps you to trace the exception details originate from components. And if you suspect error comes from your application then "Application Diagnostic" is the source to trace the reason for errors.
Also, Enable the log stream on your web app so irrespective of peak or off peak hours, you can monitor the live log stream , how your web app performs and respond to each request.
It's caused by "Always On" being ON under the Configuration / General settings of your AppService.
As per the docs:
Always On: Keeps the app loaded even when there's no traffic. When Always On is not turned on (default), the app is unloaded after 20 minutes without any incoming requests. The unloaded app can cause high latency for new requests because of its warm-up time. When Always On is turned on, the front-end load balancer sends a GET request to the application root every five minutes. The continuous ping prevents the app from being unloaded.
To mitigate the impact, you can add a controller/ action that handles the default route.
Got a bit of a weird issue occurring with an ASP.NET Web API (.Net Framework 4.8) application running on IIS 8.5.
The application will run responding to requests in < 1 sec and will suddenly spike to > 1 sec.
At the same time a large number of requests appear in the HTTPERR log with the error Request_Cancelled.
Failed request tracing shows the request is hitting 1 second in AspNetHttpHandlerEnter. This would lead me to think that the application itself is performing poorly, however, monitoring doesn't show any performance issues elsewhere.
Eventually these cancelled requests cause the DB connection pool to run out of connections.
Is there any way to get more info about what might be causing the Request_Cancelled errors in a Web API application?
Every day at about 3:00PM-4:00PM GMT the response times start to increase (no memory increase or CPU increase)
There is a azure availability test going to server every 10 minutes.
As this is a dev site there is no traffic to it other than me (at the odd time) and the availability test
I log to a variable internally the startup time and this shows that the site is not restarting
The first request via a browser when this starts happening is very slow (2 minutes - probably some timeout).
After that it runs perfectly. That seems like the site is shutting down and then starting up on first request, but the pings are keeping it alive so the site is not shutting down (as far as I know)
On the odd log entry I get - I seem to be getting 502 errors - but I can't confirm this as the FEEB logs are usually off at this time.
FREB logs turn off automatically after 1 hour and as this is the middle of the night for me (NZDT) - I don't get a chance to turn on.
See attached images - as you can see the response times just increase at same time
Ignore the requests where they are above 20 - thats me going to it via browser
I always check the azure dashboard BEFORE viewing site in browser
Just got this error (from web browser randomly - keep accessing the same page:
502: The specified CGI application encountered an error and the server terminated the process.
Other relevant Info (Perhaps):
I initially had the availability test ping going to a ping endpoint /ping that only returned a 200 and empty string when I noticed this happening
It now points to the sites homepage to see if it changed anything - still the same.
Assuming the database is not the issue as the /ping endpoint doesn't touch the database - just a straight controller return.
Internal Exception handling is catching nothing
Service: Azure Free Web App (Development)
There are no web jobs or timed events on this site
Azure Dashboard Initial
Current tests:
Uploading as new site to a Basic 1 Small
Restarting dev site 12 hours before issues (usually 20 hours before)
Results:
Restarting free web-app 12ish hours before issue - same result at same time - so its not the app slowly overloading or it would me much later
Basic 1 Small: no problems - could it be something with the dev server ?
Azure Dashboard From Today
Observations:
Same behavior with /ping endpoint (just return empty string 200 Ok) and Main home page endpoint (database lookups [w/caching] / razer)
If anyone has any ideas what might be going on - I would very much appreciate it
:-)
Update:
It seems to of stopped (on its own) about 11/1/2016 1:50:49 AM GMT - my internal timestamp says it restarted - and then the errors started again same time as usual. Note: no-one is using the app. The basic 1 Small Server is still going fine.
Sorry I can't add anymore images (not enough rep)
By default, web apps are unloaded if they are idle for some period of time, which could cause the web site slow response during this period of time. Besides, this article is about troubleshooting HTTP "502 Bad Gateway" error or a HTTP "503 Service Unavailable" error in Azure web apps, you could read it. And from the article we could know scaling the web app could mitigate the issue.