Azure Webapp restarted on 1 node, then startup timeout - azure

We have a lightweight .NET Core API running in an Azure Webapp. In the past we had frequent downtime (3 times a week) out of nowhere, because for some reason the Webapp was restarted on 1 node and it couldn't start up in time: 502.5 ASP.NET Core Process Startup Error. First of all there should be no reason not to be able to start within 2 minutes, as it just have a few service registrations and that's it. But secondly, why did restart at all?
One time it was because Microsoft was updating their Azure Storage, which Webapps rely on and causes a restart. So I enabled local caching on the Webapp so that updates on the storage don't affect the webapp. And since then everything was stable for months, until today around 13:00.
Same happened; 1 node went down and the API couldn't start anymore. Just this error message 502.5 ASP.NET Core Process Startup Error. I rebooted the API and it started working again.
I have no idea what to do to prevent this. Does anyone else experience the same issue lately? There is no stacktrace whatsoever, as it looks like it's the runtime that causes the issue and not our code.
Help highly appreciated :)
Regards, Peter

Related

Azure slot swapping cause HTTP Error 500.30 - ANCM In-Process Start Failure

I've got an simple asp.net core 2.2 API. It is configured to deploy to azure as soon as we check-in into the master branch.
Azure devops release pipeline is configured to deploy it to an staging slot first. Then it does an smoke web test (by going to one end-point) and if that is successful then it swap the slot with production.
When the slot is swaped it does the same smoke web test (by going to the same end-point on production) to check if it still works. A lot of times i then get an HTTP Error 500.30 - ANCM In-Process Start Failure.
Deploying the same build again fixes this problem most of the times. But i cannot find any logs or details why this error occurds and how to fix this.
Any idea how to debug an HTTP Error 500.30 - ANCM In-Process Start Failure on a Azure Web App?
Turns out Azure has an internally known (I guess they are not eager to share the news about this) problem with 'Application Insights'.
So turn that feature off (if it's on), and see if it solves the issue. That step solved the problem for me.
I had the same error with an Azure ASP.Net Core 2.2 that was running fine for several weeks and suddenly started generating this error from Oct 15 to Oct 17.
Microsoft tech support folks tried to help for a couple of days but they couldn't figure out why the stdout logs were blank. Then, after 2 days, it turned out that it was a known problem on Microsoft's side and they promised to fix. Indeed, after about 8 hours the application started working again (no change or redeployment of the application on my side!).
I asked for an explanation but they told me it was too sensitive.
Today, after 2 weeks of working well, the same application is back to showing the same exact error: "HTTP Error 500.30 - ANCM In-Process Start Failure"
So, most likely, the problem is not in your code or deployment procedure. Instead, the problem is Azure (perhaps how they provision the .net core 2.2 runtime). But for some odd reason Microsoft is not willing to share the details of the problem with their user community (or permanently solve it). Very disappointing!

Azure Web Application Crashing 10 to 30 minutes + App Pool Recycles

I am seeing these errors in "Application Crashes"
88 crashes due to (0xC0000005 - Native Access Violation), 4 crashes due to (0xE0434352 - CLR Exception)
App Service is running on S3 app service plan. Memory and CPU don't seem to be an issue.
Doesn't seem to be consistent, seems to crash every 20 to 30 minutes but can sometimes be quicker. Always On is enabled.
There isn't alot to go on here but would suggest the following to try and narrow down...
Make sure any 3rd party libs you're using are supported on the version of .NET in the app service.
Enable diagnostic logs to get additional details on the fail to see if the problem area can be narrowed down
Enable app insights to help narrow down
If the above doesn't help, you could try and recreate the issue locally so you can debug as described here

IIS - Service Unavailable

Recently we are facing "Service Unavailable" while opening our web reports url in internet explorer.
Restarting the IIS service resolves the issue but didn't found any logs/errors in event viewer to track what is causing IIS to fail.
Is there any other way to troubleshoot this?
Many thanks...
To actually help you out SO need more information but following is more common cause.
There is no enough memory for application to run when it try to start. If there are multiple application in your IIS then it cause such issue as other application took priority so memory consume by them.
Your application has some un-handle exception that cause your application to shutdown and sometime it cause worker process to stop.
If your application is .NET based ( This is not the case with you because after IIS restart it runs successfully ) then .NET Runtime Version conflict also create such problem.

Access to _vti_bin/sharedaccess.asmx Throttles SharePoint

We have SharePoint 2013 Service Pack1 May 2015 CU.
Of lately we see lots of POST requests to SharePoint for end point " _vti_bin/sharedaccess.asmx" .
These requests just wait in IIS pipeline for as long as 3+ hours and after.
Once IIS can't take more requests, SharePoint Throttles and no one can access anything.
Any idea why this web service is hanging? What can be done to fix this?
turns out this was a problem with Distributed Cache. The App Fabric service on the WFE stopped. Upon restarting the service and doing an issreset , everything seems ok.
NOTE:
In case App fabrci keeps stopping you might have to remove the cache host and add it again

Is it possible to deploy ASP.NET Core website to Azure without taking it offline?

When we try to deploy ASP.NET Core website to Azure we are getting this error:
Error Code: ERROR_INSUFFICIENT_ACCESS_TO_SITE_FOLDER
More Information: Unable to perform the operation ("Delete File") for the specified directory ("D:\home\site\wwwroot\TestAspNetCore.exe"). This can occur if the server administrator has not authorized this operation for the user credentials you are using. Learn more at: http://go.microsoft.com/fwlink/?LinkId=221672#ERROR_INSUFFICIENT_ACCESS_TO_SITE_FOLDER.
The problem is IIS locks the .exe file. We can take the website offline but with continuous delivery it would be nice to have no downtime.
Note that ASP.NET 4.5 does not have this problem.
See also https://github.com/aspnet/IISIntegration/issues/226 and https://github.com/aspnet/Hosting/issues/141
I've had a similar headache and it seems not to be possible, the most reliable solution I have come up with is to have 1 build for a private local version that can be taken offline then restarted right after deployment. I then have a second build that takes the private version into production every night in the early hours.
This way I can make updates regularly through the day and ensure my site is only offline for no more than 20s during the early hours when it is least likely to be used.

Resources