Azure slot swapping cause HTTP Error 500.30 - ANCM In-Process Start Failure - azure

I've got an simple asp.net core 2.2 API. It is configured to deploy to azure as soon as we check-in into the master branch.
Azure devops release pipeline is configured to deploy it to an staging slot first. Then it does an smoke web test (by going to one end-point) and if that is successful then it swap the slot with production.
When the slot is swaped it does the same smoke web test (by going to the same end-point on production) to check if it still works. A lot of times i then get an HTTP Error 500.30 - ANCM In-Process Start Failure.
Deploying the same build again fixes this problem most of the times. But i cannot find any logs or details why this error occurds and how to fix this.
Any idea how to debug an HTTP Error 500.30 - ANCM In-Process Start Failure on a Azure Web App?

Turns out Azure has an internally known (I guess they are not eager to share the news about this) problem with 'Application Insights'.
So turn that feature off (if it's on), and see if it solves the issue. That step solved the problem for me.

I had the same error with an Azure ASP.Net Core 2.2 that was running fine for several weeks and suddenly started generating this error from Oct 15 to Oct 17.
Microsoft tech support folks tried to help for a couple of days but they couldn't figure out why the stdout logs were blank. Then, after 2 days, it turned out that it was a known problem on Microsoft's side and they promised to fix. Indeed, after about 8 hours the application started working again (no change or redeployment of the application on my side!).
I asked for an explanation but they told me it was too sensitive.
Today, after 2 weeks of working well, the same application is back to showing the same exact error: "HTTP Error 500.30 - ANCM In-Process Start Failure"
So, most likely, the problem is not in your code or deployment procedure. Instead, the problem is Azure (perhaps how they provision the .net core 2.2 runtime). But for some odd reason Microsoft is not willing to share the details of the problem with their user community (or permanently solve it). Very disappointing!

Related

Azure Webapp restarted on 1 node, then startup timeout

We have a lightweight .NET Core API running in an Azure Webapp. In the past we had frequent downtime (3 times a week) out of nowhere, because for some reason the Webapp was restarted on 1 node and it couldn't start up in time: 502.5 ASP.NET Core Process Startup Error. First of all there should be no reason not to be able to start within 2 minutes, as it just have a few service registrations and that's it. But secondly, why did restart at all?
One time it was because Microsoft was updating their Azure Storage, which Webapps rely on and causes a restart. So I enabled local caching on the Webapp so that updates on the storage don't affect the webapp. And since then everything was stable for months, until today around 13:00.
Same happened; 1 node went down and the API couldn't start anymore. Just this error message 502.5 ASP.NET Core Process Startup Error. I rebooted the API and it started working again.
I have no idea what to do to prevent this. Does anyone else experience the same issue lately? There is no stacktrace whatsoever, as it looks like it's the runtime that causes the issue and not our code.
Help highly appreciated :)
Regards, Peter

msbuild is failing at CheckAzureNet46Support for a webjob

I'm trying to build and publish web job via MSBuild and it is failing at
'_CheckAzureNet46Support' with error VerifyAzureNet46Support -
[VerifyAzureNet46Support] C:\Program Files
(x86)\MSBuild\Microsoft\VisualStudio\v14.0\Web\Microsoft.Web.Publishing.targets(4755,
7): Your hosting provider does not yet support ASP.NET 4.6, which your
application is configured to use.
I've published other projects as web job to this web app with no issue but intermittently this issue occurs, is it something with my configuration of web app.
MSBuild arguments used for build is
/P:Configuration=Release /p:DeployOnBuild=true
/P:PublishProfileRootFolder="%heckoutDir\BuildConfigurations\publishProfile"
/p:PublishProfile="WebsitePublishProfile"
/P:Password=WebsitepublishProfilePassword
/P:outputdirectory=Bin/Release
If the app service plan is having high RAM or CPU utilization this issue occurs.
To mitigate this you can Scale Up app service plan to avoid this issue.
It is highly unlikely this has nothing to do with the App Service plans. That analysis simply does not make sense.
Rather, a build-time check fails verifying whether the deployment target in Azure supports ASP.NET 4.6. This is done by performing an HTTP GET request to a hard-coded URL at http://go.microsoft.com/fwlink/?LinkID=613106&clcid=0x409. If said request returns 200 OK, the check fails.
Therefore, the more likely explanation is that the support page in question was back up for a few hours by mistake. We had the same problem, but it seems to work now.

Is it possible to deploy ASP.NET Core website to Azure without taking it offline?

When we try to deploy ASP.NET Core website to Azure we are getting this error:
Error Code: ERROR_INSUFFICIENT_ACCESS_TO_SITE_FOLDER
More Information: Unable to perform the operation ("Delete File") for the specified directory ("D:\home\site\wwwroot\TestAspNetCore.exe"). This can occur if the server administrator has not authorized this operation for the user credentials you are using. Learn more at: http://go.microsoft.com/fwlink/?LinkId=221672#ERROR_INSUFFICIENT_ACCESS_TO_SITE_FOLDER.
The problem is IIS locks the .exe file. We can take the website offline but with continuous delivery it would be nice to have no downtime.
Note that ASP.NET 4.5 does not have this problem.
See also https://github.com/aspnet/IISIntegration/issues/226 and https://github.com/aspnet/Hosting/issues/141
I've had a similar headache and it seems not to be possible, the most reliable solution I have come up with is to have 1 build for a private local version that can be taken offline then restarted right after deployment. I then have a second build that takes the private version into production every night in the early hours.
This way I can make updates regularly through the day and ensure my site is only offline for no more than 20s during the early hours when it is least likely to be used.

Infamous 'Load operation failed for query 'GetUser'. The remote server returned an errror: NotFound'

It looks like I have just happened to find an easy reproducible solution for infamous:
'Load operation failed for query 'GetUser'.
The remote server returned an errror: NotFound'
issue for WCF RIA Services (Silverlight 5) web setup: when using VS2012 Web Publishing Wizard with 'Precompile during publishing' option checked-on then the issue does raise its ugly head. When the 'Precompile during publishing' option is checked off then deployed WCF RIA Services works well with Silverlight client. Please check on your system.
I have used fuslogvw etc.etc. - nothing helped. Never used Web Publishing Wizard before - xcopy was my friend, this time I wanted to automate the whole deploying process - and lost a couple of hours of precious time.
I still don't know what is the cause of the issue but I can proceed with my work. Next time when there will be more free time I will probably use this technique to localize the causes of the subject issue.
Edit for clarification: I have effectively solved the subject issue on my localhost but I have it still appearing in one program but not in another when releasing on an external web hosting environment, and I'm yet to find what causes this issue - do you know any good sources where this issue's various solutions are classified and accompanied with solution walk-throughs?

Loading profiler failed during CoCreateinstance - error, but I'm not using a profiler

Recently I published to my Azure Staging server (Asp.Net MVC App) and my app wouldn't come up. I checked the Event logs on the machine, and this was the error:
.NET Runtime version 4.0.30319.18033 - Loading profiler failed during
CoCreateInstance. Profiler CLSID:
'{F1260058-1A1F-4738-8BE2-0BF9D3A64219}'. HRESULT: 0x8007007e. Process
ID (decimal): 1872. Message ID: [0x2504].
The thing is that I am not using a profiler, everything worked fine yesterday (day old publish) - any ideas what could be causing this, and how I could fix it? Thank you.
Not to say there is not a better fix (I tried all I could find elsewhere, nothing seemed to relate to my specific problem) but here is what I ended up doing. Simply delete your deployment, and re-publish. This must re-set whatever turning on your profiler sets.
Remember that if this is a non domain dns instance, your address will be changed. Hope this can save someone a few hours.
Blog Post Here

Resources