Processing termination of activated Application Host after creation of Service Fabric application - azure

We have roughly 12 Service Fabric clusters running in Azure. They are running correctly in both our Production and Test environments. We have found recently that one of them will not start locally. We have not ran this one locally in quite a while, and I am having a hard time tracking down what might have happened that is causing this error. It is happening on any machine I try to run locally on.
Specifically, after the type is registered, and the app created, the host process immediately terminates:
"Message": "EventName: ApplicationProcessExited Category: StateTransition EventInstanceId 158f38d1-47ac-4b70-9830-0d8d3cdf8f9c ApplicationName fabric:/Office.Ocv.CustomerTalkback.AutomatedService.ServiceFabric Application terminated, ServiceName=fabric:/Office.Ocv.CustomerTalkback.AutomatedService.ServiceFabric/MS.Internal.Office.Ocv.Services.CustomerTalkback.Automated, ServicePackageName=MS.Internal.Office.Ocv.Services.CustomerTalkback.Automated.Package, ServicePackageActivationId=d58e53d1-af22-42fb-9003-3154bcb8d00b, IsExclusive=True, CodePackageName=Code, EntryPointType=Exe, ExeName=MS.Internal.Office.Ocv.Services.CustomerTalkback.Automated.exe, ProcessId=16756, HostId=e27ccd9d-cff6-4317-b168-5a4b7b724808, ExitCode=2147516563, UnexpectedTermination=True, StartTime=06/18/2019 15:47:26. ",
This is dotnet core 2.2.0. All of our Service Fabric apps are running with the same settings/dependencies, etc. Only this one fails locally.
I have tried moving the local cluster to a larger drive (800 GB free); deploying manually via PowerShell (usually VS 2019).
Any help (even if it is just a suggestion of trouble shooting steps) would be much appreciated as I have working on this for about 16 hours over last three days.
thanks!

The problem I had was with the full name of the assembly. The local path was short (like d:\src\adm), but the full assembly name was ~65 characters. It appears as though the PowerShell that deploys locally would fail silently on this. When I dropped the length of the name down to about 35 characters it started working.

Related

Parse Server periodically is getting slow

parse-server version 2.7.4 (Azure on a Standard_B4ms)
mongoDB-server version 3.4.14 (Azure on a separate Standard_B4ms)
I have an iOS & Android app, with LiveQuery (set on the parse-server's VM) that's being used a lot for chatting, where usually there are ± 50 simultaneous users. The thing is, after a few hours of continuous usage, the server's cloud code responses are getting REALLY slow! Not just from a specific one... all cloud functions!
I'm using screen to run the parse server. So I found that if I restart the parse server (not the vm), the app is getting back to normal.
I also have logs enabled at all times. (just mentioning it in case it could be the issue)
I can't understand why this is happening!
Any ideas?

many log files on azure service fabric

I have a azure service fabric development cluster running locally with two applications.
After a two week holiday I come back and see that my hard drive is completely full, consequently nothing really works anymore.
the sfdevcluster\log\traces folder has many *.etl files all larger than 100MB.
And all kinds of other log files > 250 MB are present
So my questions: how to disable tracing/logging on azure service fabric and are there tools to administer log files?
The powerShell script file that does the cluster setup magic is:
Program Files\Microsoft SDKs\Service Fabric\ClusterSetup\DevClusterSetup.ps1
Looking inside, there is a function called DeployNodeConfiguration which sets the logs and data path using the PowerShell command New-ServiceFabricNodeConfiguration. Unfortunately, It does not seem that there is a way to limit the size of those folders.
I believe that your slowness / freeze is due to insufficient space on the OS drive (happened to me too haha). A workaround can be to set the location of those folders to a non-OS drive with a limited amount of space.
Hope this helps
This turned out to be a bug in Service Fabric, upgrade your local cluster to the latest version 6.1.472.9494 which will fix the issue. more details here

Azure Server Incaccessible

One of my 10 Azure VMs running windows has suddenly became inaccessible! Azure Management Console shows the state of this VM as "running" the Dashboard shows no server activity since my last RDP logout 16 hours ago. I tried restarting the instance with no success, still inaccessible ( No RDP access, hosted application down, unsuccessful ping...).
I have changed the instance size from A5 to A6 using the management portal and everything went back to normal. Windows event viewer showed no errors except the unexpected shutdown today after my Instance size change. Nothing was logged between my RDP logout yesterday and the system startup today after changing the size.
I can't afford having the server down for 16 hours! Luckily this server was the development server.
How can I know what went wrong? Anyone faced a similar issue with Azure?
Thanks
there is no easy way to troubleshoot this without capturing it in a stuck state.
Sounds like you followed the recommended steps, i.e.:
- Check VM is running (portal/PowerShell/CLI).
- Check endpoints are valid.
- Restart VM
- Force a redeployment by changing the instance size.
To understand why it happened it would be necessary to leave it in a stuck state and open a support case to investigate.
There is work underway to make both self-service diagnosis and redeployment easier for situations like this.
Apparently nothing is wrong! After the reboot the machine was installing updates to complete the reboot. When I panicked, I have rebooted it again, stopped it, started it again and I have even changed its configuration thinking that it is dead. While in fact it was only installing updates.
Too bad that we cannot disable the automatic reboot or estimate the time it takes to complete.

AZURE VM refuses to start "Starting (Could not start)"

We have set up a small AZURE VM (plain windows 2012 R2 image as provided by Microsoft) with a lightweight DEMO application that happily runs with SQLExpress and 1GB RAM. This VM did run quite fine for a month. A few days ago we shut down the VM and the cloud service to save some credits until we need the demo again.
Today the VM refuses to start and cycles continuously between "stopped", "starting" and "stopped (could not start)".
No useful error detals are listed but the operation logs of the management service notes
<OperationStatus>
<ID>98502f34-08bd-70a3-b0c3-f7e08976dd38</ID>
<Status>Failed</Status>
<HttpStatusCode>500</HttpStatusCode>
<Error>
<Code>InternalError</Code>
<Message>The server encountered an internal error. Please retry the request.</Message>
</Error>
</OperationStatus>
Can i expect that this situation will eventually resolve by itself or is there any other measure I could try to get my VM up and running again? It is no option to download the 20GB virtual disk to examine it locally and upload it again. This would take forever and a day.
Are the AZURE services just crap?
If you haven't already tried it, a typical workaround step is to delete the failing VM (keeping the disk/stateful information intact!) and then recreate it using the same disk.

Cloud environment on Windows Azure platform

I've got 6 web sites, 2 databases and 1 cloud environment setup on my account
I used the cloud to run some tasks via Windows Task Manager, everything was installed on my D drive but between last week and today the 8 of March my folder containing the "exe" to run as been removed.
Also I've installed SVN tortoise to get the files deployed and it not installed anymore
I wonder if somebody has a clue about my problem
Best Regards
Franck merlin
If you're using Cloud Services (web/worker roles), these are stateless virtual machines. That is: Windows Azure provides the operating system, then brings your deployment package into the environment after bootup. Every single virtual machine instance booted this way starts from a clean OS image, along with the exact same set of code bits from you.
Should you RDP into the box and manually install anything, anything you install is going to be temporary at best. Your stuff will likely survive reboots. However, if the OS needs updating (especially the underlying host OS), your changes will be lost as a fresh OS is brought up.
This is why, with Cloud Services, all customizations should be done via startup tasks or the OnStart() event. You should never manually install anything via RDP since:
Your changes will be temporary
Your changes won't propagate to additional instances; you'll be required to RDP into every single box to perform the same changes.
You may want to download the Azure Training Kit and look through some of the Cloud Service labs to get a better feel for startup tasks.
In addition to what David said, check out http://blogs.msdn.com/b/kwill/archive/2012/10/05/windows-azure-disk-partition-preservation.aspx for the scenarios where the different drives will be destroyed.
Also take a look at http://blogs.msdn.com/b/kwill/archive/2012/09/19/role-instance-restarts-due-to-os-upgrades.aspx which points you to the RSS feed and MSDN article where you can see that a new OS is currently being deployed.

Resources