We have set up a small AZURE VM (plain windows 2012 R2 image as provided by Microsoft) with a lightweight DEMO application that happily runs with SQLExpress and 1GB RAM. This VM did run quite fine for a month. A few days ago we shut down the VM and the cloud service to save some credits until we need the demo again.
Today the VM refuses to start and cycles continuously between "stopped", "starting" and "stopped (could not start)".
No useful error detals are listed but the operation logs of the management service notes
<OperationStatus>
<ID>98502f34-08bd-70a3-b0c3-f7e08976dd38</ID>
<Status>Failed</Status>
<HttpStatusCode>500</HttpStatusCode>
<Error>
<Code>InternalError</Code>
<Message>The server encountered an internal error. Please retry the request.</Message>
</Error>
</OperationStatus>
Can i expect that this situation will eventually resolve by itself or is there any other measure I could try to get my VM up and running again? It is no option to download the 20GB virtual disk to examine it locally and upload it again. This would take forever and a day.
Are the AZURE services just crap?
If you haven't already tried it, a typical workaround step is to delete the failing VM (keeping the disk/stateful information intact!) and then recreate it using the same disk.
Related
I created a Standard B1s Windows VM instance where I'm running OpenSSH service and using it as a SFTP server.
All works perfectly fine for about 2 hours, I can RDP to the VM nicely and SSH connection works fine.
After about 2 hours the connection to VM becomes very slow in a way that RDP takes around a minute and SSH connection times out every time.
What fixes a problem for a short time is restarting the VM or resizing it to any other tier. Then again everything works fine for about 2 hours, then problem appears again.
I'm aware that B1s is a Burst type VM but we are using it as simple SFTP server where 2-3 times a day one document will be uploaded. So no high CPU or Memory occupancy is needed. I also tried resizing it to non B-class VM, but the problem is the same. We are located in East USA and server is also located in that Azure region.
Any help is appreciated! Thanks
Try accessing the VM from somewhere else. Maybe its a network related problem? Create a VM in Europe West and execute the same operations. If this is causing no issues then I would try to dig deeper in network related topics.
We have roughly 12 Service Fabric clusters running in Azure. They are running correctly in both our Production and Test environments. We have found recently that one of them will not start locally. We have not ran this one locally in quite a while, and I am having a hard time tracking down what might have happened that is causing this error. It is happening on any machine I try to run locally on.
Specifically, after the type is registered, and the app created, the host process immediately terminates:
"Message": "EventName: ApplicationProcessExited Category: StateTransition EventInstanceId 158f38d1-47ac-4b70-9830-0d8d3cdf8f9c ApplicationName fabric:/Office.Ocv.CustomerTalkback.AutomatedService.ServiceFabric Application terminated, ServiceName=fabric:/Office.Ocv.CustomerTalkback.AutomatedService.ServiceFabric/MS.Internal.Office.Ocv.Services.CustomerTalkback.Automated, ServicePackageName=MS.Internal.Office.Ocv.Services.CustomerTalkback.Automated.Package, ServicePackageActivationId=d58e53d1-af22-42fb-9003-3154bcb8d00b, IsExclusive=True, CodePackageName=Code, EntryPointType=Exe, ExeName=MS.Internal.Office.Ocv.Services.CustomerTalkback.Automated.exe, ProcessId=16756, HostId=e27ccd9d-cff6-4317-b168-5a4b7b724808, ExitCode=2147516563, UnexpectedTermination=True, StartTime=06/18/2019 15:47:26. ",
This is dotnet core 2.2.0. All of our Service Fabric apps are running with the same settings/dependencies, etc. Only this one fails locally.
I have tried moving the local cluster to a larger drive (800 GB free); deploying manually via PowerShell (usually VS 2019).
Any help (even if it is just a suggestion of trouble shooting steps) would be much appreciated as I have working on this for about 16 hours over last three days.
thanks!
The problem I had was with the full name of the assembly. The local path was short (like d:\src\adm), but the full assembly name was ~65 characters. It appears as though the PowerShell that deploys locally would fail silently on this. When I dropped the length of the name down to about 35 characters it started working.
I use azure vm, when vm is restart then failed to start vm error and stay on Updating mode, so I don't use azure recovery services.
Probably more of a server fault question.
Couple of things you can try:
Completely stop the VM to stopped(deallocated) status, then boot it backup, if you have problem stop it from portal, use powershell cmdlet
Re-size the VM, normally would move it to another host and can jump off the problematic host
Delete the VM retain disks, create another VM from the disk.
There is also a good chance all above would fail. If you have a RTO, you might want to start preparing failover/recreate the machine in parallel.
If someone is still experiencing the stuck in the UPDATING status, it's probably your VM's Disk ran out of space, what I did is I increased the size and it boot fine.
We've got a classic VM on azure. All it's doing is running SQL server on it with a lot of DB's (we've got another VM which is a web server which is the web facing side which accesses the sql classic VM for data).
The problem we have that since yesterday morning we are now experiencing outages every 2-3 hours. There doesnt seem to be any reason for it. We've been working with Azure support but they seem to be still struggling to work out what the issue is. There doesnt seem to be anything in the event logs that give's us any information.
All that happens is that we receive a pingdom alert saying the box is out, we then can't remote into it as it times out and all database calls to it fail. 5 minutes later it will come back up. It doesnt seem to fully reboot or anything it just haults.
Any ideas on what this could be caused by? Or any places that we could look for better info? Or ways to patch this from happening?
The only thing that seems to be in the event logs that occurs around the same time is a DNS Client Event "Name resolution for the name [DNSName] timed out after none of the configured DNS servers responded."
Smartest or Quick Recovery:
Did you check SQL Server by connecting inside VM(internal) using localhost or 127.0.0.1/Instance name. If you can able connect SQL Server without any Issue internally and then Capture or Snapshot SQL Server VM and Create new VM using Capture VM(i.e without lose any data).
This issue may be occurred by following criteria:
Azure Network Firewall
Windows Server Update
This ended up being a fault with the node/sector that our VM was on. I fixed this by enlarging the size of our VM instance (4 core to 8 core), this forced azure to move it to another node/sector and this rectified the issue.
I use MS azure virtual machine. IIS running, hosting different web sites. What happens ,when I update vm from extra small to small ? Should I take a backup or nothing is required?There are some IIS bindings, should I move those settings ?
I think it is similar to gave more memory in vmware machine . But since it is vital for us, I am asking this question .
The VM will be gracefully shutdown and brought back up (most likely on a completely different VM host), so you will see some downtime as that transition occurs. The actual data for your machine is stored in BLOB storage, not on the VM itself, so you don't specifically need to worry about a back up because you are making this change. That said, if this is a production machine you need to be thinking about backups anyway.