Azure worker role getting stuck in Role state Unknown - azure

Azure toolkit 1.5
Create New project
Add worker role
Hit F5
The deployments get stuck in:
[fabric] Role Instance: deployment(189).WindowsAzureProject1.WorkerRole1.0
[fabric] Role state Unknown
Eventually the deployment times out.
Any ideas on how to debug this?

I personally solved this problem by removing *:808 binding in IIS Manager for Default Website.

The required Azure assemblies may be missing from the package you are deploying to Azure.
Double check that each Azure assembly your project is referencing has the copy to local property set to true.
The following article may help to debug the problem:
Debugging MSDN article

Got it working, turned out that Windows Process Activation Service wasn't running correctly on my machine. Reinstalled and enabled tcp activation and now its working!

I had the same problem: roles were permanently stuck in Unknown state and never started properly. Turns out that Net.Tcp Port Sharing Service (SMSvcHost.exe) had taken port 808 and this prevented dev fabric from starting the roles. I restarted the service, and now my roles run fine in dev fabric.
So if you run into the same problem, see if port 808 has been taken by some other process.

Andreas,
You're probably missing an assembly reference or have a startup script issue, best way to continue is to try the deployment with intellitrace enabled.

Related

Service Fabric deployment hanging on GetApplicationExistence

I have created a Service Fabric cluster in Azure from scratch, with security provided by an X509 certificate. There is one node type with five nodes, all of which appear healthy when I look in Service Fabric Explorer.
I have a solution in Visual Studio 2017 (v15.3.5) containing a Service Fabric project along with one other project which is a Stateless Service. When I attempt to deploy this, I select the cluster in the publish window, and hit go.
It then sits indefinately displaying "Started executing script 'GetApplicationExistence'" in the Build output window in VS. No amount of waiting will result in either success nor an error message. I can't find anything in the logs in Azure.
Does anyone have any idea what could be going wrong?
Check if your LocalCluster is healthy. There should not be any error/warning. I had warning System/DnsService was not up.
Reset the LocalCluster worked for me.
It turns out that there was something wrong with my development environment; the service fabric installation had got messed up somehow. When I tried with the exact same code from a different machine it all worked properly. I updated the Service Fabric elements of Visual Studio to the latest version and all started working again.

Web deployment task failed. Destination not reachable

I've been successfully using Azure for months. Today I'm getting the following error when I publish from via Web Deploy from Visual Studio 2013
Error 5 Web deployment task failed. (Could not connect to the remote computer ("waws-prod-hk1-001.publish.azurewebsites.windows.net"). On the remote computer, make sure that Web Deploy is installed and that the required process ("Web Management Service") is started. Learn more at: http://go.microsoft.com/fwlink/?LinkId=221672#ERROR_DESTINATION_NOT_REACHABLE.)
I tried downloaded a new Publishing Profile from the Azure Dashboard and turning off my firewall - no change.
I can't ping the listed server.
It seems to be the same issue as http://social.msdn.microsoft.com/Forums/windowsazure/en-US/83d4e635-2851-4526-b21b-31101d00aa86/web-deployment-task-failed-errorcouldnotconnecttoremotesvc?forum=windowsazurewebsitespreview
Any ideas? Thank you.
I had the exact same problem for a couple of days (sometimes I could published and sometimes not).
The problem was my internet configuration (not related to Microsoft or Azure in anyway).
To solve this issue I disconnected my modem (its actually modem+router) for 10 min and reconnect it and it was fixed!
Hope it will help someone someday...
Nevermind. It would appear that Azure was in a 'challenged' state.
It's working fine now..
What worked for me was disabling and re-enabling my network adapter.
The reasoning is, quite possibly naive, that having Fiddler running during a deployment will fail but it also changes the network state and that state continues
to exist even after exiting Fiddler (which disables the proxy).

Your role instances have recycled a number of times during an update or upgrade operation

I am trying to deploy a Cloud Service with 1 Web Role to Azure.
When I do so, I get this message:
Your role instances have recycled a number of times during an update or upgrade operation. This indicates that the new version of your service or the configuration settings you provided when configuring the service prevent the role instances from running. Verify your code does not throw unhandled exceptions and that your configuration settings are correct and then start another update or upgrade operation.
The project runs just fine locally, and I'm having a hard time figuring out how to start debugging this issue. Are there any common problems that cause this message or steps to figure out what is causing it?
See https://learn.microsoft.com/en-us/archive/blogs/kwill/windows-azure-paas-compute-diagnostics-data. This will walk through all of the diagnostic data available as well as how to troubleshoot the most common issues.
We also had this annoying problem and in our case:
We use local storage, but it wasn't defined in service definition (or Worker Role's properties)
Our worker role project has reference to a service project which has reference to data layer project. But, the worker role project doesn't have reference to the data layer project. As soon as we added reference to data layer project in worker role project, it deploys successfully.
Problem #1 can be easily noticed if you first run the project in your local machine. Exception will be thrown.
Problem #2, however, is more difficult, mainly because it runs just fine in local machine. After 5 days of trouble shooting, we finally found the problem. So, check all references and try to add sub-reference projects, those that are referenced by other references.
We had similar problem, and it was due to some DDLs failed to load. (due to different version from the one MS have deployed to the VM)
Try to set CopyLocal to "true" for all the References in the project, and re-deploy.
I would either remote desktop to the cloud instance and review the Windows Event Logs for exceptions or redeploy with IntelliTrace Enabled. If you choose the later, you can download the IntelliTrace logs from Visual Studio and debug
http://msdn.microsoft.com/en-us/library/windowsazure/ff683671.aspx
One way to find out the actual error is to click on the " 1 instance" at the top of Dashboard after trying to deploy your web role. It will tell you the status of the role instance. The status should include more information about the type of error which blocks your deployment.
It depends on what your case is. For me, the status claimed that I had an unhandled Security exception. After some investigation, it turned out that under my role's OnStart(), I tried to create a event source. However, Azure service doesn't have the permission to create an event source.
For more possible issues, check http://blogs.msdn.com/b/kwill/archive/2013/09/06/troubleshooting-scenario-3-role-stuck-in-busy.aspx
For me, the issue was with my SQL Azure DB firewall rules. My Azure SQL Database servers are not set to "Allow Access to Azure services", so I have to explicitly list IPs that are allowed.
I discovered this after wrapping my code in a try/catch that swallowed all exceptions, refactoring my OnStart() and RunAsync() methods, and setting all my references to Copy Local = True. None of that worked, then I saw that I had this line in my RunAsync() method:
log4net.Config.XmlConfigurator.Configure();
I am using the AdoNetAdapter for log4net and connecting to an Azure SQL DB for logging, so that led me to check the firewall rules.
For me, I had some differing version of nuget packages in my various projects. Once I consolidated everything to the same version(s), it worked fine.
With the release of Windows Azure SDK version 2.2 for Visual Studio 2012 and 2013, Now you can Remote Debug Cloud Resources within Visual Studio.
Once your cloud service is published and running live in the cloud, you can simply set a breakpoint in your local source code. This may help you in digging out what's going wrong!

Azure: Worker role looping through "recycling"

I'm currently working on an Azure project that works 100% locally with emulator resources. I'm now trying to deploy a worker role, but I'm running into an issue that I'm not sure how to troubleshoot.
Upon deploying the worker role in my Azure portal, the two instances continually loop through "recycling".
I can try to RDP into the role, but I only have about a minute to look around before the connection closes, I'm assuming due to the recycling.
After some searching it doesn't seem like this is a super common problem. Is there something trivial I'm overlooking that could be causing this issue? How would you go about troubleshooting this? Thank you for your time :)
In case of missing Reference you can troubleshoot this issue by:
Unzip your CSPKG file and then again unzip .CSSX file (just rename CSSX to zip) and match that everything references and static content is all there.. This way you can match what is on VM. Also in 2 minute windows when you RDP, try to look for Application event log for exception and get it because that would be the key to find the root cause.
IF you could see the exception in event log and look for the exception, you sure can find where it was generated. You can also use Intellitrace which might require you to redeploy the app.
Also there are ways were copying WinDBG and locking to the specific process you can debug it. I am not sure how much you would want to try but just copy the WinDBG to VM and use it would be enough (not sure how much experience you have with WinDBG though and how much time you would want to spent.)
Also been pestered by this role recycle issue numerous times. Here is the sequence of steps to debug persistent role recycles:
Debugging Azure Role Recycles
Enable Remote Access to your role - RDP login
Check eventvwr.msc (Windows Logs -> Application, App & Service Logs->Windows Azure)
Review the Azure text file logs across both C:\logs and c:\resources
Review custom logs in the Volume E: or F: for any custom role startup logging
Run AzureTools and attach to startup processes (download WinDBG, use Utils->Attach Debugger, select process - WaWorkerHost/WaIISHost, etc), use G to continue and watch debugger output for assemblies failing to load.
Installing Azure Debugging Tools via Powershell
PS> md c:\tools; Import-Module bitstransfer; Start-BitsTransfer http://dsazure.blob.core.windows.net/azuretools/AzureTools.exe c:\tools\AzureTools.exe; c:\tools\AzureTools.exe
If all items above fail - try using other tools in the AzureTools treasure trove - such as fusion logging, etc, this approach above will work!
WinDBG Sample Output - Failing to Locate Assembly (WaIISHost)
The most likely cause is that you have a missing assembly. One tactic to catch this is to wrap any startup processing in a master try/catch that manual logs the error to Azure storage.
If you added any referrences, check to make sure they're set to copylocal=true and that any external assets that were included in your service package were also set to be included.
From Avkash above:
Yes. this mean some issue in your Worker Role code is causing your Worker Role Host Process to crash.. If you look your fault stack you must see the function or the link from your code which generate this fault. IF you need help open a free Azure Support incident to Windows Azure Support team and they will help you.
Just a suggestion: Also Check the installable(if any)and any other references you use are 64bit.Azure VMs have 64bit OS. Once i was stuck up with this kind of problem due to 32/64 bit issues.
Are your worker roles exiting their work loop? A local recycle is very fast and you might not notice it, but spin-up time in the cloud can be long.
If the issue is caused by a startup batch file, I have stopped the loop by editing the batch file on the instance to include "exit /b 0" at the beginning. This will tell Azure that the startup was successful and you then have all the time you need to diagnose issues without the VM getting killed.

How do I debug a Worker Role using Remote Desktop with Windows Azure?

I now have my Windows Azure environment set up so that I can access my Worker Role with Remote Desktop. However, I'm not sure how to proceed at the moment. After much digging I found a web site that was offline but in Google's cache there was mention of attaching to the Worker Role running in the Azure Cloud from the Visual Studio debugger. But I only have Visual Developer (not studio) 2010 and I have searched all over and as far as I can see there is no such option to attach to a remote server. I am able to publish my project to the Azure Cloud without error and I have a "healthy" instance of my Worker Role showing as active and running.
I did connect with RDP through the Azure Management portal. The login worked fine and up came the remote desktop window. I searched through much of what I could find and was unable to find my Worker Role. I must have the wrong impression of RDP, because I had hoped to see the Worker Role's main display form when I logged in, just like I do when I debug it locally in the Cloud Emulator. But instead all I saw was a blank desktop with some base level server inspection and management routines. I even checked the Event Viewer for Application related messages and saw none.
So now I'm stuck wondering if my Worker Role is actually running or not, despite the seemingly positive status messages from the Management Portal, and I still want to attach to my Worker Role for debugging through Visual Developer, if it's possible, but I am unable to figure out how.
Anyone with experience in this area that can give me some solid tips on what to do next, please respond.
UPDATE: I believe my worker role may be running because I opened a command window and did a Netstat and saw it listening on the correct port. However, that may just be my Worker Role shell class that starts the custom EXE I have it launch as a spawned proces. I still haven't confirmed if my custom EXE is running yet.
UPDATE-2: Just ran TaskList from a command window and the custom EXE is listed.
UPDATE-3: Everything is working as I just ran a remote test of the service so that's not a problem. Still want to know how to attach to the Worker Role from Visual Developer 2010 for remote debugging, and if it's possible to see the custom EXE's display form like I do when doing local debugging in the Cloud Emulator.
-- roschler
There is a set of articles here which goes in length on how to set up for remote debugging in Azure:
http://blogs.u2u.be/peter/post/2011/06/21/Remote-debugging-an-Azure-Worker-role-using-Azure-Connect-Remote-desktop-and-the-remote-debugger.aspx
http://blogs.u2u.be/peter/post/2011/06/24/Remote-debugging-an-Azure-worker-role-using-Azure-Connect-remote-desktop-and-remote-debugger-part-2.aspx
http://blogs.u2u.be/peter/post/2011/06/26/Remote-debugging-a-Windows-Azure-Worker-Role-using-Azure-Connect-Remote-desktop-and-the-remote-debugger-part-3.aspx
The key takeaway is that you don't need to actually install Visual Studio on Azure, you only need to copy the Remote Debugger bits and then use Azure Connect to add your developer machine to the Virtual Network.
You can setup Remote Debugging with Visual Studio 2012
http://code.msdn.microsoft.com/Remote-Debugging-Windows-dedaaec9
When you say:
But instead all I saw was a blank desktop with some base level server inspection and management routines.
this is exactly what you get with an Azure VM. It's a basic OS install, plus the bare minimum of Azure stuff it needs to run and the code you've uploaded. There's no fancy monitoring or health checks available on the machine by default, you're expected to have provided those yourself to have them available without having to RDP into the machine to check on it.
RDP is very good for tracking down certain problems, like checking that a startup task will run, checking which directories items are installed in and just generally being nosey. If you need extra tools to track down a problem, you can just install them while you're connected to the server. For example I have RDPed into a server and installed the Microsoft Debugging Tools, to track down a memory issue.
I suppose you could remote into your VM, install Visual Studio there, and debug the process...
I also suppose it might be possible to enable remote debugging (not sure what's involved there, but such a thing exists, and it works over TCP) and debug from a local instance of Visual Studio.
To my knowledge, neither is commonly done.
Based on other answers, you would be better off writing a log file to a local storage. You can read the file from RDP if you reallyhace to. Keep in mind, debugging on Azure isn't really simple, and rightly so.
What I was thinking though was, maybe you could run the process using the user's credentials. I can't verify at the moment, but you have a better shot of seeing the ui when you rdp.

Resources