What could be causing my WebRole to never start? - azure

I have a web service hosted on azure as a web role.
In this web role I override the Run() method and perform some db operations in the following order to basically act as a worker role.
go to blob storage to pull a small set of data.
Cache this data for future use.
Kill some blobs that are too old.
Thread.Sleep(900000);
Basically the operation repeats every 15 minutes in the background.
It runs fine when I run on DevFabric but when deployed the azure portal get stuck in a loop of stabilizing role and then preparing node.
Never actually starting up either instance.
I have enabled diagnostics and it isn't showing me anything to suggest there is a problem.
I'm at a loss for why this could be happening.

Sounds like an error is being thrown in the OnStart. Do you have any way of doing try/catch around the whole function and dumping the error into an EventViewer? From there you would be able to remote into the instance and investigate the error
Most likely your configuration deployed to cloud is different from the one running in an emulator (SQL Azure firewall permissions, pointers to Local Dev storage instead of ATS, etc). Also, make sure that your diagnostics is pointing to a real Azure account instead of local Dev storage.

I would suggest moving this code to Run(). In OnStart(), your role instance isn't yet visible to the load balancer, and since you're introducing a very long (ok, infinite delay) into OnStart(), this is likely why you're seeing the messages about the role trying to stabilize (more info on these messages are here.)

I don't typically like to answer my own question when someone else has made an effort to help me however I feel that the approach I used to solve this should be documented.
I enabled intellitrace when deploying to Azure and I was able to see all the exceptions being thrown and investigate the cause of the exceptions.
Intellisense was critical in solving my deployment issues. I would recommend it to anyone seeing an inconsistency between deploying devfabric and deploying to azure.

Related

Does backing up Azure API Management service take the service down during the backup process at all?

I'm wondering what implications ,if any, backing up a production API Management Service has on production traffic. The reason for me asking is that I recently tested running the backup cmdlet
Backup-AzApiManagement ..
and during the process ( which took anywhere between 15-25 minutes), within the Azure Portal on the main APIMS page for the service I was backing up, it stated "Updating service..."
I just want to be sure I understand if there are any potential downtime that need to be accounted for during the backup. Perhaps I need to run the backup during non-peak hours if so. I would hate to run the backup operation during peak hours, just unaware of potential downtime just to kick myself later when I find out that the service is expected to be down intermittently during the backup operation.
Any insight is appreciated. Thank you in advance for the help.
There will be no downtime if you are within the same region, however if its different region If you use the same Use the same API Management instance name there will be a downtime.
There will be no downtime. The API's is still working even when doing backup. However the developer portal is locked for editing. (I have tested running queries against API's when doing backup)
The only downside is that the APIM also throws an event in the Resource health log. Which means that if you have alerts configured for you APIM there will be an alert every time you are running a backup! I am digging for a solution for this, thats why I ended up here.

Microsoft Azure Reports "Your app experienced failure(s) due to a transient storage access issue."

I have an Azure WebApp that continually reports "Your app experienced failure(s) due to a transient storage access issue." The suggested solution is "Explore Local Cache feature for your web app." but my webapp exceeds the maximum storage (3GB) for this option.
The problem mostly occurs between midnight and 6am in the morning when the site is LEAST active, but there seems to be an increasing number of occurrences during the day.
What are the underlying causes of this problem? Is this something to do with my WebApp or is it the Azure Infrastructure. In either case, how do I determine the underlying issue(s) and resolve?
"Your app experienced failure(s) due to a transient storage access issue."
The Web Apps environment provides diagnostic functionality for logging information from both the web server and the web application. You could try to enable Logging and check the logs that generated within that period of time.
According to the error, it seems that a temporary issue causes app failure, and it suggests enabling local cache. You could follow the suggested solution and make sure if it helps resolve issue.
Besides, you could try to scale your web app (which would take additional charge) and check if it could mitigate the issue.
Updates:
As we know App Service offers a shared, persistent storage for the application. Maybe something wrong with the shared storage when the instances in farm access the storage, which may be the cause of the issue.
To determine the underlying issue, you may try to enable diagnostics logging for your web app. This should provide more information on what is happening at the storage level and what kind of activity is going on.

why worker role instance is busy?

I am learning Azure cloud services.
I deployed the Contosco cloud service (https://azure.microsoft.com/en-us/documentation/articles/cloud-services-dotnet-get-started/)
to a staging slot.
The Ads worker role instance is having problems (busy status).
Any tips on troubleshooting ? Clicking on the instance just shows high level info.
In a non-azure application, looking at event log would be useful. Should I following this instruction : https://www.opsgility.com/blog/2011/09/20/windows-event-logs-with-windows-azure-diagnostics-and-powershell/
Thanks,Peter
You should first check this link to understand the various lifecycles a cloud service goes through. My understanding is something written in onstart() method might be the cause, but without any code in the question I can't be sure. Then you can Enable Diagnostics (logs) using this and this links. So, that you clearly see, what line of code has executed and what is keeping the cloud service busy.

Alternate to run window service in Azure cloud

We currently have a window service which send some notification emails to users after doing some processing on database(SQL database). Runs once in day.
We want to move this on azure cloud. One alternate is to put it on Azure VM as is. but I am finding some other best possible solution for that.
I study about recurring and on demand Web jobs but I am not sure is this is best solution.
Also is there any possibility to update configuration of service code in App.config without re-deploy the code of service on cloud. I means we can manage configuration from Azure portal.
Thanks in advance.
Update 11/4/2016
Since this was written, there are 2 additional features available in Azure that are both excellent choices depending on what functionality you need:
Azure Functions (which was based on the WebJobs described below): Serverless code that can be trigger/invoked in various ways, and has scaling support.
Azure Service Fabric: Microservice platform, with support for actor model, stateful and stateless services.
You've got 3 basic options:
Windows service running on VM
WebJob
Cloud service
There's a lot of information out there on the tradeoffs between these choices, but here's a brief summary.
VM - Advantages: you can move your service basically as it is without having to change much or any of your code. They also have the easiest connectivity with other resources in Azure (blob storage, virtual networks, etc). The disadvantage is you're giving up all the of PaaS advantages and are still stuck managing your own VM infrastructure
WebJob - Advantages: Multiple invocation options (queues, blobs, manually, queue receive loops, continuous while-loop style, etc), scheduled (would cover your case). Easy to deploy (can go with website, as a console app, automatically through Kudu), has some built in logging in Azure portal - and yes, to answer your question, you can alter the configuration in the portal itself for connection strings and app settings.
Disadvantages - you'll need to update code, you don't have access to underlying resources (if you need that), and more of something to keep in mind than a disadvantage - it uses the same resources as the webapp it's deployed with.
Web Jobs are the newest of the options, but at the same time appear to have active development going on to increase the functionality and usefulness.
Cloud Service - like a managed VM, has some deployment options, access to underlying VM if needed. Would require some code changes from your existing service.
There's nothing you've mentioned in your use case that makes me think a Web Job shouldn't be first thing you try.
(Edit: Troy Hunt has a great and relatively recent blog post illustrating most of the points I've mentioned about Web Jobs above: http://www.troyhunt.com/2015/01/azure-webjobs-are-awesome-and-you.html)

Why does Azure give me an intermittent Error 503. The service is unavailable?

I have an Azure service that has been running for a long period of time. It builds a word or powerpoint document based on arguments in the request and returns a uri to the build document. This is access via a visualforce page, when you click a button, it calls the service and displays a link to the document that has just been built. Simple.
All of a sudden, I get an apparently random 503 Service Unavailable error. Sometimes I click the button, no problem. Other times a 503 error. Each time the button triggers exactly the same request. Does anyone know why this might be happening?
Apparently doing the same thing over and over again and expecting a different result, is not insanity!
Thanks for taking the time to read this.
Looking at the monitoring on my service told me the processor was never exceeding 6% of usage, so it couldn't be a lack of resource causing these intermittent 503 errors. It's bizarre and I'm afraid I have no explanation for it, but simply redeploying the cloud service to Azure appears to have done the trick. It now works perfectly. The solution has not changed, so I can only imagine that whatever 'reboot' is necessary after deployment, has rectified whatever the problem was. All I can suggest is that you try the same thing if you are getting intermittent 503 errors.
For me the error went away when I set up auto-scaling. I think failover requests were getting routed to my second VM, and the second VM took some time to spin up because it wasn't ready for the activity. Auto-scaling shut down my second VM and the error no longer appears (I'm assuming it will spin up if/when I get enough traffic to use it).
Hope this also helps someone.
I get this error whenever I create an Azure Function with a storage account in the South Central US. If I use a storage account in a different region the function works.
Try a storage account in a different region than the one you are currently using to see if it resolves your issue.
503 error is simply shows that your application pool was inaccessible, it was intermittent because your application pool is restarting because the lack of resource (processor, memory, etc).
Scale up your instance (Cloud Services or VM) to get more resource for the application pool.

Resources