How to tell Azure not to remove particular server during scale down - azure

I have a .NET app running on Azure App Service.
The auto-scale is setup and sometimes it goes up to 10 instances and then back to 3.
I have a background task (hangfire) that runs every hour on one of the instances (I don't know on which one, it is random).
Is there a way to tell Azure, during scale down, not to remove the server where the task is currently executing on?

You should never rely on such thing but design your background job processors to be able to shutdown gracefully.
This is why you should be using cancellation tokens in you jobs, and job should be able to pick up from where it left.
For hangfire there is custom implementation. In some other cases you can use .net CancellationToken


Azure Autoscaling: Scale down after process ends on instance

I have an azure cloud service which scales instances out and in. This works fine using some app insights metrics to manage the auto-scaling rules.
The issue comes in when the scales in and azure eliminates hosts; is there a way for it to only scale in an instance once that instance is done processing its task?
There is no way to do this automatically. Azure will always scale in the highest number instance.
The ideal solution is to make the work idempotent and chunked so that if an instance that was doing some set of work is interrupted (scaling in, VM reboot, power loss, etc), then another instance can pick up the work where it left off. This lets you recover from a lot of possible scenarios such as power loss, instead of just trying to design something specific for scale in.
Having said that, you can manually create a scaling solution that only removes instances that are not doing work, but doing so will require a fair bit of code on your part. Essentially you will use a signaling mechanism running in each instance that will let some external service (a Logic app or WebJob or something like that) know when an instance is free or busy, and that external service can delete the free instances using the Delete Role Instances API (
For more discussion on this topic see:
How to Stop single Instance/VM of WebRole/WorkerRole
Azure autoscale scale in kills in use instances
Another solution but this one breaks an assumption that we are using Azure cloud service; if you use app services instead of the cloud service you will be able to setup auto scaling on the app service plan effectively taking care of the instance drop you are experiencing.
This is an infrastructure change so it's not a two click thing but I believe app services are better suited in many situations including this one.
You can look at some pros and cons but if your product is traffic managed this switch will not be painful.
Kwill, thanks for the links/information, the top item in the second link was the best compromise.
The process work length was usually under 5 minutes and the service already had re-handling of failed processes, so after some research it was decided to track state of when the service was processing a queue item and use a while loop in the RoleEnvironment.Stopping event to delay restart and scale-in events until the process had a chance to finish.
App Insights was used to track custom events during the on stopping event to track how often it completes vs restarts during the delay cycles.

How to host long running process into Azure Cloud?

I have a C# console application which extracts 15GB FireBird database file on a server location to multiple files and loads the data from files to SQLServer database. The console application uses System.Threading.Tasks.Parallel class to perform parallel execution of the dataload from files to sqlserver database.
It is a weekly process and it takes 6 hours to complete.
What is best option to move this (console application) process to azure cloud - WebJob or WorkerRole or Any other cloud service ?
How to reduce the execution time (6 hrs) after moving to cloud ?
How to implement the suggested option ? Please provide pointers or code samples etc.
Your help in detail comments is very much appreciated.
let me give some thought on this question of yours
"What is best option to move this (console application) process to
azure cloud - WebJob or WorkerRole or Any other cloud service ?"
First you can achieve the task with both WebJob and WorkerRole, but i would suggest you to go with WebJob.
PROS about WebJob is:
Deployment time is quicker, you can turn your console app without any change into a continues running webjob within mintues (
Build in timer support, where WorkerRole you will need to handle on your own
Fault tolerant, when your WebJob fail, there is built-in resume logic
You might want to check out Azure Functions. You pay only for the processing time you use and there doesn't appear to be a maximum run time (unlike AWS Lambda).
They can be set up on a schedule or kicked off from other events.
If you are already doing work in parallel you could break out some of the parallel tasks into separate azure functions. Aside from that, how to speed things up would require specific knowledge of what you are trying to accomplish.
In the past when I've tried to speed up work like this, I would start by spitting out log messages during the processing that contain the current time or that calculate the duration (using the StopWatch class). Then find out which areas can be improved. The slowness may also be due to slowdown on the SQL Server side. More investigation would be needed on your part. But the first step is always capturing metrics.
Since Azure Functions can scale out horizontally, you might want to first break out the data from the files into smaller chunks and let the functions handle each chunk. Then spin up multiple parallel processing of those chunks. Be sure not to spin up more than your SQL Server can handle.

Creating a Web Crawler using Windows Azure

I want to create a Web Crawler, that takes the content of some website and saves it in a blob storage. What is the right way to do that on Azure? Should I start a Worker role, and use the Thread.Sleep method to make it run once a day?
I also wonder, if I use this Worker Role, how would it work if I create two instances of it? I noticed using "Compute Emulator UI" that the command "Trace.WriteLine" works on both instances at the same time, can someone clarify this point.
I created the same crawler using php and set the cron job to start the script once a day, but it took 6 hours to grab the whole content, thats why I want to use Azure.
This is the right way to do it, as of Jan 2014 Microsoft introduced Azure WebJobs, where you can create a project (console for example), and run it as a scheduled task (occurrence once, recurrence)
Considering that a worker role is basically Windows 2008 Server, you can run the same code you'd run on-premises.
Consider, though, that there are several reasons why a role instance might reboot: OS updates, crash, etc. In these cases, it's possible you'd lose the work being done. So... you can handle this in a few ways:
Queue. Place a message on a command queue. If it's a once-a-day task, you can just push the message on the queue when done processing the previous message. Note that you can put an invisibility timeout on the message, so it doesn't appear for a day. In the event of failure during processing, the message will re-appear on the queue and a different instance can pick it up. You can also modify the message as you go, to keep track of your status.
Scheduler. Just make sure there's only one instance running (by way of a mutex). An easy way to do this is to attempt to obtain a write-lock on a blob (there can only be one).
One thing to consider is breaking up your web-crawl into separate tasks (url's?) and place those individually on the queue? With this, you'd be able to scale, running either multiple instances or, potentially, multiple threads in the same instance (since web-crawling is likely to be a blocking operation, rather than a cpu- and bandwidth-intensive one).
A single worker role running once a day is probably the best approach. I would not use thread sleep though, since you may want to restart the instance and then it may, depening on your programming, start before one day or later than one day. What about putting the task command as a message on the Azure Queue and dequeuing it once it has been picked up by a worker role, then adding a new task command on the Azure Queue once.

WF4 Affinity on Windows Azure and other NLB environments

I'm using Windows Azure and WF4 and my workflow service is hosted in a web-role (with N instances). My job now is find out how
to do an affinity, in a way that I can send messages to the right workflow instance. To explain this scenario, my workflow (attached) starts with a "StartWorkflow" receive activity, creates 3 "Person" and, in a parallel-for-each, waits for the confirmation of these 3 people ("ConfirmCreation" Receive Activity).
I then started to search how the affinity is made in others NLB environments (mainly looked for informations about how this works on Windows Server AppFabric), but I didn't find a precise answer. So how is it done in others NLB environments?
My next task is find out how I could implement a system to handle this affinity on Windows Azure and how much would this solution cost (in price, time and amount of work) to see if its viable or if it's better to work with only one web-role instance while we wait for the WF4 host for the Azure AppFabric. The only way I found was to persist the workflow instance. Is there other ways of doing this?
My third, but not last, task is to find out how WF4 handles multiple messages received at the same time. In my scenario, this means how it would handle if the 3 people confirmed at the same time and the confirmation messages are also received at the same time. Since the most logical answer for this problem seems to be to use a queue, I started looking for information about queues on WF4 and found people speaking about MSQM. But what is the native WF4 messages handler system? Is this handler really a queue or is it another system? How is this concurrency handled?
You shouldn't need any affinity. In fact that's kinda the whole point of durable Workflows. Whilst your workflow is waiting for this confirmation it should be persisted and unloaded from any one server.
As far as persistence goes for Windows Azure you would either need to hack the standard SQL persistence scripts so that they work on SQL Azure or write your own InstanceStore implementation that sits on top of Azure Storage. We have done the latter for a workflow we're running in Azure, but I'm unable to share the code. On a scale of 1 to 10 for effort, I'd rank it around an 8.
As far as multiple messages, what will happen is the messages will be received and delivered to the workflow instance one message at a time. Now, it's possible that every one of those messages goes to the same server or maybe each one goes to a diff. server. No matter how it happens, the workflow runtime will attempt to load the workflow from the instance store, see that it is currently locked and block/retry until the workflow becomes available to process the next message. So you don't have to worry about concurrent access to the same workflow instance as long as you configure everything correctly and the InstanceStore implementation is doing its job.
Here's a few other suggestions:
Make sure you use the PersistBeforeSend option on your SendReply actvities
Configure the following workflow service options
<workflowIdle timeToUnload="00:00:00" />
<sqlWorkflowInstanceStore ... instanceLockedExceptionAction="AggressiveRetry" />
Using the out of the box SQL instance store with SQL Azure is a bit of a problem at the moment with the Azure 1.3 SDK as each deployment, even if you made 0 code changes, results in a new service deployment meaning that already persisted workflows can't continue. That is a bug that will be solved but a PITA for now.
As Drew said your workflow instance should just move from server to server as needed, no need to pin it to a specific machine. And even if you could that would hurt scalability and reliability so something to be avoided.
Sending messages through MSMQ using the WCF NetMsmqBinding works just fine. Internally WF uses a completely different mechanism called bookmarks that allow a workflow to stop and resume. Each Receive activity, as well as others like Delay, will create a bookmark and wait for that to be resumed. You can only resume existing bookmarks. Even resuming a bookmark is not a direct action but put into an internal queue, not MSMQ, by the workflow scheduler and executed through a SynchronizationContext. You get no control over the scheduler but you can replace the SynchronizationContext when using the WorkflowApplication and so get some control over how and where activities are executed.

How to create a job in IIS capable of running an extended process

I have a web service app, I have 1 web service call that could take anything from 1 hour to 14 hours, depending on the data that needs to be processed and the time of the month.
Is there any way to create a job in IIS that could be capable of running this extended process. I also need job management and reporting to be able to see if jobs are running, so that new jobs aren't created on top of others.
I will be working with IIS6 primarily. And would like to use C# code.
Right now I am using a web service call, but I don't like the idea of having web services run for such a long time, and due to the nature of the web service, I can't split the functionality any more.
IIS jobs would be awesome if they are available. Any ideas?
If I were you, I would make a command line app that is kicked off by the web service. Running a commandline app is pretty straight forward, basically
Process p = new Process();
p.StartInfo.UseShellExecute = false;
p.StartInfo.FileName = "appname.exe";
There are a limited amount of worker processes per machine, they aren't really meant for long running jobs.
One possibility, with a bit of setup cost, is to have your processing run as a Windows service that listens to a message queue (MSMQ or similar), and have your web service simply post the request onto the message queue to be handled by the processing service.
Monitoring jobs is more difficult; your web service would need to have a way of querying your processing service to find out its state. This is an IPC (interprocess communication) problem, which has many different solutions with various tradeoffs that depend on your environment and circumstances.
That said, for simple cases, Matt's solution is probably sufficient.
