Windows Azure Inter-Role communication - azure

I want to create an Azure application which does the following:
User is presented with a MVC 4 website (web role) which shows a list of commands.
When the user selects a command, it is broadcast to all worker roles.
Worker roles process the task, store the results and notify web role
Web role displays the combined results of the worker roles
From what I've been reading there seem to be two ways of doing this: the Windows Azure Service Bus or using Queues. Each worker role also stores the results in the database.
The Service Bus seems more appropriate with its publish/subscribe model, so all worker roles would get the same command and roughly the same time. Queues seem easier to use though.
Can the service bus be used locally with the emulator when developing? I am using a free trial and cannot keep the application constantly whilst still developing. Also, when using queues how can you notify the web role that processing is complete?

I agree. ServiceBus is a better choice for this messaging requirement. You could, with some effort, do the same with queues. But, you'll be writing a lot of code to implement things that the ServiceBus already gives you.
There is not a local emulator for ServiceBus like there is for the Azure Strorage service (queues/tables/blobs). However, you could still use the ServiceBus for messaging between roles while they are running locally in your development environment.
As for your last question about notifying the web role that processing is complete, there are a several ways to go here. Just a few thoughts (not exhaustive list)...
Table storage where the web role can periodically check the status of the unit of work.
Another ServiceBus Queue/topic for completed work.
Internal endpoints. You'll have to have logic to know if it's just an update from worker role N or if it is indicating a completed unit of work for all worker roles.

I agree with Rick's answer, but would also add the following things to think about:
If you choose the Service Bus Topic approach then as each worker role comes online it would need to generate a subscription to the topic. You'll need to think about subscription maintenance of when one of the workers has a failure and is recycled, or any number of reasons why a subscription may be out there.
Telling the web role that all the workers are complete is interesting. The options Rick provides are good ones, but you'll need to think about some things here. It means that the web role needs to know just how many workers are out there or some other mechanism to decide when all have reported done. You could have the situation of five worker roles receieving a message and start working, then one of them starts to repeatedly fail processing. The other four report their completion but now the web role is waiting on the fifth. How long do you wait for a reply? Can you continue? What if you just told the system to scale down and while the web role thinks there are 5 there is now only 4. These are things you'll need to to think about and they all depend on your requirements.

Based on your question, you could use either queue service and get good results. But each of them are going to have different challenges to overcome as well as advantages.
Some advantages of service bus queues is that it provides blocking receipt with a persistent connection (up to 100 connections), it can monitor messages for completion, and it can send larger messages (256KB).
Some advantages of storage queues over the service bus solution is that it's slightly faster (if 15 ms matters to you), you can use a single storage system (since you'll probably be using Storage for blob and table services anyways), and simple auto-scaling. If you need to auto-scale your worker roles based on the load, passing the the requests through a storage queue makes auto-scaling trivial -- you just setup auto-scaling in the Azure Cloud Service UI under the scale tab.
A more in-depth comparison of the two azure queue services can be found here: http://msdn.microsoft.com/en-us/library/hh767287.aspx
Also, when using queues how can you notify the web role that processing is complete?
For the Azure Storage Queues solution, I've written a library that can help: https://github.com/brentrossen/AzureDistributedService.
It provides a proxy layer that facilitates RPC style communication from web roles to worker roles and back through Storage Queues.

Related

Azure Service Bus Queues vs. Topics (Pub/Sub)

Need a bit of architectural guidance. I have a set of stateless services that do various functions. My architecture allows for multiple copies of each service to run at the same time (as they are stateless), allowing me to:
scale up as needed for handling larger workloads
have fault-tolerance (if one instance of a service fails, no problem as there will be others to take on that work).
However, I don't want duplication of work.
If Service A, Instance 1 has already taken Job ABC, I don't want Service A, Instance 2, to take on that same job. So, I could avoid this problem by using Azure Service Bus Queues. Only a single worker would get a particular item from the queue and would only be reassigned to another worker, if the worker didn't mark it as complete in a set time.
So what's an appropriate use-case for Topics (Pub/Sub)? It seems like if I ever have multiple copies of the same service, I must rely on Queues. Is that right?
Asked another way, is there a way to use Topics in Azure Service Bus or similar products/services but avoid duplication of work? Also, if there is a way to lock a message (for a short period of time) when using Topics, is it possible to lock that message to just one instance of Service A (so no other instances of Service A will have access to it) but the message will be broadcast to Service B, Service, C, etc.?
is there a way to use Topics in Azure Service Bus or similar
products/services but avoid duplication of work?
Yes, there is. Basically with that you would need to use each subscription as a queue. What you will need to do is define proper filters so that one kind of message is sent to a single subscription (that way it acts as a queue) and have multiple listeners (service instances in your case) listen to a specific subscription only.
Also, if there is a way to lock a message (for a short period of time)
when using Topics, is it possible to lock that message to just one
instance of Service A (so no other instances of Service A will have
access to it) but the message will be broadcast to Service B, Service,
C, etc.?
It is certainly possible to lock a message. For that you will need to fetch messages in Peek-Lock mode. However if multiple subscribers (services) are involved, then only one subscriber will be able to lock the message and access it. For other subscribers, the message will be invisible. You can't have a scenario where one service acquires the lock and other services still receive the message.
Azure function triggers would provide all what you are looking for out of the box.
If you are not leveraging any advanced queuing features of service bus then I would recommend you look at storage queues to save some money.
If you need service bus then you can use service bus triggers.
Hope that helps.

Architecture recommendation - Azure Webjob

I have a webjob that subscribes to an Azure service Bus topic. The webjob automates a very important business process. For the Service bus, it is Premium SKU and have Geo-Recovery configured. My question is about the best practice to setup High Availability for my webjob (to ensure that the process runs always). I already have the App Service Plan deployed in two regions, and the webjob is installed in both the regions. However, I would like my webjob in the secondary region to run only if the primary region is down - maybe temporarily due to an outage. How can this be implemented? If I run both the webjob in parallel, that will create some serious duplication issues. Is there any architectural pattern I can refer to, or use any features within App Service or Azure to implement this?
With ServiceBus, when you can pick up a message, it is locked so shouldn't be picked up by another process unless the lock time expires or you issue a compled message back to service bus. In your case, if you are using Peek Lock, you can use it to prevent the same message being picked up by different instances. See docs
You can also make use of sessions which is available in the premium instance of ServiceBus. In this way, you can group messages to a session and each service instance handles their own session unless the other instance is not available.
Since WebJob is associated with App service , so really depends how you have configured this. You already mentioned that WebJobs are in 2 regions which mean you have app services running in 2 regions. (make sure you have multiple instance running in each region and different Availability zones).
Now it comes down what configuration you have regarding standby region. Is it Active/passive with hot Standby, Active/passive with cold Standby or is it active/Active. If your secondary region is Active where you have atleast one instance running then your webjob is actually processing the message.
I would recommend read through these patterns and understand.
Standby Regions Configuration , Multi Region Config
Regarding Service bus, When you are processing the message with Peek-Lock it means the message is not visible in the queue so no other instance would pick up. If your webjob is not able to process in time or failed to do or crash , the message become visible in the queue again and any other instance can pick it up so no two instances can pick same message.
Better Approach
I would recommend using Azure functions to process queue message .They are serverless offering with free invocations credit a month and are naturally highly available.
You can find more about here
Azure Function Svc Bus Trigger

Running WebJobs within an Azure Worker Role

I do have a AzureWorker that receives SMTP messages from TCP ports and pushes them to queues. Other threads pick up these messages from the queues and process them. Currently, process threads have their queue polling logic. Simply they check the queues and increase wait interval if the queues are empty.
I want to simplify the queue logic and make use of other Webjobs functionalities in this AzureWorker.
Is it possible to start a WebJobs thread in this AzureWorker and let that thread handle the details? Are there any limitations that I need to know?
Azure Worker Roles are a feature of Azure Cloud Services. Azure Web Jobs are a feature of Azure App Service. They are both built to provide similar ability to run background process tasks within the context of your application. Although, since they are features of different Azure services they can't be run together like you are asking in a nested fashion.
Is it possible to start a WebJobs thread in this AzureWorker and let that thread handle the details?
I agree with Chris Pietschmann, it does not enable us to start WebJobs thread directly in Azure Worker Role.
Other threads pick up these messages from the queues and process them. Currently, process threads have their queue polling logic. Simply they check the queues and increase wait interval if the queues are empty.
I want to simplify the queue logic and make use of other Webjobs functionalities in this AzureWorker.
If you’d like to complete this task by using WebJobs, you could write a program and run as a WebJobs in your Azure App Service. And WebJobs API provides a way to dynamically start/stop WebJobs via REST API, you could use it to manage your WebJobs in your Worker Role.

Migration to Azure Service Fabric - Architectural considerations

We are on Azure since 2010 and had a great benefit from a performance and reliability in our application. Azure offers a lot of enterprise-level services and I think that the new "Azure Service Fabric" is great.
What I cannot understand by reading the documentation is the approach on migrating an "old" Cloud Service to the new Service Fabric. Why do we want to migrate? For horizontal scaling and more reliability.
Currently we have a single-instance cloud service, that spins up a lot of subservices. Those subservices are great candidates for microservices. The only problem is that some of these subservices are "runners", i.e. they just cycle on our users database and decide whether an operation (service) has to be run for a particular user or not.
How would you migrate a service like this considering that more than one instance may run this service?
Thanks
First thing to keep in mind is that once a service is started it keeps running, and his lifecycle and uptime is controlled by Service Fabric (ex: it will restart it automatically if it crashes). Second thing to keep in mind is that you will end-up with multiple instances of the service running at the same time (on different nodes), so they will end-up doing the exact same thing on different nodes of your cluster.
Your first reflex could be to have one stateless service kind/instance per runner "subservice" that keeps running and leverage the RunAsync (https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-reliable-services-advanced-usage). Personally, I wouldn't take that approach, since this could then require some kind of synchronization between services to prevent useless concurrency, since they do the exact same thing independently.
A better approach would be to have your runner services need to run only once in a while when requested by the "main" service acting as an orchestrator, you could have a Queue based approach where the "main" service submit tasks (messages) to be processed by the runners, who are listening concurrently on the same Queue, making sure that maximum one service instance would complete the task.
For the Queue, think Service Bus or Reliable Concurrent Queue (https://learn.microsoft.com/enus/dotnet/api/microsoft.servicefabric.data.collections.preview.ireliableconcurrentqueue-1).

Manage Scaling of Worker Role Programmatically

I'm looking into restructuring the application that I'm working on to find a way that will cut costs and give us a lot more room for scalability. In essence, the application is currently hosted on Azure as one large web app which users can log into, do some computationally expensive work on data stored in memory on the web app, and then eventually log off.
When looking into another way to scale this, one idea was to use Worker Roles. Instead of doing the processing on the web app, which currently requires us to use a fairly expensive pricing tier, we could use Service Bus to pass messages with the relevant data to a Worker Role instance, which would do this processing and send back the results.
The most cost-effective way to do this it seems, would be to create a small instance of a Worker Role for each user that logs on, which would deal exclusively with their requests (using, for example, a queue named after the user's ID) and then be destroyed when the user's session ends.
I have the code to determine when to spin up an instance, how to pass these messages back and forth and when to shut an instance down, but I'm having difficulty finding documentation for any methods or API calls that would allow me to do this easily. The closest I can find for deleting an instance is described here, but I can't find anything for creating them.
What is the best way to spin instances up and down on Azure? What alternatives are available to me? I'm also happy to hear alternative proposals on how to architect this.
The most cost-effective way to do this it seems, would be to create a
small instance of a Worker Role for each user that logs on, which
would deal exclusively with their requests (using, for example, a
queue named after the user's ID) and then be destroyed when the user's
session ends.
I would not recommend this approach. Here are my reasons:
The number of Virtual Machine cores are limited in a subscription. Imagine a scenario that you get 1000s of users logged in into your application. Creating 1000s of Worker Role instances would not be allowed by Azure. You would need to take special permission from Microsoft to do that.
Spinning up a VM takes time. When you create a new Worker Role deployment for your user, it is not instantaneous. Depending on the complexity of your role, it may take anywhere from 5 - 10 minutes to start a new Worker Role instance.
It's not an effective approach. Your basic idea is to create a new Worker Role instance when a user logs in is based on the assumption that user will do some compute intensive task. What if the user doesn't want to perform this intensive task (I may be wrong here because I don't know much about your application). Then in that case, you have created a VM instance which is of no use. Again your assumption is that user will always log out. What if the user simply closes the browser? How will you detect that user has left your application and you would need to terminate the worker role instance you created for that user.
It's not an efficient approach. The whole premise of Cloud Computing is built around shared resources. Having a VM dedicated for a user does not sound like an efficient approach.
Possible Solution
Instead of spinning of new worker role instances, may I suggest you take a look at scaling options. Basically the idea is to start with a shared pool of Worker Role instances. When a user logs in and start a task, web role writes a message in Service Bus queue which gets dequeued by a worker role instance which does the work and return the result. Set a maximum number of tasks a worker role could process. If you exceed that count, spin off a new instance of worker role. You can take a look at auto-scaling feature available in Azure Management Portal or look at some 3rd party services which can do this scaling for you.
Using dedicated instances for each user is not a good idea. Utilization will be low, costs will be high and each subscription has a cap of 20 CPU cores by default, so you'll have to first ask support to increase the quota.
A better approach would be to combine the web and worker role into one - once more load comes in you just scale it out. You can still use whatever is convenient for you to store the user requests - a queue or whatever else. So IIS of the role will be pushing data there and the "infinite loop" part (role enrty point Run()) will be processing the data and storing the results and then the web server will fetch the results and feed them back to users.

Resources