Azure Availability Set vs Affinity Group - azure

I'm a little bit confused about when to use Azure Availability Set and when to use Azure Affinity Group.

Lets look at the key purpose of Availability set and Affinity Group briefly to begin with.
Availability Set: is predominately to provide High Availability for your deployment. Azure does this via Fault domains and Upgrade domains.
A fault domain: is basically a different hardware rack in the same datacenter. The solution will be deployed in two different hardware racks.
Upgrade domains: is exactly same like fault domains in function, but they support upgrades rather than failures. The Upgrade domain is a logical unit of instance separation that determines which instances in a particular service will be upgraded at a point in time.
Affinity Group: In order to explain it, we need to take peek into Azure DC . Windows Azure Data Centers are purpose build , you might see rows and rows of containers (something like shipping containers) that contain clusters and racks. Each of those Containers have specific services, for example, Compute and Storage, SQL Azure, Service Bus, Access Control Service, and so on. Those containers are spread across the data center.
When you deploy a service using Portal or PowerShell , the service will talk directly to RDFE (Red Dog Front End). The RDFE controls the DC and nodes. The Cluster of nodes is controlled by Fabric Controller.. When you specify Affinity Group , the Fabric controller will place all the required elements of a deployment together. This has number of advantages like reducing latency (since required elements are close together) , Networking.
There are new changes brought in related to Network Affinity group , you can refer them (https://azure.microsoft.com/en-us/documentation/articles/virtual-networks-migrate-to-regional-vnet/).
To address you question
You would use Availability set when you want to have Highly Available system and also want to have SLA for Compute. Without Availability set there wont be SLA for your VM or PaaS Instances in other words will single instances of VM (IaaS) and PaaS wont have SLA and prone to downtime during HW failure and Upgrades of OS.
Availability set can be implemented after the deployment as well. Do note there is cost associated with the Availability set , since you are running additional instances , so they will be charged.
Affinity group you need to include them at the time of Creation of the services . It cannot be updated after the creation. So it very important to include Affinity group at the time of creation. There is no additional charges for including Affinity group.
Do share your feedback if the response addresses your question.

Related

asp.net core application redundancy

I need that an asp.net core application in azure to have redundancy. If one application fails another, take over your tasks online. I didn't find anything that I can use as a guide. Thanks for your help.
Azure VMs HA options:
Use Availability Set: An availability set is a logical grouping of VMs that allows Azure to understand how your application is built to provide for redundancy and availability. (SLA 99,95%)
Scale Sets: Azure virtual machine scale sets let you create and manage a group of load balanced VMs. The number of VM instances can automatically increase or decrease in response to demand or a defined schedule. Scale sets provide high availability to your applications, and allow you to centrally manage, configure, and update many VMs.
Load Balancing
Also follow this decission tree as starting point to choose whatever feats your needs.

What's the difference between primary and non-primary nodes in Azure Service Fabric?

I can't find any specific documentation that says what's the difference between primary node and a non-primary node, and how are they being used. Can somebody shed light on it? Thanks.
If you compare Service Fabric to other Orchestration Tools like Kubernetes, you will notice a small difference on how clusters are defined.
Kubernetes uses a concept of Master to host cluster management services, and Minion to host your application services(containers). Until version 1.1 it was not possible to run containers on the masters, because it had the idea that Master's should be isolated to avoid conflicting with containers running on it, like consuming too much memory, disk, cpu, and so on.
On Service Fabric this is a bit different. When you define a NodeType as Primary, what it means inside the cluster is that this NodeType will be responsible to host the Service Fabric Management Services(services needed to control the cluster health, orchestration and so on).
When you deploy a cluster via Azure Portal, depending on the durability tier (Bronze,Silver,Gold) you choose, it will require a certain number of nodes on Primary Node Type, to keep the cluster management healthy. For production workloads, 5 nodes the minimum recommended size for Primary NodeType or NonPrimary with stateful workloads in it. The minimum supported use VM SKU is Standard D1 or Standard D1_V2.
There is a catch for Primary Node-type, the change of VMSS Sku (Size) is not supported, you can do on your own risk, but is a recipe for disaster, because the risk of loosing management services is too high.
Non-primary NodeType, there is no overall difference other than these mentioned above. All NodeTypes will have a VMSS and a LoadBalancer(with an domain) being able to configure the access rules. All NodeType will have a limit of 100 nodes.
Compared to Kubernetes, SF does not add any constraints to prevent you deploying your services alongside the management services on primary nodes, Every node is part of a pool of resources(including the primary). So the default behaviour is deploy applications on every node available no matter the NodeType.
When you plan bigger clusters (100+ nodes), it is important that you take that in account, and isolate your Primary NodeType from your workloads, and remove the pressure on your management services nodes.
Having multiple node types can be useful in these situations:
You want to run services exposed to the internet & services not exposed. The first set would run on a node type (VMSS) attached to the Load Balancer and the second on a scale set that isn't.
You need to run services for certain customers on premium hardware and trials on cheaper hardware. The first set would run on nodes with lots of CPU, lots of RAM. The second on lower SKU's.
You want to build a cluster that exceeds the max node count that one VMSS can hold.
Or you need to add scale sets on the fly, to support huge growth.
And: The primary nodes run your system services, the secondaries don't.
There is not much of a difference. Nodes of different node types all share the same characteristics of a Service Fabric Cluster. They all participate in load balancing etc.
Except for one thing: system services run om the nodes of the primairy node type only (source):
Primary node type is where the system services run, so the VM SKU you choose for it, must take into account the overall peak load you plan to place into the cluster. Here is an analogy to illustrate what I mean here - Think of the primary node type as your "Lungs", it is what provides oxygen to your brain, and so if the brain does not get enough oxygen, your body suffers.
An important purpose of node types is to constraint service placement to specific node types. For example, you can have several node types, one uses VM's with higher cpu capacity and one with focus on amount of memory. The you can place memory resource hungry services on one node type and cpu intensive services on the other.

Azure - Linux Standard B2ms - Turned off automatically?

I have a Linux Standard B2ms azure virtual machine. I have disabled the autoshutdown feature you see in your dashboard under operations. For some reason this server was still shutdown after running about 8 days.
What reasons are there which could shutdown this server if I haven't changed anything on it the last three days?
What reasons are there which could shutdown this server if I haven't
changed anything on it the last three days?
There are many reasons will shutdown this VM, maybe we should try to find some logs about this.
First, we should check Azure Alerts via Azure portal, try to find some logs about you VM.
Second, we should check this VM's performance, maybe high CPU usage or high memory usage, we can find logs in /var/log/*.
Also we can try to find are there some issue about Azure service, we can check service Health -> Health history to find are there some issues in your region.
By the way, if we just create one VM in Azure, we can't avoid a single point of failure. In Azure, Microsoft recommended that two or more VMs are created within an availability set to provide for a highly available application and to meet the 99.95% Azure SLA.
An availability set is composed of two additional groupings that protect against hardware failures and allow updates to safely be applied - fault domains (FDs) and update domains (UDs).
Fault domains:
A fault domain is a logical group of underlying hardware that share a common power source and network switch, similar to a rack within an on-premises datacenter. As you create VMs within an availability set, the Azure platform automatically distributes your VMs across these fault domains. This approach limits the impact of potential physical hardware failures, network outages, or power interruptions.
Update domains:
An update domain is a logical group of underlying hardware that can undergo maintenance or be rebooted at the same time. As you create VMs within an availability set, the Azure platform automatically distributes your VMs across these update domains. This approach ensures that at least one instance of your application always remains running as the Azure platform undergoes periodic maintenance. The order of update domains being rebooted may not proceed sequentially during planned maintenance, but only one update domain is rebooted at a time.
In your scenario, maybe there are some unplanned maintenance events,when Microsoft update the VM host, they will migrate your VM to another host, they will shutdown your VM then migrate it.
To achieve a highly available, maybe we should create at least two VMs in one availability set.

How to autoscale virtual machines(IaaS approach) in azure

How to autoscale virtual machines(IaaS approach) in azure instead of web/worker role autoscaling in azure?
You can now Autoscale Virtual machines in Azure directly in the Azure Management Portal. ScottGu has a post about it on his blog.
The important thing to autoscale VM's is you must proactively provision the Max # of VM's you think you'll need to handle your peak capacity, and add them to the same availability set.
For example, if on the busiest day of the week it takes 6 machines to handle all of your traffic, then you need to create 6 instances and install your application on it, configure it to handle traffic etc.... and then add it to an availability set with the other 5 machines.
Once you've done this, you can navigate to the Cloud Service that contains all of your virtual machines and click on the Scale tab. You should see a list of your availability sets, and it should tell you the # of machines you can scale over. Choose a metric (either CPU or Queue today), and then range of machines you want to scale between. You can scale between 1 and the total # of machines.
When load is low -- Azure will turn off machines (so you don't have to pay for them), and when load is high, Azure will turn those machines back on.
Auto-scaling on the IaaS level doesn't really make sense. Even if azure could detect high CPU usage and start a new VM based on it, what then? you still need to install your application on that VM automatically somehow.
What you are looking for is something that runs your app on azure, and installs new instances on new VM's if necessary. That "something" is called PaaS enabler. Basically it is another abstraction level between your app and the azure IaaS.
there are a couple of them out there :
Cloudify, CloudFoundary, Juju
as far as i know, only one that supports Azure is Cloudify. you can check out how to configure azure using Cloudify here : Configuring Azure
you can also check out the community - Cloudify Forum, or post questions here for assistance.
Disclaimer: I work for Gigaspaces, developing the Cloudify product line.
According to this it's possible to scale out IaaS with Availability sets by pre-provisioning the number of boxes: https://blogs.msdn.microsoft.com/kaevans/2015/02/20/autoscaling-azurevirtual-machines/

Proper setup for high-availability Azure VMs

I would like to achieve a high-availability scenario on two VMs in Azure.
I understand and can follow the directions here:
https://www.windowsazure.com/en-us/manage/windows/common-tasks/manage-vm-availability/
However, my question is this: are the two VMs supposed to be exact replicas of each other, so that when one goes down, the other takes over? Or does the Availability Set look after this, so that the two VMs can have totally different content and still utilise each others' free resources?
If you're working with Virtual Machines (currently in Preview), then each VM lives in its own VHD. You can make additional instances by creating a VM from an image you build, but at that point, the new VM lives in its own VHD and the actual disk image will then deviate from any other instance as time goes forward. Of course, if each VM is created from the same image, with the same initialization tasks, etc., then they'd have the same software as well. You'd be responsible for upgrading software versions on all the VMs. If you then put these multiple Virtual Machines in an Availability Set, you'd be assured that the Host OS (underlying OS at machine-level) for the VMs you have would not be updated at the same time. You'd also know that different VMs in the Availability Set would be situated in different racks, network segments, etc.
More on Availability Sets: Within an Availability Set, you may have any variety of Virtual Machines - Linux, Windows, different functionality. And... you may define more than one Availability Set.
In the PaaS world, where you set up a Cloud Service with Web and/or Worker roles, those VMs are spawned the exact same way. So adding instances means adding more of the equivalent VMs. If the disk crashed, a new VM would be created just like the others. There are no persistent changes to those OS disks. In the case of Cloud Services, there's fault domains and upgrade domains, which are very similar to availability sets.

Resources