Azure VM - Update Domain and Fault Domain in Avaibility set - azure

I am reading about availability set in Azure Virtual Machine. And it seems a bit confusing to me.
I have few questions, would appreciate if someone can answer that.
Microsoft document says with two or more machines in availability set gives 99.95% availability. if this is the case why they have 3 maximum Fault Domain and 20 maximum Update domain. if I choose a max of both would I get more availability than 99.95%? if not, what is the purpose of having more updates or fault domains than 2?
If I have 3 fault domain and 20 update domain, how many physical machines will be created? 20 max(update_domain, fault_domain) or 23 (update_domain + fault_domain)
can it be possible to have less number of update domain than fault domain? i.e. 2?

no, you would not get a higher SLA, but in theory the more FD\UD you have the more reliable your VMs are (and its free of charge, so no point in having less, tbh)
physical machines are not going to be created (how??), but your vms would be split along 20 hypervisors and those hypervisors will be split between 3 fault domains (racks)
I dont see why not, but I dont see why would you want to do that.

Related

Do you know of a good explanation for fault domains and update domains?

I'm new to Azure and have been struggling with a concept specifically update domains and fault domains. Probably having a harder time understanding Update Domains.
So as I understand it, having 3 VMs in 3 fault domains would be essentially having those VMs spread out to three racks? Is that correct?
Like this
Fault domain 1
Fault domain 2
Fault domain 3
VM 1
VM 2
VM 3
If that is wrong, please correct me. So then what is an update domain? A lot of the documentation I have seen shows a demonstration for the fault domain similar to the table above and will describe what kind of sounds like the fault domain.
If you have a link to a good explanation that would be a big help or if you think you could dumb it down for me a bit, that would work too.
Each virtual machine in your availability set has an update domain and fault domain assigned.
Fault domains indicate the group of virtual machines that share common power source and network switch limiting the impact of potential physical hardware failures, network outages, or power interruptions.
Update domains indicate the group of virtual machines and underlying physical hardware that can be rebooted at the same time ensuring availability of some virtual machines during a planned maintenance.
Link: https://learn.microsoft.com/bs-latn-ba/azure/virtual-machines/availability-set-overview#how-do-availability-sets-work

Availability set azure fault domain & update domain

Q.I have 2 servers. So i ll have 2 FD(FD0, FD1) & 2 UD(UD0, UD1). What if UD0 is down & at the same time the FD1 goes down due some reason. So what will happen ?
If I co-relate the actual question and diagram in the Ashok's answer,
There are two scenarios here,
1) Update Domains will be down only if there is any updates going on(it can be planned or unplanned). So in case if FD1 goes down there won't be any updates happening in UD0 as because there are no other servers to take the load. Till FD1 comes online UD0 will have to wait to do the update.
2) In case any updates going on in UD1 or UD2 definitely UD0 will be running and serving the load/handling the traffic. At that time, if FD0 goes down, then your app will be down. To overcome this scenario you should have 3 FDs.
Very simple: both of your servers would be out.
It's not even related to Azure here, even if you have 2 machines, hosted in two locations, by 2 different providers, and the first is down for maintenance, and the second one crashes, you'll end up with everything down. So, fault domains and update domains will not protect you from a full outage in such an event.
This is how FDs and UDs are useful in the case of two machines:
Having each machine in its own FD and its own UD allows you to avoid a full outage in the event of an unexpected outage in one FD and avoid full outage in the event of an update
Having both machines in the same FD but in different UDs allows you to avoid full outage during update operations, but does not prevent full outage in the event of an unexpected FD outage
Having both machines in the same UD, but in different FDs (yes it's possible) allows you to avoid full outage in the event of an unexpected outage in one FD, but you'll have full outage for each update operation
Having both machines in the same FD and in the same UD would not protect you from anything, you'll have a full outage for both unexpected FD outages and update outages
For all Virtual Machines that have two or more instances deployed in the same Availability Set, Microsoft guarantee you will have Virtual Machine Connectivity to at least one instance at least 99.95% of the time.
For any Single Instance Virtual Machine using premium storage for all Operating System Disks and Data Disks, Microsoft guarantee you will have Virtual Machine Connectivity of at least 99.9%.
Each virtual machine in your availability set is assigned an update domain and a fault domain by the underlying Azure platform. For a given availability set, five non-user-configurable update domains are assigned by default (Resource Manager deployments can then be increased to provide up to 20 update domains) to indicate groups of virtual machines and underlying physical hardware that can be rebooted at the same time. When more than five virtual machines are configured within a single availability set, the sixth virtual machine is placed into the same update domain as the first virtual machine, the seventh in the same update domain as the second virtual machine, and so on. The order of update domains being rebooted may not proceed sequentially during planned maintenance, but only one update domain is rebooted at a time. A rebooted update domain is given 30 minutes to recover before maintenance is initiated on a different update domain.
Fault domains define the group of virtual machines that share a common power source and network switch. By default, the virtual machines configured within your availability set are separated across up to three fault domains for Resource Manager deployments (two fault domains for Classic). While placing your virtual machines into an availability set does not protect your application from operating system or application-specific failures, it does limit the impact of potential physical hardware failures, network outages, or power interruptions.
Here is an article which helps you to understand Fault Domains and Update Domains

Reduce costs of Azure availability set

I am planning on running Sharepoint Foundation on one VM size A3 and SQL Server on another of size A6. As far as I understand this is not enough to achieve SLA and I should use 2 more instances - one for Sharepoint and one for SQL Server configured in 2 seperate availability sets.
Can I use scaling (by CPU usage) to turn off one instance and leave only one running at a time in an availability set? This would reduce the costs but I wonder if this solution will be good enough to achieve Azure's SLA. The way I see it one instance is running at a time while other one is shut down so I am billed for one instance. When there is an update or failure going on, the instance that until then has been running is shut down and the other one comes online. Is this the way it works? Can I cut costs of availability sets like this?
no, the SLA requires two running instances. However, if you want to control your costs, the approach you have in place will work. Just keep in mind that the duration/window for a disruption will be dependent on how quickly you detect that the primary VM has failed, and how fast you can start the secondary VM. And depending on the nature of the service disruption, it may not be possible for you to start the secondary. So its a risk.

Azure changing hardware

I have a product which uses CPU ID, network MAC, and disk volume serial numbers for validation. Basically when my product is first installed these values are recorded and then when the app is loaded up, these current values are compared against the old ones.
Something very mysterious happened recently. Inside of an Azure VM that had not been restarted in weeks, my app failed to load because some of these values were different. Unfortunately the person who caught the error deleted the VM before it was brought to my attention.
My question is, when an Azure VM is running, what hardware resources may change? Is that even possible?
Thanks!
Answering this requires a short rundown of how Azure works.
In each data centres there are thousands of individual machines. Each machine runs a hypervisor which allows a number of operating systems to share the same underlying hardware.
When you start a role, Azure looks for available resources - disk space CPU RAM etc and boots up a copy of the appropriate OS VM in thoe avaliable resources. I understand from your question that this is a VM role - so this VM is the one you uploaded or created.
As long as your VM is running, the underlying virtual resources provided by the hypervisor are not likely to change. (the caveat to this is that windows server 2012's hyper visor can move virtual machines around over the network even while they are running. Whether azure takes advantage of this, I don't know)
Now, Azure keeps charging you for even when your role has stopped because it considers your role "deployed". So in theory, those underlying resources still "belong" to your role.
This is not guaranteed. Azure could decided to boot up your VM on a different set of virtualized hardware for any number of reasons - hardware failure being at the top of the list, with insufficient capacity being second.
It is even possible (tho unlikely) for your resources to be provided by different hardware nodes.
An additional point of consideration is that it is Azure policy that disaster recovery (or other major event) may include transferring your roles to run in a separate data centre entirely.
My point is that the underlying hardware is virtual and treating it otherwise is most unwise. Roles are at the mercy of the Azure Management Routines, and we can't predict in advance what decisions they may make.
So the answer to your question is that ALL of the underlying resources may change. And it is very, very possible.

Windows Azure, MSDN offer, 750 small compute hours

I'm an msdn subscriber and I'm looking at Azure as a possible platform for a new website that will test the water of a new service. This website is expecting low to very low traffic at the time of launch. I've heard that this kind of traffic levels is very expensive for Azure but since they have this msdn offer, I thought I'd finally take a look at Azure.
In the offer, I'm looking at getting "750 small compute hours per month". From the reading I've done, this seems that, if I purchase nothing more than what's given (although the subscription itself is thousands of dollars of course), that an entire month would be covered. Since 24 (hours) x 31 (max days in a month) = 744 I'm still below my allotted 750 for the month.
Am I missing something else from this simple equation? Is there further aspects that could cause the site to be "turned off" temporarily that should be considered?
Yes, you can indeed run a small instance during a whole month. Or you can have 2 extra-small instances instead (having 2 instances means you're covered by the SLA).
There are 2 other things you need to consider:
Depending on your subscription you can have maximum 45GB of storage (blob/table/queue). If you use Virtual Machines you need to know that the system disk (and additional data disks) are persisted as blobs, so make sure not to reach the limit here.
There are also other limits active, but the most important one besides storage is the data transfer limit which is also very limited (max 35GB out).
If you're expecting very low traffic, did you ever consider Windows Azure Web Sites? You get 10 of those for free during 12 months. The free ones run on shared instances, but they are perfect to host the first low-traffic version of your app.

Resources