Azure VMs high-availability setup for data disk or storage - azure

I'm currently looking into a high-availability approach for a file server within Azure in which I will need to be deploying VMs for. The data on the file server will be constantly changing. From what I read so far, I will need at least 2 VMs and grouping them together into a shared availability set along with creating a cloud service. Although this will address the application and server aspect, what about the storage and the data on them?
I understand that I can't attach a single disk to multiple VMs so I'm a bit lost on how to proceed with this. Any thoughts or ideas on how to move forward with this?
In short, I have a VM with direct data disk attached to it that I'm looking to provide high-availability in the event that the VM goes offline; either through an outage, host patching, hardware maintenance, etc...

Have a look into Azure Blob Storage - don't worry about disks etc, just let the Azure fabric handle the data redundancy and scalability for you!
Here's an "all you need" introduction to WIndows Azure Storage:

Related

Storage to cloud

we have a working Netapp with ESXi (VMWare 5.5) setup. With multiple VMs running on 3 ESXi Systems but they are residing entirely on Netapp Storage.
We are thinking of moving this entire setup to private Cloud consists of HP Nimble cloud storage. This cloud is currently owned by one of our departments and are ready to give us space(in terms of storage) and ESXis(VMI Cluster) to run our VMs on a rental basis. So immediate advantage for us is more space, more network speed, DR Setup and not anymore worry about the hardware.
Ofcourse it is in discussion phase but I still would like to ask you experts following questions.
Netapp Storage is all about data plus its configuration (Snapshot, User Quota Policies, Export Rules etc.). When we talk about storage space in cloud, then how are we going to control/administrate the configuration parts listed above? Or will this not be anymore possible to administrate? And the Cloud administrators take this control in their hands and we have to be dependant on them for every configuration changes? This is very important factor.
Will VMs running on Netapp Storage be migrated without much efforts? Is there any documented method for this?
Your view on this will be really helpful.
Thanx in advance.
Regards,
Admin
On point #1, a common way to provide multi-tenant administrator access on NetApp is to create a separate SVM [1] (Storage Virtual Machine) that a tenant administrator can use to manage volume capacity, snapshots, quotas, etc.
For #2, a common migration path for moving VMware VMs is to use Storage vMotion [2]. The private cloud provider can remap the ESXi hosts in your environment to be managed under their vCenter Server first. Then from there, they will have the ability to (non-disruptively, in most cases) move the VMs from your old NetApp datastores to new datastores on their array. They can do the same for vMotioning these VMs over to their managed ESXi hosts.
[1] https://docs.netapp.com/us-en/ontap/concepts/storage-virtualization-concept.html
[2] https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.vcenterhost.doc/GUID-AB266895-BAA4-4BF3-894E-47F99DC7B77F.html

Azure IO transfer

We have a VM setup to run SQL server in Azure. we are seeing the disk write is doing like 0.6MBps with WRITE THROUGH during testing. We have tried numerous different this from changing Azure Vm types D series L series etc. We have also created different RAID based disks. is this a limit in Azure that it can do only certain rate for non cached disks than what is advertised as 500MBps.. any help to improve the WRITE THROUGH rate?
In order to prevent the blocking issues and to improve the IO performance, we need to:
1.Prevent VM level throttling at all cost.
2.Prevent disk level throttling if the application has dependent blocking issue due to the software design. Adding more disks to create a storage pool may help.
More information about Azure VM Storage Performance and Throttling, we can refer to:
Azure VM Storage Performance and Throttling
According to your issue, as you are using SQL Server, here is an article about how to optimize SQL Server performance in Microsoft Azure Virtual Machine: Performance best practices for SQL Server in Azure Virtual Machines

Possible to keep two vhd's in sync on azure vm's

This is scenario than a specific technical question.
I have two azure vm's who run a web application in load balanced mode.
as per this article http://asheej.blogspot.in/2014/03/load-balancing-using-windows-azure.html
both virtual machines are attached an additional disk which stores images which are referred from web application hosted in vm's IIS.
Now What would be the best way to keep contents on two vm hard drives in sync.
For example, If i delete, add a data from vhd of first vm then that should also be affected on second vm.
Is there anything possible, probably using a common vhd for both machines which will take sync out of question.
Before going into solution , let me briefly touch base on the VM and disk relationship.
Typically a VM contains 3 Disks attached to them 1. OS Disk 2. Temporary Disk and 3.Data Disks. The VM will have lease on all these disks ,the only way to write into data disks is via the VM.
The C: Disk is persistent, meaning when the VM get rebooted the data in the disk is retained. But the D:\ is non persistent , when you reboot the disk will be fully wiped clean. So at any point in time the D:\ shouldn't be used to store any user data.
So writing a process to sync between two VM's just to keep pictures in sync is not very ideal. You might know this already , but wanted to set context for the choice of options provided below.
Your potential options are as follows
You can setup File Share using the new Azure File Service (In Preview) http://blogs.technet.com/b/uspartner_ts2team/archive/2014/06/09/setting-up-a-file-share-for-the-new-azure-file-service.aspx. This will be single source for all your images and you don't need to worry about syncing of files.
2.Store the images in the Azure Blob and access them from the application that's running in the VM http://blogs.msdn.com/b/yaohuang1/archive/2012/07/02/asp-net-web-api-and-azure-blob-storage.aspx and http://www.nickharris.net/2012/11/how-to-upload-an-image-to-windows-azure-storage-using-mobile-services/
3.Host another VM as a Webserver and host your images from there. Then the two VM's can refer the image. The cost here will be to hosting the VM.
The key point with all the 3 potential options there is no need sync the files in two different places , everything is in single place.
Edited based on new information:-
In your scenario hosting your files into VM is not the right approach. You should take the following into consideration even for the short term solution , if you are using Azure LB.
Azure Load Balancer uses a 5 tuple (source IP, source port, destination IP, destination port, protocol type) to calculate the hash that and map traffic to available servers and also the distribution is fairly random. So if you load balance the VM, you cannot control which VM the images are accessed.
Manual updates is not possible in this scenario.
You either need to setup an virtual network to allow you to create and share a windows file share OR you should investigate the use of Azure File Service for creating a share that both VMs connect to (see: http://blogs.technet.com/b/uspartner_ts2team/archive/2014/06/09/setting-up-a-file-share-for-the-new-azure-file-service.aspx).

Azure Traffic Manager for Cloud Services - What about storage access?

I have finally got the time to start looking at Azure. It's looks good and easy scaling.
Azure SQL, Table Storage and Blog Storage should cover most of my things. Fast access to data, auto replication and failover to an other datacenter.
Should the idea come for an app that needs fast global access the Traffic manager is there and one can route users for "Fail Over" or "Performance".
The "performance" is very nice for Cloud Services and "Web Roles / Worker Roles" ... BUT ... What about access to data from SQL Azure/Table Storage/Blog Storage.
I have tried searching the web(for what to do about this need), but haven't found anything about the traffic manager that mentions anything about how to access data in such a scenario.
Have I missed anything?
Do people access the storage in the original data center (and if that fails use the Geo Replication feature)? Is that fast enough? Is internal traffic on the MS network free across datacenters?
This seems like such a simple ...
Take a look at the guidance by Microsoft: Replicating, Distributing, and Synchronizing Data. You could use the Service Bus to keep data centers in Sync. This can cover SQL Databases, Storage, search indexes like SolR, ElasticSearch, ... The advantage over solutions like SQL Data Sync is that it's technology independent and it can keep virtually all your data in sync:
In this episode of Channel 9 they state that Traffic Manager is only for Cloud Services as of now (Jan 2014) but support is coming for Azure Web Sites and other services. I agree that you should be able to ask for a Blob using a single global URL and expect that the content will be served from the closest datacenter.
There isn't a one-click easy to implement solution for this issue. The way you solve it will depend on where the data lives (ie. SQL Azure, Blob storage, etc) and your access patterns.
Do you have a small number of data requests that are not on a performance critical path in your code? Consider just using the main datacenter.
Do you have a large number of read-only type of requests? Consider doing a replication of the data to another datacenter.
Do you do a large number of read and only a few write operations? Consider duplicating the data among all datacenters and each write will write to all datacenters at the same time (incurring a perf penalty) and do all reads to the local datacenter (fast reads).
Is your data in SQL Azure? Consider using SQL Data Sync to keep multiple datacenters in sync.

How do I make my Windows Azure application resistant to Azure datacenter catastrophic event?

AFAIK Amazon AWS offers so-called "regions" and "availability zones" to mitigate risks of partial or complete datacenter outage. Looks like if I have copies of my application in two "regions" and one "region" goes down my application still can continue working as if nothing happened.
Is there something like that with Windows Azure? How do I address risk of datacenter catastrophic outage with Windows Azure?
Within a single data center, your Windows Azure application has the following benefits:
Going beyond one compute instance, your VMs are divided into fault domains, across different physical areas. This way, even if an entire server rack went down, you'd still have compute running somewhere else.
With Windows Azure Storage and SQL Azure, storage is triple replicated. This is not eventual replication - when a write call returns, at least one replica has been written to.
Ok, that's the easy stuff. What if a data center disappears? Here are the features that will help you build DR into your application:
For SQL Azure, you can set up Data Sync. This facility synchronizes your SQL Azure database with either another SQL Azure database (presumably in another data center), or an on-premises SQL Server database. More info here. Since this feature is still considered a Preview feature, you have to go here to set it up.
For Azure storage (tables, blobs), you'll need to handle replication to a second data center, as there is no built-in facility today. This can be done with, say, a background task that pulls data every hour and copies it to a storage account somewhere else. EDIT: Per Ryan's answer, there's data geo-replication for blobs and tables. HOWEVER: Aside from a mention in this blog post in December, and possibly at PDC, this is not live.
For Compute availability, you can set up Traffic Manager to load-balance across data centers. This feature is currently in CTP - visit the Beta area of the Windows Azure portal to sign up.
Remember that, with DR, whether in the cloud or on-premises, there are additional costs (such as bandwidth between data centers, storage costs for duplicate data in a secondary data center, and Compute instances in additional data centers). .
Just like with on-premises environments, DR needs to be carefully thought out and implemented.
David's answer is pretty good, but one piece is incorrect. For Windows Azure blobs and tables, your data is actually geographically replicated today between sub-regions (e.g. North and South US). This is an async process that has a target of about a 10 min lag or so. This process is also out of your control and is purely for a data center loss. In total, your data is replicated 6 times in 2 different data centers when you use Windows Azure blobs and tables (impressive, no?).
If a data center was lost, they would flip over your DNS for blob and table storage to the other sub-region and your account would appear online again. This is true only for blobs and tables (not queues, not SQL Azure, etc).
So, for a true disaster recovery, you could use Data Sync for SQL Azure and Traffic Manager for compute (assuming you run a hot standby in another sub-region). If a datacenter was lost, Traffic Manager would route to the new sub-region and you would find your data there as well.
The one failure that you didn't account for is in the ability for an error to be replicated across data centers. In that scenario, you may want to consider running Azure PAAS as part of HP Cloud offering in either a load balanced or failover scenario.

Resources