When using Virtual Servers on Windows Azure and attaching disks - Would these go down if the hard disk / server should fail? Or are they guaranteed to be backed up? Should we still keep an external backup ourselves, or can we depend on Azure?
Also, would the primary hard-disk as part of the server, also follow the same backup principles? And same applies for blob-storage, would that need to be backed up?
Azure attached disks, just like the OS disk, is stored as a vhd, inside a blob in Azure Storage. This is durable storage: triple-replicated within the data center, and optionally geo-replicated to a neighboring data center.
That said: If you delete something, it's instantly deleted. No going back. So... then the question is, how to deal with backups from an app-level perspective. To that end:
You can make snapshots of the blob storing the vhd. A snapshot is basically a linked list of all the pages in use. In the event you make changes, then you start incurring additional storage, as additional page blobs are allocated. The neat thing is: you can take a snapshot, then in the future, copy the snapshot to its own vhd. Basically it's like rolling backups with little space used, in the event you only add data and don't modify.
You can make blob copies. This is an async action that is near-instant within a single data center. I've seen copies take upwards of an hour going cross-data center. But the point is that you can make copies any time you want, and this will be in its own blob.
Related
Why would I need to create a blob snapshot and incur additional cost if Azure already provides GRS(Geo redundant storage) or ZRS (Zone redundant storage)?
Redundancy (ZRS/GRS/RAGRS) provides means to achieve high availability of your resources (blobs in your scenario). By enabling redundancy you are ensuring that a copy of your blob is available in another region/zone in case primary region/zone is not available. It also ensures against data corruption of the primary blob.
When you take a snapshot of your blob, a readonly copy of that blob in its current state is created and stored. If needed, you can restore a blob from a snapshot. This scenario is well suited if you want to store different versions of the same blob.
However, please keep in mind that neither redundancy nor snapshot is backup because if you delete base blob, all the snapshots associated with that blob are deleted and all the copies of that blob available in other zones/regions are deleted as well.
I guess you need to understand the difference between Backup and Redundancy.
Backups make sure if something is lost, corrupted or stolen, that a copy of the data is available at your disposal.
Redundancy makes sure that if something fails—your computer fails, a drive gets fried, or a server freezes and you are able to work regardless of the problem. Redundancy means that all your changes are replicated to another location. In case of a failover, your slave can theoretically function as a master and serve the (hopefully) latest state of your file system.
You could also turn soft delete on. That would keep a copy of every blob for every change made to it, even if someone deletes it. Then you set the retention period for those blobs so they would be automatically removed after some period of time.
https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-soft-delete
I am using two Microsoft Azure Virtual Machines (marked as classic), both running on Linux. One is used for test purposes and internal demos, the other is production and running few of clients' instances.
What I would like to do is change the size of Virtual Machine. I understand this is quite common process and can easily be done from the Azure Management Portal and that this is not affecting data. However, when I have changed the size of our testing machine, exactly this has happened and we have lost all data.
Azure Support answer received was:
"We recommend you delete the VM by keeping the attached disks and create a new VM with the required size." Not sure why this would be better?
Any data stored on the ephemeral (internal-to-chassis) scratch disk is at risk, as it's a non-durable disk (and will in all likelihood be destroyed/recreated upon resizing a VM).
The only way to have durable data is to use Azure Storage (blobs, vhd as attached disk, Azure File storage) or external database. Azure Storage is durable (minimum 3 copies), and is not stored with your VM.
One more thing: The VM's OS Disk is a VHD in Azure Storage (so the OS disk is durable, just like attached vhd's).
You have more than one way to do that and keep in mind what David said, data on OS disks, attached disks and blobs is the only durable one.
To prevent losing data and since you're using Classic VMs, you can do the following:
1- Go to your VM on portal and capture an image out of it.
2- Go to your new image and create a new VM out of it, while specifying the new specs that you need.
3- When done, connect to your new VM while keeping the old one without termination.
4- Check if all your data is there, if yes, then you can remove the old one. (In case you need the old IP, you can still assign it to the new one).
Cheers.
I have a VM on Azure which is my content management system using nodejs and mongodb.
One of things the CMS does is have a social sharing function where html pages are created and users are given the url to this page.
I expect a large volume of users (probably 5000 at a given time) access the http pages. I do not want this load to be on the same server as my CMS.
So I was thinking about moving the html pages to another server. My question is do I need to look at Azure blob storage to do this or should I just use another VM and put files there?
The files are very small and minified. I want to keep my costs down whilst at the same time if I get more than 5000 requests, the server should auto scale.
The question itself is somewhat subjective/opinion-soliciting. And how you solve this problem is really up to you.
But from an objective perspective:
Blobs themselves are not the same as local file storage. If you're going to store content in them, either your CMS needs to support them natively or you're going to need to build that support into it (if that's even possible). Since they have their own REST API (and related SDKs) you cannot simple do file I/O operations against them. They are, however, accessible via URI (which may be made private or public).
Azure VMs store their disks (vhd's) in page blobs (so, you're already using blob storage technically speaking). And each VM may have attached disks (1TB each) also in page blobs, two disks per core (so a dual-core VM supports 4 attached 1TB disks). Just like your OS disk, these attached disks are durable, in blob storage. A CMS may access an attached disk once it's formatted and given a drive letter (Windows) or mounted (Linux). EDIT - forgot to mention: If you go with the attached-disk approach, you need to consider the fact that these disks are per-VM. That is, they are not shared across multiple VM's (in the event you scale your CMS to multiple instances).
Azure File Service is an SMB share sitting atop Azure Blob Storage. Again, durable storage, and drive-mappable. EDIT unlike attached disks, Azure File Service SMB shares are accessible across multiple VM's.
I need to do an automatic periodic backup of an Azure blob storage to another Azure blob storage.
This is in order to guard against any kind of malfunction in the software.
Are there any services which do that? Azure doesn't seem to have this
As #Brent mentioned in the comments to Roberto's answer, the replicas are for HA; if you deleted a blob, that delete is replicated instantly.
For blobs, you can very easily create asynchronous copies to a separate blob (even in a separate storage account). You can also make snapshots which capture a blob at a current moment in time. At first, snapshots don't cost anything, but if you start modifying the blocks/pages referred to by the snapshot, then new blocks/pages are allocated. Over time, you'll want to start purging your snapshots. This is a great way to keep data "as-is" over time and revert back to a snapshot if there's a malfunction in your software.
With queues, the malfunction story isn't quite the same, as typically you'd only have a small number of queue items present (at least that's the hope; if you have thousands of queue messages, this is typically a sign that your software is falling behind). In any event: You could, when writing queue messages, write your queue messages to blob storage, for archive purposes, in case there's a malfunction. I wouldn't recommend using blob- based messaging for scaling/parallel processing, since they don't have the mechanisms in place that queues do, but you could use them manually in case of malfunction.
There's no copy function for tables. You'd need to write to two tables during your write.
Azure keeps 3 redundant copies of your data in different locations in the same data centre where your data is hosted (to guard against hardware failure).
This applies to blob, table and queue storage.
Additionally, You can enable geo-replication on all of your storage. Azure will automatically keep redundant copies of your data in separate data centres. This guards against anything happening to the data centre itself.
See Here
I currently have a Rackspace Cloud Server that I'd like to migrate to an Azure Virtual Machine. I recently got an MSDN subscription which gives me a certain level of hosting via Azure at no cost, where I'm currently paying for that level of service with Rackspace.
However, one of the nice things about Rackspace is that I can schedule nightly/weekly backups of the VM image. Is there any mechanism for doing this on Azure? I'm worried about protecting against corruption of the database (i.e. what if someone were to run an UPDATE statement and forget the WHERE clause). Is there a mechanism for this with Azure?
I know the VMs are stored as .VHD files in my local Azure storage, but the VM image is 127 gigs. Downloading that nightly even with FIOS internet isn't really going to fly as a solution.
You can perform an asynchronous blob copy to make a physical copy of a vhd. See here for REST API details. This operation is very fast within the same data center (maybe a few seconds?). You don't need to make raw REST calls though: There's a method already implemented in the Azure cross-platform command line interface, available here. The command is:
azure vm disk upload
You can also take blob snapshots, and return to a previous snapshot later. A snapshot is read-only (which you can copy from later) and takes up no space initially. However, as storage pages are changed, the snapshot grows.
One question though: why such a large VM image? Are you storing OS + data on same vhd? If so, it may make more sense to mount a separate Azure Drive (also stored in VHD in blob storage) to store data, and make independent copies / snapshots.