I am thinking of using Azure Blob Storage for document management system which I am developing. All Blobs ( images,videos, word/excel/pdf etc) will be stored in Azure Blob storage. As I understand, I need to create container and these files can be stored within the container.
I would like to know how to safeguard against accidental/malicious deletion of the container. If a container is deleted, all the files it contains will be lost. I am trying to figure out how to put backup and recovery mechanism in place for my storage account so that it is always guaranteed that if something happens to a container, I can recover files inside it.
Is there any way provided by Microsoft Azure for such backup and recovery or Do I need explicitly write a code in such a way that files are stored in two separate Blob storage account.
Anyone with access to your storage account's key (primary or secondary; there are two keys for a storage account) can manipulate the storage account in any way they see fit. The only way to ensure nothing happens? Don't give anyone access to the key(s). If you place the storage account within a resource group that only you have permissions on, you'll at least prevent others with access to the subscription from discovering the storage account and accessing it.
Within the subscription itself, you can place a lock on the actual resource (the storage account), so that nobody with access to the subscription accidentally deletes the entire storage account.
Note: with storage account keys, you do have the ability to regenerate the keys at any time. So if you ever suspected a key was compromised, you can perform a re-gen action.
Backups
There are several backup solutions offered for blob storage in case if containers get deleted.more product info can be found here:https://azure.microsoft.com/en-us/services/backup/
Redundancy
If you are concerned about availability, "The data in your Microsoft Azure storage account is always replicated to ensure durability and high availability. Replication copies your data, either within the same data center, or to a second data center, depending on which replication option you choose." , there are several replication options:
Locally redundant storage (LRS)
Zone-redundant storage (ZRS)
Geo-redundant storage (GRS)
Read-access geo-redundant storage (RA-GRS)
More details can be found here:
https://learn.microsoft.com/en-us/azure/storage/common/storage-redundancy
Managing Access
Finally, managing access to your storage account would be the best way to secure and ensure you'll avoid any loss on your data. You can provide read access only if you don't want anyone to delete files,folders etc.. through the use of SAS: Shared Access Signatures, allows you to create policies and provide access based on Read, Write, List, Delete, etc.. A quick GIF demo can be seen here: https://azure.microsoft.com/en-us/updates/manage-stored-access-policies-for-storage-accounts-from-within-the-azure-portal/
We are using blob to store documents and for documents management.
To prevent deletion of the blob, you can now enable soft deletion as described in here:
https://azure.microsoft.com/en-us/blog/soft-delete-for-azure-storage-blobs-ga/
You can also create your own automation around powershell,azcopy to do incremental and full backups.
The last element would be to use RA-GRS blobs where you can read from a secondary blob in read mode in another region in case the data center goes down.
Designing Highly Available Applications using RA-GRS
https://learn.microsoft.com/en-us/azure/storage/common/storage-designing-ha-apps-with-ragrs?toc=%2fazure%2fstorage%2fqueues%2ftoc.json
Use Microsoft's Azure Storage Explorer. It will allow you to download the full contents of blob containers including folders and subfolders with blobs. Conversely, you can upload to containers in the same way. Simple and free!
Related
We have Terraform state file stored in the Azure Storage Account. In case storage account went down we will be screwed. What is the best way to store the file? where?
AFAIK, there are two methods to store a terraform state file i.e. Locally in your machine or in a Storage account in azure .
In case storage account went down we will be screwed. What is the best
way to store the file? where?
As confirmed , You are using Standard_LRS which is not preferred as per the Microsoft Document if you are looking for high availability.
Locally redundant storage (LRS) copies your data synchronously three
times within a single physical location in the primary region. LRS is
the least expensive replication option, but is not recommended for
applications requiring high availability or durability.
So, as a solution you can change the storage account type as per your requirement to Standard_GRS or Standard_ZRS so that your data is present in two locations i.e. replicated.
You can change it by going to your storage account>>Configuration>>replication as shown below:
If You want more details on Disaster recovery (if one location is down) or data protection from Accidental Deletes then please refer the below documents:
Disaster recovery and storage account failover - Azure Storage | Microsoft Docs
Soft delete for containers - Azure Storage | Microsoft Docs
My Azure WebApp stores data in Azure Storage Tables and Blob storage.
There is a backup functionnality, but as I understood it just does not support azure tables/blobs... however I would like to automatically backup my tables to protect against accidental data corruption by users or by a software issue...
I would like to backup MyProdTables in MyBackupBlob container. Is there away to do it actually?
I read something about AZCopy, but it seems working with virtual machine's hard drives, but we have the WebApplication as a Service, so I am not sure that it will work in our case...
Edit: There is a partial (negative) MS feedback on the question, as mentioned in this answer, but it focused rather on migration and entire account snapshots. I am rather focused on the table storage, and maybe even the possibility of backuping individual tables... is strange that is nothing possible in this field, cause MS Azure Storage Explorer can easily backup the tables as CSV files.
There is no built-in backup feature for blobs and tables, as you've surmised. However: blobs do offer snapshots (a point-in-time snapshot may be taken at any time).
There are also Shared Access Signatures (and Policies) to limit exposure to your storage. And you can even protect the storage account itself from deletion.
As for AzCopy: that has nothing to do with VM disks. That's specifically for moving content in and out of blobs and tables.
I've searched the web and contacted technical support yet no one seems to be able to give me a straight answer on whether items in Azure Blob Storage are backed up or not.
What I mean is, do I need to create a twin storage account as a "backup" and program copies of all content from one storage to another, or are the contents of a client's Blob Storage automatically redundantly backed up by Microsoft?
I know with AWS, storage is redundantly backed up via onsite drives as well as across other nodes in the cluster.
do I need to create a twin storage account as a "backup" and program
copies of all content from one storage to another, or are the contents
of a client's Blob Storage automatically redundantly backed up by
Microsoft?
Yes, you will need to do backup manually. Azure Storage does not back up the contents of your storage account automatically.
Azure Storage does provide geo-redundant replication (provided you configure the redundancy level for your storage account as GRS or RA-GRS) but that is not back up. Once you delete content from your primary account (location, it will automatically be removed from secondary account (geo-redundant location).
Both AWS (EBS) and Azure(Blob Storage) options provides durability by replicating the data across different data centers. This is for the high availability and durability of the data to provide the guarantee by the cloud provider.
In order to ensure that your data is durable, Azure Storage has the
ability to keep (and manage) multiple copies of your data. This is
called replication, or sometimes redundancy. When you set up your
storage account, you select a replication type. In most cases, this
setting can be modified after the storage account is set up.
For more details refer the replication section in documentation.
If you need to capture changes to the storage and allow restore to previous versions (e.g In situations like data corruption or application feature requirements like restore points, backups), you need to take a SnapShot manually. This is common for both AWS and Azure.
For more details on creating a Snapshot of Blob in Azure refer the documentation.
I have recently brought Microsoft Azure product for storage purpose of a NAS.
At first i chose the "read access geo redundant" and made a schedule for my NAS to backup.
Today i have changed it to local redundant (i saw the price difference) but my synology NAS is not finished backing up yet. Will it automaticly change to local redundant, or should i cancel the backup and re-do it?
To answer your question, you don't have to do anything. Azure will automatically convert your storage account's redundancy type from RAGRS to LRS.
To elaborate more, essentially the way RAGRS works is that data is written to the primary storage account and then through some background process the data is replicated to the secondary storage account. Once you change the redundancy to LRS, the replication stops.
One more point I would like to mention. If you're storing your data in blob storage for backup purpose only, may I suggest that you look at Cool Storage offering from Azure Storage. Compared to standard storage accounts, cost of storing data in a Cool Storage account is much lesser.
For an HDInsight cluster there has to be at least one azure storage account which is its default storage account -- it is required so that it is treated as its fs (filesystem). This I get. But what about optional linked azure storage accounts? From ADF (Azure Data Factory) perspective at least, do we need to have a storage account added as linked storage account to an HDInsight cluster? Anyway the Azure storage account is accessible purely by providing just two pieces of information --- the account name and the key. Both these things are specified in Linked Servers in ADF. This guarantees the access of the storage account. What is the real benefit of having some account added as linked storage account, from ADF point of view or otherwise? Basically, what I am asking is -- is there anything that we can't do purely using account name and key without adding the account as linked storage for the given HDInsight cluster?
The main reason to have additional accounts is because they have limits. A storage account can have 500 TB of data in it and 20000 request per second. Depending on the size of your cluster and work load you might hit the request limit. If you are worried about those limits and you don't want to manage alot of storage accounts you should look into Azure Data Lake.
I think I sort of figured out the answer. With linked storage accounts the cluster, when used as a compute, can directly access BLOBS on those storage accounts without requiring us to separately specify the storage keys in queries. That's the use case for which linked storage is a must have.