Is copying blob within the same Azure storage account instant? - azure

I'm using the StartCopyFromBlob to copy a 2GB blob from container A to container B within the same storage account. I noticed that it's an instant operation as the CopyState status is Success right away. This is very good for us, so want to confirm that we can actually rely on this.
I can't find any MSDN document about this "copy optimization" when copying within the same storage account. Is there a document on this copy behavior within the same account? Just want to make sure it is officially supported.

Only storage accounts created on or after June 7th, 2012 allow the Copy Blob operation to copy from another storage account. http://msdn.microsoft.com/en-us/library/windowsazure/dd894037.aspx
you might find this post interesting: Introducing Asynchronous Cross-Account Copy Blob http://blogs.msdn.com/b/windowsazurestorage/archive/2012/06/12/introducing-asynchronous-cross-account-copy-blob.aspx
I hope this helps let me know if you need anything else.

Related

Copying new Azure blobs to different container

We have 5 vendors that are SFTPing files to Blob Storage. When the files come in, I need to copy them to another container and create a folder in that container named with the date to put the files in. From the second container, I need to copy the files to a file share on an Azure server. What is the best way to go about this?
I'm very new to Azure and unsure what the best way is to accomplish what I am being asked to do. Any help would be greatly appreciated.
I'd recommend using Azure Synapse for this task. It will let you move data to and from different storage securely and with little-to-no code.
Specifically, I'd put a blob storage trigger on the SFTP blob container so that the Synapse Pipeline to move data automatically runs when your vendors drop their files.
Note that when you look for documentation on how to do things in Synapse, most of the time the Azure Data Factory documentation will also be applicable, since most of Data Factory's functionality is now in Synapse.
The ADF and Synapse YouTube channels are excellent resources, as well as the Microsoft Learn courses on Data Engineering.
I need to copy them to another container and create a folder in that container named with the date to put the files in.
You can use Azcopy to copy a files to another container by using SAS token.
command:
azcopy copy 'https://<storage account>.blob.core.windows.net/test/files?SAS' 'https://<storage account >.blob.core.windows.net/mycontainer/12-01-2023?SAS' --recursive
Console:
Portal:
I need to copy the files to a file share on an Azure server
You can also copy the files from container to file share by using Azcopy.
Command:
azcopy copy 'https://<storage account>.blob.core.windows.net/test?SAS' 'https://<storage account >.file.core.windows.net/fileshare/12-01-2023?SAS' --recursive
Console:
Portal:
You can get the SAS token through portal:
Go to portal -> your storage account -> shared access signature -> check the resource types -> click generate SAS and Connection-string.
Portal:
Probably azcopy is a good way to move all or part of the blobs from one container to another one. But I would suggest to automate it with Azure Functions. I think it can be atomated triggering an Azure Function every time a blob or set of blobs (Azure could process a batch of blobs) are updoladed to the source container.
Note on Azure Functions, depends on the quantity of blobs to be moved and the time that it could take, durable functions should be better solution to skip timeout exception. Durable function returns inmediate response but are running in "background".
Consider this article to have a better approach to this solution:
https://build5nines.com/azure-functions-copy-blob-between-azure-storage-accounts-in-c/

Azure blob container backup and recovery

I am thinking of using Azure Blob Storage for document management system which I am developing. All Blobs ( images,videos, word/excel/pdf etc) will be stored in Azure Blob storage. As I understand, I need to create container and these files can be stored within the container.
I would like to know how to safeguard against accidental/malicious deletion of the container. If a container is deleted, all the files it contains will be lost. I am trying to figure out how to put backup and recovery mechanism in place for my storage account so that it is always guaranteed that if something happens to a container, I can recover files inside it.
Is there any way provided by Microsoft Azure for such backup and recovery or Do I need explicitly write a code in such a way that files are stored in two separate Blob storage account.
Anyone with access to your storage account's key (primary or secondary; there are two keys for a storage account) can manipulate the storage account in any way they see fit. The only way to ensure nothing happens? Don't give anyone access to the key(s). If you place the storage account within a resource group that only you have permissions on, you'll at least prevent others with access to the subscription from discovering the storage account and accessing it.
Within the subscription itself, you can place a lock on the actual resource (the storage account), so that nobody with access to the subscription accidentally deletes the entire storage account.
Note: with storage account keys, you do have the ability to regenerate the keys at any time. So if you ever suspected a key was compromised, you can perform a re-gen action.
Backups
There are several backup solutions offered for blob storage in case if containers get deleted.more product info can be found here:https://azure.microsoft.com/en-us/services/backup/
Redundancy
If you are concerned about availability, "The data in your Microsoft Azure storage account is always replicated to ensure durability and high availability. Replication copies your data, either within the same data center, or to a second data center, depending on which replication option you choose." , there are several replication options:
Locally redundant storage (LRS)
Zone-redundant storage (ZRS)
Geo-redundant storage (GRS)
Read-access geo-redundant storage (RA-GRS)
More details can be found here:
https://learn.microsoft.com/en-us/azure/storage/common/storage-redundancy
Managing Access
Finally, managing access to your storage account would be the best way to secure and ensure you'll avoid any loss on your data. You can provide read access only if you don't want anyone to delete files,folders etc.. through the use of SAS: Shared Access Signatures, allows you to create policies and provide access based on Read, Write, List, Delete, etc.. A quick GIF demo can be seen here: https://azure.microsoft.com/en-us/updates/manage-stored-access-policies-for-storage-accounts-from-within-the-azure-portal/
We are using blob to store documents and for documents management.
To prevent deletion of the blob, you can now enable soft deletion as described in here:
https://azure.microsoft.com/en-us/blog/soft-delete-for-azure-storage-blobs-ga/
You can also create your own automation around powershell,azcopy to do incremental and full backups.
The last element would be to use RA-GRS blobs where you can read from a secondary blob in read mode in another region in case the data center goes down.
Designing Highly Available Applications using RA-GRS
https://learn.microsoft.com/en-us/azure/storage/common/storage-designing-ha-apps-with-ragrs?toc=%2fazure%2fstorage%2fqueues%2ftoc.json
Use Microsoft's Azure Storage Explorer. It will allow you to download the full contents of blob containers including folders and subfolders with blobs. Conversely, you can upload to containers in the same way. Simple and free!

How to archive Azure blob storage content?

I'm need to store some temporary files may be 1 to 3 months. Only need to keep the last three months files. Old files need to be deleted. How can I do this in azure blob storage? Is there any other option in this case other than blob storage?
IMHO best option to store files in Azure is either Blob Storage or File Storage however both of them don't support auto expiration of content (based on age or some other criteria).
This feature has been requested long back for Blobs Storage but unfortunately no progress has been made so far (https://feedback.azure.com/forums/217298-storage/suggestions/7010724-support-expiration-auto-deletion-of-blobs).
You could however write something of your own to achieve this. It's rather very simple: Periodically (say once in a day) your program will fetch the list of blobs and compare the last modified date of the blob with current date. If the last modified date of the blob is older than the desired period (1 or 3 months like you mentioned), you simply delete the blob.
You can use WebJobs, Azure Functions or Azure Automation to schedule your code to run on a periodic basis. In fact, there's readymade code available to you if you want to use Azure Automation Service: https://gallery.technet.microsoft.com/scriptcenter/Remove-Storage-Blobs-that-aae4b761.
As I know, Azure Blob is a appropriate approach for you to storage some temporary files. For your scenario, I assumed that there is no in-build option for you to delete the old files, and you need to programmatically or manually delete your temporary files.
For a simple way, you could try to upload your blob(file) with the specific format (e.g. https://<your-storagename>.blob.core.windows.net/containerName/2016-11/fileName or https://<your-storagename>.blob.core.windows.net/2016-11/fileName), then you could manually manage your files via Microsoft Azure Storage Explorer.
Also, you could check your files and delete the old files before you uploading the new temporary file. For more details, you could follow storage-blob-dotnet-store-temp-files and override the method CleanStorageIfReachLimit to implement your logic for deleting blobs(files).
Additionally, you could leverage a scheduled Azure WebJob to clean your blobs(files).
You can use Azure Cool Blob Storage.
It is cheaper than Blob storage and is more suitable for archives.
You can store your less frequently accessed data in the Cool access tier at a low storage cost (as low as $0.01 per GB in some regions), and your more frequently accessed data in the Hot access tier at a lower access cost.
Here is a document that explains its features:
https://azure.microsoft.com/en-us/blog/introducing-azure-cool-storage/

Is this a sensible Azure Blob Storage setup and are there restructuring tools to help me migrate to it?

I think we have gone slightly wrong on the way we have used Azure storage in a SAAS system. We created a storage account per client (Securtiy was prime consideration) and containers per system area e.g. Vehicle, Work etc
Having done further reading it seems a suggestion would be that we should have used one account for all clients. Each client should have a container (so we can programmatically create it) which we then secure. Then files should just be structured using "virtual" folder structure e.g. Container called "Client A". Then Files for the Jobs (in Work area of system) stored like Work/Jobs/{entity id}/blah.pdf. Does this sound sensible?
If so we now have about 10 accounts that we need to restructure. Are there any tools that will let us easily copy one accounts contents to another containers account? I appreciate we probably can't move the files between accounts (as we set them up ages ago so can't use native copy function) so I guess some sort of copy. There are GB of files across all the accounts.
It may not be such a bad idea to keep different storage accounts per client. The benefits of doing that (to me) are:
Better security as mentioned by you.
You'll be able to achieve better throughput / client as each client will have their own storage account. If you keep one storage account for all clients, and if one client starts hitting that account badly other clients will be impacted.
Better scalability. Each storage account can hold up to 200 TB of data. So if you keep just one storage account and assuming each client consumes 100 GB of data, you'll be able to accommodate only 2000 clients (I hope my math is right :)). With individual storage accounts, you won't be restricted in that sense.
There're some downsides as well. Some of them are:
Management would be a nightmare. Imagining you have 2000 customers then you would end up managing 2000 storage accounts.
You may be limited by Windows Azure. Currently by default you get about 10 or 20 storage accounts per subscription and you would need to contact support to manually up that limit. They can do that for you but I would imagine you would want this to be a self-service model where you would be able to create as many storage accounts as you want without contacting support.
Now coming to your question about tooling, you could possibly write something on your own which makes use of Copy Blob functionality. This functionality allows you to copy blob data across storage accounts asynchronously. Basically this is what you would do is:
First create a blob container for each client in the target storage account.
Enumerate all blob containers in source storage account.
For each blob container in source storage account, enumerate the blobs.
Copy each blob asynchronously to target storage account in the client's blob container.
If you're a PowerShell fan, you can look into Cerebrata's Azure Management Cmdlets (http://www.cerebrata.com/Products/AzureManagementCmdlets) as well which wraps this functionality. I could have recommended Cerebrata's Azure Management Studio as well but I haven't tried this functionality just yet there [Disclosure: I'm one of the devs on Cerebrata team].
Hope this helps.
Adding to Gaurav Mantri answer...
You can have shared storage account for customers and use Shared Access Signature(SAS) to limiting access to particular container or blobs(as well as for tables and queues)...
http://msdn.microsoft.com/en-us/library/windowsazure/hh508996.aspx

How to convert exist Block Blob to PageBlob

I used the CloudBerry explorer to copy the VM(Iaas) disk file to a another Storage.
But when I finished duplication, I found the new create Blob is a Block Blob, not a Page Blob.
The tool didn't duplicate the source blob type which is Page Blob.
Is there anyway to Convert to Page Blob from Block Blob? Thanks
No. Once a blob is created/uploaded you can't change the blob type. Unfortunately you would need to recreate/re-upload the blob. However I'm somewhat surprised. You mentioned that you copied the blob from one storage account to another. Copy Blob operation within Windows Azure (i.e. from one storage account to another) preserves the source blob type. It may seem a bug in CloudBerry explorer. I wrote a blog post some days ago about moving virtual machines from one subscription to another (http://gauravmantri.com/2012/07/04/how-to-move-windows-azure-virtual-machines-from-one-subscription-to-another/) and it has some sample code and other useful information for copying blobs across storage account. You may want to take a look at that. HTH.
Has been a while since the original question, but it seems that the solution I used is not known or at least is not being used.
In Azure Storage you can not change the blob type for an existing file. Some people recommends download the files and upload again. But you can also use azcopy from the Cloud Shell in the Azure portal. At least in PowerShell the azcopy utility is available. I haven't tried in bash.
You need 2 SAS URLs with addecuate permission to read from the original container and to write to the destination. You also need the LIST permission. Having that, open the Cloud Shell and write the command.
azcopy copy 'https://<source-storage-account-name>.blob.core.windows.net/<source-container-name>?<SAS-token>' 'https://<dest-storage-account-name>.blob.core.windows.net/<dest-container-name>?<SAS-token>' --recursive --blob-type=BlockBlob
After coping, just delete the old page blobs.
More options for azcopy copy command can be found in the documentation.
This is the sample output:

Resources