Faster blob storage copy tools across regions - azure

I need to copy containers in Blob Storage across regions and wanted a solution that would do it without having to download locally and then upload it again. For example, I am trying to copy a container from East US to a container in SouthEast Asia. I used AzCopy to do that and the throughput I got was 22 Mb/s at best. I am not doing /SyncCopy either so is this best throughput the tool provides cross region ? Do we any other external tools that provide faster results ? Thanks.

Azcopy is your best bet when it comes to rapid data move within Azure. You could also consider using Azure Import/Export service if you have an urgent timeline for large amount of data transfer:
using Azure Import/Export service to securely transfer large amounts of data to Azure Blob storage and Azure Files by shipping disk drives to an Azure data center. This service can also be used to transfer data from Azure storage to hard disk drives and ship to your on-premise sites. Data from a single internal SATA disk drive can be imported either to Azure Blob storage or Azure Files.
There are also some external tools:
https://www.signiant.com/signiant-flight-for-fast-large-file-transfers-to-azure-blob-storage/
and:
http://asperasoft.com/fast-file-transfer-with-aspera-sod-azure/
https://learn.microsoft.com/en-us/azure/storage/common/storage-import-export-service
https://learn.microsoft.com/en-us/azure/storage/common/storage-moving-data

Related

When should we use file share in azure as compared to Azure Blobs?

Could someone please tell some examples where we can use Azure file share in azure instead of Azure Blobs. In the internet whenever I search I get it can be mounted or it follows SMB protocol. But still I am not understanding a single case where we can use Azure File share.
For this I tried to look into When to use Azure blob storage versus Azure file share?
-This is a similar question but doesn't answer my question.
Azure provides a variety of storage tools and services, including Azure Storage. To determine which Azure technology is best suited for your scenario, see Review your storage options in the Azure Cloud Adoption Framework.
For detailed information and examples refer to this article: https://learn.microsoft.com/en-us/azure/storage/common/storage-introduction
It depends mostly on your use-case and how you plan to access the data. If you simply want to mount and access your files Azure Files will be your best fit. If you are looking for the lowest cost and want to access your data programmatically through your application Azure Blob would be a better fit. Both are accessible through the portal or Azure Storage Explorer.
I also recommend this Learn module which covers the difference in data types and solutions.
Additional information: Azure Blob Storage vs Azure File Storage
Cost details of Azure Blob Storage pricing & Azure Files pricing
In short: if you ...
have an application that needs to store or access files in the cloud, use Blob Storage
need a file share that can be used by, for instance, a server, use File Shares
Azure Files shares can be mounted concurrently by cloud or on-premises deployments of Windows, Linux, and macOS. Azure Files shares can also be cached on Windows Servers with Azure File Sync for fast access near where the data is being used.
This means a File Share is, somewhat simplified, similar to a network share you would have in a local environment.
Azure Blob Storage helps you create data lakes for your analytics needs, and provides storage to build powerful cloud-native and mobile apps. Optimize costs with tiered storage for your long-term data, and flexibly scale up for high-performance computing and machine learning workloads.
This means Blob Storage is what you need when you're building powerful cloud-native and mobile apps.

Azure blob storage streaming performance issue

My application till this day was working with local zip files,
meaning I was using a direct return new FileStream()
in the application and the local zip file that was located on the SDD/Network drive path (zip files can be hundreds of GB).
I configured the application to work with Azure Blob Storage, meaning each FileStream that was returned in now return as the Azure Blob SDK method:
GetBlobStreamAsync(ContainerName, BlobName).ConfigureAwait(false).GetAwaiter().GetResult()
I uploaded some zip files to a container in the blob storage and set the connection string in the application to work with that storage account.
The application was deployed and running on a virtual windows machine located in the same region of the Azure Storage Blob.
Note: This is a private cloud network.
When the app is streaming the zip file on Azure blob storage it seems that the performance has decreased by at least 8-9 times (problematic with hundreds of GB).
Speed comparison is between local C: drive on the same windows virtual machine that the application is running on an Azure Storage account which is located in the same region.
Note: NW Bandwidth - is 50 GB on the VM on azure
Solutions that I tried:
Azure blob Premium Performance storage - Didn’t improve performance
.Net Core - advantage of performance enhancements (we work with .Net framework so this is irrelevant).
Network File System (NFS) 3.0 performance considerations in Azure Blob storage - (Does not work with private cloud).
Hot, Cool, and Archive access tiers for blob data - The default is Hot so we already tried this scenario with no improvements.
Solutions I want to try:
Azure Files Share Storage as a cache solution
.Net Framework configuration - lists several quick configuration settings that you can use to make significant performance improvements
Question:
Does anyone have any suggestions on how can I optimize the streaming in front of the Azure Storage Blob?
Azure Files (share) or Storage Blob services are likely not the right services to be utilized for this scenario. There are two possible paths:
Break a single file into multiple files and leverage Storage Blob service that handles throughput better than Azure Files. Azure Files performs better with small(er) files which are typical to user documents (PDFs, Word, Excel, etc.)
Switch over to a more dedicated service that is designed specifically for large-size data transfer if breaking up a single file into multiple blobs is not an option.
The recommendation for each option will highly depend on the implementation details, requirements and constraints of the system.

Is there something like transfer acceleration in Azure Blob Storage?

I would like to create an Azure Storage Account, and use blob storage in the region US West.
However my business needs is to upload/download files from all over the world and not just US West.
When I download/upload files from India or places that are far from US West, there is a severe degradation in performance.
For downloads I could use Geo Redundant read replica. This partially solves the problem. However the this is increasing the cost significantly. Also the time take for replication is several minutes and this is not fitting for me.
In AWS S3 storage, there is a feature called transfer acceleration. Transfer acceleration speeds up the uploads/downloads by optimizing the routing of packets. Is there any similar feature in Azure?
You may use Azcopy(AzCopy is a command-line utility that you can use to copy blobs or files to or from a storage account. This article helps you download AzCopy, connect to your storage account, and then transfer files.) Fast Data Transfer or Azure Data Factory(A fully managed, serverless data integration solution for ingesting, preparing, and transforming all your data at scale.)
High-Throughput with Azure Blob Storage
You should look at the Azure Storage library https://azure.microsoft.com/en-us/blog/introducing-azure-storage-data-movement-library-preview-2/
You should also take into account the performance guidelines from the Azure Storage Team https://azure.microsoft.com/en-us/documentation/articles/storage-performance-checklist/
This article provides an overview of some of the common Azure data transfer solutions. The article also links out to recommended options depending on the network bandwidth in your environment and the size of the data you intend to transfer. Choose an Azure solution for data transfer

Backup files to Azure Storage

We are migrating from an on-premises virtual machine to Azure cloud. The virtual machine will eventually be decommissioned and we have many files and folders that we don't want to lose, like old websites and databases, scripts, programs etc.
We use an Azure storage account for storing and retrieving images via blob containers for the live websites.
Q: What is the best and most cost effective way to backup large amount of files unused in production, rarely accessed, from an on-premises virtual machine to Azure cloud?
Changing the Access tier to Azure Archive Storage(if storing data in Blobs) would be your best option. A few notes:
The Archive storage tier is only available at the blob level and not at the storage account level.
Archive storage is offline and offers the lowest storage costs but also the highest access costs
Hot, Cool, and Archive tiers can be set at the object level.
Additional info can be found here:https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-storage-tiers
recommendation would be to move those unused files to Azure storage archives, which is cost effective and easily accessible when required.
https://azure.microsoft.com/en-us/services/storage/archive/

Upload 650,000 documents to Azure

I can't seem to find any reference to bulk uploading data to azure.
I have a document store with 650,000 pdf document that take up about 1.2 TB of disk space.
Uploading those files to Azure via the web will be difficult. Is there a way I can mail a hard drive and have your team upload them for me?
If not can you recommend the best way to upload this many documents?
Maybe not the answer you expected, but you could use Amazon's AWS Import/Export (this allows you to mail them a HDD and they'll import it in your S3 account).
To transfer the data to a Windows Azure Storage Account you can leverage one of the new features in the 1.7.1 SDK: the StartCopyFromBlob method. This method allows you to copy a file at a specific url in an asynchronous way (you could use this to copy all files from your S3 to your Azure storage account).
Read the following blogpost for a fully working example: How to Copy a Bucket from Amazon S3 to Windows Azure Blob Storage using “Copy Blob”
While Azure doesn't offer a physical ingestion process today, if you talk nicely to the Azure team they can do this as a one off. If you like I can get a contact on the product team for you (dave at greenbutton dot com).
Alternatively there are solutions such as Aspera which provide for accelerated data transfers over UDP and is being beta test in Azure along with the Azure Media Services offering.
We have some tools that help with this as well http://www.greenbutton.com and leverage Aspera's technology.
As disk shipment are not supported by Windows Azure, your best bet is use a 3rd party application (or write your own one) which supports parallel upload. This way you can still upload much faster. 3rd party applications like Gladinet, Cloudberry could be used for upload the data but I am not sure how configurable they are to get maximum parallel upload to achieve fastest upload.
If you decide to write by yourself here is the starting point: Asynchronous Parallel Block Blob Transfers with Progress Change Notification
I know this is a bit too late for the OP, but in the Azure Management Portal, under Storage, pick your storage instance, then click the Import/Export link at the top. At the bottom of that screen, there is a "Create Import Job" link and icon. Also, if you click the blue help icon on the far right side, it says this:
You can use the Windows Azure Import/Export service to transfer large amounts of file data to Windows Azure Blob storage in situations where uploading over the network is prohibitively expensive or infeasible. You can also use the Import/Export service to transfer large quantities of data resident in Blob storage to your on-premises installations in a timely and cost-effective manner. Use the Windows Azure Import/Export Service to Transfer Data to Blob Storage
To transfer a large set of file data into Blob storage, you can send one or more hard drives containing that data to a Microsoft data center, where your data will be uploaded to your storage account. Similarly, to export data from Blob storage, you can send empty hard drives to a Microsoft data center, where the Blob data from your storage account will be copied to your hard drives and then returned to you. Before you send in a drive that contains data, you'll encrypt the data on the drive; when Microsoft exports your data to send to you, the data will also be encrypted before shipping.
Both windows azure storage powershell and azcopy could bulk upload data to azure.
For azure storage powershell, you could use ls -File -Recurse | Set-AzureStorageBlobContent -Container upload.
You can refer http://msdn.microsoft.com/en-us/library/dn408487.aspx for more details.
For azcopy, you can refer this article http://blogs.msdn.com/b/windowsazurestorage/archive/2012/12/03/azcopy-uploading-downloading-files-for-windows-azure-blobs.aspx

Resources