Is there something like transfer acceleration in Azure Blob Storage? - azure

I would like to create an Azure Storage Account, and use blob storage in the region US West.
However my business needs is to upload/download files from all over the world and not just US West.
When I download/upload files from India or places that are far from US West, there is a severe degradation in performance.
For downloads I could use Geo Redundant read replica. This partially solves the problem. However the this is increasing the cost significantly. Also the time take for replication is several minutes and this is not fitting for me.
In AWS S3 storage, there is a feature called transfer acceleration. Transfer acceleration speeds up the uploads/downloads by optimizing the routing of packets. Is there any similar feature in Azure?

You may use Azcopy(AzCopy is a command-line utility that you can use to copy blobs or files to or from a storage account. This article helps you download AzCopy, connect to your storage account, and then transfer files.) Fast Data Transfer or Azure Data Factory(A fully managed, serverless data integration solution for ingesting, preparing, and transforming all your data at scale.)
High-Throughput with Azure Blob Storage
You should look at the Azure Storage library https://azure.microsoft.com/en-us/blog/introducing-azure-storage-data-movement-library-preview-2/
You should also take into account the performance guidelines from the Azure Storage Team https://azure.microsoft.com/en-us/documentation/articles/storage-performance-checklist/
This article provides an overview of some of the common Azure data transfer solutions. The article also links out to recommended options depending on the network bandwidth in your environment and the size of the data you intend to transfer. Choose an Azure solution for data transfer

Related

When should we use file share in azure as compared to Azure Blobs?

Could someone please tell some examples where we can use Azure file share in azure instead of Azure Blobs. In the internet whenever I search I get it can be mounted or it follows SMB protocol. But still I am not understanding a single case where we can use Azure File share.
For this I tried to look into When to use Azure blob storage versus Azure file share?
-This is a similar question but doesn't answer my question.
Azure provides a variety of storage tools and services, including Azure Storage. To determine which Azure technology is best suited for your scenario, see Review your storage options in the Azure Cloud Adoption Framework.
For detailed information and examples refer to this article: https://learn.microsoft.com/en-us/azure/storage/common/storage-introduction
It depends mostly on your use-case and how you plan to access the data. If you simply want to mount and access your files Azure Files will be your best fit. If you are looking for the lowest cost and want to access your data programmatically through your application Azure Blob would be a better fit. Both are accessible through the portal or Azure Storage Explorer.
I also recommend this Learn module which covers the difference in data types and solutions.
Additional information: Azure Blob Storage vs Azure File Storage
Cost details of Azure Blob Storage pricing & Azure Files pricing
In short: if you ...
have an application that needs to store or access files in the cloud, use Blob Storage
need a file share that can be used by, for instance, a server, use File Shares
Azure Files shares can be mounted concurrently by cloud or on-premises deployments of Windows, Linux, and macOS. Azure Files shares can also be cached on Windows Servers with Azure File Sync for fast access near where the data is being used.
This means a File Share is, somewhat simplified, similar to a network share you would have in a local environment.
Azure Blob Storage helps you create data lakes for your analytics needs, and provides storage to build powerful cloud-native and mobile apps. Optimize costs with tiered storage for your long-term data, and flexibly scale up for high-performance computing and machine learning workloads.
This means Blob Storage is what you need when you're building powerful cloud-native and mobile apps.

Data Transfer between S3 and Blob Storage

We have a large amount, 1PB, of (live) data that we have to transfer periodically between S3 and Azure Blob Storage. What tools do you use for that? And what strategy do you use to minimize cost of transfer and downtime?
We have evaluated a number of solutions, including AzCopy, but none of them satisfy all of our requirements. We are a small startup so we want to avoid homegrown solutions.
Thank you
Azure Data Factory is probably your best bet.
Access the ever-expanding portfolio of more than 80 prebuilt connectors—including Azure data services, on-premises data sources, Amazon S3 and Redshift, and Google BigQuery¬—at no additional cost. Data Factory provides efficient and resilient data transfer by using the full capacity of underlying network bandwidth, delivering up to 1.5 GB/s throughput.

Azure blob: how many read/write operations?

I do not understand how to find out my stats on azure blob storage. Egress and ingress show data in volume, not in reads/writes and I do not think this is necessarily data operations, because there is no way something is downloading 20 gigs of data a day from the blob storage (shows this much egress). Pricing, on the other hand, is all read-write operations.
I want to find out the usage statistics on my blob storage so I could adapt the storage strategy, put the relevant stuff in hot/cold storage, archive things appropriately. I need practical data for analysis.
The metrics in portal are mostly error counts.
Azure Storage Analytics provides more detailed metrics (aggregated per minute and hour) about all services (e.g. Blob, File, Table and Queue) in the storage account usage, such as:
user;GetBlob -> TotalRequests, TotalBillableRequests, TotalIngress, TotalEgress, Availability, etc.;
Find more details at https://learn.microsoft.com/en-us/azure/storage/common/storage-analytics.

Faster blob storage copy tools across regions

I need to copy containers in Blob Storage across regions and wanted a solution that would do it without having to download locally and then upload it again. For example, I am trying to copy a container from East US to a container in SouthEast Asia. I used AzCopy to do that and the throughput I got was 22 Mb/s at best. I am not doing /SyncCopy either so is this best throughput the tool provides cross region ? Do we any other external tools that provide faster results ? Thanks.
Azcopy is your best bet when it comes to rapid data move within Azure. You could also consider using Azure Import/Export service if you have an urgent timeline for large amount of data transfer:
using Azure Import/Export service to securely transfer large amounts of data to Azure Blob storage and Azure Files by shipping disk drives to an Azure data center. This service can also be used to transfer data from Azure storage to hard disk drives and ship to your on-premise sites. Data from a single internal SATA disk drive can be imported either to Azure Blob storage or Azure Files.
There are also some external tools:
https://www.signiant.com/signiant-flight-for-fast-large-file-transfers-to-azure-blob-storage/
and:
http://asperasoft.com/fast-file-transfer-with-aspera-sod-azure/
https://learn.microsoft.com/en-us/azure/storage/common/storage-import-export-service
https://learn.microsoft.com/en-us/azure/storage/common/storage-moving-data

Is this a sensible Azure Blob Storage setup and are there restructuring tools to help me migrate to it?

I think we have gone slightly wrong on the way we have used Azure storage in a SAAS system. We created a storage account per client (Securtiy was prime consideration) and containers per system area e.g. Vehicle, Work etc
Having done further reading it seems a suggestion would be that we should have used one account for all clients. Each client should have a container (so we can programmatically create it) which we then secure. Then files should just be structured using "virtual" folder structure e.g. Container called "Client A". Then Files for the Jobs (in Work area of system) stored like Work/Jobs/{entity id}/blah.pdf. Does this sound sensible?
If so we now have about 10 accounts that we need to restructure. Are there any tools that will let us easily copy one accounts contents to another containers account? I appreciate we probably can't move the files between accounts (as we set them up ages ago so can't use native copy function) so I guess some sort of copy. There are GB of files across all the accounts.
It may not be such a bad idea to keep different storage accounts per client. The benefits of doing that (to me) are:
Better security as mentioned by you.
You'll be able to achieve better throughput / client as each client will have their own storage account. If you keep one storage account for all clients, and if one client starts hitting that account badly other clients will be impacted.
Better scalability. Each storage account can hold up to 200 TB of data. So if you keep just one storage account and assuming each client consumes 100 GB of data, you'll be able to accommodate only 2000 clients (I hope my math is right :)). With individual storage accounts, you won't be restricted in that sense.
There're some downsides as well. Some of them are:
Management would be a nightmare. Imagining you have 2000 customers then you would end up managing 2000 storage accounts.
You may be limited by Windows Azure. Currently by default you get about 10 or 20 storage accounts per subscription and you would need to contact support to manually up that limit. They can do that for you but I would imagine you would want this to be a self-service model where you would be able to create as many storage accounts as you want without contacting support.
Now coming to your question about tooling, you could possibly write something on your own which makes use of Copy Blob functionality. This functionality allows you to copy blob data across storage accounts asynchronously. Basically this is what you would do is:
First create a blob container for each client in the target storage account.
Enumerate all blob containers in source storage account.
For each blob container in source storage account, enumerate the blobs.
Copy each blob asynchronously to target storage account in the client's blob container.
If you're a PowerShell fan, you can look into Cerebrata's Azure Management Cmdlets (http://www.cerebrata.com/Products/AzureManagementCmdlets) as well which wraps this functionality. I could have recommended Cerebrata's Azure Management Studio as well but I haven't tried this functionality just yet there [Disclosure: I'm one of the devs on Cerebrata team].
Hope this helps.
Adding to Gaurav Mantri answer...
You can have shared storage account for customers and use Shared Access Signature(SAS) to limiting access to particular container or blobs(as well as for tables and queues)...
http://msdn.microsoft.com/en-us/library/windowsazure/hh508996.aspx

Resources