How to upload a large file in chunks with parallelism in Azure SDK v12? - azure

In Azure SDK v11, we had the option to specify the ParallelOperationThreadCount through the BlobRequestOptions. In Azure SDK v12, I see that the BlobClientOptions does not have this, and the BlockBlobClient (previously CloudBlockBlob in Azure SDK v11), there is only mention of parallelism in the download methods.
We have three files: one 200MB, one 150MB, and one 20MB. For each file, we want the file to be split into blocks and have those uploaded in parallel. Is this automatically done by the BlockBlobClient? If possible, we would like to do these operations for the 3 files in parallel as well.

You also can take use of StorageTransferOptions in v12.
The sample code below:
BlobServiceClient blobServiceClient = new BlobServiceClient(conn_str);
BlobContainerClient containerClient= blobServiceClient.GetBlobContainerClient("xxx");
BlobClient blobClient = containerClient.GetBlobClient("xxx");
//set it here.
StorageTransferOptions transferOptions = new StorageTransferOptions();
//transferOptions.MaximumConcurrency or other settings.
blobClient.Upload("xxx", transferOptions:transferOptions);
By the way, for uploading large files, you can also use Microsoft Azure Storage Data Movement Library for better performance.

Using Fiddler, I verified that BlockBlobClient does indeed upload the files in chunks without needing to do any extra work. For doing each of the major files in parallel, I simply had a task for each one, added it to a list tasks and used await Task.WhenAll(tasks).

Related

does azure blob storage use gzip across the wire

I want to know if there is a benefit to zipping files before sending them to Azure Blob Storage - strictly for transfer purposes. Put another way, will pre-zipping files make file transfers any faster when going to/from blob storage? Or does this automatically happen at the transport level by using gzip?
As of 12th August 2015 Azure blob storage (when mounted to the CDN) now supports automatic GZip compression.
Compression method - Supported compression methods are
gzip/deflate/bzip2, a supported method must be set in the
Accept-Encoding Request Header.
Improve performance by compressing files
UPDATE
I'm unsure of what and how I originally did this, but all I can think is that I was looking at the results incorrectly. Everything I can read about azure (from MSDN, to the code itself) is now telling me that Azure does not support gzip for transfer purposes. I do not know under what circumstances I was able to get the following results and am unable to reproduce them now. Needless to say, I'm very disappointed.
(THIS ANSWER IS INCORRECT, SEE THE UPDATE ABOVE) The answer is no, there is no benefit for transfer speed purposes to zip a file before sending to blob storage. By turning on Fiddler, you can see that the transport level automatically gzips content across the wire. Screenshots below confirm this:
Edit 1 - Quick Clarifications for Gaurav
The byte array that comes back in code has a length of 386803, but the network card only saw 23505 bytes go by, because it was gzipped by Azure in the response. I didn't have to do anything for that to happen.
Here is the code I'm using to initiate the request from Blob Storage
public Byte[] Read(string containerName, string filename)
{
CheckContainer(containerName);
Initialize();
// Retrieve reference to a previously created container.
CloudBlobContainer container = _blobClient.GetContainerReference(containerName);
// Retrieve reference to a blob named "photo1.jpg".
CloudBlockBlob blockBlob = container.GetBlockBlobReference(filename);
byte[] buffer;
// Save blob contents to a file.
using (var stream = new MemoryStream())
{
blockBlob.DownloadToStream(stream);
stream.Seek(0, SeekOrigin.Begin);
buffer = new byte[stream.Length];
stream.Read(buffer, 0, (int)stream.Length);
}
return buffer;
}

If using ImageResizer with Azure blobs do I need the AzureReader2 plugin?

I'm working on a personal project to manage users of my club, it's hosted on the free Azure package (for now at least), partly as an experiment to try out Azure. Part of creating their records is to add a photo, so I've got a Contact Card view that lets me see who they are, when they came and a photo.
I have installed ImageResizer and it's really easy to resize the 10MP photos from my camera and save them to the file system locally, but it seems that for Azure I need to use their Blobs to Upload Pictures to Windows Azure Web Sites, and that's new to me. The documentation on ImageResizer says that I need to use AzureReader2 in order to work with Azure blobs but it isn't free. It also says in their best practices #5 to
Use dynamic resizing instead of pre-resizing your images.
Which is not what I was thinking, I was going to resize to 300x300 and 75x75 (for thumbnail) when creating the users record. But if I should be storing full size images as blobs and dynamically resizing on the way out then can I just use standard means to Upload a blob into a container to save it to Azure, then when I want to display the images use the ImageResizer and pass it each image to resize as required. That way not needing to use the AzureReader2, or have I misunderstood what it does / how it works?
Is there another way to consider?
I've not yet implemented cropping, but that's next to tackle when I've worked out how to actually store the images properly
With some trepidation, I'm going to disagree with astaykov here. I believe you CAN use ImageResizer with Azure WITHOUT needing AzureReader2. Maybe I should qualify that by saying 'It works on my setup' :)
I'm using ImageResizer in an MVC 3 application. I have a standard Azure account with an images container.
Here's my test code for the view:
#using (Html.BeginForm( "UploadPhoto", "BasicProfile", FormMethod.Post, new { enctype = "multipart/form-data" }))
{
<input type="file" name="file" />
<input type="submit" value="OK" />
}
And here's the corresponding code in the Post Action method:
// This action handles the form POST and the upload
[HttpPost]
public ActionResult UploadPhoto(HttpPostedFileBase file)
{
// Verify that the user selected a file
if (file != null && file.ContentLength > 0)
{
string newGuid = Guid.NewGuid().ToString();
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(ConfigurationManager.AppSettings["StorageConnectionString"]);
// Create the blob client.
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
// Retrieve reference to a previously created container.
CloudBlobContainer container = blobClient.GetContainerReference("images");
// Retrieve reference to the blob we want to create
CloudBlockBlob blockBlob = container.GetBlockBlobReference(newGuid + ".jpg");
// Populate our blob with contents from the uploaded file.
using (var ms = new MemoryStream())
{
ImageResizer.ImageJob i = new ImageResizer.ImageJob(file.InputStream,
ms, new ImageResizer.ResizeSettings("width=800;height=600;format=jpg;mode=max"));
i.Build();
blockBlob.Properties.ContentType = "image/jpeg";
ms.Seek(0, SeekOrigin.Begin);
blockBlob.UploadFromStream(ms);
}
}
// redirect back to the index action to show the form once again
return RedirectToAction("UploadPhoto");
}
This is 'rough and ready' code to test the theory and could certainly stand improvement but, it does work both locally and when deployed on Azure. I can also view the images I've uploaded, which are correctly re-sized.
Hope this helps someone.
The answer to the concrete question:
If using ImageResizer with Azure blobs do I need the AzureReader2
plugin?
is YES. And as described in the Image Resizer's documentation - that plugin is used to read/process/serve images out of Blob Storage. So there is no doubt - if you are going to use Image Resizer, AzureReader2 is your needed plugin to make things right. It will take care of Blob uploads/serve.
Although I question Image Resizer's team competency on Windows Azure, since they are referencing Azure SDK v.2, while the most current version for Azure SDK is 1.8. What they mean is the Azure Storage Client Library, which has versions 1.7 and 2.x. Whereas version 2.x is recommended one to use and comes with Azure SDK 1.8. So, do not search for Azure SDK 2.0, install the latest one, which is 1.8. And by the way, use the Nuget Package Manager to install the Azure Storage Library v. 2.0.x.
You can also upload resized versions to azure. So, you first upload the original image as a blob, say with the name /original/xxx.jpg; then you create a resize of the image and upload that to azure with the name say /thumbnail/xxx.jpg. If you want to create the resized versions on the fly or on a separate thread, you may need to temporarily save the original to disk.

Azure Blob storage Download

I am learning Azure, I have successfully uploaded and list files in my containers. When I run the code below on my home pc everything works fine, no exceptions, however when I run on my work pc i catch an exception that states:
Blob data corrupted. Incorrect number of bytes received '12288' / '-1'
The file seems to download to my local drive just fine, I just cannot figure why it works different on two different PCs, exact same code.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse("My connection string");
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("mycontainer");
CloudBlockBlob blockBlob = container.GetBlockBlobReference("ARCS.TXT");
using (var fileStream = System.IO.File.OpenWrite(#"c:\a\ARCS.txt"))
{
blockBlob.DownloadToStream(fileStream);
}
May be your Organization's firewall is blocking a specific port. I have written a blog which discuss similar kind of port related issues. Will request you to verify once from that. http://nabaruns.blogspot.in/2012/11/common-port-related-troubleshoot-in.html
Regards
Nabarun
Your code looks correct.
That is a weird issue. It's more weird becuase file gets downloaded properly even after the error. I would recomend you use Azure storage explorer on both of your machines.
If Azure storage explorer works fine on both the machine then next step would be to check the SDK version on both machine. There are chances of such error with older version of SDK.
You may also want to try Commandline Downloader to trouble shoot your issue.
Note - Azure storage explorer and Commandline Downloader are open source. If download through them works fine then you can download its code and debug through it also.
I'd recommend trying CloudBlob.DownloadToFile or CloudBlob.DownloadToStream instead of CloudBlockBlob

Automating App Deployment in Azure with LocalResource

I'm currently attempting to automate the deployment of an application to an Azure Worker role by pulling a file into the role from blob storage and working with it via a batch script, also located in blob storage. I'm using onStart to accomplish this. Here's a reduced version of my onStart method:
Getting ready to pull the files down:
public override bool OnStart()
{
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("mycontainer");
container.CreateIfNotExist();
CloudBlob file = container.GetBlobReference("file.bat");
Actually getting the files into the role:
LocalResource localResource = RoleEnvironment.GetLocalResource("localStore");
string filePath = System.IO.Path.Combine(localResource.RootPath, "file.bat");
using (var fileStream = System.IO.File.OpenWrite(#filePath))
{
file.DownloadToStream(fileStream);
}
This is how I get the batch file and the dependencies into the role. My problem now is - originally, I built the batch file with the assumption that the other files would be dropped right on C:\. For example - C:\installer.exe, C:\archive.zip, etc. But now the files are in localStorage.
I'm thinking I can either A) Somehow tell the batch file where localStorage is by dynamically writing the script onStart, or B) change localStorage to use C:\.
I'm not sure how to do either, or what the best thing to do here would be. Thoughts?
I would not change the LocalStorage to use C: (how would you do this anyways?). Take a look at Steve's blogpost: Using a Local Storage Resource From a Startup Task. He explains how you can get a LocalResource using powershell (and even call that script from a batch file).
And why not use the Windows Azure Bootstrapper? This is a little tool that can help you with the configuration of your role without having to write any code, you simply call it from a startup task and it can download files (also from blob storage like you're doing), work with local resources, ...
bootstrapper.exe -get http://download.microsoft.com/download/F/3/1/F31EF055-3C46-4E35-AB7B-3261A303A3B6/AspNetMVC3ToolsUpdateSetup.exe -lr $lr(temp) -run $lr(temp)\AspNetMVC3ToolsUpdateSetup.exe -args /q
Note: Instead of using absolute references in your batch file, make it use relative paths using %~dp0

How to create a sub container in azure storage location

How can I create a sub container in the azure storage location?
Windows Azure doesn't provide the concept of heirarchical containers, but it does provide a mechanism to traverse heirarchy by convention and API. All containers are stored at the same level. You can gain simliar functionality by using naming conventions for your blob names.
For instance, you may create a container named "content" and create blobs with the following names in that container:
content/blue/images/logo.jpg
content/blue/images/icon-start.jpg
content/blue/images/icon-stop.jpg
content/red/images/logo.jpg
content/red/images/icon-start.jpg
content/red/images/icon-stop.jpg
Note that these blobs are a flat list against your "content" container. That said, using the "/" as a conventional delimiter, provides you with the functionality to traverse these in a heirarchical fashion.
protected IEnumerable<IListBlobItem>
GetDirectoryList(string directoryName, string subDirectoryName)
{
CloudStorageAccount account =
CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
CloudBlobClient client =
account.CreateCloudBlobClient();
CloudBlobDirectory directory =
client.GetBlobDirectoryReference(directoryName);
CloudBlobDirectory subDirectory =
directory.GetSubdirectory(subDirectoryName);
return subDirectory.ListBlobs();
}
You can then call this as follows:
GetDirectoryList("content/blue", "images")
Note the use of GetBlobDirectoryReference and GetSubDirectory methods and the CloudBlobDirectory type instead of CloudBlobContainer. These provide the traversal functionality you are likely looking for.
This should help you get started. Let me know if this doesn't answer your question:
[ Thanks to Neil Mackenzie for inspiration ]
Are you referring to blob storage? If so, the hierarchy is simply StorageAccount/Container/BlobName. There are no nested containers.
Having said that, you can use slashes in your blob name to simulate nested containers in the URI. See this article on MSDN for naming details.
I aggree with tobint answer and I want to add something this situation because I also
I need the same way upload my games html to Azure Storage with create this directories :
Games\Beautyshop\index.html
Games\Beautyshop\assets\apple.png
Games\Beautyshop\assets\aromas.png
Games\Beautyshop\customfont.css
Games\Beautyshop\jquery.js
So After your recommends I tried to upload my content with tool which is Azure Storage Explorer and you can download tool and source code with this url : Azure Storage Explorer
First of all I tried to upload via tool but It doesn't allow to hierarchical directory upload because you don't need : How to create sub directory in a blob container
Finally, I debug Azure Storage Explorer source code and I edited Background_UploadBlobs method and UploadFileList field in StorageAccountViewModel.cs file. You can edit it what you wants.I may have made spelling errors :/ I am so sorry but That's only my recommend.
If you are tying to upload files from Azure portal:
To create a sub folder in container, while uploading a file you can go to Advanced options and select upload to a folder, which will create a new folder in the container and upload the file into that.
Kotlin Code
val blobClient = blobContainerClient.getBlobClient("$subDirNameTimeStamp/$fileName$extension");
this will create directory having TimeStamp as name and inside that there will be your Blob File. Notice the use of slash (/) in above code which will nest your blob file by creating folder named as previous string of slash.
It will look like this on portal
Sample code
string myfolder = "<folderName>";
string myfilename = "<fileName>";
string fileName = String.Format("{0}/{1}.csv", myfolder, myfilename);
CloudBlockBlob blob = container.GetBlockBlobReference(fileName);

Resources