Storing groups of files in Azure Blob Storage - azure

I have groups of files using the following structure:
RandomFolderName1 [File1.jpg, File2.jpg, File3.jpg...]
RandomFolderName2 [File1.jpg, File2.jpg, File3.jpg...]
I wonder what will be the bast way to store this in Blob Storage.
Should I use GUID.jpg for every file name and manage the folder structure in the DB
Should I use FolderName+FileName.jpg, but again will have to manage the folder structure in DB
Should I use a Container for a folder and inside have File1.jpg, File2.jpg, File3.jpg...
Should I Store the whole ForderName as a zip and have all the files inside
Is there any other way to define a folder structure in Blob Storage?
Edit: The files will be accessed on a folder basis

So you can use file names in Azure blobs like "randomfoldername1/file1.jpg". It will look like a folder structure and some GUI clients will even let you navigate like it it. But the reality is that the "container" is the only real grouping factor and from there its just a matter of filterng the files in that container based on partial file names.
So to answer your question, you'll likely be fine putting all the files into a single container. The containers help control acces policy an each blob has its own performance target. The aside from acl reasons, the only other reason to split them across blobs in the same container is if you have enough blobs that quering them starts to degrade due to the shere number (or you're exceeding the storage account throughput targets).
You can find out more about Azure Storage abstractions and throughput targets at: http://www.windows-azure.net/windows-azure-storage-abstractions-and-their-scalability-targets/

Related

Azure WebApp storing Files

I am updating a system that had all of it's files stored inside of sql server.
It's going from an on prem server to a Azure webapp.
My questions are:
I think I should be using a storage blob for these files. Is that correct or is there a better option inside of Azure that I should be using?
Is there a quick way to migrate files from sql to that blob?
For storage purposes, do I write the file to the blob and then store the hyperlink to that file?
The staging environment gets updated with the latest data from production when they do a release, is there a way to migrate storage blob to a different resource group for when they do this?
Yes, I would use blob.
Quickest way would be a quick powershell or cli script or console app to pull the files from the database and upload them to blob.
I don't store the entire hyperlink to the file in the database, just the path. That way the storage account and container can be environment configurations.
I would recommend against doing this... we've found since we started doing automated continuous deployment, we haven't had a reason to move backwards, which has eliminated a lot of effort. That being said, AzCopy is a utility that allows you to do server-side copy of blobs between storage accounts (along with many other types of source and destination if needed). That should do what you need.
To answer your questions:
I think I should be using a storage blob for these files. Is that
correct or is there a better option inside of Azure that I should be
using?
That's correct. Blob storage is meant for this purpose only.
Is there a quick way to migrate files from sql to that blob?
I'm not aware of any automated way to do that. What you would need to do is read the binary data from SQL Database and then create a stream out of it and upload that stream. You can use Azure Storage SDK for uploading purpose.
For storage purposes, do I write the file to the blob and then store
the hyperlink to that file?
Under normal circumstances, it is recommended approach however considering you have a need to create a staging environment that will be a copy of production environment (including database I am assuming), I would recommend you store 2 things in your database: blob container name and blob name (or you could store relative URL e.g. <container-name>/<blob-name>). Assuming you keep storage account name somewhere in the configuration file, you can create the URL dynamically using https://<account-name>.blob.core.windows.net/<container-name>/<blob-name> pattern.
The staging environment gets updated with the latest data from
production when they do a release, is there a way to migrate storage
blob to a different resource group for when they do this?
Azure Storage provides Copy Blobs functionality using which you can copy blobs from one blob container to another in same or a different storage account. You can use that to copy data from production environment to staging environment.

Azure storage for files in specific folder structure

Currently i have some ftp where on it i have some deep structure of folders and files within it. It could be even 10 levels down from root folder. As i migrated already with success my local database to azure database, i wonder also whether is there any azure ftp i could use to migrate this as well. I know we have something like Azure storage and i could create Container for it of type File or Blobs - are one of those could be used like particural ftp - could i create folder structure there somehow using container and either File or Blob for that purpose, how it works there? Does either container blob or file for such purposes?
Let me add to what NDJ has written. So both Azure Blobs and Files would serve your purpose.
As mentioned by NDJ, Azure Blob Storage is a 2-level hierarchy system. At the top you have a blob container and the each blob container contains 0 or more files. So it does not support a folder structure per se but as NDJ mentioned, you can create an illusion of a sub folder by using appropriate blob delimiters (usually /). If you were to compare it with local file system, a directory at the root level (C:) is a container in blob storage and then the files would go in there. So imagine you have a folder called images in C:\ of your computer, that would be a container in blob storage. Now imagine that you have 2 sub folders beneath this folder (let's call them hires and lores) and both of them contains some files (say image1.png). When you move them to Azure Blob Storage, the container name would be images but the blob names would be hires/image1.png and lores/image1.png. Some of the storage explorers would take this delimiter (/) and show you that your container contains 2 folders and inside each folder you have an image called image1.png but in reality there are only 2 blobs in that blob container.
Azure File Service is a close match to your local file system. At the top level, you've got a Share and each share will container directories and files. Each directory can again contain many directories and files.
As NDJ mentioned, there's no FTP access to Azure Storage but there are many tools that will allow you to upload files from local computer to Azure Storage and many of them will preserve the file hierarchy. You can always write code to upload the files yourself. If you decide to use Azure Files, you can simply mount a File Storage Share as a network drive on your local computer and then transfer the files from your local computer to Azure Files as if you're transferring files from one drive to another.
UPDATE
Regarding difference between Azure Blob Storage and File Storage, both are used to store files. There are a few differences that I could think of:
A Share in Azure File Storage can be mount as a network drive on your local computer/Azure VM whereas a Blob Container in Azure Blob Storage can't. So if you have an application which writes files to local file system, you can take the application as is and make use of Azure File Storage and write the file to that network drive without making many changes to your code (typical example of Lift-And-Shift kind of application.
You can set ACL on a Blob Container whereas you can't do the same on a Share. This makes Azure Blob Storage ideal for storing static content (images, css, js) for your websites. For exposing files in File Storage, you would need to resort to Shared Access Signature.
You can set the size of a Share (default is 5GB) whereas no such thing exist for a Blob Container. A blob container can go up to the size of a storage account.
To understand Azure Files, I would recommend reading this: https://azure.microsoft.com/en-in/documentation/articles/storage-dotnet-how-to-use-files/.
Azure blob supports 10 levels down (up to 254. Basically the files are stored non hierarchically, but each / separator gives the appearance of directories.
It's relatively trivial to write something to move files to azure, as far as I know there is no ftp functionality yet - but it has been requested. It looks like some people have already created some code for this
You can now use Storage Explorer across all platforms to easily work within any folder structure.

Are generated files persisted on Azure?

I have an Azure Web App, which will generate pdf files at runtime and write them to disk. Can I trust that these files will be persisted?
I am concerned that if my image is spun down and brought back up again then the files might have disappeared.
Or perhaps Azure decides to move the website to a different machine or different datacentre, where these files would not exist.
I know there are cloud based options such as blob storage, but I would prefer the simplicity of writing to disk and having access over FTP.
Anything that you write under the d:\home folder is guaranteed to be persisted. See the File System section in this for more details on this topic.

Azure WCF accessing disk files

I have a WCF service hosted on Windows Azure as a "cloud service." When the service starts, it needs to populate data from files/disk to its memory so it is accessed fast (cached in other words). Right now I'm using like C:\Documents\Filestoprocess folder so that the WCF calls the folder and populates data data in that folder in its memory. I have like 5,000 small files. How do I do this in Azure? Is there a folder path that I can call within the WCF so that the WCF calls these files and opens each files and saves each data in the files? I'm not really looking for complicated Blob access through network using bandwidth. I'm looking for simple disk I/O access to these files from the WCF "cloud service" that is running on its own public web address.
You should try to use a cloud storage service to store data, as if you write to the local file system it can get destroyed on a restart of the service or recycling of the service.
You can look into using the azure drive service, which is like creating a disk dive. It is on top of blob storage.
But if you really want to write and read data on the local file system check out this blog post http://blog.codingoutloud.com/2011/06/12/azure-faq-can-i-write-to-the-file-system-on-windows-azure/
It talks about setting up your service definition to allow writing to the local file system.
Depending on the size of your instances you'll get a non-presistent disk where you can store this kind of temporary data. The minimum is 20GB for an extra small instance. You shouldn't access the disk directly, but you need to use a local resource instead which you can configure in your service definition file or in Visual Studio (double click your Web / Worker Role).
This storage is non-persistent, this means if you delete your deployment, if you decrease the number of instances, in case of hardware problems, ... you loose all data saved here. If you want to persist your files you should use blob storage instead. But in your case, where you need the files as some kind of caching mechanism, local resources are perfect.
And if your goal is to cache data you might want to take a look at the caching features included in Windows Azure: Caching in Windows Azure
Blob access is not complex. In fact, you could do a single download of a zip file from blob storage to local disk, unzip it, then prime your wcf service from those 5,000 small files.
Check out this msdn page documenting DownloadBlobToFile(). The essential parts:
CloudBlobClient blobClient =
new CloudBlobClient(blobEndpoint, new StorageCredentialsAccountAndKey(accountName, accountKey));
// Return a reference to the blob.
CloudBlob blob = blobClient.GetBlobReference("mycontainer/myblob.txt");
// Download the blob to a local file.
blob.DownloadToFile("c:\\mylocalblob.txt");
Now: I don't agree with saving to the root folder on C:. Rather, you should grab some local storage (easily configurable). Once you configure local storage in your role configuration, just ask the role environment for it, and ask for root path:
var localResource = RoleEnvironment.GetLocalResource("mylocalstorage");
var rootPath = localResource.RootPath;
Note: As #KingPancake mentioned, you could use an Azure drive. However: remember that an Azure drive can only be writeable by one instance. You'd need to make additional snapshots for your other instances. I think it's much simpler for you to go with a simple blob, copy your files down (either as single zip or individual files), and go from there.
You mentioned concern with network+bandwidth. You don't pay for bandwidth within the same data center. Also: It's extremely fast: 100Mbps per core. So even with a Small instance, you'll have your files copied down very quickly, moreso when you go to larger instance sizes.
One last thought: The only other ways to gain access to your 5,000 files, without using blob storage or Azure Drives (which are mounted as vhd's in blob storage) would be to either download the files from an external source or bundle them with your Windows Azure package (and then they'd show up in your app's folder, under whatever subfolder you stuck them in). Bundling has two downsides:
Longer time to upload your deployment package due to added size
Inability to change any of the individual files without redeploying the package.
By storing in a blob, you can easily change one (or all) of your small files without redeploying your code - you'd just need to signal it to either re-read from blob storage or restart the instances so they automatically download the new files.

Where to store things like user pictures using Azure? Blob Storage?

I have just migrated a project of mine for test cases to Microsoft's azure.
But for functionalities similar to an avatar upload I need write access to the files on the harddrive. But this is a cloud, so this is not possible.
How can I build such functionalities instead? Should I use the Blob Storage or is there a better solution?
Does it make sense to store all website images (f.e. layout images) in the Blob Storage? So I would have a Cookie-free Domain for my static content?
Blob storage is definitely the place to put dynamic images like avatars. While you can write to the disk on the VM you'll be running in, you can't rely on this to be present - if your app gets moved off to another machine (which could happen for any number of reasons) this storage will be erased.
One thing you could do is store your images in blob storage, and cache them on the local VM disk (using the standard file IO mechanisms). This way you'll get pretty good performance and will save on a few storage transactions while still making sure you're not storing in volatile storage.
If you've got static images which will be completely static, these are just bundled with your application and can be referenced like a normal file. But, if you will ever need to change them, you'd need to redeploy the application - so only use this technique for images which won't need to change.
Be aware there are two types of Blobs in Windows Azure: Block Blobs and Page Blobs. Block Blobs are appropriate for media file serving, whereas Page Blobs are optimized for other work patterns.
Also consider use of the Azure Content Distribution Network (CDN) for lowering latency to clients.
Azure also has streaming capabilities which work in concert with Silverlight Smooth Streaming (http://blog.smarx.com/posts/smooth-streaming-with-windows-azure-blobs-and-cdn if interested).
"Does it make sense to store all website images (f.e. layout images) in the Blob Storage? So I would have a Cookie-free Domain for my static content?"
Yes I think so - this is what I'm rolling out right now actually.

Resources