Azure storage for files in specific folder structure - azure

Currently i have some ftp where on it i have some deep structure of folders and files within it. It could be even 10 levels down from root folder. As i migrated already with success my local database to azure database, i wonder also whether is there any azure ftp i could use to migrate this as well. I know we have something like Azure storage and i could create Container for it of type File or Blobs - are one of those could be used like particural ftp - could i create folder structure there somehow using container and either File or Blob for that purpose, how it works there? Does either container blob or file for such purposes?

Let me add to what NDJ has written. So both Azure Blobs and Files would serve your purpose.
As mentioned by NDJ, Azure Blob Storage is a 2-level hierarchy system. At the top you have a blob container and the each blob container contains 0 or more files. So it does not support a folder structure per se but as NDJ mentioned, you can create an illusion of a sub folder by using appropriate blob delimiters (usually /). If you were to compare it with local file system, a directory at the root level (C:) is a container in blob storage and then the files would go in there. So imagine you have a folder called images in C:\ of your computer, that would be a container in blob storage. Now imagine that you have 2 sub folders beneath this folder (let's call them hires and lores) and both of them contains some files (say image1.png). When you move them to Azure Blob Storage, the container name would be images but the blob names would be hires/image1.png and lores/image1.png. Some of the storage explorers would take this delimiter (/) and show you that your container contains 2 folders and inside each folder you have an image called image1.png but in reality there are only 2 blobs in that blob container.
Azure File Service is a close match to your local file system. At the top level, you've got a Share and each share will container directories and files. Each directory can again contain many directories and files.
As NDJ mentioned, there's no FTP access to Azure Storage but there are many tools that will allow you to upload files from local computer to Azure Storage and many of them will preserve the file hierarchy. You can always write code to upload the files yourself. If you decide to use Azure Files, you can simply mount a File Storage Share as a network drive on your local computer and then transfer the files from your local computer to Azure Files as if you're transferring files from one drive to another.
UPDATE
Regarding difference between Azure Blob Storage and File Storage, both are used to store files. There are a few differences that I could think of:
A Share in Azure File Storage can be mount as a network drive on your local computer/Azure VM whereas a Blob Container in Azure Blob Storage can't. So if you have an application which writes files to local file system, you can take the application as is and make use of Azure File Storage and write the file to that network drive without making many changes to your code (typical example of Lift-And-Shift kind of application.
You can set ACL on a Blob Container whereas you can't do the same on a Share. This makes Azure Blob Storage ideal for storing static content (images, css, js) for your websites. For exposing files in File Storage, you would need to resort to Shared Access Signature.
You can set the size of a Share (default is 5GB) whereas no such thing exist for a Blob Container. A blob container can go up to the size of a storage account.
To understand Azure Files, I would recommend reading this: https://azure.microsoft.com/en-in/documentation/articles/storage-dotnet-how-to-use-files/.

Azure blob supports 10 levels down (up to 254. Basically the files are stored non hierarchically, but each / separator gives the appearance of directories.
It's relatively trivial to write something to move files to azure, as far as I know there is no ftp functionality yet - but it has been requested. It looks like some people have already created some code for this

You can now use Storage Explorer across all platforms to easily work within any folder structure.

Related

Doubts with Designing an Azure web app file manager

I am designing a web application and it needs to be in the Azure web app. The app is focused on managing files, so it needs to upload files and store them.
As is a cloud app, I suppose that I am not able to create a directory in the web app service. My question is if I have to use the benefits of Azure and create a Storage Account and if this is the solution, What will be the best storage solution, Blob or File?
Thank you in advance.
Best wishes
Container is a Blob Storage, which is a great option for programmable storage, where our program can read and write to the storage account.
If we don't want to allow websites and the public to access the files, we can choose the below options.
Blob Storage Containers can contain any binary files/ binary large objects, there is no ordering and hierarchy, we can have a virtual folder structure.
Containers are usually programmed to share files to access using Shared Access Signature and Access Policy
I suppose that I am not able to create a directory in the web app service
Azure Files is more useful for mounting a file share to a server and multiple servers can mount the same file share. It can have a quota.
File share has a Directory Structure, we can create Directories and Subdirectories in a Hierarchical manner compared to Containers.
The Connect option in the File Share gives you details on how to mount drive onto a Windows/Linux machine.
Use file storage if you need the shared drive protocol, if not we can design the applications and use blob storage.
As per your requirement, if you want to create Directories you can choose AzureFile Share.
Reference link Azure Blob and Fileshare storage mentioned by #deherman-MSFT

Azure: Is there a way to cache/reuse files downloaded from Azure blob storage?

I have a file upload/download service that uploads files to Blob storage. I have another service (a job service) that needs to download files from file service (using the blob storage URLs) and process those files. The files are read-only (they are not going to change during their lifetime). In many cases, the same file can be used in different jobs. I am trying to figure out if there is a way to download a file once and all the instances of my job service use that downloaded file. So can I store the downloaded file in some shared location and access it from all the instances of my service? Does it even make sense to do it this way? Would the cost of fetching the file from blob be the same as reading it from a shared location (if that is even possible)?
Azure also provide a file storage. Azure file storage provide a facility to mount that storage as a drive and access contain of azure file storage.
Buy for this you need to download it once and then upload to file storage.
Then you can mount that to any instance of virtual machine or local drive.
That is a alternate way to achieve your goal.
Check this
http://www.ntweekly.com/?p=10034

Access Azure Storage Blob as file system

We have a Worker role on Windows Azure that runs ffmpeg to convert media files using MediaHandler Pro. The files that we like to process is saved on a blob storage and the resulting files should also be stored there.
Our problem is that ffmpeg works on local files and not on URIs from the blob storage. Is there any way to mount a blob storage container and access the files there directly as a file system?
If this is not possible is it ok to download the files (they can be quite large, perhaps 1-2Gb) to the local file system*, process them there and then upload them. This sounds like redundant.
*) We have set up a CloudDrive that downloads this blob to a virtual disc
You have a couple ways of doing this - you can either create a cloud drive (VHD uploaded as page blob) and mount it or you can download the source files locally and work on scratch (local temp) disk. Of the two choices, I would download locally and use scratch disk.
If you were to use a cloud drive there would be 3 primary problems - the first is that it is a VHD and you have to mount it to get the files. The second is that only 1 instance can mount for RW, so you cannot split the work of encoding source files with multiple workers saving to same drive. The 3rd problem is that it is the slowest of all the storage options. For encoding, probably not a great choice.
Your best bet is to download the source files from blob storage (that is very fast, btw) into a 'Local Resource' (aka scratch disk) and work from there. Upload the resulting file into blob storage.
If your systems support SAMBA 3.0 you can simply map the Azure Storage Blob Container as a drive using the file share features now available.

Azure WCF accessing disk files

I have a WCF service hosted on Windows Azure as a "cloud service." When the service starts, it needs to populate data from files/disk to its memory so it is accessed fast (cached in other words). Right now I'm using like C:\Documents\Filestoprocess folder so that the WCF calls the folder and populates data data in that folder in its memory. I have like 5,000 small files. How do I do this in Azure? Is there a folder path that I can call within the WCF so that the WCF calls these files and opens each files and saves each data in the files? I'm not really looking for complicated Blob access through network using bandwidth. I'm looking for simple disk I/O access to these files from the WCF "cloud service" that is running on its own public web address.
You should try to use a cloud storage service to store data, as if you write to the local file system it can get destroyed on a restart of the service or recycling of the service.
You can look into using the azure drive service, which is like creating a disk dive. It is on top of blob storage.
But if you really want to write and read data on the local file system check out this blog post http://blog.codingoutloud.com/2011/06/12/azure-faq-can-i-write-to-the-file-system-on-windows-azure/
It talks about setting up your service definition to allow writing to the local file system.
Depending on the size of your instances you'll get a non-presistent disk where you can store this kind of temporary data. The minimum is 20GB for an extra small instance. You shouldn't access the disk directly, but you need to use a local resource instead which you can configure in your service definition file or in Visual Studio (double click your Web / Worker Role).
This storage is non-persistent, this means if you delete your deployment, if you decrease the number of instances, in case of hardware problems, ... you loose all data saved here. If you want to persist your files you should use blob storage instead. But in your case, where you need the files as some kind of caching mechanism, local resources are perfect.
And if your goal is to cache data you might want to take a look at the caching features included in Windows Azure: Caching in Windows Azure
Blob access is not complex. In fact, you could do a single download of a zip file from blob storage to local disk, unzip it, then prime your wcf service from those 5,000 small files.
Check out this msdn page documenting DownloadBlobToFile(). The essential parts:
CloudBlobClient blobClient =
new CloudBlobClient(blobEndpoint, new StorageCredentialsAccountAndKey(accountName, accountKey));
// Return a reference to the blob.
CloudBlob blob = blobClient.GetBlobReference("mycontainer/myblob.txt");
// Download the blob to a local file.
blob.DownloadToFile("c:\\mylocalblob.txt");
Now: I don't agree with saving to the root folder on C:. Rather, you should grab some local storage (easily configurable). Once you configure local storage in your role configuration, just ask the role environment for it, and ask for root path:
var localResource = RoleEnvironment.GetLocalResource("mylocalstorage");
var rootPath = localResource.RootPath;
Note: As #KingPancake mentioned, you could use an Azure drive. However: remember that an Azure drive can only be writeable by one instance. You'd need to make additional snapshots for your other instances. I think it's much simpler for you to go with a simple blob, copy your files down (either as single zip or individual files), and go from there.
You mentioned concern with network+bandwidth. You don't pay for bandwidth within the same data center. Also: It's extremely fast: 100Mbps per core. So even with a Small instance, you'll have your files copied down very quickly, moreso when you go to larger instance sizes.
One last thought: The only other ways to gain access to your 5,000 files, without using blob storage or Azure Drives (which are mounted as vhd's in blob storage) would be to either download the files from an external source or bundle them with your Windows Azure package (and then they'd show up in your app's folder, under whatever subfolder you stuck them in). Bundling has two downsides:
Longer time to upload your deployment package due to added size
Inability to change any of the individual files without redeploying the package.
By storing in a blob, you can easily change one (or all) of your small files without redeploying your code - you'd just need to signal it to either re-read from blob storage or restart the instances so they automatically download the new files.

Storing groups of files in Azure Blob Storage

I have groups of files using the following structure:
RandomFolderName1 [File1.jpg, File2.jpg, File3.jpg...]
RandomFolderName2 [File1.jpg, File2.jpg, File3.jpg...]
I wonder what will be the bast way to store this in Blob Storage.
Should I use GUID.jpg for every file name and manage the folder structure in the DB
Should I use FolderName+FileName.jpg, but again will have to manage the folder structure in DB
Should I use a Container for a folder and inside have File1.jpg, File2.jpg, File3.jpg...
Should I Store the whole ForderName as a zip and have all the files inside
Is there any other way to define a folder structure in Blob Storage?
Edit: The files will be accessed on a folder basis
So you can use file names in Azure blobs like "randomfoldername1/file1.jpg". It will look like a folder structure and some GUI clients will even let you navigate like it it. But the reality is that the "container" is the only real grouping factor and from there its just a matter of filterng the files in that container based on partial file names.
So to answer your question, you'll likely be fine putting all the files into a single container. The containers help control acces policy an each blob has its own performance target. The aside from acl reasons, the only other reason to split them across blobs in the same container is if you have enough blobs that quering them starts to degrade due to the shere number (or you're exceeding the storage account throughput targets).
You can find out more about Azure Storage abstractions and throughput targets at: http://www.windows-azure.net/windows-azure-storage-abstractions-and-their-scalability-targets/

Resources