Azure Filestore & Backup

I have a physical server that holds documents customer orders in XML and their resultant orders PDF. The locations are mapped from the application server that generates them and the desktops that need to access them via drive mapping.
These files need to be kept for a number of years for regulatory purposes, the current file server needs expanding. So I was thinking that as this data will grow to about 5-8tb over time as the data needs to be held for approx 10 years, then it can be removed.
I could create a VM in Azure with the appropriate storage and then I presume to use MARS to create a backup strategy as if this was an onsight server. But to meet the disk sizing I need a large server as the processing of the server does not need to be very much its just storage.
So I would need to still be able to map the server and desktops to the drive where the files are stored
So I was wondering if anyone could suggest an approach. The data from the desktop would need to be available for the application to access for up to 18 months. So the old data could be archived but still needs to be backed up as retrieval of archive data would be via a manual search.
You can map the Azure file storage as a network drive on your desktop and then you can take a backup of files stored in Azure file storage for long-term retention.
Azure Backup for Files is in Limited Preview today. It enables you to take scheduled copies of yourFiles and can be managed from the Recovery Services Vault. If you are interested in signing up for this preview, you can drop in a mail with their subscription ID(s) to


What are the advantages of using Azure File Sync over azcopy sync to migrate local data to Azure Storage?

I have to choose the best tool to migrate data from on-premises to Azure.
Ideal solution would enable to sync the on-prem filesystem to an Azure storage account allowing for “differential sync” or (delta sync) for handling large files incremental updates.
Here are the Features and Benefits of using Azure File Sync:
Multiple File Servers at multiple locations. Sync all to single Azure File Storage. Commonly used files are cached on local server . If local server goes down, quickly install another Server or VM and sync Azure files to it.
The older , rarely accessed files will move to Azure thus freeing your local file Server .
Sync Group helps to manage locations that should be kept in sync with each other. Every Sync Group has one common Cloud Storage. So a Sync Group will have one Azure End point and multiple Server end points. There is a 2 way sync so that changes to Cloud are replicated on local server within 12 to 24 hours. But changes on a local server are replicated to all end points within 5 minutes.
An agent is installed on the Server end point . There is no need to change or relocate data on a different volume. Thus it is non-disruptive type of agent.
Every Server end point creates an Azure file share in the storage account. End user experience is unchanged.
When a particular local file is getting synced , then it is locked. But this is only for a few seconds.
A Disaster Recovery Solution for File Server. If local File Server is destroyed, set up a VM or physical server , join to the previous sync group and you get “rapid restore”.
When a file is renamed or moved, the meta data is preserved.
Its different from One Drive . One Drive is for Personal Document management and is not a general purpose File Server. One Drive is primarily meant for collaborating on Office files. Not optimized for very large files , CAD drawings, multimedia development projects.
Azure File Sync works with On Premise AD and not Azure AD.

Sharepoint - external storage for documents

I am now extending some features of our corporate sharepoint, and one of the wishes of my customer is following:
Sharepoint farm has little space allocation, about 2 Gb per department. The department I am working with has hundreds of midsized documents (4-10 Mb in average, pptx, doc, pdf...) per month to be stored permanently. Currently they do it in a NAS file share. They want however to store them in Sharepoint for accessibility, but avoiding the 2Gb limit. Is there a way to integrate the external file storage to the Sharepoint?
Alternatively, is it possible to store the file data in a database (i.E. Access or MS SQL) but avoiding the farm-wide installations of frameworks like RBS?
You can using Remote BLOB Storage (RBS) as part of your data storage solution.
Check the article: Manage RBS in SharePoint Server
Before you migrate your BLOBs out of the database, you’ll need to choose an option. Here are several typical options:
1.File System: You can use a normal file system (perhaps a large disk partition on a file server) to store your BLOBs.
2.SAN/NAS Storage: Storage area network (SAN) and network attached storage (NAS) are usually high-end storage options. They’re expensive, but well-suited if the business value of your documents and their size can justify the cost. Both SAN and NAS provide data replication and mirroring and seamless growth into terabytes of data.
3.Cloud Storage: This is useful in at least two situations. First, when you’re running SharePoint in the cloud, but still want to externalize the BLOBs, your natural choice is to store them in a nearby, vendor-provided cloud storage. The second is when you’re running SharePoint within your own datacenter but want to store or archive all BLOBs in the cloud due to space limitations or reliability issues in your datcenter. Archiving is the most common reason in this situation. Make sure that if a user is creating or modifying document for cloud storage that your EBS or RBS provider does this in the background, as it could degrade performance. You also want to make sure your third-party EBS or RBS provider supports storing BLOBs in the cloud storage.
More information is here: Optimize SharePoint with External Storage
RBS is the way to leverage non-SQL Server storage for your BLOBs. Any other approach you take will be a hack/work around. Only your customer's requirements can tell us if the limitations of these workarounds are acceptable. For example, you could store the files elsewhere (network share or cloud share) and have SharePoint Search index them. In this case you loose out on a consistent UI for managing content and you'd still need the SP hosting team's help to setup the search crawling.
The real answer is to work with your customer to document their business needs and why IT's offering doesn't meet that need so that they can give you more space.

Do I need Azure blob storage or just a simple web server on a VM?

I have a VM on Azure which is my content management system using nodejs and mongodb.
One of things the CMS does is have a social sharing function where html pages are created and users are given the url to this page.
I expect a large volume of users (probably 5000 at a given time) access the http pages. I do not want this load to be on the same server as my CMS.
So I was thinking about moving the html pages to another server. My question is do I need to look at Azure blob storage to do this or should I just use another VM and put files there?
The files are very small and minified. I want to keep my costs down whilst at the same time if I get more than 5000 requests, the server should auto scale.
The question itself is somewhat subjective/opinion-soliciting. And how you solve this problem is really up to you.
But from an objective perspective:
Blobs themselves are not the same as local file storage. If you're going to store content in them, either your CMS needs to support them natively or you're going to need to build that support into it (if that's even possible). Since they have their own REST API (and related SDKs) you cannot simple do file I/O operations against them. They are, however, accessible via URI (which may be made private or public).
Azure VMs store their disks (vhd's) in page blobs (so, you're already using blob storage technically speaking). And each VM may have attached disks (1TB each) also in page blobs, two disks per core (so a dual-core VM supports 4 attached 1TB disks). Just like your OS disk, these attached disks are durable, in blob storage. A CMS may access an attached disk once it's formatted and given a drive letter (Windows) or mounted (Linux). EDIT - forgot to mention: If you go with the attached-disk approach, you need to consider the fact that these disks are per-VM. That is, they are not shared across multiple VM's (in the event you scale your CMS to multiple instances).
Azure File Service is an SMB share sitting atop Azure Blob Storage. Again, durable storage, and drive-mappable. EDIT unlike attached disks, Azure File Service SMB shares are accessible across multiple VM's.

Backup Azure Virtual Machine local folders to blob storage?

I've just setup an extra small VM instance in Windows Azure to run a help console for our company. The help files can be updated and published through a simple .NET interface. Obviously the flat html files are getting deployed to the local drive on the VM and exposed publicly through IIS. I'm just wondering how stable this is? If the VM suffers a hardware failure, presumably there's no automatic failover and any edits we've made to the help system will be lost?
Can anyone recommend a way I can shuttle the source files out of the VM into blob storage? I could write a an application to do this, I'm just wondering if there is an out-of-the-box solution out there?
Additional information:
The VM instance is running Server 2008 R2 SP1 (As a Virtual Machine not a web-role)
A backup needs to be created once every 24 hours
Aged backups (3+ days old) need to be automatically cleared from the blob container
The help system we use is called HelpConsole 2012
New pages are added at a rate of myabe 2-3 per week
The answer depends on how whether you are running this in a Windows Azure Virtual Machine or on a Windows Azure Web role.
If you are running this on a Windows Azure Virtual Machine, then the VHD is stored in BLOB storage and, if the site is running of the C: drive and not on a data Disk, then the system has some Host caching turned on for both reads and writes. In this scenario it is possible (depending on the methods you use to write your files out) that the data is not pushed back to the VHD in BLOB storage before a failure occurs. You can either ensure that your writing methods do a write through operation, or turn off the write caching. Better yet, attach a data disk for your web site files. By default data disks have both read and write caching off (you could turn on read caching). Since the VHD's are persisted you don't have to worry about the concern of the edits getting lost. You can script out taking a snapshot of the files and move them to BLOB storage separately, or even push them somewhere else. Another thing to think about with this option is that you have to care for the VM instances and keep them patched and up to date.
If you are running a Web Role, then yes, if a failure occurs and the VM goes through self healing it will indeed redeploy with the older files. In this case I'd recommend changing the code in the web role that when it writes the updates to the local file it also puts a copy of the local file into BLOB Storage. In addition, in the web role OnStart you could reach out to BLOB storage and pull down all the new content locally. BE VERY CAREFUL with this approach though because it only really works well for ONE instance, not multiple. If you plan on running multiple instances of the server (and you will have to if you want the SLA for uptime) then your code will need to be a little more robust and do writes out to BLOB storage and then alert all instances of the role that there is a new file to pull down locally.
Another option for web roles is to also write a handler for the content so that requests come in and are mapped to a file BLOB Storage directly. Then updates can occur to direct edits to the file in BLOB storage. This offloads the serving of the flat files from your compute nodes to BLOB storage and you could even implement some caching and stream the content back through the handler rather than having them hit BLOB storage directly if you wanted to.
Now, another option, is to use Windows Azure Web Sites for this. The underlying storage of the web site files in Windows Azure Web Sites is a shared location and thus updating the files in it will immediately be reflected for all instances. Also, the content for the site is stored in BLOB storage and can be updated via FTP, source control, or directly from code. Lots of options here. You may end up moving to reserved instances to help keep away from some of the quotas that Web Sites have. Web Sites may not be an option for you currently depending on other requirements (as in how much control do you need over the environment since you don't get a lot of control for Web Sites).

Azure blobs, what are they for?

I'm reading about Azure blobs and storage, and there are things I don't understand.
First, you can hire Azure for just hosting, but when you create a web role ... do you need storage for the .dll's and other files (.js and .css) ?? Or there are a small storage quota in a worker role you can use? how long is it? I cannot understand getting charged every time a browser download a CSS file, so I guess I can store those things in another kind of storage.
Second, you get charged for transaction and bandwidth, so it's not a good idea to provide direct links to the blobs in your websites, then... what do you do? Download it from your web site code and write to the client output stream on the fly from ASP.NET? I think I've read that internal trafic/transactions are for free, so it looks like a "too-good-for-be-truth" solution :D
Is the trafic between hosting and storage also free?
Thanks in advance.
First, to answer your main question: blobs are best used for dynamic data files. If you run a YouTube sorta site, you would use blobs to store videos in every compressed state and thumbnails to images generated from those videos. Tables within table storage are best for dynamic data that does not require files. For example comments on YouTube videos would likely be best stored by tables in ATS.
You generally want a storage account for at least: publishing your deployments into Azure and to have your compute nodes transfer their diagnostic data to, for when you're deployed and need to monitor your compute nodes
Even though you publish your deployments THROUGH a storage account, the deployment code lives on your compute nodes. .CSS/.HTML files served by your app are served through your node's storage space which you get plenty of (it is NOT a good place for your dynamic data however)
You pay for traffic/data that crosses the Azure data center boundary, irregardless where it came from. Furthermore, transactions (reads or writes) between your azure table storage and anywhere else are not free. You also pay for storing the data in the storage account (storing data on compute nodes themselves is not metered). Data that does not leave their data center is not subject to transfer fees. Now in reality, the costs are so low, that you have to be pushing gigabytes per day to start noticing
Don't store any dynamic data only on compute instances. That data will get purged whenever you redeploy your app or whenever they decide to move your app onto a different node.
Hope this helps
