I have an azure virtual machine which has some application specific CSV files(retrieved via ftp from on-premise) that needs to be stored into a blob (and eventually will be read and pushed into a Azure SQL DB via a worker role). Question is around pushing the files from VM to blob. Is it possible to get AzCopy without installing the SDK to have the files copied to the blob? Is there a better solution than this? Please read the points below for further information
Points to note:
1) Though files could be directly uploaded to a blob rather than getting them into the VM first and copying from there, for security reasons the files will have to be pulled into the VM and this cannot be changed.
2) I also thought about a worker role talking to a VM folder share (via common virtual network) to pull the files and upload to the blob, but this does not appear to be a right solution after reading some blogs - as it requires changes to both VMs (worker role VM and the Iaas VM).
3) Azure File Service is still in preview (?) and hence cannot be used.
Is it possible to get AzCopy without installing the SDK to have the
files copied to the blob?
Absolutely yes. You can directly download AzCopy binaries without installing SDK using the following links:
Version 3.1.0: http://aka.ms/downloadazcopy
Version 4.1.0: http://aka.ms/downloadazcopypr
Source: http://blogs.msdn.com/b/windowsazurestorage/archive/2015/01/13/azcopy-introducing-synchronous-copy-and-customized-content-type.aspx
Related
I currently have a Rackspace Cloud Server that I'd like to migrate to an Azure Virtual Machine. I recently got an MSDN subscription which gives me a certain level of hosting via Azure at no cost, where I'm currently paying for that level of service with Rackspace.
However, one of the nice things about Rackspace is that I can schedule nightly/weekly backups of the VM image. Is there any mechanism for doing this on Azure? I'm worried about protecting against corruption of the database (i.e. what if someone were to run an UPDATE statement and forget the WHERE clause). Is there a mechanism for this with Azure?
I know the VMs are stored as .VHD files in my local Azure storage, but the VM image is 127 gigs. Downloading that nightly even with FIOS internet isn't really going to fly as a solution.
You can perform an asynchronous blob copy to make a physical copy of a vhd. See here for REST API details. This operation is very fast within the same data center (maybe a few seconds?). You don't need to make raw REST calls though: There's a method already implemented in the Azure cross-platform command line interface, available here. The command is:
azure vm disk upload
You can also take blob snapshots, and return to a previous snapshot later. A snapshot is read-only (which you can copy from later) and takes up no space initially. However, as storage pages are changed, the snapshot grows.
One question though: why such a large VM image? Are you storing OS + data on same vhd? If so, it may make more sense to mount a separate Azure Drive (also stored in VHD in blob storage) to store data, and make independent copies / snapshots.
I've just setup an extra small VM instance in Windows Azure to run a help console for our company. The help files can be updated and published through a simple .NET interface. Obviously the flat html files are getting deployed to the local drive on the VM and exposed publicly through IIS. I'm just wondering how stable this is? If the VM suffers a hardware failure, presumably there's no automatic failover and any edits we've made to the help system will be lost?
Can anyone recommend a way I can shuttle the source files out of the VM into blob storage? I could write a an application to do this, I'm just wondering if there is an out-of-the-box solution out there?
Additional information:
The VM instance is running Server 2008 R2 SP1 (As a Virtual Machine not a web-role)
A backup needs to be created once every 24 hours
Aged backups (3+ days old) need to be automatically cleared from the blob container
The help system we use is called HelpConsole 2012
New pages are added at a rate of myabe 2-3 per week
The answer depends on how whether you are running this in a Windows Azure Virtual Machine or on a Windows Azure Web role.
If you are running this on a Windows Azure Virtual Machine, then the VHD is stored in BLOB storage and, if the site is running of the C: drive and not on a data Disk, then the system has some Host caching turned on for both reads and writes. In this scenario it is possible (depending on the methods you use to write your files out) that the data is not pushed back to the VHD in BLOB storage before a failure occurs. You can either ensure that your writing methods do a write through operation, or turn off the write caching. Better yet, attach a data disk for your web site files. By default data disks have both read and write caching off (you could turn on read caching). Since the VHD's are persisted you don't have to worry about the concern of the edits getting lost. You can script out taking a snapshot of the files and move them to BLOB storage separately, or even push them somewhere else. Another thing to think about with this option is that you have to care for the VM instances and keep them patched and up to date.
If you are running a Web Role, then yes, if a failure occurs and the VM goes through self healing it will indeed redeploy with the older files. In this case I'd recommend changing the code in the web role that when it writes the updates to the local file it also puts a copy of the local file into BLOB Storage. In addition, in the web role OnStart you could reach out to BLOB storage and pull down all the new content locally. BE VERY CAREFUL with this approach though because it only really works well for ONE instance, not multiple. If you plan on running multiple instances of the server (and you will have to if you want the SLA for uptime) then your code will need to be a little more robust and do writes out to BLOB storage and then alert all instances of the role that there is a new file to pull down locally.
Another option for web roles is to also write a handler for the content so that requests come in and are mapped to a file BLOB Storage directly. Then updates can occur to direct edits to the file in BLOB storage. This offloads the serving of the flat files from your compute nodes to BLOB storage and you could even implement some caching and stream the content back through the handler rather than having them hit BLOB storage directly if you wanted to.
Now, another option, is to use Windows Azure Web Sites for this. The underlying storage of the web site files in Windows Azure Web Sites is a shared location and thus updating the files in it will immediately be reflected for all instances. Also, the content for the site is stored in BLOB storage and can be updated via FTP, source control, or directly from code. Lots of options here. You may end up moving to reserved instances to help keep away from some of the quotas that Web Sites have. Web Sites may not be an option for you currently depending on other requirements (as in how much control do you need over the environment since you don't get a lot of control for Web Sites).
I have just implemented Umbraco in an Azure Cloud Instance. I was able to migrate my existing SQL Database to run on SQL Azure and everything runs fine, except for the images and documents inside the media folder.
By default the media folder resides in [siteroot]/Media.
Is there a way to map this folder to azure storage? If not I don't think I'm going to be able to scale up my cloud instances, since the images depend on the virtual server's local storage.
Edit: Bounty Started
What I have so far is this:
Define a stand alone web role which would hold the media directory
and all the files.
Map this folder to the Azure Blobg Storage service with Cloud Drive, in order to minimize the risk of losing data and relying on a
single point of storage.
Somehow (and this is the part I don't know how to accomplish) keep all the folder of [siteRoot]/media synced with this shared drive on
all running instances.
I've seen a similar approach taken with the Azure Accelerator project from Umbraco here: http://azureaccelerators.codeplex.com/releases
But they haven't updated the release since 2011, and I'm not sure it would work with the current version of Azure.
Edit 2:
Umbraco has their own accelerator, but they've deprecated it in favor of using Websites instead of Web Roles:
https://github.com/Microsoft-DPE/wa-accelerator-umbraco
This release works with the 1.6 SDK. Current version is 1.8 I believe...
I'm not sure about a way of mapping the path to storage, but depending on the version of Umbraco you are using, I think from 4.9 (possibly 4.10) they introduced FileSystemProviders configuration which may help solve your problem.
My understanding of it is that it allows you to replace the default Umbraco FileSystemProvider, Umbraco.Core.IO.PhysicalFileSystem with your own custom implementation. I'm pretty sure you could implement an Azure-based provider that wrote and read from the blob storage. In the source, it looks fairly straightforward, a matter of implementing their IFileSystem.
Ended up using Matt Brailsford's Universal Media Picker solution:
http://our.umbraco.org/projects/backoffice-extensions/universal-media-picker
The final solution actually circumvents the Umbraco Media Folder and reads directly from Blob Storage, so I had to rewrite all the macros and templates that rendered images before and point them directly to the Blob Storage account
Unfortunately theres no way to map a NTFS directory to BlobStorage directly.
Have a look at the CloudDrive class of the Windows Azure SDK. This feature allows you to upload a Virtual Hard Disk file (.vhd file) into your blob storage and mount it as a local drive inside Windows Azure Instances.
You sould know that (if you're using multiple instances) only one cloud instance can mount the VHD in read/write mode. The rest of them has only read access to the drive. If the "Media" folder stores static content that you update manually only a few times, this is okay. But if user content is placed there, too, you might want only one instance to mount the VHD and grant other instances access to it via Network Share.
This package provided by Ali Sheikh Taheri solves the problem of the media folder
http://our.umbraco.org/projects/backoffice-extensions/ast-azure-media-sync
How would I write to a tmp/temp directory in windows azure website? I can write to a blob, but i'm using an NPM that requires me to give it file names so that it can directly write to those filenames.
Are you using Cloud Services (PaaS) or Virtual Machines (IaaS).
If PaaS, look at Windows Azure Local Storage. This option gives you up to 250gb of disk space per core. Its a great location for temporary storage of information in a way that traditional apps will be familiar with. However, its not persistent so if you put anything there you need to make sure will be available if the VM instance gets repaved, then copy it to Blob storage. Also, this storage is specific to a given role instance. So if you have two instances of the same role, they each have their own local storage buckets.
Alternatively, you can use Azure Drive, which allows you to keep the information persisted, but still doesn't allow multiple parallel writes.
If IaaS, then you can just mount a data disk to the VM and write to it directly. Data disks are already persisted to blob storage so there's little risk of data loss.
Just from my understanding and please correct me if anything wrong.
In Windows Azure Web Site, the content of your website will be stored in blob storage and mounted as a drive, which will be used for all instances your web site is using. And since it's in blob storage it's persistent. So if you need the local file system I think you can use the folders under your web site root path. But I don't think you can use the system tmp or temp folder.
I have a build script that it would be very useful to configure to dump some files into Azure blob storage so they can be picked up by my Azure web role.
My preferred plan was to find some way of mounting the blob storage on my build server as a mapped drive and simply using Robocopy copy to copy the files over. This will involve the least ammount of friction as I already am deploying some files like this to other web servers using WebDrive.
I found a piece of software that will allow me to do that: http://www.gladinet.com/
However on further investigation I found that it needs port 80 to run without some hairy looking hacking about on the server.
So is there another piece of software I could use or perhaps another way I haven't considered, such as deploying the files to a local folder that is automagically synced with blob storage?
Update in response to #David Makogon
I am using http://waacceleratorumbraco.codeplex.com/ this performs 2 way synchronisation between the blob storage and the web roles. I have tested this with http://cloudberrylab.com/ and I can deploy files manually to the blob and they are deployed correctly to the web roles. Also I have done the reverse and updated files in the web roles which have then been synced back to the blob and I have subsequently edited/downloaded them from blob storage.
What I'm really looking for is a way to automate the cloudberry side of things. So I don't have a manual step to copy a few files over. I will investigate the Powershell solutions in the meantime.
I know this is an old post - but in case someone else comes here... the answer is now "yes". I've been working on a CodePlex project to do exactly that. (All source code is available).
http://azuredrive.codeplex.com/
If you're comfortable using powershell in your build process then you could use the Cerebrata Cmdlets to upload the files. If that doesn't work for you, you could write a custom activity (but this sounds quite a bit more involved).
Mounting a cloud drive from a non-Windows Azure compute instance (e.g. your local build machine) is not supported.
Having said that: Even if you could mount a Cloud Drive from your build machine, your compute instances would need access to it too, and there can only be one writer. If your compute instances only needed read-only access, they'd need to create a snapshot after you upload new files.
This really doesn't sound like a good idea though. As knightpfhor suggested, the Cerebrata cmdlets provide this capability (look at Import-File). This allows you to push individual files into their own blobs. You can optimize further by pushing a single ZIP file into a blob. You can then use a technique similar to the one described by Nate Totten in his multi-tenant web role sample, to detect new zip files and expand them to your local storage. Nate's blog post is here.
Oh, and if you don't want to use the Cerebrata cmdlets, you can upload blobs directly with the Windows Azure Storage REST API (though the cmdlets are very simple to use and integrate seamlessly with PowerShell).