I was reading a blog post by Microsoft's Brad Calder (link below) about how Azure uses VHD files backed by page blobs. One interesting bit of information reads:
It is also important to note that when you delete files within the file system used by the VHD, most operating systems do not clear or zero these ranges, so you can still be paying capacity charges within a blob for the data that you deleted via a disk/drive."
I take this to mean if I attached a 1TB drive, fill it up, and then delete all files I will still be using 1TB of backing pages because they weren't cleared, even though my drive in the VM will appear empty.
I decided to test this out, and my results were the opposite of what Brad states. I created a Win2012 R2 VM in Azure, attached a 1GB drive wrote some code to see the amount of page blob data I was using. I then copied files onto the drive and again recorded the amount of page blob data in use; this number went up as expected. To my surprise, when I deleted the files, the page blob data in use returned to the original number of an empty drive.
I ran this test multiple times with different size drives and different types of data. Each time my backing page blob data size accurately reflected what was on the drive (i.e. I never saw "ghost" data remaining).
Can anyone shed some light on my results? Is this to be expected? Did something change in Azure? I can't find any information regarding this topic besides Brad's blog post.
Thanks!
Brad Calder's blog: http://blogs.msdn.com/b/windowsazure/archive/2012/06/28/data-series-exploring-windows-azure-drives-disks-and-images.aspx
After posting I managed to discover what's going on. Back in October 2013, Microsoft added TRIM support to Windows Server 2012 VMs (Win2008 is supported with some caveats). When you delete a file in Windows, a TRIM command is now sent which causes Azure to delete the backing VHD pages.
Here's a post that discusses the addition of TRIM:
http://mvwood.com/blog/trim-support-comes-to-windows-azure-virtual-machines/
Here's a YouTube video of Mark Russinovich talking about how Azure uses TRIM:
https://www.youtube.com/watch?v=5PZ6wFXQ9-4
Related
Posting here as server fault doesn't seem to have the detailed Azure knowledge.
I have a Azure storage account, a file share. This file share is connected to a Azure VM through mapped drive. A FTP server on the VM accepts a stream of files and stores them in the File Share directly.
There are no other connections. Only I have Azure admin access, limited support people have access to the VM.
Last week, for unknown reasons 16 million files, which are nested in many sub-folders (origin, date) moved instantly into a unrelated subfolder, 3 levels deep.
I'm baffled how this can happen. There is a clear instant cut off when files moved.
As a result, I'm seeing increased costs on LRS. I'm assuming because internally Azure storage is replicating the change at my expense.
I have attempted to copy the files back using a VM and AZCOPY. This process crashed midway through leaving me with a half a completed copy operation. This failed attempt took days, which makes me confident I wasn't the support guys dragging and moving a folder by accident.
Questions:
Is it possible to just instantly move so many files (how)
Is there a solid way I can move the files back, taking into account the half copied files - I mean an Azure backend operation way rather than writing an app / power shell / AZCOPY?
So there a cost efficient way of doing this (I'm on Transaction Optimised tier)
Do I have a case here to get Microsoft to do something, we didn't move them... I assume something internally messed up.
Thanks
A tool that supports server-side copy (like AzCopy) can move the files quickly because only the metadata is updated. If you wants to investigate the root cause, I recommend opening a support case. (To sort this out – Your best bet is to connect with our Azure support team by filing a ticket, our support team on best effort basis can help you guide on this matter. )
We have a VPS running on Google Cloud which had a very important folder in a user directory. An employee of ours deleted that folder and we can't seem to figure out how to recover it. I came across extundelete but it seems the partition needs to be unmounted for it to work but I don't understand how I would do it on Google. This project took more than a year and that was the latest copy after a fire which took out the last copy from our local servers.
Could anyone please help or guide me in the right direction?
Getting any files back from your VM's disk may be tricky (at best) or impossible (most probably) if the files got overwritten.
Easiest way would be to get them back from a copy or snapshot of your VM's disk. If you have a snapshot of your disk (either taken manually or automatically) from before when the folder in question got delete then you will get your files back.
If you don't have any backups then you may try to recover the files - I've found many guides and tutorials, let me just link the ones I believe would help you the most:
Unix/Linux undelete/recover deleted files
Recovering accidentally deleted files
Get list of files deleted by rm -rf
------------- UPDATE -----------
Your last chance in this battle is to make two clones of the disk
and then detach original disk from the VM and attach one of the clones to keep your VM running. Then use second clone for any experiments. Keep the original untouched in case you mess up the second clone.
Now create a new Windows VM and attach your second clone as the additional disk. At this moment you're ready to try various data redovery software;
UFS Explorer
Virtual Machine Data Recovery
There are plenty of others to try from too.
Another approach would be to create an image from the original disk and export it as a VMDK imagae (and save it to a storage bucket). Then download it to yor local computer and then use for example VMware VMDK Recovery or other specialized software for extracting data from virtual machines disk images.
The scenario is as follows: A large text file is put somewhere. At a certain time of the day (or manually or after x number of files), a Virtual Machine with Biztalk installed is about to start automatically for processing of these files. Then, the files should be put in some output place and the VM should be shut down. I don´t know the time it takes for processing these files.
What is the best way to build such a solution? The solution is preferably to be used for similar scenarios in the future.
I was thinking of Logic Apps for the workflow, blob storage or FTP for input/output of the files, an API App for starting/shutting down the VM. Can Azure Functions be used in some way?
EDIT:
I also asked the question elsewhere, see link.
https://social.msdn.microsoft.com/Forums/en-US/19a69fe7-8e61-4b94-a3e7-b21c4c925195/automated-processing-of-large-text-files?forum=azurelogicapps
Just create an Azure Runbook with a Schedule, make that Runbook check for specific files in a Storage Account, if they exist, start up a VM and wait till the files are gone, once the files are gone (so BizTalk processed them, deleted and put them in some place where they belong), Runbook would stop the VM.
I have a few questions concerning Azure. At this moment I created a VHD image pre-installed with all my software so I can easily redo the same server. All this works perfectly but the next thing i'm working on are the backups.
There is a lot of stuff on the web concerning this but non involve Linux (or I cant find them). There are a few options as I read.
The first option is to create a snapshot and store it in the blob storage. Next thing is HOW? I installed the azure CLI tools via NPM but how to use them? Nothing on the web on how to use them on the command line.
The second thing is to store a ZIP file as blob data. So I can manually manage the backups instead of a complete snapshop. I don't know if this is better or less good but the same goes out for this. How does it work?
I hope someone can pinpoint me in the right direction because I am stuck at this point. As you might know, backups are essential for this to work so without them I can't use Azure.
Thanks for your answer but I am still not able to do this.
root#DEBIAN:/backup# curl https://mystore01.blob.core.windows.net/backup/myblob?comp=snapshot
<?xml version="1.0" encoding="utf-8"?><Error><Code>UnsupportedHttpVerb</Code><Message>The resource doesn't support specified Http Verb.
RequestId:09d3323f-73ff-4f7a-9fa2-dc4e219faadf
Time:2013-11-02T11:59:08.9116736Z</Message></Error>root"DEBIAN:/backup# curl https://mystore01.blob.core.windows.net/backup/myblob?comp=snapshot -i
HTTP/1.1 405 The resource doesn't support specified Http Verb.
Allow: PUT
Content-Length: 237
Content-Type: application/xml
Server: Microsoft-HTTPAPI/2.0
x-ms-request-id: f9cad24e-4935-46e1-bcfe-a268b9c0107b
Date: Sat, 02 Nov 2013 11:59:18 GMT
<?xml version="1.0" encoding="utf-8"?><Error><Code>UnsupportedHttpVerb</Code><Message>The resource doesn't support specified Http Verb.
RequestId:f9cad24e-4935-46e1-bcfe-a268b9c0107b
Time:2013-11-02T11:59:19.8100533Z</Message></Error>root#HSTOP40-WEB01:/backup# ^C
Hope you can help me get it working since the documentation on Azure + Linux is very bad
I don't believe snapshots are implemented in the CLI. You can either work with the REST API for snapshotting directly, or use one of the language SDKs that wrap this functionality (such as Node.js createBlobSnapshot()).
Note that snapshots are point-in-time lists of committed blocks/pages. They're not actual bit-for-bit copies (yet they represent the exact contents of a blob at the moment you take the snapshot). You can then copy the snapshot to a new blob if you want and do anything you want with it (spin up a new vm, whatever). You can even do a blob-copy to a storage account in a separate data center, if you're looking at a DR strategy.
Snapshots will initially take up very little space. If you start modifying the blocks or pages in a blob, then the snapshot starts growing (as there needs to be blocks/pages representing the original content). You can take unlimited snapshots, but you should consider purging them over time.
If you needed to restore your VM image to a particular point in time, you can copy any one of your snapshots to a new blob (or overwrite the original blob) and restart your VM based on the newly-copied vhd.
You can store anything you want in a blob, including zip files. Not sure what the exact question is on that, but just create a zip and upload it to a blob.
I've been told that you can create virtual directories in IIS hosted on Azure but I'm struggling to find any info on this as its a relatively new feature. I'd like to point the virtual directory to an Azure Drive (XDrive, NTFS Drive) so that I can reference resources on the drive.
I'm migrating an on premise website onto Azure and need to minimise the amount of rework / redevelopment required. Currently the website has access to shared content folders and I'm trying to mimic a similar set up due to tight time scales.
Does anyone have any knowledge of this or pointers for me as I can't find any information on how to do this?
Any information / pointers you have would be great
Thanks
Steve
I haven't had a moment to check myself, but get the latest copy of the Windows Azure Platform Training kit. I'm fairly certain that it has a hands on lab that demonstrates the new feature. However, I do not believe that lab includes creating a virtual directory on a azure drive. Even if you can point it there, you may run into some .NET security limitations. http://www.microsoft.com/downloads/en/details.aspx?FamilyID=413e88f8-5966-4a83-b309-53b7b77edf78&displaylang=en
Another resource to look into might be the stuff Cory Fowler is doing http://blog.syntaxc4.net/ He's been spending some time of late really digging into the internals of the new 1.3 roles. So he might be able to lend you a hand.
I've been kicking this issue around for sometime now and I can upload a VHD to Azure and I can create a virtual directory in Azure that points to a physical location on my pc (when running in Dev fabic) and here is the but....
I can't find any examples on where I can do both at the same time, i.e. mount a drive and then map a virtual directory to it.
I've had a look in the 1.3 SDK and looked at various blogs but I can't see any pointers on this - I guess I may have got hold of the wrong end of the stick. If anyone knows how or if this can be done, that would be great.