Backup Azure Debian VM

Backup Azure Debian VM - azure

I have a few questions concerning Azure. At this moment I created a VHD image pre-installed with all my software so I can easily redo the same server. All this works perfectly but the next thing i'm working on are the backups.
There is a lot of stuff on the web concerning this but non involve Linux (or I cant find them). There are a few options as I read.
The first option is to create a snapshot and store it in the blob storage. Next thing is HOW? I installed the azure CLI tools via NPM but how to use them? Nothing on the web on how to use them on the command line.
The second thing is to store a ZIP file as blob data. So I can manually manage the backups instead of a complete snapshop. I don't know if this is better or less good but the same goes out for this. How does it work?
I hope someone can pinpoint me in the right direction because I am stuck at this point. As you might know, backups are essential for this to work so without them I can't use Azure.
Thanks for your answer but I am still not able to do this.
root#DEBIAN:/backup# curl https://mystore01.blob.core.windows.net/backup/myblob?comp=snapshot
ï»¿<?xml version="1.0" encoding="utf-8"?><Error><Code>UnsupportedHttpVerb</Code><Message>The resource doesn't support specified Http Verb.
RequestId:09d3323f-73ff-4f7a-9fa2-dc4e219faadf
Time:2013-11-02T11:59:08.9116736Z</Message></Error>root"DEBIAN:/backup# curl https://mystore01.blob.core.windows.net/backup/myblob?comp=snapshot -i
HTTP/1.1 405 The resource doesn't support specified Http Verb.
Allow: PUT
Content-Length: 237
Content-Type: application/xml
Server: Microsoft-HTTPAPI/2.0
x-ms-request-id: f9cad24e-4935-46e1-bcfe-a268b9c0107b
Date: Sat, 02 Nov 2013 11:59:18 GMT
ï»¿<?xml version="1.0" encoding="utf-8"?><Error><Code>UnsupportedHttpVerb</Code><Message>The resource doesn't support specified Http Verb.
RequestId:f9cad24e-4935-46e1-bcfe-a268b9c0107b
Time:2013-11-02T11:59:19.8100533Z</Message></Error>root#HSTOP40-WEB01:/backup# ^C
Hope you can help me get it working since the documentation on Azure + Linux is very bad

I don't believe snapshots are implemented in the CLI. You can either work with the REST API for snapshotting directly, or use one of the language SDKs that wrap this functionality (such as Node.js createBlobSnapshot()).
Note that snapshots are point-in-time lists of committed blocks/pages. They're not actual bit-for-bit copies (yet they represent the exact contents of a blob at the moment you take the snapshot). You can then copy the snapshot to a new blob if you want and do anything you want with it (spin up a new vm, whatever). You can even do a blob-copy to a storage account in a separate data center, if you're looking at a DR strategy.
Snapshots will initially take up very little space. If you start modifying the blocks or pages in a blob, then the snapshot starts growing (as there needs to be blocks/pages representing the original content). You can take unlimited snapshots, but you should consider purging them over time.
If you needed to restore your VM image to a particular point in time, you can copy any one of your snapshots to a new blob (or overwrite the original blob) and restart your VM based on the newly-copied vhd.
You can store anything you want in a blob, including zip files. Not sure what the exact question is on that, but just create a zip and upload it to a blob.

Related

Azure Storage - File Share - Move 16m files in nested folders

Posting here as server fault doesn't seem to have the detailed Azure knowledge.
I have a Azure storage account, a file share. This file share is connected to a Azure VM through mapped drive. A FTP server on the VM accepts a stream of files and stores them in the File Share directly.
There are no other connections. Only I have Azure admin access, limited support people have access to the VM.
Last week, for unknown reasons 16 million files, which are nested in many sub-folders (origin, date) moved instantly into a unrelated subfolder, 3 levels deep.
I'm baffled how this can happen. There is a clear instant cut off when files moved.
As a result, I'm seeing increased costs on LRS. I'm assuming because internally Azure storage is replicating the change at my expense.
I have attempted to copy the files back using a VM and AZCOPY. This process crashed midway through leaving me with a half a completed copy operation. This failed attempt took days, which makes me confident I wasn't the support guys dragging and moving a folder by accident.
Questions:
Is it possible to just instantly move so many files (how)
Is there a solid way I can move the files back, taking into account the half copied files - I mean an Azure backend operation way rather than writing an app / power shell / AZCOPY?
So there a cost efficient way of doing this (I'm on Transaction Optimised tier)
Do I have a case here to get Microsoft to do something, we didn't move them... I assume something internally messed up.
Thanks

A tool that supports server-side copy (like AzCopy) can move the files quickly because only the metadata is updated. If you wants to investigate the root cause, I recommend opening a support case. (To sort this out – Your best bet is to connect with our Azure support team by filing a ticket, our support team on best effort basis can help you guide on this matter. )

Azure File Storage urls don't work the first time

Background:
I am moving a legacy app that was storing images and documents on local disk on the web server, over toward a PaaS Azure web app and using Azure File Storage to store and serve the files.
Question:
I have noticed that sometimes the url for file download fails the first time, either image links on a page are broken until I refresh or a download fails the first time, then succeeds the next. I am guessing that this is due to some issue with how Azure File Storage works and it hasn't started up or something. The only consistent thread I have observed is that this seems to happen once or twice in the morning when I am first working with it. I am guessing maybe my account has to ramp up or something, so its not ready on the first go round. I tried to come up with steps to reproduce, but I could not reproduce the symptom. If my hunch is correct, I will have to wait until tomorrow morning to try. I will post more detailed error information if/when I can.
var fullRelativePath = $"{_fileShareName}/{_fileRoot}/{relativePath}".Replace("//","/");
return $"{_fileStorageRootUrl}{fullRelativePath}{_fileStorageSharedAccessKey}";
Thanks!

So its been a while but I remember I was able to resolve this, so I'll be writing this from memory. To be able to access an image from file storage via a URL, you need to use a SAS token. I already had, which is why I was perplexed about this. I'm not sure if this is the ideal solution, but what I wound up doing was just appending some random characters to the end of the url, after the SAS token, and that make it work. My guess is this somehow made it unique, which may have helped it bypass some caching mechanism that was behaving erratically.
I'll see if I can dig up working example from my archive. If so, I'll append it to this answer.

Automated processing of large text file(s)

The scenario is as follows: A large text file is put somewhere. At a certain time of the day (or manually or after x number of files), a Virtual Machine with Biztalk installed is about to start automatically for processing of these files. Then, the files should be put in some output place and the VM should be shut down. I don´t know the time it takes for processing these files.
What is the best way to build such a solution? The solution is preferably to be used for similar scenarios in the future.
I was thinking of Logic Apps for the workflow, blob storage or FTP for input/output of the files, an API App for starting/shutting down the VM. Can Azure Functions be used in some way?
EDIT:
I also asked the question elsewhere, see link.
https://social.msdn.microsoft.com/Forums/en-US/19a69fe7-8e61-4b94-a3e7-b21c4c925195/automated-processing-of-large-text-files?forum=azurelogicapps

Just create an Azure Runbook with a Schedule, make that Runbook check for specific files in a Storage Account, if they exist, start up a VM and wait till the files are gone, once the files are gone (so BizTalk processed them, deleted and put them in some place where they belong), Runbook would stop the VM.

Azure VHD files and page blobs

I was reading a blog post by Microsoft's Brad Calder (link below) about how Azure uses VHD files backed by page blobs. One interesting bit of information reads:
It is also important to note that when you delete files within the file system used by the VHD, most operating systems do not clear or zero these ranges, so you can still be paying capacity charges within a blob for the data that you deleted via a disk/drive."
I take this to mean if I attached a 1TB drive, fill it up, and then delete all files I will still be using 1TB of backing pages because they weren't cleared, even though my drive in the VM will appear empty.
I decided to test this out, and my results were the opposite of what Brad states. I created a Win2012 R2 VM in Azure, attached a 1GB drive wrote some code to see the amount of page blob data I was using. I then copied files onto the drive and again recorded the amount of page blob data in use; this number went up as expected. To my surprise, when I deleted the files, the page blob data in use returned to the original number of an empty drive.
I ran this test multiple times with different size drives and different types of data. Each time my backing page blob data size accurately reflected what was on the drive (i.e. I never saw "ghost" data remaining).
Can anyone shed some light on my results? Is this to be expected? Did something change in Azure? I can't find any information regarding this topic besides Brad's blog post.
Thanks!
Brad Calder's blog: http://blogs.msdn.com/b/windowsazure/archive/2012/06/28/data-series-exploring-windows-azure-drives-disks-and-images.aspx

After posting I managed to discover what's going on. Back in October 2013, Microsoft added TRIM support to Windows Server 2012 VMs (Win2008 is supported with some caveats). When you delete a file in Windows, a TRIM command is now sent which causes Azure to delete the backing VHD pages.
Here's a post that discusses the addition of TRIM:
http://mvwood.com/blog/trim-support-comes-to-windows-azure-virtual-machines/
Here's a YouTube video of Mark Russinovich talking about how Azure uses TRIM:
https://www.youtube.com/watch?v=5PZ6wFXQ9-4

Azure blob availability during an overwrite

Is a azure blob available for download whilst it is being overwritten with a new version?
From my tests using Cloud Storage Studio the download is blocked until the overwrite is completed, however my tests are from the same machine so I can't be sure this is correct.
If it isn't available during an overwrite, then I presume the solution (to maintain availability) would be to upload using a different blob name and then rename once complete. Does anyone have any better solution than this?

The blob is available during overwrite. What you see will depend on whether you are using a block blob or page blob however. For block blobs, you will download the older version until the final block commit. That final PutBlockList operation will atomically update the blob to the new version. I am not actually sure however for very large blobs that you are in the middle of downloading what happens when a PutBlockList atomically updates the blob. Choices are: a.) request continues with older blob, b.) the connection is broken, or c:) you start downloading bytes of new blob. What a fun thing to test!
If you are using page blobs (without a lease), you will read inconsistent data as the page ranges are updated underneath you. Each page range update is atomic, but it will look weird unless you lease the blob and keep other readers out (readers can snapshot a leased blob and read the state).
I might try to test the block blob update in middle of read scenario to see what happens. However, your core question should be answered: the blob is available.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string