Delete a "folder" in Azure Blob Storage - azure

I would like to delete a folder from the container of my Azure Blob Storage account. This one contains 3 000 000+ files and using Azure Storage Explorer it is a pretty long process (1 000 files/5 min) so I would like to know if it is possible to delete a folder at once.
I am aware there is no "folder" in Azure Blob Storage and it is more a virtual path to access a blob but regarding batch deletion for a huge amount of blobs it is problematic.

Ben I'd recommend using this Powershell script allowing the deletion of 10,000 a time:
This PowerShell script, designed to run in Azure Automatiom, deletes huge number of blobs in a container, by processing them in chunks of 10,000 blobs at a time. When the number of blobs grows beyond a couple of thousands, the usual method of deleting each blob at a time may just get suspended without completing the task.
This could be used to to delete all blobs (when parameter retentionDays is supplied as 0), or certain blobs which has not been modified for the last rententionDays number of days.
Script can be downloaded here: https://gallery.technet.microsoft.com/Delete-large-number-of-97e04976
<#
.Synopsis
Deletes large number of blobs in a container of Storage account, which are older than x days
.DESCRIPTION
This Runbook deletes huge number of blobs in a container, by processing them in chunks of 10,000 blobs at a time. When the number of blobs grow beyond a couple of thousands, the usual method of deleting each blob at a time may just get suspended without completing the task.
.PARAMETER CredentialAssetName
The Credential asset which contains the credential for connecting to subscription
.PARAMETER Subscription
Name of the subscription attached to the credential in CredentialAssetName
.PARAMETER container
Container name from which the blobs are to be deleted
.PARAMETER AzStorageName
The Storage Name to which the container belong to
.PARAMETER retentionDays
Retention days. Blobs older than these many days will be deleted. To delete all, use 0
.NOTES
AUTHOR: Anurag Singh, MSFT
LASTEDIT: March 30, 2016
#>
function delete-blobs
{
param (
[Parameter(Mandatory=$true)]
[String] $CredentialAssetName,
[Parameter(Mandatory=$true)]
[String] $Subscription,
[Parameter(Mandatory=$true)]
[String] $container,
[Parameter(Mandatory=$true)]
[String] $AzStorageName,
[Parameter(Mandatory=$true)]
[Int] $retentionDays
)
$Cred = Get-AutomationPSCredential -Name $CredentialAssetName
$Account = Add-AzureAccount -Credential $Cred
if(!$Account)
{
write-output "Connection to Azure Subscription using the Credential asset failed..."
Break;
}
set-AzureSubscription -SubscriptionName $Subscription
$AzStorageKey = (Get-AzureStorageKey -StorageAccountName $AzStorageName).Primary
$context = New-AzureStorageContext -StorageAccountName $AzStorageName -StorageAccountKey $AzStorageKey
$blobsremoved = 0
$MaxReturn = 10000
$Total = 0
$Token = $Null
$TotalDel = 0
$dateLimit = (get-date).AddDays(-$retentionDays)
try
{
do
{
Write-Output "Retrieving blobs"
$blobs = Get-AzureStorageBlob -Container $container -context $context -MaxCount $MaxReturn -ContinuationToken $Token
$blobstodelete = $blobs | where LastModified -LE $dateLimit
$Total += $Blobs.Count
Write-Output "$Total total Retrieved blobs"
$Token = $Blobs[$blobs.Count -1].ContinuationToken;
if($Blobs.Length -le 0)
{
break;
}
if($blobstodelete.Length -le 0)
{
continue;
}
$TotalDel += $blobstodelete.Count
$blobstodelete | Remove-AzureStorageBlob -Force
Write-Output "$TotalDel blobs deleted"
}
While ($Token -ne $Null)
}
catch
{
write-output $_
}
}

rclone is a great tool to interact with cloud storage.
try rclone purge

As others have already mentioned, you cannot delete a "folder" in Azure Blob Storage. You have to use workaround like listing all files with a prefix and then running a for loop to delete each of them.
In PowerShell, you can simplify these 2 steps in one line by running the following command (using the AzureRM module):
Get-AzureStorageBlob -Context $context -Container $container -Blob 'FolderName*' | Remove-AzureStorageBlob -WhatIf
The -WhatIf option will print out the exact actions it is going to take. What I observed is that it will print What if: Performing the operation "Remove blob" on target ... for each file in the "folder". That probably means this is actually doing individual file deletion.

Related

Azure storage blob list with continuation token from specific point

I'm working on a script to list the blobs in a container which has a ridiculous number of blobs (over 30 million!).
Anyway, I'm using the code from https://learn.microsoft.com/en-us/powershell/module/az.storage/get-azstorageblob?view=azps-3.8.0
Which appears to use a continuation token for every 10,000 files.
$MaxReturn = 10000
$ContainerName = "abc"
$Total = 0
$Token = $Null
do
{
$Blobs = Get-AzStorageBlob -Container $ContainerName -MaxCount $MaxReturn -ContinuationToken $Token
$Total += $Blobs.Count
if($Blobs.Length -le 0) { Break;}
$Token = $Blobs[$blobs.Count -1].ContinuationToken;
}
While ($Token -ne $Null)
Echo "Total $Total blobs in container $ContainerName"
The problem is that this always ends up hanging or getting stuck and never completes.
It usually gets around half way and I have to restart it which kicks off the entire process all over again.
However, I already have the data from the first run, is there a way to get it to start from a specific value rather than from the start?
Lets say I already have the records I need for the first 3 million blobs. How do I tell it to start from 3 million instead of 0?
Or am I not understanding how the process works?
Just a summary for the issue to let others know who has the similar issue.
How do I tell it to start from 3 million instead of 0?
Since the data in the container is static, you can store the latest ContinuationToken. Then run the script with the ContinuationToken next time to get remaining blobs.
For more details you could refer to this article.

Get-AzureRmStorageAccount, Dig into Container files and get "Modified" property

I need to get all Storage Accounts which last modified date is 6 months ago with PS script.
I didn't found any cmdlet or function which could provide such information. I thought it would be enough to sort by 'LastModifiedTime' but then I dig dipper, I saw that I have a lot of new files inside containers with the parameter "Modified". Question is how can I access these files with Powershell? Any cmdlet, function, etc?
Here is what I used to get SA before:
function check_stores {
$stores = Get-AzureRmResource -ODataQuery "`$filter=resourcetype eq 'Microsoft.Storage/storageAccounts'"
$x = (Get-Date).AddDays(-180)
foreach($store in $stores){
$storename = $store.Name
$dates = (Get-AzureRmStorageContainer -ResourceGroupName $store.ResourceGroupName -StorageAccountName $store.Name).LastModifiedTime
if(!($dates -ge $x)){
"Storage Account Name: $storename"
}}
}
check_stores
Not sure if you just want to get the blobs which LastModifiedTime (aka: LMT) is in 180 days.
If so, you don't need to check the container LMT, since it is not related with blob last modify time. (container LMT is for container properties modification).
Following script works with pipeline. If you don't need to check container LMT, just remove the check:
$x = (Get-Date).AddDays(-180)
# get all storage accounts of current subscription
$accounts = Get-AzStorageAccount
foreach($a in $accounts)
{
# get container of storage account with LMT in 180 days
$containers = $a | Get-AzStorageContainer | ? {$_.LastModified -ge $x}
# if don't need check container LMT, use : $containers = $a[0] | Get-AzStorageContainer
# get blob of containers with LMT in 180 days
$blobs = $containers | Get-AzStorageBlob | ? {$_.LastModified -ge $x}
#add code to handle blobs
echo $blobs
}

Rename azure blob with specific condition

I have a Storage account with Test container and this container contains 24 folders and each folder contains n numbers of blobs. What i want is to rename all the folder and sub blob in Pic 1 to exactly like in Pic 2.
I've used this rename script:
How to rename a blob file using powershell
But above script is useful for one or two blobs plus this script doesn't renames folder(this can be solved by copying the blob to new location) where as in my case i have n blob which i tried to tackle with:
$textinfo = (Get-Culture).TextInfo
$text2=($textinfo.ToTitleCase($text)).Split(" ")
But this is also not helping since it will rename exchangerate to Exchangerate but not to ExchangeRate(and this is the real requirement).
I thought of this:
foreach($file in $text)
{
Get-AzStorageBlob -Container 'test' -Context $storageContext -Blob $text|
Rename-AzStorageBlob -NewName ''
}
But then new name(parameter) i'm not able to make it dynamic. How to approach this?
If you want to rename a blob from exchangerate to ExchangeRate, there is no automatic way for this.
To do that, as Martin mentioned, you'd better have a mapping where you can lookup exchangerate -> ExchangeRate (and also for all other names contain more than one words).
Otherwise, your code doesn't know whether a letter in the new blob name should be uppercase or not.
Modified my code something like this
##Lookup.
$hash = $null
##Hash table with keys in Lowercase and values in Uppercase
$hash = #{...}
##Renaming the files.
$file = $null
$file1 = $null
foreach($file in $text)
{
if($hash.Values -eq $file)
{
Write-Host 'Old name' $file
foreach($file1 in $hash.Values)
{
if($file -eq $file1)
{
Write-Host 'New name' $file1
Get-AzStorageBlob -Container 'test' -Context $storageContext -Blob $file| Rename-AzStorageBlob -NewName $file1
}
}
}
}
So this is working for me.

Is it possible to use Azure Automation Runbook to delete another Runbook output (an Azure File share snapshot)?

I want to use the runbook to delete another runbook output (an Azure File Share snapshot).
Is it possible? If you know something, please write something at here
Runbook 1: Create an Azure File share snapshot
$context = New-AzureStorageContext -StorageAccountName -StorageAccountKey
$share = Get-AzureStorageShare -Context
$context -Name "sharefile"
$snapshot = $share.Snapshot()
Runbook 2: Delete the Azure runbook output. The problem with this is that it deletes all snapshots rather than just delete the one created by the first runbook.
$allsnapshots = Get-AzureStorageShare -Context $context | Where-Object { $_.Name -eq "sharefile" -and $_.IsSnapshot -eq $true }
foreach($snapshot in $allsnapshots){
if($snapshot.SnapshotTime -lt (get-date).Add·Hours()){
$snapshot.Delete()
}
}
The sample code is as below, I test it in runbook and works well(create a snapshot, and then delete it after 3 minutes), and the other snapshots have no effect.
code in my powershell runbook:
param(
[string]$username,
[string]$password,
[string]$filesharename
)
$context = New-AzureStorageContext -StorageAccountName $username -StorageAccountKey $password
$share = Get-AzureStorageShare -Context $context -Name $filesharename
$s = $share.snapshot()
#get the snapshot name, which is always a UTC time formated value
$s2= $s.SnapshotQualifiedStorageUri.PrimaryUri.ToString()
#the $snapshottime is actually equal to snapshot name
$snapshottime = $s2.Substring($s2.IndexOf('=')+1)
write-output "create a snapshot"
write-output $snapshottime
#wait 180 seconds, then delete the snapshot
start-sleep -s 180
write-output "delete the snapshot"
$snap = Get-AzureStorageShare -Context $context -SnapshotTime $snapshottime -Name $filesharename
$snap.Delete()
write-output "deleted successfully after 3 minutes"
after it's running, you can see the snapshot is created in azure portal:
After it completes, the specified snapshot is deleted(you may need to open a new webpage to see the change due to some cache issue)
the output in runbook:

How to get size of Azure Container in PowerShell

Similar to this question How to get size of Azure CloudBlobContainer
How can one get the size of the Azure Container in PowerShell. I can see a suggested script at https://gallery.technet.microsoft.com/scriptcenter/Get-Billable-Size-of-32175802 but want to know if there is a simpler way to do in PowerShell
With Azure PowerShell, you can list all blobs in the container with Get-AzureStorageBlob with Container and Context parameter like:
$ctx = New-AzureStorageContext -StorageAccountName youraccountname -storageAccountKey youraccountkey
$blobs = Get-AzureStorageBlob -Container containername -Context $ctx
Output of Get-AzureStorageBlob is an array of AzureStorageBlob, which has a property with name ICloudBlob, you can get blob length in its Properties, then you can sum length of all blobs to get content length of the container.
The following PowerShell script is a simple translation of the c# code in the accepted answer of the question How to get size of Azure CloudBlobContainer. Hope this suit your needs.
Login-AzureRmAccount
$accountName = "<your storage account name>"
$keyValue = "<your storage account key>"
$containerName = "<your container name>"
$storageCred = New-Object Microsoft.WindowsAzure.Storage.Auth.StorageCredentials ($accountName, $keyValue)
$storageAccount = New-Object Microsoft.WindowsAzure.Storage.CloudStorageAccount ($storageCred, $true)
$container = $storageAccount.CreateCloudBlobClient().GetContainerReference($containerName)
$length = 0
$blobs = $container.ListBlobs($null, $true, [Microsoft.WindowsAzure.Storage.Blob.BlobListingDetails]::None, $null, $null)
$blobs | ForEach-Object {$length = $length + $_.Properties.Length}
$length
Note: the leading Login-AzureRmAccount command will load the necessary .dll for you. If you do know the path of "Microsoft.WindowsAzure.Storage.dll", you can replace it by [Reflection.Assembly]::LoadFile("$StorageLibraryPath") | Out-Null. The path is usually like this "C:\Program Files\Microsoft SDKs\Azure.NET SDK\v2.7\ToolsRef\Microsoft.WindowsAzure.Storage.dll"
Here's my solution I just hammered through today. Above examples didn't give me what I wanted which was (1) a byte sum of all blobs in a container and (2) a list of each blob + path + size so that it can be used to compare the results to a du -b on linux (origin).
Login-AzureRmAccount
$ResourceGroupName = ""
$StorageAccountName = ""
$StorageAccountKey = ""
$ContainerName = ""
New-AzureStorageContext -StorageAccountName $StorageAccountName -StorageAccountKey $StorageAccountKey
# Don't NEED the Resource Group but, without it, fills the screen with red as it search each RG...
$size = 0
$blobs = Get-AzureRmStorageAccount -ResourceGroupName $ResourceGroupName -Name $StorageAccountName -ErrorAction Ignore | Get-AzureStorageBlob -Container $ContainerName
foreach ($blob in $blobs) {$size = $size + $blob.length}
write-host "The container is $size bytes."
$properties = #{Expression={$_.Name};Label="Name";width=180}, #{Expression={$_.Length};Label="Bytes";width=80}
$blobs | ft $properties | Out-String -width 800 | Out-File -Encoding ASCII AzureBlob_files.txt
I then moved the file to Linux to do some flip flopping of it and the find output to create a list of files to input into blobxfer. Solution to a different problem, but perhaps a suitable solution for your needs as well.

Resources