How to move certain files from one storage blob container to another? - azure

I have been trying to find the best way to do the following:
I need to move a large amount of json files that are named following the format "yyyymmdd-hhmmss.json" from one blob container to another that's in another storage account. These files are nested inside several different folders.
I only have to move the files that were created (or are named) before a certain date, for example: move all files that were created/are named before 01/01/2022.
What would be the best way to do so quickly? This is a one-time migration so it won't be recurring.

To copy files in bulk from a Source to a Destination Blob Container:
Connect-AzAccount
Get-AzSubscription
Select-AzSubscription -Subscription "My Subscription"
$srcResourceGroupName = "RG-DEMO-WE"
$srcStorageAccountName = "storageaccountdemowe"
$srcContainer = "sourcefolder"
$blobName = "dataDisk.vhd"
$destResourceGroupName = "RG-TRY-ME"
$destStorageAccountName = "storageaccounttryme"
$destContainer = "destinationfolder"
# Set Source & Destination Storage Keys and Context
$srcStorageKey = Get-AzStorageAccountKey -Name $srcStorageAccountName -ResourceGroupName $srcResourceGroupName
$destStorageKey = Get-AzStorageAccountKey -Name $destStorageAccountName -ResourceGroupName $destResourceGroupName
$srcContext = New-AzStorageContext -StorageAccountName $srcStorageAccountName -StorageAccountKey $srcStorageKey.Value[0]
$destContext = New-AzStorageContext -StorageAccountName $destStorageAccountName -StorageAccountKey $destStorageKey.Value[0]
# Optional step
New-AzStorageContainer -Name $destContainer -Context $destContext
# The copy operation
$copyOperation = Start-AzStorageBlobCopy -SrcBlob $blobName `
-SrcContainer $srcContainer `
-Context $srcContext `
-DestBlob $blobName `
-DestContainer $destContainer `
-DestContext $destContext
REF: https://www.jorgebernhardt.com/copy-blob-powershell/
Since you need to do individual files based on Date, instead of the Start-AzStorageBlobCopy the best is following the Microsoft Documentation with Async az storage file copy:
az storage file copy start --destination-path
--destination-share
[--account-key]
[--account-name]
[--connection-string]
[--file-endpoint]
[--file-snapshot]
[--metadata]
[--sas-token]
[--source-account-key]
[--source-account-name]
[--source-blob]
[--source-container]
[--source-path]
[--source-sas]
[--source-share]
[--source-snapshot]
[--source-uri]
[--timeout]
REF: https://learn.microsoft.com/en-us/cli/azure/storage/file/copy?view=azure-cli-latest
The code to loop through the files based on date I'll leave to the reader, eg:
Get-ChildItem | Where-Object {$_.LastWriteTime -lt (Get-Date).AddDays(-30)}

You can iterate each blob in the source container (No matter how the folder structure is, as blob folders are simply virtual), and you can parse the name of the blob to filter blobs matching the pattern "yyyymmdd-hhmmss" and find the date and if it is older than the date that you wish to choose as a condition, you can easily copy the blob from your source to destination container, and finally delete the blob from the source container. Not sure about power shell, but its easy with any supported programming language.
Here's an example of doing this with .Net:
BlobContainerClient sourceContainerClient = new BlobContainerClient("<source-connection-string>", "<source-container-name>");
BlobContainerClient destinationContainerClient = new BlobContainerClient("<destination-connection-string>", "<destination-container-name>");
var blobList = sourceContainerClient.GetBlobs();
DateTime givenDateTime = DateTime.Now;
foreach (var blobItem in blobList)
{
try
{
var sourceBlob = sourceContainerClient.GetBlobClient(blobItem.Name);
string blobName = sourceBlob.Uri.Segments.Last().Substring(0, sourceBlob.Uri.Segments.Last().IndexOf('.'));
if (DateTime.Compare(DateTime.ParseExact(blobName, "yyyyMMdd-hhmmss", CultureInfo.InvariantCulture), givenDateTime) < 0)
{
var destinationBlob = destinationContainerClient.GetBlockBlobClient(blobName);
destinationBlob.StartCopyFromUri(sourceBlob.Uri);
sourceBlob.Delete();
}
}
catch { }
}

Related

Public Containers in Storage Accounts

First post. My Powershell knowledge isnt great. I am trying to list out all containers that have the "PublicAccess" attribute set to On. I am trying to use the script provided by MS.
$rgName = "<Resource Group name>"
$accountName = "<Storage Account Name>"
$storageAccount = Get-AzStorageAccount -ResourceGroupName $rgName -Name $accountName
$ctx = $storageAccount.Context
Get-AzStorageContainer -Context $ctx | Select Name, PublicAccess
However I need to do this on a large amount of storage accounts. In the past I have used "foreach($item in $list)" to pass things into a small script. But never for multiple lists. Can anyone help?
Based off your extended requirements in the comments, this should work. (I've not tested it and not handled any potential errors related to permissions or anything else)
# Create a collection for the items found.
$StorageAccounts = [System.Collections.Generic.List[System.Object]] #{}
# Loop through the available Azure contexts.
foreach ($Context in (Get-AzContext -ListAvailable)) {
# Set each subscription in turn, voiding the output.
[System.Void](Set-AzContext -Context $Context)
# Create an object with the container name, public access values and the name of the storage account/subscription.
$StorageAccountInfo = Get-AZStorageAccount | Get-AzStorageContainer | Select-Object Name, PublicAccess, #{l = "StorageAccountName"; e = { $_.Context.StorageAccountName } }, #{l = "Subscription"; e = { $Context.Subscription.Name } }
# If there is data found, add it to the collection.
if ($null -ne $StorageAccountInfo) {
$StorageAccounts.AddRange($StorageAccountInfo)
}
}
# Export the collected information to Csv.
$StorageAccounts | Export-Csv -Path .\myStorageAccounts.csv -NoClobber -Encoding utf8

Upload multiple folders from local storage to Azure as new containers with folder contents

We have Azure Blob Storage Accounts with 100s of containers. The file structure is something like below:
container_01
|
--somemedia.jpg
--anothermedia.jpg
container_02
|
--secondcontainersmedia.jpg
--andSoOn
--AndSoOnAndSoOn
My client wants to download all of the containers to local storage so that if necessary they can be re-uploaded to Azure. After doing some research I found this blog post. Updating the script from there to suit my needs (just updating from AzureRM to AZ and my personal connection and local path), I came up with the following suitable script for downloading the files.
$destination_path = 'C:\Storage Dump Test'
$connection_string = '[Insert Connection String]'
$storage_account = New-AzStorageContext -ConnectionString $connection_string
$containers = Get-AzStorageContainer -Context $storage_account
Write-Host 'Starting Storage Dump...'
foreach ($container in $containers)
{
Write-Host -NoNewline 'Processing: ' . $container.Name . '...'
$container_path = $destination_path + '\' + $container.Name
if(!(Test-Path -Path $container_path ))
{
New-Item -ItemType directory -Path $container_path
}
$blobs = Get-AzStorageBlob -Container $container.Name -Context $storage_account
Write-Host -NoNewline ' Downloading files...'
foreach ($blob in $blobs)
{
$fileNameCheck = $container_path + '\' + $blob.Name
if(!(Test-Path $fileNameCheck ))
{
Get-AzStorageBlobContent `
-Container $container.Name -Blob $blob.Name -Destination $container_path `
-Context $storage_account
}
}
Write-Host ' Done.'
}
Write-Host 'Download complete.'
So now I have a directory on my local storage with hundreds of folders containing media items. I need to create a PS script (or find some other way) to basically do the opposite-- take all the folders in that directory, create containers using the names of the folders, and upload the items within each folder to the appropriate container.
How should I start going about this?
You'd have a lot more success, quicker, using azcopy instead of working with the azure cmdlets. To copy:
azcopy copy '<local-file-path>' 'https://<storage-account-name>.<blob| dfs>.core.windows.net/<container-name>/<blob-name>'
It can also create containers:
azcopy make 'https://mystorageaccount.blob.core.windows.net/mycontainer'
azcopy can download an entire container without you having to specify each file. Use --recursive
See: https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10

How to get list of Azure container files?

I'm working on PS script to list all Storage Accounts, which contains files with a modified date less than < X.
I'm able to list all SA containers, it's not a big deal but I'm not sure how to get further and list all files inside a particular container.
$storageAccCtx = (Get-AzStorageAccount -Name acc_name -ResourceGroupName acc_rg).Context
Get-AzStorageContainer -Context $storageAccCtx
I couldn't find any cmdlet for this.
Could anyone, please, advise what should I use next? Thanks.
Once you have the StorageContext, you can use this below Azure Storage Management Cmdlet to list all BlockBlob.
Get-AzStorageBlob -Container containerName -Context $storageAccCtx
To get list of all Azure Storage Management Cmdlet, please follow this documentation https://learn.microsoft.com/en-us/powershell/module/az.storage/?view=azps-4.8.0
You can use Get-AzStorageBlob to list the blobs in a storage container, the cmdlet is documented here. In a script you could use this cmdlet as follows to return all the blobs older than a particular date:
$CutOffDate = Get-Date -Year 2020 -Month 10 -Day 19
$OldBlobs = #()
$StorageAccounts = Get-AzStorageAccount
foreach ($StorageAccount in $StorageAccounts) {
$Containers = Get-AzStorageContainer -Context $StorageAccount.Context
foreach ($Container in $Containers) {
$ContainerBlobs = Get-AzStorageBlob -Container $Container.Name -Context $StorageAccount.Context
$OldBlobs += $ContainerBlobs | Where-Object { $_.LastModified -lt $CutOffDate }
}
}
$OldBlobs

Foreach loop for listing unmanaged disks in Azure

It's very easy to list azure managed disks in PS, but unmanaged ones are very tricky to list, as they're not objects from azure POV. I tried to wrote foreach loop to list me all unamanged disks (i.e. *.vhd files) for each storage account. This is the code I wrote:
$StorageAccounts = Get-AzureRmStorageAccount
$sa = $StorageAccounts | foreach-object {
#Get the Management key for the storage account
$key1 = (Get-AzureRmStorageAccountKey -ResourceGroupName $_.ResourceGroupName -name $_.StorageAccountName)[0].value
#Get the Storage Context to access the Storage Container
$storageContext = New-AzureStorageContext -StorageAccountName $_.StorageAccountName -StorageAccountKey $key1
#Get the Storage Container in the Variable
$storageContainer = Get-AzureStorageContainer -Context $storageContext
$blob = Get-AzureStorageBlob -Container $storageContainer.name -Context $storageContext
[PSCustomObject]#{
"Name" = $blob.Name
"Length" = $blob.Length
"Storage Account Name" = $_.StorageAccountName
}
}
I want the loop to fetch all the vhd's for each storageaccount and parse it into pscustomobject to list me all vhd*s from all storage accounts, but I get an error:
Get-AzureStorageBlob : Cannot validate argument on parameter
'Container'. The argument is null or empty. Provide an argument that
is not null or empty, and then try the command again. At line:13
char:41
Get-AzureStorageBlob : Cannot convert 'System.Object[]' to the type
'System.String' required by parameter 'Container'. Specified method is not > supported.
At line:13 char:41
Why loop is not parsing data to $storageContainer in line 11? I can see what's inside other two vars like $key1 and $storageContext.
you can rewrite your script in this fashion:
$StorageAccounts = Get-AzureRmStorageAccount
$StorageAccounts.foreach{
$ctx = $_.Context
$containers = Get-AzureStorageContainer -Context $ctx
$containers.foreach{
$blobs = Get-AzureStorageBlob -Container $_.name -Context $ctx
$blobs.foreach{
do_something
}
}
}
you dont need to get keys to construct context, because storage account variable contains the context. and then you need to iterate containers and blobs

How to set Cache headers for images in Azure Blob Storage (Please note I didnt take Azure CDN)

Is there a way to set up Cache headers for Images in Azure Blob Storage - Please Suggest - we have around 2M images wanted to add Maxage . I tried Powershell solutions but it says Properties not exist Null refrence error giving...
Is there a work around to set it up ?
Please advise , thank you !!
Edited:
Powershell Script :
#connection info
$StorageName="storagename given here"
$StorageKey="Key Replaced here”
$ContainerName="container name given here"
$BlobUri="added correct URI here"+$ContainerName
#get all the blob under your container
$StorageCtx = New-AzureStorageContext -StorageAccountName $StorageName -StorageAccountKey $StorageKey
$blobs = Get-AzureStorageBlob -Container $ContainerName -Context $StorageCtx
#creat CloudBlobClient
Add-Type -Path "C:\Program Files\Microsoft SDKs\Azure\.NET SDK\v2.9\ref\Microsoft.WindowsAzure.StorageClient.dll"
$storageCredentials = New-Object Microsoft.WindowsAzure.StorageCredentialsAccountAndKey -ArgumentList $StorageName,$StorageKey
$blobClient = New-Object Microsoft.WindowsAzure.StorageClient.CloudBlobClient($BlobUri,$storageCredentials)
#set Properties and Metadata
$cacheControlValue = "public, max-age=60480"
foreach ($blob in $blobs)
{
$blobRef = $blobClient.GetBlobReference($blob.Name)
#set Properties
$blobRef.Properties.CacheControl = $cacheControlValue
$blobRef.SetProperties()
}

Resources