Iterate the files in an ADLS2 Azure Datalake Directory given a SAS url - azure

I'd like to download the files from a ADLS2 Storage blob directory - I have only a SAS url to the said directory, and I would like to recursively download all the files in that directory.
It is very clear how to do this given the storage credentials, and there are many examples that show how to do it - but I couldn't find any which uses a SAS url.
Any clues or documentation links would be much appreciated!

I have reproduced in my environment, and I got expected results as below and I have taken code from #ROGER ZANDER's Blog:
function DownloadBlob {
param (
[Parameter(Mandatory)]
[string]$URL,
[string]$Path = (Get-Location)
)
$uri = $URL.split('?')[0]
$sas = $URL.split('?')[1]
$newurl = $uri + "?restype=container&comp=list&" + $sas
$body = Invoke-RestMethod -uri $newurl
$xml = [xml]$body.Substring($body.IndexOf('<'))
$files = $xml.ChildNodes.Blobs.Blob.Name
$files | ForEach-Object { $_; New-Item (Join-Path $Path (Split-Path $_)) -ItemType Directory -ea SilentlyContinue | Out-Null
(New-Object System.Net.WebClient).DownloadFile($uri + "/" + $_ + "?" + $sas, (Join-Path $Path $_))
}
}
Then call DownloadBlob Function and Give SAS URL.
Output:
In Local Machine Downloaded File:

Related

Upload multiple folders from local storage to Azure as new containers with folder contents

We have Azure Blob Storage Accounts with 100s of containers. The file structure is something like below:
container_01
|
--somemedia.jpg
--anothermedia.jpg
container_02
|
--secondcontainersmedia.jpg
--andSoOn
--AndSoOnAndSoOn
My client wants to download all of the containers to local storage so that if necessary they can be re-uploaded to Azure. After doing some research I found this blog post. Updating the script from there to suit my needs (just updating from AzureRM to AZ and my personal connection and local path), I came up with the following suitable script for downloading the files.
$destination_path = 'C:\Storage Dump Test'
$connection_string = '[Insert Connection String]'
$storage_account = New-AzStorageContext -ConnectionString $connection_string
$containers = Get-AzStorageContainer -Context $storage_account
Write-Host 'Starting Storage Dump...'
foreach ($container in $containers)
{
Write-Host -NoNewline 'Processing: ' . $container.Name . '...'
$container_path = $destination_path + '\' + $container.Name
if(!(Test-Path -Path $container_path ))
{
New-Item -ItemType directory -Path $container_path
}
$blobs = Get-AzStorageBlob -Container $container.Name -Context $storage_account
Write-Host -NoNewline ' Downloading files...'
foreach ($blob in $blobs)
{
$fileNameCheck = $container_path + '\' + $blob.Name
if(!(Test-Path $fileNameCheck ))
{
Get-AzStorageBlobContent `
-Container $container.Name -Blob $blob.Name -Destination $container_path `
-Context $storage_account
}
}
Write-Host ' Done.'
}
Write-Host 'Download complete.'
So now I have a directory on my local storage with hundreds of folders containing media items. I need to create a PS script (or find some other way) to basically do the opposite-- take all the folders in that directory, create containers using the names of the folders, and upload the items within each folder to the appropriate container.
How should I start going about this?
You'd have a lot more success, quicker, using azcopy instead of working with the azure cmdlets. To copy:
azcopy copy '<local-file-path>' 'https://<storage-account-name>.<blob| dfs>.core.windows.net/<container-name>/<blob-name>'
It can also create containers:
azcopy make 'https://mystorageaccount.blob.core.windows.net/mycontainer'
azcopy can download an entire container without you having to specify each file. Use --recursive
See: https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10

How to create folder structure inside Azure Storage Container using PowerShell

I have created the storage account along container in Azure using ARM templates. But I want to create folder structure inside the container using PowerShell script.
For example:
Folder1>SubFolder1>
Folder2>SUbFolder2>
…… etc
So, can anyone pls suggest me on this.
There are no directories inside Azure Storage Container. You can create a single container, and then, blobs inside of it.
Alternatively, you can use to include '/' in the blob name.
Eg:
account/container/2020/09/24/sample.txt
where "2020/09/24/sample.txt" is the blobname
You can't create the directories inside the Container
All blobs must reside in a blob container, which is simply a logical grouping of blobs. An account can contain an unlimited number of containers, and each container can store an unlimited number of blobs. You can include the / in the container name ("folder/1.txt"). You can create a folder structure
This SO thread may help you in your scenario
Creating an Azure Blob Hierarchy
You can create the folder structure using the below PowerShell Script.
#connecting to Storage account
$storageAccount = Get-AzStorageAccount -ResourceGroupName "<ResourceGroupName>"
-AccountName "<storage account>"
$ctx = $storageAccount.Context
# Passing Container name
$filesystemName = "<Container>"
# directory to be created
$folders1= #('folder1/folder2/folder3','folder4/folder5/folder6')
$FolderArray = ""
for ($i = 0; $i -le ($folders1.length - 1); $i += 1){
$dirname =""
$path =""
$filter=""
$FolderArray =$folders1[$i].Split("/")
for ($j = 0; $j -le ($FolderArray.length - 1); $j += 1){
$dirname = $dirname+$FolderArray[$j]+"/"
#print Directory name
$dirname
$path = $path + $(if ($j -eq 0) {"/"} else { "" })
$filter = $filter + $(if ($j -eq 0) {""} else { "/" })+ $FolderArray[$j]
# Check the directory is exist or not
$present = Get-AzDataLakeGen2ChildItem -Context
$ctx -FileSystem $filesystemName -Path $path |
Where-Object {$_.Name -eq "$($filter)"} -ErrorAction SilentlyContinue
# Create directory
if (! $present)
{
New-AzDataLakeGen2Item -Context $ctx -FileSystem
$filesystemName -Path $dirname -Directory
Write-Host "The directory $dirname is created"
}
# Show the existing folder
else
{
Write-Host "Folder named $dirname already exists"
Get-AzDataLakeGen2ChildItem -Context $ctx -FileSystem
$filesystemName -Path $path
}
$path = $(if ($path -eq "/"){$FolderArray[$j]} else {""}) +
$(if ($path -ne "/"){$path +"/"} else {""}) +
$(if ($path -ne "/"){$FolderArray[$j]} else {""})
}
}

Escaping characters on AzureBlobContent

I have got a problem to set my content in AzureBlobStorage.
In local, I have succeeded to replace characters for each files in a directory.
$sourceFolder = "C:\MyDirectory"
$targetFolder = "C:\MyDirectoryEncodeded"
$fileList = Dir $sourceFolder -Filter *.dat
MkDir $targetFolder -ErrorAction Ignore
ForEach($file in $fileList) {
$file | Get-Content | %{$_ -replace '"',''} | %{$_ -replace ',','.'} | Set-Content -Path "tempDirectory\$file"
$newFile = Get-Content "tempDirectory\$file"
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
[System.IO.File]::WriteAllLines("targetDirectory\$file" , $newFile,$Utf8NoBomEncoding)
}
exit
But now, I need to do the same in Microsoft Azure.
I get the content into an Azure Blob Storage, I escape characters, I encoding my file in UTF-8NoBom and then I set the encode file into a new Blob Directory.
Nevertheless, I faced an issue when I want to set the new content with escape characters (First line in my loop).
$storageContext = New-AzureStorageContext -ConnectionString "DefaultEndpointsProtocol=https;AccountName=<myAccountName>;AccountKey=<myAccountKey>;"
$sourceFolder = Get-AzureStorageBlob -Container "datablobnotencoded" -Blob "*.dat" -Context $storageContext
$targetFolder = Get-AzureStorageBlob -Container "datablob" -Context $storageContext
MkDir $targetFolder -ErrorAction Ignore
ForEach($file in $sourceFolder) {
Get-AzureStorageBlob -Container "datablobnotencoded" -Blob $file.Name -Context $storageContext | Get-AzureStorageBlobContent | %{$_ -replace '"',''} | %{$_ -replace ',','.'} | Set-AzureStorageBlobContent -File $file.Name -Context $storageContext -CloudBlob $file
$newFile = Get-AzureStorageFileContent -Path $file
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
[System.IO.File]::WriteAllLines($file , $newFile, $Utf8NoBomEncoding)
}
I've got this error:
Set-AzureStorageBlobContent : Cannot bind parameter 'CloudBlob'.
Cannot convert the
"Microsoft.WindowsAzure.Commands.Storage.Model.ResourceModel.AzureStorageBlob"
value of type
"Microsoft.WindowsAzure.Commands.Storage.Model.ResourceModel.AzureStorageBlob"
to type "Microsoft.WindowsAzure.Storage.Blob.CloudBlob". At line:7
char:264
+ ... lobContent -File $file.Name -Context $storageContext -CloudBlob $file
+ ~~~~~
+ CategoryInfo : InvalidArgument: (:) [Set-AzureStorageBlobContent], ParameterBindingException
+ FullyQualifiedErrorId : CannotConvertArgumentNoMessage,Microsoft.WindowsAzure.Commands.Storage.Blob.SetAzureBlobContentCommand
Thank you for your answers!
There are some mistakes in your powershell scripts:
1.You may misunderstand the usage of Get-AzureStorageBlobContent, it's used to download blob to local, you cann't get the content of the blob, more details refer here.
2.In the loop, you used $newFile = Get-AzureStorageFileContent -Path $file, the Get-AzureStorageFileContent cmdlet is for file share storage, not for the blob storage.
You can use Get-AzureStorageBlobContent to download the blobs to a local folder, then operate on the local file which is downloaded from blob storage. After the file is modified, you can use Set-AzureStorageBlobContent to upload the local files to the specified azure blob storage.
Sample code as below, and works fine at my side:
$context = New-AzureStorageContext -ConnectionString "xxxx"
#download the blobs in specified contianers
$sourceFolder_blob = Get-AzureStorageBlob -Container "test-1" -Blob "*.txt" -Context $context
#the target azure container, which you want to upload the modifed blob to
$taget_container="test-2"
#the local path which is used to store the download blobs, and make sure the folders exist before use.
$sourceFolder_local="d:\test\blob1\"
$targetFolder_local="d:\test\blob2\"
foreach($file in $sourceFolder_blob)
{
#download the specified blob to local path
Get-AzureStorageBlobContent -Container "test-1" -Blob $file.name -Destination $sourceFolder_local -Context $context
#get the local file path
$local_file_path=$sourceFolder_local + $file.name
#set content to the file in target local folder
$local_target_file_path = "$targetFolder_local"+$file.name
#since the files are downloaded to local, you can any operation for the local file
Get-Content $local_file_path | %{$_ -replace '-','!'} | %{$_ -replace ',','.'} | Set-Content -Path $local_target_file_path
$newFile = Get-Content -Path $local_target_file_path
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
[System.IO.File]::WriteAllLines($local_target_file_path , $newFile,$Utf8NoBomEncoding)
#the last step, upload the modified file to another azure container
Set-AzureStorageBlobContent -File $local_target_file_path -Context $context -Container $taget_container
}

using credentials to get-childitem on other server

I'm working on a script that uses get-childitem on the other server, but need to change it so it uses credentials of the local account on the other server to do that. When I was just using Active Directory to do that, I was saving the task in our scheduler with my AD login, and it was good on the other server, using the UNC path. But we decided to change it to the local login there recently and I'm getting an error message, trying to use net use. Does anyone know of a good way to do this with the UNC path instead? Or, any idea why the following is giving an error message?
function GetSecureLogin(){
$global:username = "stuff"
$global:password = get-content C:\filename.txt | convertto-securestring
}
function Cleanup([string]$Drive) {
try {
$deleteTime = -42
$now = Get-Date
**#this is saying cannot find path '\\name.na.xxx.net\20xServerBackup\V' name truncated**
Get-ChildItem -Path $Drive -Recurse -Force |Where-Object {$_.LastWriteTime -lt $limit} | Remove-Item -Force
}
Catch{
Write-Host "Failed"
}
}
#####################start of script####################
$share = '\\name.na.xxx.net\20xServerBackup\'
$TheDrive = '\\name.na.xxx.net\20xServerBackup\VMs\'
$global:password = ""
$global:username = ""
GetSecureLogin
net use $share $global:password /USER:$global:username
[array]$DriveArray = #(TheDrive)
try{
$i=0
for ($i = $DriveArray.GetLowerBound(0); $i -le $DriveArray.GetUpperBound(); $i++) {
$tempDrv = $DriveArray[$i]
Cleanup $tempDrv
}
}
catch [Exception] {
Write-Host $_.Exception.Message
}
As you can see, I started using the example at this link with net use, but it's not doing the trick to use credentials to access the other server. powershell unc path cred
I got it to work this way, with New-PSDrive as #robert.westerlund suggests above:
$DestPath = split-path "$Drive" -Parent #this gives format without slash at and and makes powerShell *very happy*
New-PSDrive -Name target -PSProvider FileSystem -Credential $global:cred -Root "$DestPath" | Out-Null
$temp1 = Get-ChildItem -Path target:\VMs\ -Recurse -Force | Where-Object { $_.LastWriteTime -lt $limit}
Get-ChildItem -Path $Drive -Recurse -Force | Where-Object { $_.LastWriteTime -lt $limit} | Remove-Item -Force
Remove-PSDrive target
I had to add the cred part like this too:
$global:cred = new-object -typename System.Management.Automation.PSCredential -argumentlist $global:username, $global:password

downloading a file from SharePoint Online with PowerShell

I have a requirement to download files from a sharepoint online document library using powershell
I've managed to get to the point where the download should happen but no luck.
I know its something to do with how I am using the stream/writer
any hints would be greatly appreciated
*Edit
No error messages are thrown just 0 length files in my local Directory
$SPClient = [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint.Client")
$SPRuntime = [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint.Client.Runtime")
$webUrl = Read-Host -Prompt "HTTPS URL for your SP Online 2013 site"
$username = Read-Host -Prompt "Email address for logging into that site"
$password = Read-Host -Prompt "Password for $username" -AsSecureString
$folder = "PoSHTest"
$destination = "C:\\test"
$ctx = New-Object Microsoft.SharePoint.Client.ClientContext($webUrl)
$ctx.Credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($username, $password)
$web = $ctx.Web
$lists = $web.Lists.GetByTitle($folder)
$query = [Microsoft.SharePoint.Client.CamlQuery]::CreateAllItemsQuery(10000)
$result = $lists.GetItems($query)
$ctx.Load($Lists)
$ctx.Load($result)
$ctx.ExecuteQuery()
#Edited the foreach as per #JNK
foreach ($File in $result) {
Write-host "Url: $($File["FileRef"]), title: $($File["FileLeafRef"]) "
$binary = [Microsoft.SharePoint.Client.File]::OpenBinaryDirect($ctx,$File["FileRef"])
$Action = [System.IO.FileMode]::Create
$new = $destination + "\\" + $File["FileLeafRef"]
$stream = New-Object System.IO.FileStream $new, $Action
$writer = New-Object System.IO.BinaryWriter($stream)
$writer.write($binary)
$writer.Close()
}
You could also utilize WebClient.DownloadFile Method by providing SharePoint Online credentials to download the resource from SharePoint Online as demonstrated below.
Prerequisites
SharePoint Online Client Components SDK have to be installed on the machine running the script.
How to download a file in SharePoint Online/O365 in PowerShell
Download-File.ps1 function:
[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint.Client")
[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SharePoint.Client.Runtime")
Function Download-File([string]$UserName, [string]$Password,[string]$FileUrl,[string]$DownloadPath)
{
if([string]::IsNullOrEmpty($Password)) {
$SecurePassword = Read-Host -Prompt "Enter the password" -AsSecureString
}
else {
$SecurePassword = $Password | ConvertTo-SecureString -AsPlainText -Force
}
$fileName = [System.IO.Path]::GetFileName($FileUrl)
$downloadFilePath = [System.IO.Path]::Combine($DownloadPath,$fileName)
$client = New-Object System.Net.WebClient
$client.Credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($UserName, $SecurePassword)
$client.Headers.Add("X-FORMS_BASED_AUTH_ACCEPTED", "f")
$client.DownloadFile($FileUrl, $downloadFilePath)
$client.Dispose()
}
Usage
Download-File -UserName "username#contoso.onmicrosoft.com" -Password "passowrd" -FileUrl https://consoto.sharepoint.com/Shared Documents/SharePoint User Guide.docx -DownloadPath "c:\downloads"
I was able to download the file successfully with the following relevant code snippet. You should be able to extend it for your situation.
Add-Type –Path "C:\Program Files\Common Files\microsoft shared\Web Server Extensions\15\ISAPI\Microsoft.SharePoint.Client.dll"
Add-Type –Path "C:\Program Files\Common Files\microsoft shared\Web Server Extensions\15\ISAPI\Microsoft.SharePoint.Client.Runtime.dll"
$siteUrl = Read-Host -Prompt "Enter web URL"
$username = Read-Host -Prompt "Enter your username"
$password = Read-Host -Prompt "Enter password" -AsSecureString
$source = "/filepath/sourcefilename.dat" #server relative URL here
$target = "C:/detinationfilename.dat" #URI of the file locally stored
$ctx = New-Object Microsoft.SharePoint.Client.ClientContext($siteUrl)
$credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($username, $password)
$ctx.Credentials = $credentials
[Microsoft.SharePoint.Client.FileInformation] $fileInfo = [Microsoft.SharePoint.Client.File]::OpenBinaryDirect($ctx,$source);
[System.IO.FileStream] $writeStream = [System.IO.File]::Open($target,[System.IO.FileMode]::Create);
$fileInfo.Stream.CopyTo($writeStream);
$writeStream.Close();
While the CSOM code above likely can be made to work I find it easier to use the web client method.
(from http://soerennielsen.wordpress.com/2013/08/25/use-csom-from-powershell/)
I've used the code below, to retrieve a bunch of files (metadata from CSOM queries) to a folder (using your $result collection, other params should be adjusted a bit):
#$siteUrlString site collection url
#$outPath path to export directory
$siteUri = [Uri]$siteUrlString
$client = new-object System.Net.WebClient
$client.UseDefaultCredentials=$true
if ( -not (Test-Path $outPath) ) {
New-Item $outPath -Type Directory | Out-Null
}
$result |% {
$url = new-object Uri($siteUri, $_["FileRef"])
$fileName = $_["FileLeafRef"]
$outFile = Join-Path $outPath $fileName
Write-Host "Downloading $url to $outFile"
try{
$client.DownloadFile( $url, $outFile )
}
catch{
#one simple retry...
try{
$client.DownloadFile( $url, $outFile )
}
catch{
write-error "Failed to download $url, $_"
}
}
}
The trick here is the
$client.UseDefaultCredentials=$true
which will authenticate the webclient for you (as the current user).
The direct and almost shortest answer to the question is simply:
$url = 'https://the.server/path/to/the/file.txt'
$outfile = "$env:userprofile\file.txt"
Invoke-WebRequest -Uri $url -OutFile $outfile -Credential (Get-Credential)
This works at least in Powershell 5.1...
So I gave up on this. it turned out to be much easier to write an SSIS script component to do the job.
I have awarded Soeren as he posted some code that will work for regular websites but not sodding SharePoint Online.
Thanks Sorean!
Short an easy approach to download a file from sharepoint online, using just powershell and sharepoint online url ( no pnp powershell )
This approach can also be used to perform Sharepoint REST queries, with just powershell and sharepoint REST api
# required MS dependencies
# feel free to download them from here https://www.microsoft.com/en-us/download/details.aspx?id=42038
Add-Type -Path 'C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.dll' -ErrorAction Stop
Add-Type -Path 'C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.Runtime.dll' -ErrorAction Stop
# prepare passwords
$spCredential = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($user, $(ConvertTo-SecureString -AsPlainText $pass -Force))
# prepare and perform rest api query
$Context = New-Object Microsoft.SharePoint.Client.ClientContext($targetSiteUrl)
$Context.Credentials = $spCredential
try {
#this may return an error, but still will finish context setup
$Context.ExecuteQuery()
}
catch {
write-host "TODO: fix executeQuery() err 400 bug" -ForegroundColor Yellow
}
$AuthenticationCookie = $Context.Credentials.GetAuthenticationCookie($targetSiteUrl, $true)
$WebSession = New-Object Microsoft.PowerShell.Commands.WebRequestSession
$WebSession.Credentials = $Context.Credentials
$WebSession.Cookies.SetCookies($targetSiteUrl, $AuthenticationCookie)
$WebSession.Headers.Add("Accept", "application/json;odata=verbose")
Invoke-WebRequest -Uri $spFileUrl -OutFile $outputFilePath -WebSession $WebSession -errorAction Stop
Where
$outputFilePath is the target output file in which you want to save the remote file.
$targetSiteUrl is the target sp site url.
$spFileUrl is the "[sharepoint file full url]"
$user plain text sp user email
$pass plain text sp user pass

Resources