Downloading a file from Adls gen1 using powershell is not working - azure

I am trying a download a file from azure data lake store using powershell script. The code is triggered from a runbook in azure cloud. Looks like Export-AdlStoreItem is not working as expected.I dont get any error messages or compilation errors. In fact when this command is executed a zero byte file is generated in the destination.The name of that file is TemporaryFile2020-06-02_14-56-57.datc18371c2-d39c-4588-9af0-93aa3e136b01Segments
what is happening?.Please help!.
$LocalDwldPath = "T:\ICE_PROD_DATA_SOURCING\FILE_DOWNLOAD_PATH\TemporaryFile$($TimeinSec).dat"
$SourcePath = "Dataproviders/Landing/GCW/HPIndirect/Orders/AMS/gcw_hp_indirect_orders_ams_745_20200601_04_34_01.dat"
$PRODAdlsName = "itgdls01"
Export-AdlStoreItem -Account $PRODAdlsName -Path $("/" + $SourcePath.Trim()) -Destination $LocalDwldPath -Force -ErrorAction Stop
if( Test-Path $LocalDwldPath.Trim() )
{
Get-Content -Path $LocalDwldPath.Trim() -ReadCount 1000 |% { $FileCount += $_.Count }
Remove-Item $LocalDwldPath.Trim()
Set-Content -Path $cntCaptureFile -Value $FileCount
$TimeinSec = TimeStamp2
Add-Content -Value "$TimeinSec Log: Identified file for getting count is $($SourcePath.Trim()) and the count is $FileCount" -Path $logfile
}
else
{
$TimeinSec = TimeStamp2
Add-Content -Value "$TimeinSec Error: Identified file for getting count is $($SourcePath.Trim()) and the count capture failed as local file is not found!" -Path $logfile
}

According to my research, if you want to download a file from azure data lake with PowerShell, we can use the PowerShell command Export-AzDataLakeStoreItem.
For example
Export-AzDataLakeStoreItem -Account <> -Path '/test/test.csv' -Destination 'D:\myDirectory\test.csv'
For more details, please refer to the document

The issue was with the local download path/destination
("T:\ICE_PROD_DATA_SOURCING\FILE_DOWNLOAD_PATH\TemporaryFile$($TimeinSec).dat").
The T:\ drive is a virtual drive/network drive connected as Azure Fileshare.
Instead of T:\ , I have pointed the destination location to a local drive ("F:\ICE_PROD_DATA_SOURCING\FILE_DOWNLOAD_PATH\TemporaryFile$($TimeinSec).dat") and it worked fine.
Its surprising that Powershell didn't give any error messages when it was not able to save the file in a network path pointed to azure fileshare.

Related

Download a list of specific files from Azure Blob

I've got an issue with the files download from Azure Blob container. This is not that trivial as it may seem. I saw many examples of how you can download one file, but there's a problem if you need to download a bulk of them.
Issue definition:
I have an Azure Blob container that has ~30k files in it
At the same time, I've got a list of exact file names locally (around 300 files) that I want to download from that Azure Blob container (i.e., I need to download a whole bunch of files selectively, by their names)
I know that all these files exist in the given blob. So, I need a way to iterate over the list of the files and download them from the blob.
What I tried:
I tried 'azcopy copy' command. It works alright if you need to copy one or several files from the blob to your local disk, but there's no way you can pass a huge list of files as a parameter to download those
Tried to search for PowerShell examples that can do the same, but no luck
Please advice.
Please try something like the following. It makes use of Get-AzStorageBlobContent Cmdlet.
The idea is to have an array of blobs you wish to download and then loop over that array and call this Cmdlet for each item.
$accountName = "account-name"
$accountKey = "account-key"
$containerName = "container-name"
$context = New-AzStorageContext -StorageAccountName $accountName -StorageAccountKey $accountKey
$destination = "C:\temp"
$blobNames = #("blob1.txt", "blob2.txt", "blob3.txt", "blob4.txt")
For ($i=0; $i -lt $blobNames.Length; $i++) {
$blob = $blobNames[$i]
Write-Host "Downloading $blob. Please wait."
Get-AzStorageBlobContent -Blob $blob -Container $containerName -Destination $destination -Context $context -Verbose
}
Have you tried "Azure Storage Explorer" soft ?
I was able to download a whole folder from a blob storage thanks to it.
If data in the blob container is in a folder just right click on the folder > Download.
If the files are directly at the root of the container (not stored in a subfolder) you'll have to select all files with the "Select All > Select all files in cache" option then click on "Download".
$accountName = "account-name"
$accountKey = "account-key"
$containerName = "container-name"
$context = New-AzStorageContext -StorageAccountName $accountName -StorageAccountKey $accountKey
$destination = "C:\temp"
$blobNames = Get-Content $destination\list.txt
For ($i=0; $i -lt $blobNames.Length; $i++) {
$blob = $blobNames[$i]
Write-Host "Downloading $blob. Please wait."
Get-AzStorageBlobContent -Blob $blob -Container $containerName -Destination $destination -Context $context -Verbose
}
It'll work like a charm, given that the text file that holds a list of file names for the blobs you need to download is located in the '$destination' directory (could be any directory on your PC, though).
p.s., the files just need to be listed as one column (separated by a caret return, i.e., "\n" at the end of each file name).
Thanks #Gaurav Mantri for the solution.

Blazor Wasm PWA IIS Deployment integrity error

I created a new Blazor PWA WebAssembly (last version default template) project and deployed it in a IIS in Windows Server to try PWA.
Installed the last .NET Core Hosting Bundle.
After publising it, I ran the script in the Microsoft Docs to rename dll files:
dir .\_framework\_bin | rename-item -NewName { $_.name -replace ".dll\b",".bin" } ((Get-Content .\_framework\blazor.boot.json -Raw) -replace '.dll"','.bin"') | Set-Content .\_framework\blazor.boot.json
And the serviceworker renaming code too:
((Get-Content .\service-worker-assets.js -Raw) -replace '.dll"','.bin"') | Set-Content .\service-worker-assets.js
Then I deleted the compressed files as the docs says:
wwwroot\service-worker-assets.js.br
wwwroot\service-worker-assets.js.gz
wwwroot\_framework\blazor.boot.json.br
wwwroot\_framework\blazor.boot.json.gz
But I am still getting an error when I load the app:
What Am I missing here?
I guess that it has to do with the hashes and the renaming thing but cant find any solution in the BlazorĀ“s Github issues.
As a result of your modifications to the blazor.boot.json file, the integrity checks fails. service-worker-assets.js contains a list of files and their integrity hashes which are calculated at the time of publish.
You can manually recalculate the hashes using Bash/PowerShell, since you're using IIS, I'll provide the PowerShell script I used for a similar issue:
# make sure you're in the wwwroot folder of the published application
$JsFileContent = Get-Content -Path service-worker-assets.js -Raw
# remove JavaScript from contents so it can be interpreted as JSON
$Json = $JsFileContent.Replace("self.assetsManifest = ", "").Replace(";", "") | ConvertFrom-Json
# grab the assets JSON array
$Assets = $Json.assets
foreach ($Asset in $Assets) {
$OldHash = $Asset.hash
$Path = $Asset.url
$Signature = Get-FileHash -Path $Path -Algorithm SHA256
$SignatureBytes = [byte[]] -split ($Signature.Hash -replace '..', '0x$& ')
$SignatureBase64 = [System.Convert]::ToBase64String($SignatureBytes)
$NewHash = "sha256-$SignatureBase64"
If ($OldHash -ne $NewHash) {
Write-Host "Updating hash for $Path from $OldHash to $NewHash"
# slashes are escaped in the js-file, but PowerShell unescapes them automatically,
# we need to re-escape them
$OldHash = $OldHash.Replace("/", "\/")
$NewHash = $NewHash.Replace("/", "\/")
$JsFileContent = $JsFileContent.Replace("""$OldHash""", """$NewHash""")
}
}
Set-Content -Path service-worker-assets.js -Value $JsFileContent -NoNewline
This script iterates over all files listed inside of service-worker-assets.js, calculates the new hash for each file and updates the hash in the JavaScript file if it's different.
You have to execute the script with the published wwwroot folder as the current working directory.
I described this in more detail on my blog: Fix Blazor WebAssembly PWA integrity checks

Upload blob with Set-AzStorageBlobContent via pipeline and set ContentType property

Using the Az PowerShell module, I'm trying to enumerate a directory on disk and pipe the output to Set-AzStorageBlobContent to upload to Azure, while preserving the folder structure. This works great, except the ContentType property of all blobs is set to application/octet-stream. I'd like to set it dynamically based on the file extension of the blob being uploaded.
Here's example code for the base case:
Get-ChildItem $SourceRoot -Recurse -File |
Set-AzStorageBlobContent -Container $ContainerName -Context $context -Force
To set the ContentType, I need to add a Properties parameter to Set-AzStorageBlobContent with a value like #{ "ContentType" = "<content type>" }. The content type should be determined from the specific file extension being uploaded. I've written a separate pipelined function that can add a MimeType property to the file object, but I can't figure out how to reference that for the parameter in the pipeline. Example:
function Add-MimeType{
[cmdletbinding()]
param(
[parameter(
Mandatory = $true,
ValueFromPipeline = $true)]
$pipelineInput
)
Process {
$mimeType = Get-MimeType $pipelineInput.Extension
Add-Member -InputObject $pipelineInput -NotePropertyName "MimeType" -NotePropertyValue $mimeType
return $pipelineInput
}
}
function Get-MimeType(
[string]$FileExtension
)
{
switch ($FileExtension.ToLowerInvariant())
{
'.txt' { return 'text/plain' }
'.xml' { return 'text/xml' }
default { return 'application/octet-stream' }
}
}
Get-ChildItem $SourceRoot -Recurse -File |
Add-MimeType |
Set-AzStorageBlobContent -Container $ContainerName -Properties #{"ContentType" = "$($_.MimeType)"} -Context $context -Force
It seems that $_ isn't usable in this context. Is there another way to accomplish this?
The reason I'd like to continue using pipelining is that it appears to work much faster than using a ForEach-Object loop to call the function (where $_ does work).
If you are open to completely different solutions, you can also use AzCopy.
You can upload your whole folder with one command, and AzCopy can also automatically guess the correct mime type based on the file extension. There is also support for Azure Pipelines, if that is part of your setup.
Command could look something like this:
# AzCopy v10 will automatically guess the content type unless you pass --no-guess-mime-type
azcopy copy 'C:\myDirectory' 'https://mystorageaccount.blob.core.windows.net/mycontainer' --recursive
# AzCopy V8
azcopy copy 'C:\myDirectory' 'https://mystorageaccount.blob.core.windows.net/mycontainer' /s /SetContentType
Taken from the output of AzCopy.exe copy --help:
AzCopy automatically detects the content type of the files when uploading from the local disk, based on the file extension or content (if no extension is specified).
The built-in lookup table is small, but on Unix, it is augmented by the local system's mime.types file(s) if available under one or more of these names:
/etc/mime.types
/etc/apache2/mime.types
/etc/apache/mime.types
On Windows, MIME types are extracted from the registry. This feature can be turned off with the help of a flag. Please refer to the flag section.

Using Set-AzStorageBlobContent to upload only new content without prompts

I'm enumerating a local folder and uploading to Azure storage. I want to only upload new content to my Azure storage. If I use Set-AzStorageBlobContent with -Force, it'll overwrite everything. If I use it without -Force, it'll prompt on items that already exist. I can use Get-AzStorageBlob to check if the item already exists, but it prints red errors if the item does not exist. I can't find a combination of these items that gracefully uploads only new content without printing any errors or prompting. Am I using the wrong approach?
FINAL EDIT: adding working solution based on suggestions from Ivan Yang. Now only new files are uploaded, without any error messages. The key was to use -ErrorAction Stop to convert the error message into an exception, and then catch the exception.
# In my code this is part of a Test-Blob function that returns $blobFound
$blobFound = $false
try
{
$blobInfo = Get-AzStorageBlob `
-Container $containerName `
-Context $storageContext `
-Blob $blobPath `
-ErrorAction Stop
$blobFound = ($null -ne $blobInfo)
}
catch [Microsoft.WindowsAzure.Commands.Storage.Common.ResourceNotFoundException]
{
# Eat the error that'd otherwise be printed
}
# Note in my code this is actually a call to my Test-Blob function
if ($false -eq $blobFound)
{
Set-AzStorageBlobContent `
-Container $containerName `
-Context $storageContext `
-File $sourcePath `
-Blob $blobPath `
-Force # -Force is unnecessary but just being paranoid to avoid prompts
}
I see you have mentioned trying Get-AzStorageBlob, why not use it continually?
The trick here is that you can use try-catch-finally, which can properly handle the error if the blob does not exist in azure.
The sample code works at my side for uploading a single file, and you can modify it to upload multi-files:
$account_name ="xxx"
$account_key ="xxx"
$context = New-AzStorageContext -StorageAccountName $account_name -StorageAccountKey $account_key
#use this flag to determine if a blob exists or not in azure. And assume it exists at first.
$is_exist = $true
try
{
Get-AzStorageBlob -Container test3 -Blob a.txt -Context $context -ErrorAction Stop
}
catch [Microsoft.WindowsAzure.Commands.Storage.Common.ResourceNotFoundException]
{
#if the blob does not exist in azure, do the following
$is_exist = $false
Write-Output "the blob DOES NOT exists."
}
finally
{
#only execute the code when the blob does not exist in azure blob storage.
if(!$is_exist)
{
Set-AzStorageBlobContent -Container test3 -File "d:\myfolder\a.txt" -Blob a.txt -Context $context
Write-Output "uploaded!"
}
}
Not a PowerShell solution but I would suggest that you take a look at AzCopy. It's like RoboCopy but for Azure storage. A command line tool which allows you to synch, copy, move and more. It's free, works on macOS, Linux and Windows. And also, it is fast!
I use AzCopy from PowerShell scripts and it makes lie a lot easier (I'm managing millions of files and the stability and speed of AzCopy really helps)
This command is not smart enough to detect which files are new. You need to keep in the folder just the files you want to upload.
Simply use Set-AzStorageBlobContent -Force all the time.
The alternative is to check for existing file, download the file content, compare the files, and upload if different. The amount of processing/IO will only increase this way.

unable to append data to sharepoint file via Azure Automation

Ok I have asked a question like this but now I am trying to perform the task via Azure Automation. I can connect to the SharePoint site via Azure Automation (powershell). with the correct credentials. I can download the file and append data to it. But I can when I try and upload the file back to SharePoint it adds the contents 3 times and then Azure Automation suspends the Runbook after 3 times.
It does run perfect if I upload this file as a different file name.
$siteurl="https://abc.sharepoint.com/sites/xxx/teamsites/os"
$credSP = Get-AutomationPSCredential -Name 'test'
$fileFolder = "$Env:temp"
Connect-PnPOnline -Url $siteurl -Credentials $credSP
Get-PnPFile -Url "/sites/xxx/teamsites/os/Directory and Operating
Systems/test.csv" -Path $fileFolder -Filename test.csv -AsFile -Force
$test = "31-07-2019 -11:35"
Add-Content -Path $fileFolder\test.csv $test
Add-PnPFile -Path $fileFolder\test.csv -Approve -Folder "Directory and
Operating Systems" #-ErrorAction Ignore
Here are the results
test test
31-07-2019 -11:35
31-07-2019 -11:35
31-07-2019 -11:35
As you can see it added $test 3 times. But I dont have this issue if I upload it as a new file name.
Ok after a while I have fix the issue.
After the add-pnpfile ...... you pipe it to | out-null
Thats it. the sript stops after it uploads ,
happy days

Resources