I have a container called container1 in my Storage Account storageaccount1, with the following files:
blobs/tt-aa-rr/data/0/2016/01/03/02/01/20.txt
blobs/tt-aa-rr/data/0/2016/01/03/02/02/12.txt
blobs/tt-aa-rr/data/0/2016/01/03/02/03/13.txt
blobs/tt-aa-rr/data/0/2016/01/03/03/01/10.txt
I would like to delete the first 3, for that I use the following command:
az storage blob delete-batch --source container1 --account-key XXX --account-name storageaccount1 --pattern 'blobs/tt-aa-rr/data/0/2016/01/03/02/*' --debug
The files are not deleted and I see the following log:
urllib3.connectionpool : Starting new HTTPS connection (1): storageaccount1.blob.core.windows.net:443
urllib3.connectionpool : https://storageaccount1.blob.core.windows.net:443 "GET /container1?restype=container&comp=list HTTP/1.1" 200 None
What is wrong with my pattern?
If I try to delete file by file it works.
As stated in comments, you are not able to apply patterns to subfolders, only first level folders, as documented here. But if you want, you can easily write a script to list the blobs in your container, using the prefix to filter them az storage blob list and then apply the delete for each of the result blobs.
Here is what just worked for me — applied to the command you listed above.
az storage blob delete-batch --source container1 --account-key XXX --account-name storageaccount1 --pattern blobs/tt-aa-rr/data/0/2016/01/03/02/\* --debug
I didn't quote the pattern argument and I added an escape before the *. Using iTerm2 on a Mac. I didn't try --debug but the --dryrun argument was really helpful in getting it to tell me what it had matched (or not!).
Related
I am using the Azure CLI to add blobs to my storage account. Via the Azure CLI, I am successfully able to soft delete blobs; I can confirm this by viewing the soft-deleted blobs on the Azure Portal. I want to restore a blob that I delete via the Azure CLI again, but I am having trouble. I have attempted to use the az storage blob undelete command to do this. It is reportedly successful - I know this by adding the --verbose flag and seeing the 200 HTTP Status returned from the API call that the CLI triggers. The response is:
{
"undeleted": null
}
And when I look at the list of blobs in the Azure Portal again, there is no indication that the blob was actually restored/undeleted. Has anyone else had success using the undelete Azure CLI command previously?
Here is some terminal output; hopefully it is helpful in understanding what I'm trying to do:
PS C:\Users\admin> az storage blob list --account-name azartbackupstore01 -c backupcontainer01 -o table
Name IsDirectory Blob Type Blob Tier Length Content Type Last Modified Snapshot
------------------------------------------------------------------- ------------- ----------- ----------- -------- ------------------------ ------------------------- ----------
20/20162F8E84F43EEAAEC0DB0010545C32D8D1A0CF60284CA2E9A57884B55C2445 BlockBlob 47 application/octet-stream 2021-08-05T15:25:59+00:00
92/92D536261E45E93DB4A8F063A98102BF443DD7EC16B1075F7D13A1A326544035 BlockBlob 11458 application/octet-stream 2021-08-05T15:22:47+00:00
PS C:\Users\admin> az storage blob delete --account-name azartbackupstore01 -c backupcontainer01 --name 20/20162F8E84F43EEAAEC0DB0010545C32D8D1A0CF60284CA2E9A57884B55C2445
PS C:\Users\admin> az storage blob undelete --account-name azartbackupstore01 -c backupcontainer01 --name 20/20162F8E84F43EEAAEC0DB0010545C32D8D1A0CF60284CA2E9A57884B55C2445
{
"undeleted": null
}
PS C:\Users\admin> az storage blob list --account-name azartbackupstore01 -c backupcontainer01 -o table
Name IsDirectory Blob Type Blob Tier Length Content Type Last Modified Snapshot
------------------------------------------------------------------- ------------- ----------- ----------- -------- ------------------------ ------------------------- ----------
92/92D536261E45E93DB4A8F063A98102BF443DD7EC16B1075F7D13A1A326544035 BlockBlob 11458 application/octet-stream 2021-08-05T15:22:47+00:00
Apparently when you have soft-delete and versioning enabled on blobs, something weird is happening (even in the Azure portal where the blob is shown as deleted but the blob state is null, and the versions of the deleted blob are still shown as active).
But I found some kind of a workaround.
In short:
Get the (latest) versionId of the blob you want to undelete
Get the blob URI and add the versionId and SAS token as query parameters to the URI.
Copy the blob where source URI is the deleted blob including versionId (I found this solution in the code here)
When a versioned blob is soft-deleted, it will show up with the command:
az storage blob list --account-name azartbackupstore01 -c backupcontainer01 -o table --include v
I only added the --include (v)ersion at the end which will show all versions of the blobs. The --include (d)eleted will not work, because the blob somehow does not have have a state deleted.
Here is how I've done it:
$blobName="20/20162F8E84F43EEAAEC0DB0010545C32D8D1A0CF60284CA2E9A57884B55C2445"
$containerName="backupcontainer01"
$accountName="azartbackupstore01"
$sas="replace with your sas token"
# query all blobs where name equals $blobName, and reverse sort by versionId (which is the date) so most recent will be the first in the list
$versionId=az storage blob list --account-name $accountName -c $containerName --include v -o json --query "reverse(sort_by([?name=='$blobName'], &versionId))[0].versionId"
$blobUriRoot=az storage blob url --account-name $accountName -c $containerName --name $blobName
# The blobUriRoot and versionId variables are outputted with additional quotes, so these need to be replaced.
$blobUri=$($blobUriRoot + "?versionId=" + $versionId).Replace('"', "")
$blobUriWithSas = $blobUri + "&" + $sas
az storage blob copy start --account-name $accountName --destination-blob $blobName --destination-container $containerName --source-uri $blobUriWithSas
After running above commands, the specified blob is active again.
I have a user that is putting a lot of whitespaces in their filenames and this is causing a download script to go bad.
To get the names of the blobs I use this:
BLOBS=$(az storage blob list --container-name $c \
--account-name $AZURE_STORAGE_ACCOUNT --account-key $AZURE_STORAGE_KEY \
--query "[].{name:name}" --output tsv)
What is happening for a blob like blob with space.pdf it is getting stored as [blob\twith\tspace.pdf] where \t is the tab. When I iterate in an effort to download obviously I can't get at the file.
How can I do this correctly?
You can use this command az storage blob download-batch.
I tested it in azure portal, all the blobs including whose name contains white-space are downloaded.
The command:
c=container_name
AZURE_STORAGE_ACCOUNT=xx
AZURE_STORAGE_KEY=xx
//download the blobs to clouddrive
cd clouddrive
az storage blob download-batch -d . -s $c --account-name $AZURE_STORAGE_ACCOUNT --account-key $AZURE_STORAGE_KEY
The test result:
I have a PowerShell script that currently deletes all blobs in my $web container.
az storage blob delete-batch --account-name myaccountname --source $web
This works great, but now I want to exclude two directories from being deleted. I've looked over the documentation and I'm still not sure how the exclusion syntax is supposed to look.
I'm certain that I have to use the --pattern parameter.
The pattern used for globbing files or blobs in the source. The supported patterns are '*', '?', '[seq]', and '[!seq]'.
I'm hoping someone can let me know what the value of the --pattern param should look like so that I can delete everything in the $web container except the blobs in the /aaa and /bbb directories.
az storage blob delete-batch --account-name myaccountname --source $web --pattern ???
According to my test, if you want to parameter --pattern to exclude a directory from being deleted, you can use the expression '[[!]]'
az storage blob delete-batch --source <container_name> --account-name <storage_account_name> --pattern '[!<folder name>]*'
For example:
The structure of my container is as below before I run the command.
Run the command
az storage blob delete-batch --source test --account-name blobstorage0516 --pattern '[!user&&!people]*'
The structure of my container is as below after I run the command.
Consider this list of blobs (or any storage data):
backup-2018-08-29-0000.archive
backup-2018-08-29-0100.archive
backup-2018-08-29-0200.archive
backup-2018-08-29-0300.archive
backup-2018-08-29-0400.archive
backup-2018-08-29-0500.archive
backup-2018-08-29-0600.archive
backup-2018-08-29-0700.archive
backup-2018-08-29-0800.archive
backup-2018-08-29-0900.archive
backup-2018-08-29-1000.archive
backup-2018-08-29-1100.archive
backup-2018-08-29-1200.archive
backup-2018-08-29-1300.archive
backup-2018-08-29-1400.archive
backup-2018-08-29-1500.archive
backup-2018-08-29-1600.archive
backup-2018-08-29-1700.archive
backup-2018-08-29-1800.archive
backup-2018-08-29-1900.archive
backup-2018-08-29-2000.archive
backup-2018-08-29-2100.archive
backup-2018-08-29-2200.archive
backup-2018-08-29-2300.archive
I wish to delete all files except one. So my initial idea is to use --pattern flag.
--pattern
The pattern used for globbing files or blobs in the source. The
supported patterns are '*', '?', '[seq]', and '[!seq]'.
source
But I can not find info about how '*', '?', '[seq]', and '[!seq]' works.
In the command below, what pattern will seize all files excluding backup-2018-08-29-0000.archive?
$ az storage blob delete-batch --source mycontainer --pattern <pattern>
Update
Additional issue is that I have about 10000 backups collected in more than one year. Using non-batch operations seems non practical.
I doubt there is an easy way to do that with wildcards (it would be easy with regex).
[seq] and [!seq] works like that:
--pattern backup-2018-08-29-[01]???.archive
would delete all with files where the first digit after 29- is either 0 or 1:
backup-2018-08-29-0000.archive
backup-2018-08-29-0100.archive
backup-2018-08-29-0200.archive
backup-2018-08-29-0300.archive
backup-2018-08-29-0400.archive
backup-2018-08-29-0500.archive
backup-2018-08-29-0600.archive
backup-2018-08-29-0700.archive
backup-2018-08-29-0800.archive
backup-2018-08-29-0900.archive
backup-2018-08-29-1000.archive
backup-2018-08-29-1100.archive
backup-2018-08-29-1200.archive
backup-2018-08-29-1300.archive
backup-2018-08-29-1400.archive
backup-2018-08-29-1500.archive
backup-2018-08-29-1600.archive
backup-2018-08-29-1700.archive
backup-2018-08-29-1800.archive
backup-2018-08-29-1900.archive
[!seq] just negates that:
--pattern backup-2018-08-29-[!01]???.archive
This would delete:
backup-2018-08-29-2000.archive
backup-2018-08-29-2100.archive
backup-2018-08-29-2200.archive
backup-2018-08-29-2300.archive
To answer your question. I would rename (copy) the blob to e.g. backup-keep.archive and then delete the remaining backups using the pattern backup-2018-08-29-????.archive
You could use Acquire lease of the blob(in the portal or use az storage blob lease acquire
), then use the command az storage blob delete-batch to delete other blobs. If you lease the blob, the blob could not be deleted, if you want to delete it, just break the lease in the portal or use az storage blob lease break
My test command(I specifie the duration of the lease with 15 seconds):
az storage blob lease acquire --blob-name "azureProfile.txt"--container-name "testdel" --account-key "accountkey" --account-name "storagename" --lease-duration "15"
az storage blob delete-batch --source "testdel" --account-key "accountkey" --account-name "storagename"
It gives a warning, but it works fine on my side.
Check in the portal:
I solved the problem by doing two batch-delete commands:
#!/bin/bash
set -e
# AZURE_CONNECTION_STRING has taken from env
CONTAINER=backups
DATES="201[78]-??-??"
# delete blobs with a range of 1000-2300 timestamps
az storage blob delete-batch \
--connection-string $AZURE_CONNECTION_STRING \
--source $CONTAINER \
--pattern "$DATES-[1-2][0-9]00--mongo.archive"
# delete blobs with a range of 0100-0900 timestamps
az storage blob delete-batch \
--connection-string $AZURE_CONNECTION_STRING \
--source $CONTAINER \
--pattern "$DATES-0[1-9]00--mongo.archive"
With this script, I'm deleting all backups excluding backups made at midnight (with 0000 timestamp).
I'm trying to upload a sample file to Azure from my Ubuntu machine using AzCopy for Linux but I keep getting the below error no matter what permission/ownership I change to.
$ azcopy --source ../my_pub --destination https://account-name.blob.core.windows.net/mycontainer --dest-key account-key
Incomplete operation with same command line detected at the journal directory "/home/jmis/Microsoft/Azure/AzCopy", do you want to resume the operation? Choose Yes to resume, choose No to overwrite the journal to start a new operation. (Yes/No) Yes
[2017/11/18 22:06:24][ERROR] Error parsing source location "../my_pub": Failed to enumerate directory /home/jmis/my_pub/ with file pattern *. Cannot find the path '/home/jmis/my_pub/'.
I have digged over the internet to find solutions, without having a luck I eventually ended up asking a question here.
Although AzCopy was having issues for Linux I'm able to do the above operation seamlessly with Azure CLI. The below code listed on Azure docs helped me do it:
#!/bin/bash
# A simple Azure Storage example script
export AZURE_STORAGE_ACCOUNT=<storage_account_name>
export AZURE_STORAGE_ACCESS_KEY=<storage_account_key>
export container_name=<container_name>
export blob_name=<blob_name>
export file_to_upload=<file_to_upload>
export destination_file=<destination_file>
echo "Creating the container..."
az storage container create --name $container_name
echo "Uploading the file..."
az storage blob upload --container-name $container_name --file $file_to_upload --name $blob_name
echo "Listing the blobs..."
az storage blob list --container-name $container_name --output table
echo "Downloading the file..."
az storage blob download --container-name $container_name --name $blob_name --file $destination_file --output table
echo "Done"
Going forward I will be using the Cool Azure CLI which is Linux compliant and Simple too.
We can use this script to upload single file with Azcopy(Linux):
azcopy \
--source /mnt/myfiles \
--destination https://myaccount.file.core.windows.net/myfileshare/ \
--dest-key <key> \
--include abc.txt
Use --include to specify which file you want to upload, here a example, please check it:
root#jasonubuntu:/jason# pwd
/jason
root#jasonubuntu:/jason# ls
test1
root#jasonubuntu:/jason# azcopy --source /jason/ --destination https://jasondisk3.blob.core.windows.net/jasonvm/ --dest-key m+kQwLuQZiI3LMoMTyAI8K40gkOD+ZaT9HUL3AgVr2KpOUdqTD/AG2j+TPHBpttq5hXRmTaQ== --recursive --include test1
Finished 1 of total 1 file(s).
[2017/11/20 07:45:57] Transfer summary:
-----------------
Total files transferred: 1
Transfer successfully: 1
Transfer skipped: 0
Transfer failed: 0
Elapsed time: 00.00:00:02
root#jasonubuntu:/jason#
More information about Azcopy on Linux, please refer to this link.