Use the AzureCLI task to delete containers - azure

I know that az storage blob delete-batch lets you delete blobs from a blob container recursively. I need to delete containers instead of single blobs. In particular, I need to delete containers older than two years. Is there any way to accomplish this?

As I see it, there're two parts to your problem:
Deleting Multiple Containers: For this you can write a script that will first list the containers using az storage container list, loop over that list to delete each container individually using az storage container delete.
Find Containers Older Than 2 Years: This is going to be a tricky thing because currently there's no way to find out when a blob container was created. It does have a Last Modified Date property but that gets changed every time an operation is performed on that blob container (not including the operations performed on the blobs inside that container).

Related

How to add filter to Container for deleting blobs except some blobs in a virtual folder?

I am having set of folders in container named records (Azure storage account). In general what ever the blobs(folders) present in records container it will be deleted as per lifecycle management rule.
Rule: if blob exists more than 30days than it will delete the blob.
But As per my case, All blobs (folders) should delete except one blob (folder) where the blob(folder) name is Backup in the container.
Is there any way to add a rule for not deleting particular blob(In my case it is folder)?
So backup folder shouldn't delete when the existing rule run.
Create a lease for the particular blob using the azure portal for example. A lease prevents processes from doing anything with the blob. This includes lifecycle management rules.
You can also acquire or break a lease using the rest api or one of the many storage SKDs.
Another option would be to not use the lifecycle management rules but write a scheduled azure function that deletes blob older than 30 days except the ones having backup in their name.
Please do note: if you have enabled "Hierarchical namespace" then you have the concept of directories, but those cannot be leased. If you did not then you should realise that folders are a virtual construct and as such cannot be leased as they are actually blobs. See the docs. So in that case you have to individually take a lease on each blob or write a script that does it once.

Is there a way to filter by tier in azure blob storage

I would like to list all the files stored in a particular tier. This is what I tried:
az storage fs file list \
--file-system 'cold-backup' \
--query "[?contains(properties.blobTier, 'Cold')==\`true\`].properties.blobTier"
But it doesn't work. I also tried with "blobTier" only. No luck.
This is the error I get:
Invalid jmespath query supplied for '--query': In function contains(), invalid type for value: None, expected one of: ['array', 'string'], received: "null"
The command az storage fs file list is for ADLS Gen2 file system, there is no blobTier property in the output, so you could not query with it, also the blobTier should be Cool instead of Cold.
If you want to list the files filter with blobTier, you could use az storage blob list, it applies to blob storage, but it can also be used for ADLS Gen2 file system.
Sample:
az storage blob list --account-name '<storage-account-name>' --account-key 'xxxxxx' --container-name 'cold-backup' --query "[?properties.blobTier=='Cool']"
If you want to output the blobTier, use --query "[?properties.blobTier=='Cool'].properties.blobTier" instead in the command.
The accepted answer works perfectly fine. However, if you have a lot of files then the results are going to be paginated. The CLI tool will return NextMarker which has to be used in the subsequent call using --marker parameter. In case of huge number of files, this will have to be scripted out using something like power shell.Also az storage blob list makes --container-name mandatory. Which means only one container can be queried at a time.
Blob Inventory
I have a ton of files and many containers. I found an alternate method that worked best for me. Under Data management there is an option called Blob Inventory.
This will basically generate a report of all the blobs across all the containers in a storage account. The report can be customized to include the fields of your choice, for example: Name, Access Tier, Blob Type etc. There are also options to filter certain blobs (include and exclude filters).
The report will be generated in CSV or Parquet format and stored in the container of your choice at a daily or weekly frequency. The only downside is that the report can't be generated on-demand (only scheduled).
Further, If you wish to run SQL on the Inventory report (CSV/Parquet file) then simply use DBeaver to do this.

How to copy one storage account's container's blobs to another storage account's container's blobs

I have two storage accounts(storage1) and (storage2) and both of them have containers called data.
Now, storage1's data contains folder called database-files which contains lots of folders recursively. I mean it's kind of huge.
What I am trying to do is I want to copy database-files and everything that's in it from storage1's data container to storage2's data container. Note: both storage accounts are in the same resource group and subscription.
Here is what I've tried:
az storage blob copy start-batch --source-account-name "storage1" --source-container "data" --account-name "storage2" --destination-container "data"
This worked fine, but The problem is time it takes is ridiculously big and I can't wait this much because I want to do this command for one of my release . Which means that i need this as soon as fast, so that my deployment happens fast.
Is there any way to make it faster? maybe zip it/copy it/unzip it? Even If I use AzCopy, I have no idea how it's going to help with timing. All it helps is it doesn't have point of failure and also I have no idea how to use it via azure cli.
How can I proceed?

Azure - is one 'block blob' seen as one file?

Question background:
This may be a simple question but I cant find an answer to it. I've just started using Azure storage (for storing images) and want to know if one 'blob' holds a maximum of one file?
This is my container called fmfcpics:
Within the container I have a block blob named myBlob and within this I have one image:
Through the following code, if I upload another image file to the myBlob block blob then it overwrites the image already in there:
CloudBlockBlob blockBlob = container.GetBlockBlobReference("myblob");
using (var fileStream = System.IO.File.OpenRead(#"C:\Users\Me\Pictures\Image1.jpg"))
{
blockBlob.UploadFromStream(fileStream);
}
Is this overwriting correct? Or should I be able to store multiple files at the myBlob?
Each blob is a completely separate entity, direct-addressable via uri:
http(s)://storageaccountname.blob.core.windows.net/containername/blobname
If you want to manage multiple entities (such as image jpg's in your case), you would upload each one to a separate blob name (and you're free to store as many as you want within a single container, and you may have as many containers as you want).
Note: These are block blobs. There are also page blobs that have random-access capability, and this is the basis for vhd storage (and in that case, the vhd would have a formatted file system within it, with multiple files).
In blob Azure Documentation you understand how blob service works and the concepts about this storage service.
In some minutes you can use the service easily

How to clean an Azure storage Blob container?

I just want to clean (dump, zap, del .) an Azure Blob container. How can I do that?
Note: The container is used by IIS (running Webrole) logs (wad-iis-logfiles).
A one liner using the Azure CLI 2.0:
az storage blob delete-batch --account-name <storage_account_name> --source <container_name>
Substitute <storage_account_name> and <container_name> by the appropriate values in your case.
You can see the help of the command by running:
az storage blob delete-batch -h
There is only one way to bulk delete blobs and that is by deleting the entire container. As you've said there is a delay between deleting the container and when you can use that container name again.
Your only other choice is to delete the one at a time. If you can do the deleting from the same data centre where the blobs are stored it will be faster than running the delete locally. This probably means writing code (or you could RDP into one of your instances and install cloud explorer). If you're writing code then you can speed up the overall process by deleting the items in parallel. Something similar to this would work:
Parallel.ForEach(myCloudBlobClient.GetContainerReference(myContainerName).ListBlobs(), x => ((CloudBlob) x).Delete());
Update: Easier way to do it now (in 2018) is to use the Azure CLI. Check joanlofe's answer :)
Easiest way to do it in 2016 is using Microsoft Azure Storage Explorer IMO.
Download Azure Storage Explorer and install it
Sign in with the appropriate Microsoft Account
Browse to the container you want to empty
Click on the Select All button
Click on the Delete button
Try using cloudberry product for windows azure
this is the link: http://www.cloudberrylab.com/free-microsoft-azure-explorer.aspx
you can search in the blob for specific extension. select multiple blobs and delete them
If you mean you want to delete a container. I would like to suggest you to check http://msdn.microsoft.com/en-us/library/windowsazure/dd179408.aspx to see if Delete Container operation (The container and any blobs contained within it are later deleted during garbage collection) could fulfill the requirement.
If you are interested in a CLI way, then the following piece of code will help you out:
for i in `az storage blob list -c "Container-name" --account-name "Storage-account-name" --account-key "Storage-account-access-key" --output table | awk {'print $1'} | sed '1,2d' | sed '/^$/d'`; do az storage blob delete --name $i -c "Container-name" --account-name "Storage-account-name" --account-key "Storage-account-access-key" --output table; done
It first fetches the list of blobs in the container and deletes them one by one.
If you are using a spark (HDInsight) cluster which has access to that storage account, then you can use HDFS commands on the command line;
hdfs dfs -rm -r wasbs://container_name#account_name.blob.core.windows.net/path_goes_here
The real benefit is that the cluster is unlikely to go down, and if you have screen running on it, then you won't lose your session whilst you delete away.
For This case the better option is to identify the list of item found in the container. then delete each item from the container. That is the best option. If you delete the container you should have a run time error on the next time...
You can use Cloud Combine to delete all the blobs in your Azure container.

Resources