Azure - Check if a new blob is uploaded to a container - azure

Are there ways to check if a container in Azure has a new blob (doesn't matter which blob it is)? LastModifiedUtc does not seem to change if a blob is dropped into the container

You should use a BlobTrigger function in an App Service resource.
Documentation

Windows Azure Blob Storage does not provide this functionality out of the box. You would need to handle this on your end. A few things come to my mind (just thinking out loud):
If the blobs are uploaded using your application (and not through 3rd party tools), after the blob is uploaded, you could just update the container properties (may be add/update a metadata entry with information about the last blob uploaded). You could also make an entry into Azure Table Storage and keep on updating it with the information about last blob uploaded. As I said above, this method will only work if all blobs are uploaded through your application.
You could manually iterate through blobs in the blob container periodically and then sort them by last modified date. This method would work fine for a blob container having lesser number of blobs. If the number of blobs are more (say in tens of thousands), then you would end up fetching a long list because blob storage only sorts the blob by blob name.

Related

how can get an alert on “aged files” in Azure Storage Account Containers?

I have Azure Storage Account which contains files in Blob Container. I want to send alert, if a file is sitting in one of the sub folders in a container for more than a day.
So, can anyone suggest me how we can get an alert on “aged files” in Azure Storage Account Containers.
I designed a workflow in azure logic app. You can use Azure Storage Blob connecter to list blobs in your container, then you can use for each to traverse your blobs:
Within the for each action, you need to use condition action to compare its time. Here, I use the current time minus 24 hours to compare with the last modified time of the blob.
addHours expression
addHours(utcNow('yyyy-MM-ddTHH:mm:ss'),-24,'yyyy-MM-ddTHH:mm:ss')
=================update========================

Load the files from latest folder from azure blob storage to azure data warehouse

I am new to Azure and I get 150 CSV files everyday through SFTP into blob storage and they are stored in separate containers everyday. The containers are numbered as 0000,00001,00002 with daily files. How do I load the files from the latest folder into azure data warehouse. How do I point the copy activity to point to the latest folder dynamically. What is the best way to do it? Many thanks for your help.
Unfortunately there's no direct way to find the latest blob container.
Considering a new blob container is created each day and the blob container name is in sequential order, only way to find the latest blob container is to list all blob containers in the storage account, either take the last blob container in the result set or sort the result in descending order and take the first one to find the latest blob container.
There's a Last Modified Date property on a blob container but again this changes any time the blob container is changed so you can't really use it reliably to find the latest blob container. Again for this you would need to list the blob containers (you simply can't avoid this step).

how to watch a folder of Azure Storage Explorer BLOB

we have some blob containers in Azure Storage,
I would like to have a Dashboard with links to some specific folders e.g. to see at one glance the latest files in a specific folder of the Blob Container.
At the moment it is only possible with some clicks, navigation down and sorting into the folder.
I already tried to create a Metrics chart on the Dashboard, but it gives me only BLOB count and stats for the whole BLOB not for granular folders.
Any ideas how to whatch specific folders immediately?
Thing is, folders don't exists in Azure Storage Blobs. There are only containers and blobs inside containers. Blobs define virtual folders. Tools like the Azure Portal or Azure Storage Explorer use the / seperator in the blob url as a way to present a virtual folder structure.
So the answer is that it is impossible since there are no physical folders, as stated in the docs as well:
Blob storage offers three types of resources:
The storage account.
A container in the storage account
A blob in a container

Stop Azure blob trigger function from being triggered on existing blobs when function is published to the cloud

I have an Azure Function which is initiated on a blob trigger. Interestingly, if I publish an updated version of this Azure Function to the cloud and if there are blobs already existing, then the Azure Function will be triggered on each of those already-existing blobs.
This is not the functionality I would like. Instead, I would like a newly published Azure Function to only be triggered on newly uploaded blobs, not on blobs that already exist. How can I disable triggering on existing blobs?
How can I disable triggering on existing blobs?
There is no way to do this currently and not recommended.
Internally we track which blobs we have processed by storing receipts in our control container azure-webjobs-hosts. Any blob not having a receipt, or an old receipt (based on blob ETag) will be processed (or reprocessed). That's why your existing blobs are being processed, they don't have receipts.
BlobTrigger is currently designed to ensure that all blobs in a container matching the path pattern are eventually processed, and reprocessed any time they are updated.
So after all the blobs have a receipt, when you upload a new blob, it will only triggered by newly blob.
For more details, you could refer to this article.
The blob trigger function is triggered when files are uploaded to or updated in Azure Blob storage. If you disable triggering on existing blobs, when you update you blob it would not get the latest content, not recommended.
Workaround:
If you still want to trigger on newly uploaded blobs, you could add a judgement when invoke the function.
List all of existing blobs in the container and when a blob triggered invoke, check if the blob name is in the list, if not you could go ahead the triggered method.

How to determine weather Azure storage is Page or Block Blob type?

I'm trying to configure an online backup to an Azure Storage account. Some of the files I am backing up are larger than 200GB, so I have to be using page Blob type storage.
I believe that, at the moment, this is the kind of storage I have configured; However, my backup of the files that are larger than this 200GB fails stating that the "block blob maximum size is 200GB."
How can I check what kind of storage my Azure storage is configured as? And, how can i ensure that in the future, I am configuring the correct type of storage?
An Azure Storage account can contain Block, Append and Page blobs in a same container. We do not any configurations on Account level or container level. The difference is we will need to use different APIs in SDK or implement with different REST APIs for the different type of Blobs.
You can refer to https://msdn.microsoft.com/en-us/library/azure/dd135733.aspx for more info.
And according to your requirement, for those blobs will be larger than 200GB. You can divide them into several pieces of block blobs, and you can custom mimetype of the blobs pieces to determine whether they are the piece of a special file.
Any further concern, please feel free to let me know.
It depends on how you upload the files to Azure Storge, you specify what type of blob you want to create, either Page, Blob or Append Blob.
Ex:
CloudPageBlob blob = container.GetPageBlobReference("file name");
blob.Properties.ContentType = "binary/octet-stream";
blob.Create(size)
Then you have to divide your stream into pages and iterate over it and upload it to the blob.

Resources