Azure blob trigger fired once for multiple files upload in azure blob - azure

I need help for following scenario.
I have setup azure blob trigger on one of the azure blob storage.
Now suppose from media shuttle I am uploading 3 files to that azure blob and I have setup blob trigger on that azure blob but now what I want is that, I want to trigger some logic only when blob trigger function get call for last file(3rd one). for 1st and 2nd file blob trigger function should get call but should not execute the logic which supposed to execute for last file(3rd one).
so basically somewhere we need to maintain count(for total upload) and count for number of time blob trigger function get call and compare and run final logic if condition satisfied but I am not aware about how to do that.
I'm using .Net core to write azure blob trigger function.

Related

How to invoke a POST web api when a file arrives in azure blob storage, and post the file to the api?

I have video files arriving into Azure blob storage.
When the file arrives, I want to invoke a REST api and pass the file in the api's POST method.
What is the Azure blob storage setting where-in I can configure the api call trigger (when new file arrives) and specify that the file needs to be POST'ed?
What is the Azure blob storage setting where-in I can configure the
api call trigger (when new file arrives) and specify that the file
needs to be POST'ed?
There are many ways to accomplish this.
One possible solution is to make use of Blob Triggered Azure Function. The Function will be triggered when a blob is created. In your Function code, you can invoke your API and post the blob data.
Other option would be to use Azure Logic Apps where you can define a workflow that gets invoked when a blob is created.

How to get an excel file from web and store it in an azure blob storage

I have an ADF pipeline that process an excel file in azure blob storage. The excel file is actually downloaded from
Here and then manually uploaded on the azure blob storage.
I want to automate this process of downloading the excel from the link and then load it in the azure blob storage. Is there any way to do it using ADF or any other Azure Service
The non-code option that comes to mind is Logic apps.
Your Logic apps will look this. After the trigger you will need a HTTP action followed by a copy blob to copy that content into your storage account.
Your Create blob step will look like this. The blob content will be the response body of the previous http request.
You can have this scheduled at a regular interval.

In data factory does Azure Event trigger wait till full file is copied?

Let us say that I am copying a 10 GB file to an ADLS location and this location is being monitored by an Azure Event trigger. Will the Event trigger wait for the full 10 GB file to be copied to trigger the event OR trigger the pipeline as soon as file starts copying?. If the pipeline gets kicked off as soon as the file starts to copy how can we delay it so that the pipeline can wait till the full file is copied ?
Based on my knowledge, the ADF is triggered once the entire file is uploaded based on event trigger.
ADF trigger :
Based on documentation: https://learn.microsoft.com/en-us/azure/data-factory/how-to-create-event-trigger
Once the file is created (is fully uploaded) then the trigger fires.
According to the docs:
It depends which API was used to copy the file.
If it's Blob REST APIs
"In that case, the Microsoft.Storage.BlobCreated event is triggered when the CopyBlob operation is initiated and not when the Block Blob is completely committed."
If it's Azure Data Lake Storage Gen 2 REST APIs
"when clients use the CreateFile and FlushWithClose operations that are available in the Azure Data Lake Storage Gen2 REST API."

Trigger Azure data factory pipeline - Blob upload ADLS Gen2 (programmatically)

We are uploading files into Azure data lake storage using Azure SDK for java. After uploading a file, Azure data factory needs to be triggered. BLOB CREATED trigger is added in a pipeline.
Main problem is after each file upload it gets triggered twice.
To upload a file into ADLS gen2, azure provides different SDK than conventional Blobstorage.
SDK uses package - azure-storage-file-datalake.
DataLakeFileSystemClient - to get container
DataLakeDirectoryClient.createFile - to create a file. //this call may be raising blob created event
DataLakeFileClient.uploadFromFile - to upload file //this call may also be raising blob created event
I think ADF trigger is not upgraded to capture Blob created event appropriately from ADLSGen2.
Any option to achieve this? There are restrictions in my org not to use Azure functions, otherwise Azure functions can be triggered based on Storage Queue message or Service bus message and ADF pipeline can be started using data factory REST API.
You could try Azure Logic Apps with a blob trigger and a data factory action:
Trigger: When a blob is added or modified (properties only):
This operation triggers a flow when one or more blobs are added or
modified in a container. This trigger will only fetch the file
metadata. To get the file content, you can use the "Get file content"
operation. The trigger does not fire if a file is added/updated in a
subfolder. If it is required to trigger on subfolders, multiple
triggers should be created.
Action: Get a pipeline run
Get a particular pipeline run execution
Hope this helps.

Stop Azure blob trigger function from being triggered on existing blobs when function is published to the cloud

I have an Azure Function which is initiated on a blob trigger. Interestingly, if I publish an updated version of this Azure Function to the cloud and if there are blobs already existing, then the Azure Function will be triggered on each of those already-existing blobs.
This is not the functionality I would like. Instead, I would like a newly published Azure Function to only be triggered on newly uploaded blobs, not on blobs that already exist. How can I disable triggering on existing blobs?
How can I disable triggering on existing blobs?
There is no way to do this currently and not recommended.
Internally we track which blobs we have processed by storing receipts in our control container azure-webjobs-hosts. Any blob not having a receipt, or an old receipt (based on blob ETag) will be processed (or reprocessed). That's why your existing blobs are being processed, they don't have receipts.
BlobTrigger is currently designed to ensure that all blobs in a container matching the path pattern are eventually processed, and reprocessed any time they are updated.
So after all the blobs have a receipt, when you upload a new blob, it will only triggered by newly blob.
For more details, you could refer to this article.
The blob trigger function is triggered when files are uploaded to or updated in Azure Blob storage. If you disable triggering on existing blobs, when you update you blob it would not get the latest content, not recommended.
Workaround:
If you still want to trigger on newly uploaded blobs, you could add a judgement when invoke the function.
List all of existing blobs in the container and when a blob triggered invoke, check if the blob name is in the list, if not you could go ahead the triggered method.

Resources