Triggering pipeline using azure function when data is uploaded - azure

I was looking for other question which could be similar to mine, but wasn't able to find something like this.
Question: Pictures are uploaded to a platform, which I can access using an API, sometimes 3 times a day, something once a week. Instead of running a scheduled pipeline, we want to trigger the pipeline when new data is uploaded to the platform. I've used a timer function (every 5 minutes) at the moment, but cannot find how to trigger a specific pipeline. So how can I do this?
Good to know: The pipeline starts a job in Azure ML in a compute cluster.
Does anyone know the solution to my question or know where I can find more information about this?

If your Azure ML job (or pipeline) can send the picture to Azure Blob Storage, then your Azure Function can be triggered based whenever a new image is uploaded. Here are tutorials on how to do this in C# and Javascript.

Related

Azure Pipeline storage file trigger doesn't fire

I´ve created an Azure Synapse Analytics Pipeline that must be triggered by the creation of a file within a Azure Gen2 storage account.
Somehow the blob creation event (i.e. when I upload the file in the corresponding container and folder) doesn´t fire anything and the pipeline does not start. I´ve registered the Microsoft.EventGrid and Microsoft.Synapse resource providers in the subscription, as suggested by the Microsoft official documentation.
Am I missing anything? As far as I know, and according to the Microsoft documentation and the many tutorials I've read, I don´t need any Event Topic/Event subscription...
Can you please check the content type of the file :
usually when that is blank, event trigger is not initiated
I tried to reproduce your scenario in my environment, and it works for me (i.e., when I upload the file in the corresponding container and folder). Let me share my implementation and then you can compare with yours.
This is the setup for the trigger
The trigger is firing as expected.
Files uploaded date time
Trigger firing date time
I still didn´t figure out what is not working, so I implemented a workaround: a simple ADF pipeline looping for files in the landing zone. The pipeline is associated with a normal schedule trigger (it runs 3 times a day) and it calls in turn the pipeline I originally wanted to be triggered by the file creation trigger.

Azure LogicApp for migration of millions of files

I have the following requirements, where I consider using Azure LogicApp:
Files placed in Azure Blob Storage must be migrated into a custom place (it can be different from case to case)
Amount of files is something about 1 000 000
When the process is over, we should have a report saying how many records (files) failed
If the process stopped somewhere in the middle, the next run must take only files that have not been migrated
The process must be fast as it can be and files must be migrated within N hours
But what makes me worried is the fact that I cannot find any examples or articles (including official Azure Documentation) where the same thing is achieved by Azure LogicApp.
I have some ideas about my requirements and Azure Logic App:
I think that I must use pagination for dealing with this amount of files because Azure Logic App will not be able to read millions of file names - https://learn.microsoft.com/en-us/azure/logic-apps/logic-apps-exceed-default-page-size-with-pagination
I can add a record into Azure Table Storage to track failed migrations (something like creating a record to say that the process started and updating it when the file is moved to the destination)
I have no ideas how I can restart the Azure Logic App without using a custom tracking mechanism (for instance it can be the same Azure Table Storage instance)
And the question about splitting the work across several units is still open
Do you think that Azure Logic App is the right choice for my needs or I should consider something else? If Azure LogicApp can work for me, could you please share your thoughts and ideas on how I can achieve the given requirements?
I don't think logic app is a good solution for you to implement the requirement because the amount of files is about 1000000, that's too much. For this requirement, I suggest you to use Azure Data Factory.
To migrate data in azure blob according data factory, you can refer to this document

Set schedule for Azure Logic App to run once a week

I have an Azure Logic app that dynamically gets Blob contents from my azure storage account and sends an email with the attachment. I want to set a schedule for my logic app to run once a week.
Any idea how I can achieve this?
Here's my current workflow:
It depends on what you are trying to do. If you want to get an email every time your blob is updated, your current Logic App is the way to go. If you change the trigger to a Reccurrence trigger as Rob Ert stated than you could potentially lose updates (the blob could have many updates in a week). If you don't care about the individual updates, then Reccurrence is the proper trigger.
I think you are looking for Recurrence trigger's.
It's possible to set something like time triggers from regular Azure Functions.
Here's instruction how to create one in your logic app.
https://learn.microsoft.com/en-us/azure/connectors/connectors-native-recurrence

Monitor the amount of blobs entering into an Azure container

Basically I have a storage account with a containers that contain blobs of unhandled errors. My task is to somehow generate a metric that will be able to show how many blobs were uploaded to that container every hour. I tried using the Azure built in metrics, but it seems like that might limit me to the entire storage account and not just one container. I did some research on Power BI and thought that might be a good place to start, but again I came up empty.
If anyone has a good starting place for me, that would be incredible. I'm assuming that this will end up being something that requires some SQL queries, or perhaps something I can do programatically in Visual Studio. Apologies if this was posted in the wrong place, but it seemed like the best fit from my opinion.
Thanks!
You should take a look at Azure Event Grid with Blob Storage Integration. In short, whenever a blob is created, an event will be raised by Azure Event Grid. You can consume this event and post the event data to an HTTP endpoint (or call an Azure Function) which can save this information about this event in some persistent storage (Azure Tables for example). You can then create reports by querying this data.
For more information about this, you may find this link helpful: https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blob-event-overview.

Azure Data factory pipeline: How to display Custom Activity Progress in azure portal

I have a time consuming custom activity running in a Azure data factory pipeline.
It copies files from Blob to FTP server recursively.
The entire activity take 3-4 hours based on the number of files in the folder.
But when I am running the pipeline, it shows in progress 0%.
How update pipeline progress from custom activity?
In short, I doubt you will be able to. The services are very discounted from each other.
You might be better off writing out to the Azure generic activity log and monitoring directly from the custom activity method. This is an assumption though.
Hope this helps.

Resources