I am working on a project where I need to consume the entries of the storage queue from a data factory pipeline.
Files will be uploaded to a blob storage which triggers a azure function. This azure function writes into a storage queue. Now I want to consume the entries of this storage queue. Due to the fact that the storage queue provide a rest api to consume data, I can use a web client in the azure data factory which can be scheduled every few minutes. But I would prefere a more direct way, so that when the storage queue has been filled, my pipeline should be starting.
I am quite new to the azure world, so now I am searching for solution. Is there a way to subscribe to the storage queue? I can see that there is the possibilty to create custom triggers in the data factory how can I connect to a storage queue there? Or is there another way?
Thank you #Scott Mildenberger for pointing out in the right direction. After taking the inputs and reproducing from my end, this was working when I used Queue Trigger called When a specified number of messages are in a given queue (V2) where we can specify the Threshold of the Queue to get the flow triggered. Below is the flow of my Logic App.
RESULTS:
In Logic App
In ADF
Related
I have a azure storage container where I will be getting many files on daily basis.My requirement is that I need a trigger in azure data factory or databricks when each time 500 new files are arrived so I can process them.
In Datafatory we have event trigger which will trigger for each new files(with filename and path). but is it possible to get multiple new files and their details at same time?
What services in azure I can use for this scenario.. event hub? azure function.. queues?
One of the characteristics of a serverless architecture is to execute something whenever a new event occurs. Based on that, you can't use those services alone.
Here's what I would do:
#1 Azure Functions with Blob Trigger, to execute whenever a new file arrives. This would not start the processing of the file, but just 'increment' the count of files that would be stored on Cosmos DB.
#2 Azure Cosmos DB, also offers a Change Feed, which is like an event sourcing whenever something changes in a collection. As the document created / modified on #1 will hold the count, you can use another Azure Function #3 to consume the change feed.
#3 This function will just contain an if statement which will "monitor" the current count, and if it's above the threshold, start the processing.
After that, you just need to update the document and reset the count.
I'm a newbie to .net and Azure. I'm trying to design a pipeline to process files created in Azure container. I have a created event grid blob trigger on the container to get the metadata of the files created. I have two options now
Use Azure function to consume the metadata from event grid and process it. I believe the Azure function can scale out based on traffic in event grid. But the files could be large, having a size of 60 GB. I read Azure functions are not ideal for long processing. Does Azure function works for my case ?
Use storage queue to consume the metadata from event grid. Create an application to consume from the queue and process files.
Please suggest what kind of application I can develop and deploy so that I could achieve scale out/in based on the queue traffic and process large blob files efficiently.
What are possible ways to implement such scenario?
I can think of some Azure function which will periodically check the share for new files. Are there any other possibilities.
I have been thinking also about duplicating the files to Blob storage and generate the notifications from there.
Storage content trigger is by default available for blobs. If you look for migrating to blob storage, then you can utilise BlobTrigger Azure function. In case of file trigger in File Share, the below are my suggestions as requested:
A TimerTrigger Azure function that acts as a poll to check for new file in that time frame the previous trigger occured.
Recurrence trigger in logic app to poll and check for new contents.
A continuous WebJob to continuously poll the File Share checking for new contents.
In my opinion, duplicating the files to Blob storage and making your notification work may not be a great option, because such operation once again requires a polling mechanism which can be achieved with options like a few mentioned above, but is still unnecessary.
Currently, I have an Azure Function App which runs every hour (timer trigger) that pulls data from Azure table storage and updates a NSG. I only did it this way because Function Apps currently DON'T support Azure Table triggers; however Function Apps DO support Azure queue triggers.
With that said, i'd like a message be sent to the queue every time my Azure Table is updated. That way, the Azure Table updates can happen immediately compared to every hour. Haven't figured out how to send messages to Azure Queue from Azure Tables though.
Any help?
There is no change feed, update triggers etc. on Azure Table storage. You could achieve this by switching to Tables API on Cosmos DB - which does have a Change Feed.
we have a set of blob trigger functions and we are planning to use a new azure webjobs storage for these azure functions. My question is: since the new storage account doesn't have any track of the already processed file, will the blobs be reprocessed? If yes, can we avoid this reprocessing and in which?
I think you're talking about Blob Receipt feature.
When you're using a new azure webjobs storage for the azure function, it definitely re-process the already processed file. This is by design.
The only way I can think of is that, when using a new azure webjobs storage, you can add a list which contains all the processed files in your function code, and when the code detects the file is already processed, then do nothing with it.