Scalable Azure Function with blob trigger - azure

I made an Azure Function on a Consumption Plan with a blob trigger. Then I add lots of files to the blob and I expect the Azure Function to be invoked every time a file is added to the trigger.
And because I use Azure Function and Consumption Plan, I would expect that there is no scalability problem, right? WRONG.
I can easily add files to the blob faster than the Azure Function can process them. Hundred users can add to the blob but there seems to be only one instance of the Azure Function working at any one time. Meaning it can easily fall behind.
I thought the platform would just create more instances of the Azure Function as needed. Well, it seems it doesn't.
Any advice how I can configure my Azure Function to be truly scalable with a blob trigger?

This is because you are affecting with cold-start
As per the note here
When you're using a blob trigger on a Consumption plan, there can be
up to a 10-minute delay in processing new blobs. This delay occurs
when a function app has gone idle. After the function app is running,
blobs are processed immediately. To avoid this cold-start delay, use
an App Service plan with Always On enabled, or use the Event Grid
trigger.
For your case, you need to consider Event-Grid trigger instead of a blob trigger, Event trigger has the built-in support for blob-events as well.
When to consider Event Grid?
Use Event Grid instead of the Blob storage trigger for the following scenarios:
Blob storage accounts
High scale
Minimizing cold-start delay
Read more here
Update in 2020
Azure Function has a new tier/plan called a premium where you can avoid the Cold Start
E.g, YouTube Video

Related

Concurrency problem with Azure Blob Storage function

Azure function is processing multiple files in blob storage at same time.
This is causing duplicate data creation in dynamics CRM because azure function is processing multiple file in parallel execution. Can someone help me , how i can restrict azure function to process one file at a time?
According to the section Trigger - concurrency and memory usage of the offical document Azure Blob storage bindings for Azure Functions, as the figure below.
The blob trigger uses a queue internally, so the maximum number of
concurrent function invocations is controlled by the queues
configuration in host.json. The default settings limit concurrency to
24 invocations. This limit applies separately to each function that
uses a blob trigger.
So you can follow the content of the host.json template file as below to set the queues.batchSize value to 1 to restrict Azure Function with Blob Trigger to process one file per time.
As references, there are two similar SO threads which you can also refer to.
Azure Functions - Limiting parallel execution
Throttling Azure Storage Queue processing in Azure Function App

Azure Blob Triggers sometime taking too much time to get triggered

I am using App service plan for azure function, and have added blob triggers but when any file is uploaded to blob container,Functions are not triggering .or sometime its taking too much time , then after it start triggering.
Any suggestion will be appreciated
It should trigger the function as and when new files is uploaded to blob container.
This should be the case of cold-start
As per the note here
When you're using a blob trigger on a Consumption plan, there can be
up to a 10-minute delay in processing new blobs. This delay occurs
when a function app has gone idle. After the function app is running,
blobs are processed immediately. To avoid this cold-start delay, use
an App Service plan with Always On enabled, or use the Event Grid
trigger.
For your case, you need to consider Event-Grid trigger instead of a blob trigger, Event trigger has the built-in support for blob-events as well.
Since you say that you are already running the functions on an App Service plan, it's likely that you don't have the Always On setting enabled. You can do this on the App from the Application Settings -> General Settings tab on the portal:
Note that Always On is only applicable to Az Functions bound to an App Service plan - it isn't available on the serverless Consumption plan.
Another possible cause is if you don't clear the blobs out of the container after you process it.
From here:
If the blob container being monitored contains more than 10,000 blobs (across all containers), the Functions runtime scans log files to watch for new or changed blobs. This process can result in delays. A function might not get triggered until several minutes or longer after the blob is created.
And when using the Consumption Plan, here's another link warning about the potential of delays.

Which plan to select for my Azure function : Consumption Plan or App Service Plan?

We have created a blob triggered azure function to process files placed in blob storage. Load on this blob will not be consistent.
For example, for some hours there will be hundreds or even thousands of file will be placed in that blob every minutes. On the other hand there will be some hours during which we will not find even a single file.
Some files will be processed in very few seconds and some can take more than 10-15 minutes.
So my question is: In this type of unpredictable scenario which plan will be better for us? App service plan or Consumption plan?
If you can optimize your code so that the maximum processing time is 10 minutes, so Consumption Plan is your best option from cost perspective considering your fluctuating workload
As #Peter Bons, mentioned in the comments, this is your best reference
Edit
According to the above document,
if your function app is on the Consumption plan, there can be up to a
10-minute delay in processing new blobs if a function app has gone
idle.
If you want to avoid that delay and still use consumption plan to benefit from its cost effectiveness, you can replace Blob Trigger with Event Grid Trigger but it is not fully supported by Azure Functions nowadays

Azure function not triggered by blob events

I have a azure function created with ARM Template by using powershell.
Function is blobtrigger type function running on consumption plan, to copy blob from source storage to destination storage.
When I upload blob to source storage it will not copied. That means function is not executed.
When I browse function app through portal, function get invoked and do the required things as expected. Thereafter it works fine. It only happens when function app initially deployed by powershell script using ARM templates.
So I guess Issue is, when I create function app using ARM template and deployed using powershell it is in idle mode, and never triggered by blob events. Is my assumption correct or could you please help me to find the issue. Thanks.
Be careful here. According to the Blob Storage Documentation it mentions that there may be a delay for this trigger if on the consumption plan: (emphasis mine)
When your function app runs in the default Consumption plan, there may be a delay of up to several minutes between the blob being added or updated and the function being triggered. If you need low latency in your blob triggered functions, consider running your function app in an App Service plan.
Perhaps the behavior you are seeing is a manifestation of the above. Try converting to an App Service Plan and see if you still see the delay in the trigger.
I suspect it has nothing to do with your deployment method.

Azure Functions: Can I have different configuration for BlobTriggered function?

I have a .Net project which contains, multiple trigger in a same azure function project (a blob triggered function & a Queue triggered function).
I need a different concurrency for my blob triggered function from queue triggered function.
I know that the blob trigger uses a queue internally.
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob#trigger---poison-blobs
Is there any way I can achieve it?
Like #Sebastian has said, I am afraid you can only achieve this by putting blobtrigger in another Function app.
Settings in host.json regulate behavior of the whole Function app. And We can't separately customize settings for each trigger.
In your case, queue message concurrency settings(bactchSize and newBatchThreshold) influence all triggers that consume messages concurrently.
Rather than using blob trigger, you should try eventgrid trigger:
Reacting to Blob storage events
Event Grid trigger for Azure Functions
Using the eventgrid trigger which is a "custom" http trigger, any time a blob is added/deleted in any containers of your storage account, your endpoint will be called without any delay.

Resources