I have thsese in my host.json but every time i run the function it runs in parallel runing much more threads then 1 ( so much as there are messages in queue)
{
"version": "2.0",
"extensions": {
"serviceBus": {
"prefetchCount": 1,
"messageHandlerOptions": {
"maxConcurrentCalls": 1
}
}
}
}
my function
[FunctionName(nameof(ResourceEventProcessorFunction))]
public async Task Run([ServiceBusTrigger("%TopicName%", "%SubscriptionName%", Connection = "ServiceBusConnection", IsSessionsEnabled = true)]Message message, IMessageSession messageSession, ILogger log)
Leveraging sessions
Since you are using sessions, you can use the same sessionId for all messages, and they will be processed in order by a single instance, regardless of the settings in your host.json.
https://learn.microsoft.com/en-us/azure/service-bus-messaging/message-sessions
Using Singleton attribute
If you can't use the sessionId for your purpose, you should try the [Singleton] attribute on your function. This will ensure that only one instance across all of your function instances will process the request.
We have this working successfully for WebJobs in production, and it should work just the same for Azure Functions. If you have dedicated app service plans, using this attribute should be enough. This is not recommended for a consumption plan.
[Singleton] does work on functions. The Azure Function host will create or wait for a lock in the Azure Storage account. The lock is the host ID which should be the same for all hosts of an app across all instances - so all instances share this lock and will only allow one execution to occur at a time.
To test this I put 1000 queue messages at once on a function with [Singleton]. The function would wake up, emit the invocation ID, sleep, and then emit the invocation ID. After processing all 1000 I looked at logs and never saw invocation IDs overlap. Only one invocation would happen globally at a time.
https://github.com/Azure/azure-functions-host/issues/912#issuecomment-419608830
[Singleton]
[FunctionName(nameof(ResourceEventProcessorFunction))]
public async Task Run([ServiceBusTrigger("%TopicName%", "%SubscriptionName%", Connection = "ServiceBusConnection", IsSessionsEnabled = true)]Message message, IMessageSession messageSession, ILogger log)
In a consumption plan
Continuing the quote above:
With that said I think the recommendation is: [Singleton] isn't recommended for consumption hosted function plans. If you have a dedicated app service plan it's fine (as you are paying for the instance anyway). If you want to enforce [Singleton] like behavior in a consumption plan you are likely best to:
Set WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT to 1 so you never scale to more than one instance
Set the host.json file to only allow 1 concurrent execution at a time for that trigger (for instance a batch size of 1 for Azure Queues).
https://github.com/Azure/azure-functions-host/issues/912#issuecomment-419608830
{
"version": "2.0",
"extensions": {
"serviceBus": {
"prefetchCount": 1,
"messageHandlerOptions": {
"maxConcurrentCalls": 1
}
}
}
}
Maybe you can set WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT to 1 to make the function run only one instance at a time.
If you develop locally, you can set it in local.settings.json, if you develop in Azure portal, you can set it in Configuration -> Application settings.
Noteļ¼
1. If you set WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT to 1, your function will not scale out and can only run in one instance.
2. In addition to setting WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT, you still need to set maxConcurrentCalls to 1
3. This setting is in preview. An app property for function max scale out has been added and is the recommended way to limit scale out.
For more details, you can refer to this official document.
so the problem was that every message had a differnet sessionId.
Disabling sessionId on subscription in azure solved this problem.
In details below for bounty :D
azure docs doesnt exactly specify how to limit thread number, but I looked a bit dipper.
there is MessageRecievePump
and
SessionRecievePump
one uses MaxConcurrentCalls the other one MaxConcurrentSessions and MaxConcurrentAcceptSessionCalls
be aware of this if you include session in your subscription (MaxConcurrentCalls doesnt work) it works only when session id is the same.
when session is differnt try to use MaxConcurrentSessions or MaxConcurrentAcceptSessionCalls but be aware there are no docs about this....
Related
Currently working on a project where I'm using the storage queue to pick up items for processing. The Storage Queue triggered function is picking up the item from the queue and starts a durable orchestration. Normally the according to the documentation the storage queue picks up 16 messages (by default) in parallel for processing (https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue), but since the orchestration is just being started (simple and quick process), in case I have a lot of messages in the queue I will end up with a lot of orchestrations running at the same time. I would like to be able to start the orchestration and wait for it to complete before the next batch of messages are being picked up for processing in order to avoid overloading my systems. The solution I came up with and seems to work is:
public class QueueTrigger
{
[FunctionName(nameof(QueueTrigger))]
public async Task Run([QueueTrigger("queue-processing-test", Connection = "AzureWebJobsStorage")]Activity activity, [DurableClient] IDurableOrchestrationClient starter,
ILogger log)
{
log.LogInformation($"C# Queue trigger function processed: {activity.ActivityId}");
string instanceId = await starter.StartNewAsync<Activity>(nameof(ActivityProcessingOrchestrator), activity);
log.LogInformation($"Started orchestration with ID = '{instanceId}'.");
var status = await starter.GetStatusAsync(instanceId);
do
{
status = await starter.GetStatusAsync(instanceId);
} while (status.RuntimeStatus == OrchestrationRuntimeStatus.Running || status.RuntimeStatus == OrchestrationRuntimeStatus.Pending);
}
which basically picks up the message, starts the orchestration and then in a do/while loop waits while the staus is Pending or Running.
Am I missing something here or is there any better way of doing this (I could not find much online).
Thanks in advance your comments or suggestions!
This might not work since you could either hit timeouts causing duplicate orchestration runs or just force your function app to scale out defeating the purpose of your code all together.
Instead, you could rely on the concurrency throttles that Durable Functions come with. While the queue trigger would queue up orchestrations runs, only the max defined would run at any time on a single instance of a function.
This would still cause your function app to scale out, so you would have to consider that as well when setting this limit and you could also set the WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT app setting to control how many instances you function app can scale out to.
It could be that the Function app's built in scaling throttling does not reduce load on downstream services because it is per app and will just cause the app to scale more. Then what is needed is a distributed max instance count that all app instances adhere to. I have built this functionality into my Durable Function orchestration app with a scaleGroupId and it`s max instance count. It has an Api call to save this info and the scaleGroupId is a string that can be set to anything that describes the resource you want to protect from overloading. Here is my app that can do this:
Microflow
I have written a blob-triggered function that uploads data on a CosmosDB database using the Gremlin API, using Azure Functions version 2.0. Whenever the function is triggered, it is going to read the blob, extract relevant information, and then queries the database to upload the data on it.
However, when all files are uploaded on the blob storage at the same time, the Function is going to process all files at the same time, which results in too many requests for the database to handle. To avoid this, I ensured that the Azure Function would only process one file at a time, by setting the batchSize to 1 in the host.json file :
{
"extensions": {
"queues": {
"batchSize": 1,
"maxDequeueCount": 1,
"newBatchThreshold": 0
}
},
"logging": {
"applicationInsights": {
"samplingSettings": {
"isEnabled": true,
"excludedTypes": "Request"
}
}
},
"version": "2.0"
}
This worked perfectly fine for 20 files at a time.
Now, we are trying to process 300 files at a time, and this feature doesn't seem to work anymore, the Function processes all the files at the same time again, which results in the database not being able to handle all the requests.
What am I missing here ? Is there some scaling issue I'm not aware of ?
From here:
If you want to avoid parallel execution for messages received on one queue, you can set batchSize to 1. However, this setting eliminates concurrency as long as your function app runs only on a single virtual machine (VM). If the function app scales out to multiple VMs, each VM could run one instance of each queue-triggered function.
You need to combine this with the app setting WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT when you run in Consumption plan.
Or, according to the docs, the better way would be through the Function property functionAppScaleLimit: https://learn.microsoft.com/en-us/azure/azure-functions/event-driven-scaling#limit-scale-out
WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT would work of course.
You can also scale to multiple Function App instances within one Host then you can have less hosts and more FUNCTIONS_WORKER_PROCESS_COUNT per host. Cost implications would depend on your plan.
Note that all workers within a Host would share resources, so this is recommended for more IO bound workload.
I've seen this problem expressed a lot but I've yet to find a working solution.
In short, I periodically have a large batch of processing operations to be done. Each operation is handled by an Azure Function. Each operation makes calls to a database. If I have too many concurrent functions running at the same time, this overloads the database and I get timeout errors. So, I want to be able to limit the number of concurrent Azure Function calls that are run at a single time.
I've switched the function to be queue-triggered and tweaked the batchSize / newBatchThreshold / maxDequeueCount host.json settings in many ways based on what I've seen online. I've also set the WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT application setting to 1 in my function application settings to prevent more than one VM from being spawned.
Yet still, every time I fill that queue, multiple functions will spawn indiscriminately and my database will fall over.
How can I throttle the number of concurrent operations?
The problem ended up being a difference in formatting of host.json in V1 and V2 Functions. Below is the correct configuration (using Microsoft.Azure.WebJobs.Extensions.Storage at least 3.0.1). The following host.json configures a single function app to process queue messages sequentially.
{
"version":"2.0",
"extensions": {
"queues": {
"batchSize": 1,
"newBatchThreshold": 0
}
}
}
Setting App Setting WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT = 1 restricts a function app from dynamically scaling out beyond one instance.
I have an azure queue trigger associated to a queue, and I want to ensure that the trigger only reads and executes one message at the time. So, when the message is executed (successfuly or not) it processes the next message.
What is happening is that the queue executes one message, yet it begins to execute other message. My host.json:
"queues": {
"maxPollingInterval": 20000,
"visibilityTimeout": "00:01:00",
"batchSize": 1,
"maxDequeueCount": 5,
"newBatchThreshold": 1
}
Following instructions from MS link:
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue#trigger---configuration
If you want to avoid parallel execution for messages received on one
queue, you can set batchSize to 1
So it would be expected to run only one message at the time (I'm using consumption plan).
This is critical because I need to ensure that only one message it processed at the time.
Is there any setting that I could change?
Or is queue trigger not a good option to address this requirement?
Storage Queue does NOT guarantee ordering - so if needing sequential processing because order of delivery matters you need to consider Azure Service Bus and set in Function setting (host.json)
maxConcurrentCalls = 1
Even if you do the trick by setting maximum number of instances that a function app can scale to as follow, ordering is still not guaranteed with Azure Storage Queue.
WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT = 1
Microsoft documentation is not the perfect one. It's being continuously updated.
If you want to minimize parallel execution for queue-triggered functions in a function app, you can set the batch size to 1. But this setting eliminates concurrency only so long as your function app runs on a single virtual machine (VM).
If you have multiple Virtual Machines and function instances on each VM, there will be one message processed for each function instance running in each virtual machine.
This microsoft document explains concurrency on triggers.
For those coming across this question looking to debug locally and getting issue with multiple queue items making this difficult, you can add the following to your local.settings.json to override the default functionality on your machine only:
{
"IsEncrypted": false,
"Values": {
"AzureFunctionsJobHost__extensions__queues__batchSize": 1
}
}
Documentation
This is my first time using Azure functions and service bus.
I'm trying to build a function app in Visual Studio and I am able to connect to the queue topic (which I don't control). The problem is that every time I enable a breakpoint, 10s of messages are processed at once in VS which is making local testing very difficult (not to mention the problems arising from database pools).
How do I ensure that only 1 message gets processed at once until I hit complete?
public static void Run([ServiceBusTrigger("xxx", "yyy", AccessRights.Manage)]BrokeredMessage msg, TraceWriter log)
{
// do something here for one message at a time.
}
Set maxConcurrentCalls to 1 in the host.json. You also could get answer from host.json reference for Azure Functions
maxConcurrentCalls 16 The maximum number of concurrent calls to the callback that the message pump should initiate. By default, the Functions runtime processes multiple messages concurrently. To direct the runtime to process only a single queue or topic message at a time, set maxConcurrentCalls to 1
For function apps 2.0 you need to to update the host.json like this:
{
"version": "2.0",
...
"extensions": {
"serviceBus": {
"messageHandlerOptions": {
"maxConcurrentCalls": 1
}
}
}
}
Since the host.json file is usually published to Azure along with the function itself, it is preferable not to modify it for debugging and development purposes. Any change to host.json can be made to local.settings.json instead.
Here is how to set maxConcurrentCalls to 1:
{
"IsEncrypted": false,
"Values": {
...
"AzureFunctionsJobHost:Extensions:ServiceBus:MessageHandlerOptions:MaxConcurrentCalls": 1
}
}
This override functionality is described here: https://learn.microsoft.com/en-us/azure/azure-functions/functions-host-json#override-hostjson-values