Azure Functions host.json settings per function or global?

Azure Functions host.json settings per function or global? - azure

Do the settings in host.json apply to each function individually, or apply to all of the functions as a whole?
For example, I've two functions in the same Project that both get messages from Azure ServiceBus Queues.
If I set maxConcurrentCalls to 10 in host.json, does that mean that as a whole only 10 concurrent calls to ServiceBus will be made, or that it is 10 per function, so there will be 20 concurrent calls?
Thanks in advance.

host.json file is shared for all functions of a FunctionApp. That's to say that maxConcurrentCalls value will apply to all functions of the app, as any other setting will.
The effect of maxConcurrentCalls will be independent for each function. In your example, each function will have up to 10 messages processed concurrently. If you set it to 1, there will be 1 thread working per function.
Note that maxConcurrentCalls applies per instance. If you have multiple instances running, the max concurrency increases proportionally.

Related

What is the alternative to global variables in Azure Function Apps?

Lets say I want to have a TimerTrigger function app that executes every 10 seconds and prints an increasing count(1...2...3...),
how can I achieve this WITHOUT using environment variable?

You're already using an Azure Storage account for your function. Create a table within that storage account, and increment the counter there. This has the added benefit of persisting across function restarts.
Since you're using a TimerTrigger, it's implicit that there will only ever be one instance of the function running. If this were not the case, you could end up in a race condition with two or more instances interleaving to incorrectly increment your counter.

I suggest you look into Durable Functions. This is an extension for Azure Functions that allow state in your (orchestrator) functions.
In your case, you can have a single HTTP triggered starter function that starts a long running orchestrator function. The HTTP function passes the initial count value to the orchestrator function. You can use the Timer functionality of Durable Functions to have the orchestrator wait for the specified amount of time before continuing/restarting. After the timer expires, the count value is incremented and you can restart the orchestrator function with this new count value by calling the ContinueAsNew method.
This periodic work example is almost what you need I think. You still need to add the initial count to be read as the input, and increment it before the ContinueAsNew method is called.
If you need more details about Durable Functions, I have quite some videos that explain the concepts.

Azure (Durable) Functions - Managing parallelism

I'm posting this question to see if I'm understanding parallelism in Azure Functions correctly, and particularly Durable Functions.
The ability to set max degree of parallelism was recently added to Azure Functions using az cli:
https://github.com/Azure/azure-functions-host/issues/1207
az resource update --resource-type Microsoft.Web/sites -g <resource_group> -n <function_app_name>/config/web --set properties.functionAppScaleLimit=<scale_limit>
I've applied this to my Function App, but what I'm unsure of is how this plays with the MaxConcurrentOrchestratorFunctions and MaxConcurrentActivityFunctions settings for Durable Functions.
Would the below lead to a global max of 250 concurrent activity functions?
functionAppScaleLimit: 5
MaxConcurrentOrchestratorFunctions: 5
MaxConcurrentActivityFunctions: 10

Referring to the link you shared to limit scaling this functionAppScaleLimit will help you to specify the maximum number of instances for your function. Now coming to MaxConcurrentOrchestratorFunctions : sets the maximum number of orchestrator functions that can be processed concurrently on a single host instance and MaxConcurrentActivityFunctions the maximum number of activity functions that can be processed concurrently on a single host instance. Refer to this
Now, I am explaining what MaxConcurrentOrchestratorFunctions does , which would help you understand how it works:
MaxConcurrentOrchestratorFunctions controls how many orchestrator functions can be loaded into memory at any given time. If you set concurrency to 1 and then start 10 orchestrator functions, only one will be loaded in memory at a time. Remember that if an orchestrator function calls an activity function, the orchestrator function will unload from memory while it waits for a response. During this time, another orchestrator function may start. The effect is that you will have as many as 10 orchestrator functions running in an interleaved way, but only 1 should actually be executing code at a time.
The motivation for this feature is to limit CPU and memory used by orchestrator code. It's not going to be useful for implementing any kind of singleton pattern. If you want to limit the number of active orchestrations, then you will need to implement this.

Your global max of activity functions would be 50. This is based on 5 app instances as specified by functionAppScaleLimit and 10 activity functions as specified by MaxConcurrentActivityFunctions. The relationship between the number of orchestrator function executions and activity function executions depends entirely on your specific implementation. You could have 1-1,000 orchestration(s) that spawn 1-1,000 activities. Regardless, the settings you propose will ensure there are never more than 5 orchestrations and 10 activities running concurrently on a single function instance.

AWS lambda throttle concurrent invocations from a specific event source

I've created dynamic sqs standard queues which are used as an event source for my lambda function invocation. Whenever any message is pushed into the queues the lambda function is invoked. Now, i want to add some throttling on my lambda function like a single event source can have only one active invocation of lambda at a time. There are some answers but they only work on throttling overall lambda concurrency.

From Managing Concurrency for a Lambda Function:
When a function has reserved concurrency, no other function can use that concurrency. Reserved concurrency also limits the maximum concurrency for the function, and applies to the function as a whole, including versions and aliases.
Therefore, reserved concurrency can be used to limit the number of concurrent executions of a specific AWS Lambda function.

AWS Lambda functions can also be triggered from an Amazon SQS FIFO queue.
From New for AWS Lambda – SQS FIFO as an event source | AWS Compute Blog:
In SQS FIFO queues, using more than one MessageGroupId enables Lambda to scale up and process more items in the queue using a greater concurrency limit. Total concurrency is equal to or less than the number of unique MessageGroupIds in the SQS FIFO queue.
So, it seems that if you specify all messages with the same MessageGroupId and a batch size of 1, then it will only process one message at a time.

Short answer is yes it can be done but only in a roundabout way.
When you have a Lambda function set as a triggered function on an SQS queue, the Lambda service polls the queue and handles the receiving and deletion of a message from the queue. The only control you have over how many messages the Lambda service reads and how many instances of your function the Lambda service invokes is (a) batch size, and (b) function concurrency.
Neither of these will help you when applied directly to your function, because setting the batch size to a small number (e.g. 1) will result in more instances being started (takes longer to process 1 message at a time), and setting it to a high number may not be desirable in your case, and if it is then it still won't help if the number of messages is higher than the batch size or they are received frequently and your function is already busy processing the previous batch. And you already said function concurrency is a no go because you only want to limit the concurrency from a source, not overall.
So here's a way it can be accomplished: create another function with a concurrency limit of 1, set it as the triggered function instead of your function. That function will receive messages, and it in turn will invoke your function with said message(s). It will wait for your function to return before returning itself. Only when the new function returns can it receive another message/batch from the Lambda service, and invoke your function again. So your "real" function has no overall concurrency limit, but there is only ever one instance invoked/running at a time from your SQS source (via the new function).

Azure Function batchSize

I am wondering about the parallel principle in Azure Function. If I have a batchSize of 32 and a threshold of 16. If the queue grow to large the Scale controller spins up a new function to withstand the pressure. I understand this bit. What I don't understand is: does a single instance work on the batch? That is, do I only have one function running pr batch, or does the runtime scale out and run multiple threads with the function?
Could I risk having two instances running, each with a 32 messages, and concurrently 32 threads running 32 functions pr once?
Imaging I have a function calling a webapi. This means that the api will get 64 calls at once which I don't want.
What I want is 2 functions working on 32 messages each making 1 call pr message pr function.
I hope you guys understand.

Yes. That is indeed how scaling works. The same is explained with a bit more details in the docs as well.
According to that, your function (one instance) could run up to 48 messages at a time (32 from a new batch + 16 from the existing batch) and could potentially scale to multiple instances depending on the queue length.
To achieve the scenario you've mentioned, you would have to
Set the batchSize to 1 to avoid parallel processing per instance
Set the WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT app setting to 2 to limit scale out to a max of 2 instances
Note that all 32 messages won't be loaded by either instance but will work through the queue nonetheless.

azure function max execution time

I would like to have a function called on a timer (every X minutes) but I want to ensure that only one instance of this function is running at a time. The work that is happening in the function shouldn't take long, but if for some reason it takes longer than the scheduled timer (X minutes) I don't want another instance to start and the processes to step on each other.
The simplest way that I can think of would be to set a maximum execution time on the function to also be X minutes. I would want to know how to accomplish this in both the App Service and Consumption plans, even if they are different approaches. I also want to be able to set this on an individual function level.
This type of feature is normally built-in to a FaaS environment, but I am having the hardest time google-binging it. Is this possible in the function.json? Or also are there different ways to make sure that this runs only once?
(PS. I know I could this in my own code by wrapping the work in a thread with a timeout. But I was hoping for something more idiomatic.)

Timer functions already have this behavior - they take out a blob lease from the AzureWebJobsStorage storage account to ensure that only one instance is executing the timer function. Also, the timer will not execute while a previous scheduled execution is in flight.
Another roll-your-own possibility is to handle this with storage queues and visibility timeout - when the queue has finished processing, push a new queue message with visibility timeout to match the desired schedule.
I want to mention that the functionTimeout host.json property will add a timeout to all of your functions, but has the side effect that your function will fail with a timeout error and that function instance will restart, so I wouldn't rely on it in this case.

You can specify 'functionTimeout' property in host.json
https://github.com/Azure/azure-webjobs-sdk-script/wiki/host.json
// Value indicating the timeout duration for all functions.
// In Dynamic SKUs, the valid range is from 1 second to 10 minutes and the default value is 5 minutes.
// In Paid SKUs there is no limit and the default value is null (indicating no timeut).
"functionTimeout": "00:05:00"

There is a new Azure Functions plan called Premium (in public preview as of May 2019) that allows for unlimited execution duration:
https://learn.microsoft.com/en-us/azure/azure-functions/functions-scale
It will probably end up the goto plan for most Enterprise scenarios.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Azure Functions host.json settings per function or global? - azure

Related

What is the alternative to global variables in Azure Function Apps?

Azure (Durable) Functions - Managing parallelism

AWS lambda throttle concurrent invocations from a specific event source

Azure Function batchSize

azure function max execution time

Categories

Resources