Azure function takes a really long time to trigger - azure

We have an Azure function v3 running Node, consumption plan, with an input trigger connected to a cosmos database. The function.json looks like this:
{
"disabled": false,
"bindings": [
{
"type": "cosmosDBTrigger",
"name": "productDocuments",
"collectionName": "products",
"direction": "in",
"connectionStringSetting": "DB_CONNECTION_STRING",
"databaseName": "product-management",
"createLeaseCollectionIfNotExists": true,
"maxItemsPerInvocation": 1
},
{
"name": "productDocument",
"type": "cosmosDB",
"databaseName": "product-management",
"collectionName": "products",
"createIfNotExists": true,
"connectionStringSetting": "_DB_CONNECTION_STRING",
"direction": "out"
}
],
"scriptFile": "dist/nameOfFunction.js"
}
But this trigger is working really, really slow and unreliable. If we add an item to the DB it sometimes triggers straight away, sometimes it seems to take hours and sometimes not at all. I am manually monitoring the cosmos db so I can see that items are added.
I am looking at this page, and most of the time nothing happens. I don't know how else to debug this
Should it really take hours for an invocation to show up here? Or is it the trigger that's unreliable?

General guidance is in this doc: https://learn.microsoft.com/azure/cosmos-db/troubleshoot-changefeed-functions#my-changes-take-too-long-to-be-received
What happens on Consumption Plan is that, after a period of inactivity, instances are deprovisioned. When a new instance is provisioned, it hits a cold start.
The key part here is that, when your instances are deprovisioned, they are not checking the Change Feed for events, so how does Functions know when to "wake them up"?
There is a periodic check done by an external component that checks to see if there are new changes, if there are new, then it would provision new instances of your Function to start consuming them.
This external component in your case, could be having an issue or delays in this checks.
If you have no Function logs for an hour even though you are making changes to the monitored collection, I would try to contact Azure Support to understand why is your Function not "waking up".
One of the known issues I've heard about was related to where the Cosmos DB Connection Strings were stored. Apparently this component at some point (maybe it's already fixed) had a problem where it could not access the Connection String if it was saved in "Connection Strings" section of the Function configuration, but was looking for it only on the "App Settings". In this cases, it could not wake up the Function and the Function only woke up if someone opened it on the Azure Portal. My recommendation would be to check where are you storing your connection string and see if you can move it to "App Settings" and see how it behaves.

Our problem with this was that we had two separate functions that both had a CosmosDBTrigger on the same collectionm but used the same lease, and apparently you can't do that. So it was solved by setting two separate leases (we used the leaseCollectionPrefix in the function.json.)

Related

Azure Functions ServiceBus Trigger Scaling Behavior

We are currently running load tests on our Azure Function App but the throughput is not what we expected.
There are multiple functions in the Function App but the ones with the most traffic are one with an Event Hub Trigger and one with a Service Bus Trigger consuming messages from a Session-Enabled Queue.
When the system is under load, Messages in the Session-Enabled Queue wait for up to 10 Minutes in the queue until they get processed by the consuming Function.
I know there are some settings in host.json to tune this behavior but it's still far from what we expect.
This is our host.json
{
"version": "2.0",
"logging": {
"applicationInsights": {
"samplingSettings": {
"isEnabled": true,
"excludedTypes": "Request"
}
}
},
"extensions": {
"serviceBus": {
"prefetchCount": 100,
"sessionHandlerOptions": {
"autoComplete": true,
"messageWaitTimeout": "00:00:30",
"maxAutoRenewDuration": "00:55:00",
"maxConcurrentSessions": 200
},
"batchOptions": {
"maxMessageCount": 1000,
"operationTimeout": "00:01:00",
"autoComplete": true
}
}
}
}
So i would expect the Function App to process up to 200 Sessions concurrently but in fact, although the Function Runtime provisions lots of instances, most of them seem to sit there and idle out. So to me it seems there is still another setting limiting the througput of the Function App.
I know it would improve performance if we would split the functions to separate Function Apps but as the load on the both functions is quite similar my plan was to postpone this step to a later stage and still get acceptable throughput with a single Function App.
We are using Azure Functions 3 on .NET Core 3.1 with
Microsoft.Azure.Functions.Extensions 1.1.0
Microsoft.Azure.WebJobs.Extensions.ServiceBus 5.0.0
Microsoft.Azure.WebJobs.Extensions.EventHubs 5.0.0
on a Windows Consumption Plan.
Thank you for any hints on how to achieve acceptable throughput.
I figured out that handling Batch Messages (receiving ServiceBusMessage[] instead of single instances) in the ServiceBus-Triggered Function along with enabled Sessions has a massively negative impact on scalability.
After changing this to single instances, the behavior of the system was as expected and the sessionHandlerOptions in host.json were respected.
I am wondering though what's the reason for this. I guess it could be related to the circumstance that Azure Function Instances lease a number of sessions from Service Bus to process but could not find anything in the documentation on that.

Python Azure Functions alert if timeout greater than 10 minutes

I have an Azure Functions app that is running on a timer trigger that I don't expect to exceed the 10 minute timeout limit, but I would like to receive an alert in the unlikely event that the application runs longer than 10 minutes. Is this possible to do in Application Insights? I didn't see an alert trigger for this use case. In Application Insights there is a "Long dependency duration" in the Smart Detection settings where I can add an email and there's also a "Failure Anomalies" alert rule already set up. Will either of these alert me if a function is running longer than 10 minutes?
I'd also like an alert if an individual function instance encounters any type of exception. I can set this up myself in the Python code by wrapping my code in a try except block and emailing if an exception is caught, but it would be easier if this was possible in Application Insights.
You can go to your host.json of your function app,
and change it like this:
{
"version": "2.0",
"extensionBundle": {
"id": "Microsoft.Azure.Functions.ExtensionBundle",
"version": "[1.*, 2.0.0)"
},
"functionTimeout": "00:10:00"
}
Then, if your instance is running more than 10 minutes, it will throw out an error.
So you can go to the Application Insight of your function and and select like below:
You can set the action type as Email/EMS and give it your email address, then you will receive an email when the timetrigger is timeout.

How to set up a multi-user environment for Azure Functions using queues?

We have started to use the Queue binding in our Azure functions for longer-running tasks such as sending bulk e-mails and "clean-up" tasks for CosmosDB. We develop locally with the Functions emulator then commit into VSTS/Azure DevOps which then auto-deploys into our Function App.
It seems as though pretty quickly we're going to have multiple Functions (two local emulators and one cloud function) all listening to the same queue. We tried disabling locally and renaming locally, but these all seem like awkward workarounds that require too much manual work and have the possibility to push the wrong queue name forward into VSTS.
How do we configure the queue name in the function.json to read an environment variable? The connection setting in the binding takes the name of an environment variable, but the queue setting wants a string.
{
"disabled": false,
"bindings": [
{
"name": "myQueueItem",
"type": "queueTrigger",
"direction": "in",
"queueName": "emailer",
"connection": "STORAGE_CONNECTION_STRING"
}
]
}
Just wrap variable name with % and function can read its value from Application settings on portal and Values in local.settings.json locally.
"queueName": "%myqueue%"
connection property of triggers and bindings is a special case and automatically resolves values as app settings, without percent signs.
See Binding expressions - app settings.

Disabled Azure Functions still running

I am just starting to test with Microsoft Azure Functions. I have my VS2017 publishing and my function is working nicely. I currently have one function that I am working with. It is set on a timer of every 5 minutes.
However, it appears that that function is executing even when I have it "disabled". This can be seen in the Monitor and in one of the systems that it is interacting with. The only way that I am able to stop it is to stop the overall function group. When I then start the function group, it starts the disabled function running every 5 minutes again.
Am I missing something? Does the disabling of an individual function have some other purpose?
How do I get an individual function within a function group to not execute on its defined schedule?
Thanks.
What you are experiencing is an expected behavior though not an ideal one. It is a bug in the portal experience.
The Function runtime directly consumes metadata in the binary files of the pre-compiled functions. Here is sample of annotation for the disabled function.
[TimerTrigger("0 */5 * * * *"), Disable()]
This is the function.json generated by visual studio the above annotations.
{
"generatedBy": "Microsoft.NET.Sdk.Functions.MSBuild-1.0.2",
"configurationSource": "attributes",
"bindings": [
{
"type": "timerTrigger",
"schedule": "0 */5 * * * *",
"useMonitor": true,
"runOnStartup": false,
"name": "myTimer"
}
],
"disabled": true,
"scriptFile": "..\\bin\\FunctionApp3.dll",
"entryPoint": "FunctionApp3.Function1.Run"
}
The function.json generated by the precompiled functions is consumed by the portal and that is what is shown in the portal. When you change the disabled state of the function in the portal the disabled property is changed in the function.json but it is not consumed by the functions runtime. Hence it continues to execute.
When you deploy it in disabled state, runtime is aware of it and honors it as expected.
I have opened this bug to fix the portal experience.
https://github.com/Azure/azure-functions-ux/issues/1857
Today, I got the same problem, and after to disable the function in Azure, I recommend you to restart the Function service. Because the Azure needs to refresh the metadata, and the restart is one of the solutions to accomplish it.

Azure function is not triggering on scheduled time

Note: Even though it may seem duplicate, My issue is different and I request you to read complete description before hastily marking the question down just by reading Question title.
I opened "Azure function is not triggering on scheduled time " issue on official "azure-webjobs-sdk-script" git repository on May 24, 2017-but there is no reply yet. So I am re-asking this here.
I am using azure function in consumption plan, and have scheduled it to execute at every 4.00 am utc by setting following cron expression in function.json:
{
"bindings": [
{
"name": "myTimer",
"type": "timerTrigger",
"direction": "in",
"schedule": "0 0 4 * * *"
}
],
"disabled": false
}
Azure function does get execute on time only if I am logged into portal or clicks on function blade. It does not invoke when I am logged out from portal (suggesting either system listener or function go to sleep after some interval).
Their official documentation states that functions in consumption plan do not require any other settings (like Always on) to keep the function alive, instances of the Azure Functions host are dynamically added and removed based on the number of incoming events.. So as per documentation, no other settings I have to configure to execute function in consumption plan.
What I have tried?
From "Timer triggered azure function not getting triggered" question on SO, I re-checked and ensured my plan (consumption plan) and time zone. (I want it to run in utc, so no explicit setting is required)
From "#1445: Azure function timer trigger not firing" git issue, I checked whether it was just logs that are not appearing. But I am certainly sure, that its not the logs but the actual function does not get triggered unless I am having my portal on or trigger it manually.
I tried to check whether this behavior exists if I change the schedule to a more closer recursive invocations--I scheduled function to get executed at every 2 hours, and this schedule perfectly worked even when I logged out or did not awake function manually. This means, there is some issue when schedule is set to run on larger set of intervals (in my case each day)
As discussed here in #1534, deleting and redeploying the function completely in new function app did not reproduce the issue. So deleting the function and/or function app-and redeploying the same should make things working. Meanwhile, Azure team has announced to add internal logging which will help future diagnosis.

Resources