Monitoring Timer Triggered Azure Functions For Inactivity - azure

We have a couple of functions in a function app. Two of them are triggered by a timer, do some processing and write to queues to trigger other functions.
They normally work very well until recently where the timer trigger just stopped triggering. We fixed this by restarting the application which resolved the issue. The problem is that we were completely unaware of the trigger stopping as there were no failures and the function app is not constantly 'looked at' by our people.
I'd like to configure automatic monitoring and alerting for this special case. I configured Application Insights for the function app and tried to write an alert which watches the count metric of the functions which are triggered by a timer. If the metric is below the set threshold (below 1 in the last 5 minutes) the alert should be triggered.
I tested this by just stopping the function app. My reasoning behind this was that a function app that does not run should fullfill this condition and should trigger an alert within a reasonable time frame. Unfortunately this was not the case. Apparently a non-existing count is not measured and the alert will never be triggered.
Did someone else experience a similar problem and has a way to work around this?

I've added Application Insights alert:
Type: Custom log search
Search query:
requests | where cloud_RoleName =~ '<FUNCTION_APP_NAME_HERE>' and name == '<FUNCTION_NAME_HEER>'
Alert logic: Number of results less than 1
Evaluated based on: Over last N hours, Run every M hours
Alert fires if there are no launches over last N hours.

Related

Azure Function Proxy - Cold Startup - Error 429 Too many requests

I've set up a function app in Azure. I've added a proxy to the function (so I can assign it a different URI).
When the proxy and function have been torn down and its time to wake it up, I sometimes get the error code 429: Too many requests from a single Postman/insomnia request to wake it up.
How do I stop this from happening?
For the time being, I've added a logic app to ping it every 5 mins.
Seems to be something with the last release of https://github.com/Azure/azure-functions-host/releases/tag/v3.0.15185
On the date of this release we started receiving 429s, a lot, on the functions we had running for a long time.
We fixed it by adding the following to the hosts.json:
"extensions": {
"http": {
"dynamicThrottlesEnabled": false
}
}
Doc: https://learn.microsoft.com/pt-br/azure/azure-functions/functions-bindings-http-webhook-output
My guess is that they've changed some default values.
EDIT:
We are operating for a long time using BOTH, the hosts.json update from above and the pinned version, stated by sanjo (https://stackoverflow.com/a/65311645/10585914).
You can follow the entire discussion here: https://github.com/Azure/azure-functions-host/issues/6984
And the PR: https://github.com/Azure/azure-functions-host/pull/6986
We are also experiencing 429's in our azure-function and has been advised by MS to force the Azure Functions Extensions to a lower version by setting FUNCTIONS_EXTENSION_VERSION to 3.0.14916.0 instead of ~3
We're still evaluating the "solution".
From Microsoft support, there are 2 workarounds:
Cassio's answer, which actually worked for us for a couple hours but then stopped working. We had been getting very consistent 429s for multiple days, then a brief stoppage after the change, then it came back.
Update your FUNCTIONS_EXTENSION_VERSION app setting to the previous version (3.0.14916.0). This has worked again in the short time since we've changed it.
App Setting Update
I don't think your 5 minute ping is a problem like the answer from Hury Shen. We have recently begun receiving 429 requests anytime our functions wake from a cold period. I don't know what has changed at Azure side but it is not good! One fix you could try is simply redeploy your function, we did this and it worked at least for a time! Will report back if we find anything else
It seems the error was caused by the logic app ping the function every 5 mins. Per my understanding, you schedule the logic app request function to keep the function awake.
If so, you do not need to create the logic specifically to wake it up. You can choose Premium plan for your function app when you create it.
And then go to "Scale out" tab of your function app, you can set Always Ready Instances as 1. Then your function will have one instance always awake, function will not cold start when a request come.
As Premium plan plan provides the same features and scaling mechanism used on the Consumption plan (based on number of events) with no cold start, so it will cost much more than Consumption plan. You can refer to this page about function cost.

AppInsights - Monitor for Hung Processes

We are looking at implementing AppInsights for our non-web application. One of the things that we want to monitor for is processes that may be "hung" for more than N number of seconds or minutes. I have been unable to find something built in that does this. The closest thing I have seen or thought of would be to log 2 custom events for the start and end of a process, and then have an alert for a custom log that queries events with no matching "end" event after N minutes.
Is there another way to monitor for hung processes using AppInsights that I am not seeing? Thanks for any help.
If you choose to use application insights, here is the suggestion just for your reference(but if you have another better solution, you can ignore this):
As per this post, you can leverage heartbeat feature, details as below:
if this application runs more than several seconds, you can leverage heartbeat
feature - it sends metric every N minutes/seconds (configurable) and the absence of such
metric will indicate that application is no longer actively running. However, if
Application Insights thread survives, then heartbeat will still be reported.
You can rely on presense/absense of the telemetry from this app in general as well as
couple custom events as you outlined above - Azure Monitor allows to set an alert on
analytics query, so you'll be able to craft a query that returns nothing in case of
application issues and set an alert on 0 count returned by such a query.

Delay in Azure function triggering off IOThub

I have data going from my system to an azure iot. I timestamp the data packet when I send it.Then I have an azure function that is triggered by the iothub. In the azure function I get the message and get the timestamp and record how long it took the data to get to the function. I also have another program running on my system that listens for data on the iothub and records that time too.
So most of the time, the time in the azure function is in millisecs, but sometimes, I see a large time for the azure function to be triggered(I conclude it is this because the program that reads from the iot hub shows that the data reached the iot hub quickly and there was no delay).
Would anybody know the reasons for why azure function might be triggering late
Is this the same question that was asked here? https://github.com/Azure/Azure-Functions/issues/711
I'll copy/paste my answer for others to see:
Based on what I see in the logs and your description, I think the latency can be explained as being caused by a cold-start of your function app process. If a function app goes idle for approximately 20 minutes, then it is unloaded from memory and any subsequent trigger will initiate a cold start.
Basically, the following sequence of events takes place:
The function app goes idle and is unloaded (this happened about 5 minutes before the trigger you mentioned).
You send the new event.
The event eventually gets noticed by our scale controller, which polls for events on a 10 second interval.
Our scale controller initiates a cold-start of your function app. This can add a few more seconds depending on the content of your function app (it was about 6 seconds in this case).
So unfortunately this is a known behavior with the consumption plan. You can read up on this issue here: https://blogs.msdn.microsoft.com/appserviceteam/2018/02/07/understanding-serverless-cold-start/. The blog post also discusses some ways you can work around this if it's problematic for your scenario.

Azure Function app periodically not firing on trigger or timer

I have an Azure Function app with 4 functions
one triggered on a timer every 24 hours
one triggered on events from IoT Hub
two others triggered on events from Service Bus as a result of the previous function
All functions work as expected when first deployed but after a period of time the functions stop running and the app appears to be scaled down with no servers online. At this point the functions are never triggered again unless I either restart the app, or drill into a function and view details of it (supposedly, forcing the function to start up).
I have the exact same code deployed to a different environment and it runs perfectly and has never encountered this issue. I've checked all the settings and configuration and can't see any material differences between the two.
This is really frustrating and is becoming a big issue. Any help would be much appreciated.
Function App is hosted in Australia Southeast.
This is the last execution (as of now)
10:45 PM UTC - Function started (Id=4d29555b-d3af-43d7-95e9-1a4a2d43dc46)
The event triggered function should run every few minutes as the IoT Hub it's triggering from has a steady stream of events coming in. When I prod the function (or restart it) and it comes to life it quickly churns through a backlog of messages queued in the IoT Hub.
I see the problem: you have comments in your host.json, which makes it invalid and throws off the parser at the scale controller level.
Admittedly, the error handling is quite poor here. But anyway, remove the commented out logger, and it should all work.

Queue trigger in azure apparently not clearing up after succesful functions run

I am very new to Azure so I am not sure if my question is stated correctly but I will do my best:
I have an App that sends data in the form (1.bin, 2.bin, 3.bin...) always in consecutive order to a blob input container, when this happens it triggers an Azure function via QueueTrigger and the output of the function (1output.bin, 2output.bin, 3output.bin...) is stored in a blog output container.
When azure crashes the program tries 5 times before giving up. When azure succeeds it will run just once and that's it.
I am not sure what happened last week but since last week after each successful run, functions is idle like for 7 minutes and then it starts the process again as if it was the first time. So for example the blob receives 22.bin and functions process 22.bin and generates 22output.bin, it is supossed to stop after that but after seven minutes is processing 22.bin again.
I don't think is the app because each time the app sends data, even if it is the same one it will name the data with the next number (in my example 23.bin) but this is not the case it is just doing 22.bin again as if the trigger queue was not clear after the successful azure run, and it keeps doing it over and over again until I have to stop functions and make it crash i order to stop it.
Any idea in why is this happening and what can I try to correct it is greatly appreciated. I am just starting to learn about all this stuff.
One thing that could be possibly happening is that, the function execution time is exceeding 5 mins. Since this is a hard limit, function runtime would terminate the current execution and restart the function host.
One way to test this would be to create a Function app using Standard App Service plan instead of Consumption plan. Function app created with standard plan does not have execution time limit. You can log function start time and end time to see if it is taking longer than 5 mins to complete processing queue message.

Resources