I want an email notification for every logic app run with Failed status like below screenshot.
I tried to configure Runs Failed alerts in logic app but things are not very clear to me.
what should be the excect entry for Threshold value, Operator , aggression type, Period and frequency to get alert notification on every failed run.
For this requirement, I think you can choose Static in "Threshold" and set condition as Great than Count 0. In "Evaluated based on", you can set 5 minutes as "Aggregation granularity (Period)" and set 5 minutes as "Frequency of evaluation". Shown as below screenshot:
The "Evaluated based on" you choose as 24hours and every 5 minutes is not particularly good. Because once the alert triggered, its "Monitor condition" will become "fired", and if it hasn't been solved, the alert will not be triggered again.(For example, your logic failed on 1:00, the alert will be triggered in 5 minutes. But it will not be triggered again if there is a failure during the last 24 hour when evaluate every 5 minutes).
By the way, you can also test it by yourself. You can create a logic app as below, it is allowed to be saved and will fail when run it.
Related
I use Application Insights "Availability" feature to check a web site availability and send an alert if it is down.
Now Application Insights sends an alert every 5 minutes, even the "alert failure time window" is 15 minutes. Test frequency is 5 minutes.
So I get an alert after 5 minutes, then after 10 minutes, then after 15 minutes! I get 3 alerts while I need only 1 one alert after 15 minutes. It looks like a bug for me.
How to prevent Application Insights Availability feature to send alerts every 5 minutes?
The email (notification) is sent the moment alert condition is satisfied. It doesn't wait for alert failure time window.
Example: for alerting rule to send notification if 3 locations out of 5 turn red, and 3 locations turning red within the first second => notification will be sent during the same second. It will not wait for 5 (or 15) minutes.
This is by design with the goal to reduce TTD (time to detect).
There are two ways to handle noise:
Configure retries (test will retry 2 times during red => green state switch)
Increase the number of locations to trigger alert (for instance, 14 out of 16)
Either way - only one notification is supposed to be sent, not every 5/15 minutes. Multiple notifications suggest either some bug in tracking current state of an alert (bug in a product) or an Application which intermittently fails (so, alerting rule constantly changes its states green => red => green => ..., as a result email is sent during every transition). Do you get alert every 5 minutes when tests are red all the time?
Alert failure time window defines what failed location means. 5 min test interval and 5 min alert failure means that 1 last result defines whether location failed or not. 5 min test interval and 15 min alert failure means that 3 last results define whether location failed or not. So, if one of those 3 test runs failed then location is considered as failed (even though 2 results after it might have been successes).
Increasing alert failure time window makes alerting rule more aggressive (and noisy for intermittently failing apps).
We can set the idle timeout for triggered webjobs using this (WEBJOBS_IDLE_TIMEOUT) property in the WebApp Appsettings (in Azure). To set it for 24 hours i now set this property to 86400, but i would rather set the idle time-out to infinite. As that possible? And if so, how? And if not, what is the maximum value?
I am looking for this, because the development/test setups are not used for a while and when we start using it, we don't want to manually start the webjob. Initial start is done in CI after/during release.
As far as I know, it seems that the WEBJOBS_IDLE_TIMEOUT doesn't have the infinite value and the maximum value, we can set the Idle Timeout to huge numbers as far as possible.
Actually, there is another way is also feasible. As the Configuration Settings says,
WEBJOBS_IDLE_TIMEOUT - Time in seconds after which we'll abort a
running triggered job's process if it's in idle, has no cpu time or
output (Only for triggered jobs).
so we could add a heartbeat style Console write every period of time.
For example:
//SEE IF THIS HAS BEEN RUNNING FOR MORE THAN 24 HOURS
if (DateTime.Now.Subtract(StartTime).TotalHours >= 24)
ThereAreItemsInQueue = false;
Counter++;
if (Counter % 25 == 0)
Console.WriteLine("Heartbeat");
There are two articles for you to refer to, 1 and 2.
I know you can add spending limit for your azure functions per month, But I need to find a way to limit number of executions for an Azure function per day. The Function I am developing is calling a 3rd party API where we have a limit of 25,000 calls per day. When we reach that limit we get a response "LIMIT_REACH". I want to be able to pause the azure function execution until 12AM the next day. I am using a storage Queue to trigger the Azure Function. I know an option is in the function.json. I can update ["disabled": false] But i will need to set it through programatically. Then I will have to trigger a process to turn on the function again.
Why not keep a flag, or a "next valid execution time" in TableStorage when you have hit the LIMIT_REACH response. Each time the function triggers, interrogate that time and either execute or abort. Update the flag / next execution time when you are able to re-hit that 3rd party API.
I've made an ask out to the Azure Functions team to introduce a "pause" button for Azure functions. You can see the discussion and possible implementations over here = https://github.com/Azure/azure-functions-host/issues/7888
Unfortunately there are not any apis to programmatically enable/disable an Azure function at present.
However, you could achieve this in a few ways:
First, upon receiving LIMIT_REACH, have the queue function modify its own function.json to set disabled true - this will trigger a restart after all currently executing functions finish.
Then, at the time you wish to re-enable processing, run a different function to update disabled: true to false:
Use a timer trigger with a schedule to run at midnight daily (0 0 0 * * *)
or
Use another queue and set the visibility time to schedule when the message becomes visible, upon which time you re-enable the function.
Why don't you use RateLimiter as a tool to limit Function executions? There are a lot of framework that do that. As an example, here is the one:
https://github.com/David-Desmaisons/RateLimiter
I hope that it'll help you!!!
I am testing Azure Application Insights alert functionality. It seems to be either buggy or I don't know how to use it.
If I create a new alert, based on the metric 'Server Exceptions', it seems to work once then never again. Once it fires, it seems to go into a state of 'Active' where there is an orange triangle with an !. See the image below. I created a new one, that I haven't triggered, and as can be seen in the image it has a green circle with a tick.
This sort of implies to me that an alert won't fire again until one 'acknowledges' the alert, which is not a bad idea, but I can't see how to do that.
Edit :
I have just tried to use the 'Exception Rate' as suggested, but I think the minimum threshold to fire the alert would be an average of 1 exception per second over a 5 minute period.
I must say it seems strange that my use-case isn't handled. I have a light weight Web API service that is so simple it should never fail but it could, and as a result if an exception occurs I want to receive an alert straight away.
Alert is supposed to resolve and state is supposed to get back to green when the condition of the alert is no longer fulfilled.
This is exceptionally hard to achieve with "Count" metrics because they go up and up and almost never down. It means that, once fired, the alert won't resolve because the value of the metric stays over the threshold all the time.
You can try to set an alert on the "Rate" metric instead and you should see that the state is returning to green when the "Rate" is within the limits you set.
This is now fixed. Please let us know if you see any issues. Some things to keep in mind:
Alert rules are evaluated on a sliding window: an alert would trigger/resolve based on how the condition evaluates on a sliding window from the instant a sample arrives.
A caveat to the above for exception count based alert rules: we will resolve an alert if there are no exceptions reported for the time window configured in the rule.
Note: this is different from metrics based rules – lack of data does not result in the alert being resolved for those.
"Server exception" metric works as OP expects now in 2018. My use case below:
For the goal of getting an email whenever an Exception happened.
Use "Server exception" metric.
That metric is smart enough to auto-resolve after waiting the period's length of time after the initial alert, if the error has not occurred again.
So you'll have the initial "Alert", then 5 minutes later of no Exceptions, it returns a "Healthy" state.
And since it auto-resolved, if the error happens again tomorrow it will do the "Alert" again.
Note this was using App Insights with a Function App. The Function App Failure metric had problems and wasn't reliable for this (Azure kept logging 0.2 Exception/s and thinking that was over the 1 in 5 min threshold...)
I'm building a reservation system and would appreciate your thoughts on the best way to tackle this. There bookable 'slots' every day (from 7am - 1pm & 2pm - 7pm). I'm making the application in jQuery/Laravel and have already build the part which reserves the slot. The slot should be reserved for a maximum of 15 mins. After this time, if the booking has not been confirmed, then the slot should become available again.
What is the best way for me to check if the reserved spot has expired? I have a number of ideas:
1) Insert a expires_at timestamp in the database when the slot becomes reserved. Then have a cron job run every minute to see if the slot has expired. If so, change the status back to Available.
2) Alternatively have a jQuery on page timer which starts as soon as the slot is reserved. As soon as it hits 15 mins, send an ajax request to set the status back to 'Available' again.
Does anyone have any further suggestions?
Thanks
Number 2 will not be recommended solution as the slot will not be available again if the browser is closed.
Instead of a cron job, you could make it part of the query.
[show records where status is not "confirmed" and expires_at > 15 minutes old]
You would show all these records as "open" to the user.