Creating alerts for specific exceptions with Application Insight (Microsoft Azure) - azure

I'm relatively new to Azure and am trying to see if there's a way to create notifications to occur in real time (or close to) whenever only certain exceptions occur using Application Insights.
Right now I'm able to track exceptions and to trigger metric alerts for when a threshold of exceptions occur over a certain amount of time but can't seem to figure out how to make these alerts sensitive to only certain kinds of exceptions. My first thoughts were to add properties to an exception as I used a telemetry client to track it with the 'TrackException' method then create an alert specific to that property but I'm still unable to figure out how to do it.
Any help is appreciated.

A couple years later now, there's a way to mostly do this with built in functionality.
There isn't an easy way to do this on every exception as it occurs, though. some apps have literally billions of exceptions per day, so evaluating your function every time a exception occurs would be pretty expensive.
Things like this are generally done with custom alerts that do a query and see if anything that meets the criteria exists in the new time period.
you'd do this with "log alerts", documented here: https://learn.microsoft.com/en-us/azure/azure-monitor/platform/alerts-unified-log
instead of getting an email every time a specific exception occurred, your query would run every N minutes, and if any rows meet the criteria, you'd get a single mail (or whatever you have the alert configured to do), and you keep getting mails every N minutes where rows that meet the criteria are found.

There are two options:
Call TrackMetric (provide some metric name) when exception of particular type happens in addition to TrackException. Then configure alert based on this metric.
Write a tool/service/azure function which every few minutes runs a query in Application Insights Analytics and posts result as metric (using TrackMetric). Then configure alert from portal.
Right now AI team is working on providing #2 out of the box.

Related

Microsoft Flow with File Created Action is not triggered all the time

I have one drive synced local folder and the files will be synced with a SharePoint site when we add files to this folder. I also have a Flow that gets triggered for every file added.
The detailed article about what I am achieving here can be found here.
The problem is that it is not triggered all the time. Let's say I added 100 files and the Flow triggered only 78 times. Are there any limitations on the Flow that it can run only this many times in a timeframe? Anyone else faced this issue? Any help is really appreciated. #sharepoint #sharepointonline #flow #onedrive
Finally, after spending a few hours, I got it working with 120 files at the same time. The flow runs smoothly and efficiently now. Here is what I did.
Click on the three dots on your trigger in the flow, and then click on settings.
Now in the new screen, enable the Split On (Without this my Flow was not getting triggered) and give the Array value. Clicking on the array dropdown will give you the matching value. Now turn on the Concurrency as shown in the preceding image and give the Degree of Parallelism to maximum (50 as of now).
According to Microsoft:
Concurrency Control is to Limit the number of concurrent runs of the flow or leave it off to run as many as possible at the same time. Concurrency control changes the way new runs are queued. It cannot be undone once enabled.

Azure Logic Apps Trigger not fired

I am testing Azure Logic apps for a used case where I want to parse new tweets and write them to SQL. The flow works seamlessly.
But the problem is that although I have selected 1 sec for "How often do you want to check for items?" field, it seems triggers are not fired automatically. I have to press Run Trigger to to capture new tweets.
Is there any idea how to overcome this problem?
Thank you
The "How often do you want to check for items" means the trigger will check if a new tweet was posted every 1 second but not run every 1 second. If the problem is not caused by misunderstanding, please check if any of the following limits are not met:
You can find more information on this tutorial.

Can I track unexpected lack of changes using change feeds, cosmos db and azure functions?

I am trying to understand change feeds in Azure. I see I can trigger an event when something changes in cosmos db. This is useful. However, in some situations, I expect a document to be changed after a while. A question should have a status change that it has been answered. After a while an order should have a status change "confirmed" and a problem should have status change "resolved" or should a have priority change (to "low"). It is useful to trigger an event when such a change is happening for a certain document. However, it is even more useful to trigger an event when such a change after a (specified) while (like 1 hour) does not happen. A problem needs to be resolved after a while, an order needs to be confirmed after while etc. Can I use change feeds and azure functions for that too? Or do I need something different? It is great that I can visualize changes (for example in power BI) once they happen after a while but I am also interested in visualizing changes that do not occur after a while when they are expected to occur.
Achieving that with Change Feed doesn't sound possible, because as you describe it, Change Feed is reacting based on operations/events that happen.
In your case it sounds as if you needed an agent that needs to be running every X amount of time (maybe an Azure Functions with a TimerTrigger?) and executes a query to find items with X state that have not been modified in the past Y pre-defined interval (possibly the time interval associated with the TimerTrigger). This could be done by checking the _ts field of the state documents or your own timestamp field, see https://stackoverflow.com/a/39214165/5641598.
If your goal is to just deploy it on a dashboard, you could query using Power BI too.
As long as you don't need too much time precision (the Change Feed notifications are usually delayed by a few seconds) for this task, the Azure CosmosDB Change Feed could be easily used as a solution, but it would require some extra work from the Microsoft team to also support capturing deletion TTL expiration events.
A potential solution, if the Change Feed were to capture such TTL expiration events, would be: whenever you insert (or in your use case: change priority of) a document for which you want to monitor lack of changes, you also insert another document (possibly in another collection) that acts as a timer, specifying a TTL of 1h.
You would delete the timer document manually or by consuming the Change Feed for changes, in case a change actually happened.
You could also easily consume from the Change Feed the TTL expiration event and assert that if the TTL expired then there were no changes in the specified time window.
If you'd like this feature, you should consider voting issues such as this one: https://github.com/Azure/azure-cosmos-dotnet-v2/issues/402 and feature requests such as this one: https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/14603412-execute-a-procedure-when-ttl-expires, which would make the Change Feed a perfect fit for scenarios such as yours. Sadly it is not available yet :(
TL;DR No, the Change Feed as it stands would not be a right fit for your use case. It would need some extra functionalities that are planned but not implemented yet.
PS. In case you'd like to know more about the Change Feed and its main use cases anyways, you can check out this article of mine :)

Azure Application Insights Alerts work only once

I am testing Azure Application Insights alert functionality. It seems to be either buggy or I don't know how to use it.
If I create a new alert, based on the metric 'Server Exceptions', it seems to work once then never again. Once it fires, it seems to go into a state of 'Active' where there is an orange triangle with an !. See the image below. I created a new one, that I haven't triggered, and as can be seen in the image it has a green circle with a tick.
This sort of implies to me that an alert won't fire again until one 'acknowledges' the alert, which is not a bad idea, but I can't see how to do that.
Edit :
I have just tried to use the 'Exception Rate' as suggested, but I think the minimum threshold to fire the alert would be an average of 1 exception per second over a 5 minute period.
I must say it seems strange that my use-case isn't handled. I have a light weight Web API service that is so simple it should never fail but it could, and as a result if an exception occurs I want to receive an alert straight away.
Alert is supposed to resolve and state is supposed to get back to green when the condition of the alert is no longer fulfilled.
This is exceptionally hard to achieve with "Count" metrics because they go up and up and almost never down. It means that, once fired, the alert won't resolve because the value of the metric stays over the threshold all the time.
You can try to set an alert on the "Rate" metric instead and you should see that the state is returning to green when the "Rate" is within the limits you set.
This is now fixed. Please let us know if you see any issues. Some things to keep in mind:
Alert rules are evaluated on a sliding window: an alert would trigger/resolve based on how the condition evaluates on a sliding window from the instant a sample arrives.
A caveat to the above for exception count based alert rules: we will resolve an alert if there are no exceptions reported for the time window configured in the rule.
Note: this is different from metrics based rules – lack of data does not result in the alert being resolved for those.
"Server exception" metric works as OP expects now in 2018. My use case below:
For the goal of getting an email whenever an Exception happened.
Use "Server exception" metric.
That metric is smart enough to auto-resolve after waiting the period's length of time after the initial alert, if the error has not occurred again.
So you'll have the initial "Alert", then 5 minutes later of no Exceptions, it returns a "Healthy" state.
And since it auto-resolved, if the error happens again tomorrow it will do the "Alert" again.
Note this was using App Insights with a Function App. The Function App Failure metric had problems and wasn't reliable for this (Azure kept logging 0.2 Exception/s and thinking that was over the 1 in 5 min threshold...)

Patterns to azure idempotent operations?

anybody know patterns to design idempotent operations to azure manipulation, specially the table storage? The more common approach is generate a id operation and cache it to verify new executions, but, if I have dozen of workers processing operations this approach will be more complicated. :-))
Thank's
Ok, so you haven't provided an example, as requested by knightpfhor and codingoutloud. That said, here's one very common way to deal with idempotent operations: Push your needed actions to a Windows Azure queue. Then, regardless of the number of worker role instances you have, only one instance may work on a specific queue item at a time. When a queue message is read from the queue, it becomes invisible for the amount of time you specify.
Now: a few things can happen during processing of that message:
You complete processing after your timeout period. When you go to delete the message, you get an exception.
You realize you're running out of time, so you increase the queue message timeout (today, you must call the REST API to do this; one day it'll be included in the SDK).
Something goes wrong, causing an exception in your code before you ever get to delete the message. Eventually, the message becomes visible in the queue again (after specified invisibility timeout period).
You complete processing before the timeout and successfully delete the message.
That deals with concurrency. For idempotency, that's up to you to ensure you can repeat an operation without side-effects. For example, you calculate someone's weekly pay, queue up a print job, and store the weekly pay in a Table row. For some reason, a failure occurs and you either don't ever delete the message or your code aborts before getting an opportunity to delete the message.
Fast-forward in time, and another worker instance (or maybe even the same one) re-reads this message. At this point, you should theoretically be able to simply re-perform the needed actions. If this isn't really possible in your case, you don't have an idempotent operation. However, there are a few mechanisms at your disposal to help you work around this:
Each queue message has a DequeueCount. You can use this to determine if the queue message has been processed before and, if so, take appropriate action (maybe examine the Table row for that employee, for example).
Maybe there are stages of your processing pipeline that can't be repeated. In that case: you now have the ability to modify the queue message contents while the queue message is still invisible to others and being processed by you. So, imagine appending something like |SalaryServiceCalled . Then a bit later, appending |PrintJobQueued and so on. Now, if you have a failure in your pipeline, you can figure out where you left off, the next time you read your message.
Hope that helps. Kinda shooting in the dark here, not knowing more about what you're trying to achieve.
EDIT: I guess I should mention that I don't see the connection between idempotency and Table Storage. I think that's more of a concurrency issue, as idempotency would need to be dealt with whether using Table Storage, SQL Azure, or any other storage container.
I believe you can use Reply log storage way to solve this problem

Resources