I am trying to create dashboard of my services in Azure. I added Azure Metrics Chart of each service and later wanted to add under it specific details to operations included in service.
But when I try to get it from logs, I get much higher number of requests made. KQL:
requests
| where cloud_RoleName startswith "notificationengine"
| summarize Count = count() by operation_Name
| order by Count
And result:
Problem is with some metrics chart I get values with minimal difference or exactly same while with some like one I shown I get completely different values. I tried to modify KQL or search what might be wrong but never got anywhere.
My guess is that those are 2 different values but in that case why both are labeled as "requests" and if so what are actual differences?
I have taken an Azure Function App with 2 Http Trigger Functions with identical names starts with “HttpTrigger” and run both the functions for couple of times.
Test Case 1:
In the Logs Workspace, Requests count got for the two functions that starts with the word “HttpTrigger”:
But I have pinned the chart of only 1 Function Requests Count to the Azure Dashboard:
Probably, I believe you have written the query of requests of all the services/applications that starts with “notificationengine” but pinned only some apps/services logs-chart to the dashboard.
Test Case 2:
Related
When looking at the metrics from our app services in Azure, I'm very confused at Sum and Count's aggregation metrics for requests. They should be the same, according to the MS tech doc.
Count: The number of measurements captured during the aggregation interval.
When the metric is always captured with the value of 1, the count aggregation is equal to the sum aggregation. This scenario is common when the metric tracks the count of distinct events and each measurement represents one event. The code emits a metric record every time a new request arrives.
And this MS tech doc as well.
Though not the case in this example, Count is equal to Sum in cases where a metric is always captured with the value of 1. This is common when a metric tracks the occurrence of a transactional event--for example, the number of HTTP failures mentioned in a previous example in this article.
So, let say, for a specific period, if there are 10 HTTP requests, the count of requests is 10, then the sum of requests is also 10.
But ours are all different. Below are one web app service's Sum and Count metrices, you can see they are very different. But why?
From offical restapi, we can see that count and sum are still different.
If you want more explanation, you can refer to the following post, or raise a support for help.
Related Post:
Azure App Service Metrics - How to interpret Sum vs. Count related to requests?
I've added a chart using KQL and logs from Azure Log Analytics to a dashboard. I'm using make-series which works great but the catch is the following:
The logs I'm getting might not extend to the whole time range dictated by the dashboard. So basically I need access to the starttime/endtime (and time granularity) to make make-series cover the whole timerange.
e.g.
logs
| make-series
P90 = percentile(Elapsed, 90) default = 0,
Average = avg(Elapsed) default = 0
// ??? need start/end time to use in from/to
on TimeGenerated step 1m
Currently, it's not supported. There are some feedbacks about this feature: Support for time granularity selected in Azure Portal Dashboard, and Retrieve the portal time span and use it inside the kusto query.
And some people provided workarounds in the first feedback, you can give it a try.
I posted on another question on this subject - you can do a bit of a hack in your KQL to get this working: https://stackoverflow.com/a/73064218/5785878
I want to monitor consecutive exceptions.
For example if I get 'X' amount of '500' exceptions in a row, I want it to trigger an action group.
How to write this in Kusto?
I know how to monitor amount of exceptions over a 1min period but I'm a bit stuck on how to monitor consecutive exceptions.
You are looking for setting up a custom log alert on AppInsights
Here is the step by step guide on how to setup
You can use the following query with Summarize Operator
exceptions
| where timestamp >= datetime('2019-01-01')
| summarize min(timestamp) by operation_Id
Please use the query like below:
Exceptions
| summarize count() by xxx
For more details about summarize operator, refer to this article.
I want to get the workload foreach teammember from the Devops API, so that I can visualize the same like what you can see in the picture and here: https://learn.microsoft.com/en-us/azure/devops/boards/sprints/adjust-work?view=azure-devops
I already saw that there is a capacity endpoint: https://learn.microsoft.com/en-us/rest/api/azure/devops/work/capacities/list?view=azure-devops-rest-5.1.
But this shows only the available hours foreach member in a week. I want all workItems per Member (hours summed up).
Is there a possible way to achieve this? Am I missing something?
I’m afraid that there is not a REST API can get the value directly at present. Before the value is displayed on the page, it is computed several times at the backend. After your assigned the work items to members, the page will show as following. It will show all work items per members(hours summed up) and capacity hours.
If you want to get the value by REST API, you can check the API from Fiddler, then follow the format of the API to pass the your value. Or you can get the member capacity and the Iteration work days then use the script to calculate this value manually.
I have an Azure Function App with Azure Functions that I individually want to monitor with the following rule: If an Azure Function didn't execute for N amount of minutes, send out an email/notification.
I am wondering if this is possible with the Application Insights Alerts, which does provide signal logic for the count on an individual Azure Function basis. But this count is never 0, in the graphs it appears that any count < 0 is not seen as a number. It displays as --, as you can see in the graph for my test function below:
testfunction chart (don't have enough reputation to post images)
The peak on the chart is seen as a 3, but if I use the condition "Whenever the testfunction Count is Less than 1" then the alert is never triggered.
Changing the aggregation granularity doesn't really do much, since the signal logic doesn't ever seem to record a count of 0, or any count smaller than 1.
There are lots of (slightly) more inconvenient ways to do this type of monitoring, but it seemed very possible with the nice built-in Azure Application Insights Alerts and I'd like to use that if at all possible.
Am I trying to misuse Application Insights Alerts or is there something obvious that I'm not getting? I would think it should be possible to have monitoring rules based on a lack of executions.
you might have to do this with log/query alerts instead. If you're doing metric based alerts, some of those don't send 0's as data. so if nothing happened during a time range, there's no 0's to alert on, since nothing is submitting 0, 0, 0, 0.
instead, you'd create alerts based on queries: https://learn.microsoft.com/en-us/azure/azure-monitor/platform/alerts-unified-log
the doc has this exact scenario listed:
In some cases, you may want to create an alert in the absence of an event.
For example, a process may log regular events to indicate that it's working properly. If it doesn't log one of these events within a particular time period, then an alert should be created. In this case, you would set the threshold to less than 1. [emphasis added, this is your scenario, correct]?
Example of Number of Records type log alert
Consider a scenario where you want to know when your web-based App gives a response to users with code 500 (that is) Internal Server Error. You would create an alert rule with the following details:
Query: requests | where resultCode == "500"
Time period: 30 minutes
Alert frequency: five minutes
Threshold value: Greater than 0
in that example the query would end up being something like requests | where timespan < ago(30m) | where resultCode == "500" because of the time period set. (the query itself can then filter that time range/result set down however you want)
so for yours, you'd probably just do requests with no where condition at all, and whatever time period and frequency you have, and "less than one" as the threshold.
you could make much more complicated queries as well, to filter out test data, etc.
one thing to watch out for is that I believe log alerts will fire an alert every time the frequency elapses. so if you had a requests < 1 alert set up for every 5 minutes, and your function had no calls for 2 hours, the alert is going to fire every 5 minutes, sending you 40 emails or whatever. maybe you want that :)