Is it possible to reduce AzureDiagnostics logs (Log Analytics)? - azure

Looking for a way to reduce AzureDiagnostics table in Azure Log Analytics.
Is it possible to reduce log collecting time? For example, get CPU or Disk stats every 1h instead 5m for all or custom resources?
Or, maybe, there is a way to clean up logs with less than default period 31 days?

As per the updated Microsoft Document it is still not possible to reduce the default time for cleanup logs
You can set the workspace default retention policy in the Azure portal to 30, 31, 60, 90, 120, 180, 270, 365, 550, and 730 days. To set a different policy, use the Resource Manager configuration method described below. If you're on the free tier, you need to upgrade to the paid tier to change the data retention period.
If you are looking for Avg of CPU stats you can use the below query which will give you the time period of 5 minutes.
Perf
| where ObjectName == "Processor Information" and CounterName == "% Processor Time" and InstanceName == "_Total"
| summarize AggregatedValue = avg(CounterValue) by Computer, bin(TimeGenerated, 5m)
You can also raise a feature request over here which may even help the other people with same issues.

Related

Alerts with Azure Monitor Agent Metrics

I am using the Azure Monitor Agent (AMA) to monitor a virtual machine.
I need to make an alert if the free disk is less than 10%.
For this purpose i'm using the guest metric "disk/free_percent", with mean as type of data aggregation.
On the graph, the values on the ordinate are the percentage of free disk? Because using df command on the virtual machine i have quite different values than the ones shown on the dashboard.
I have to make an alert if free disk is below 10%. What query i have to make using "disk/free_percent" to accomplish that task?
I've tryed to use operator "lesset than", unit as "number" and thrshold value as 10.
Disk Space will be computed in GB/MB units in general.
Instead of monitoring on a percentage basis, create an alert to check if the free disk space is less than 10gb.
As discussed here in Microsoft Q&A, I tried in my environment with a few modifications accordingly and I got the expected output for disk space.
Query:
let setgbvalue = 10;
Perf
| where ObjectName == "LogicalDisk" and CounterName == "Free Megabytes"
| where InstanceName !contains "C:"
| where InstanceName !contains "_Total"
| extend FreeSpaceGB = CounterValue/1024
| summarize FreeSpace = max(FreeSpaceGB) by InstanceName
| where FreeSpace < setgbvalue
Output:
If requirement is only with percentage, then you can use computing operations like countervalue/1024 multiplied by 100.

Access dashboard's time range and granularity from KQL

I've added a chart using KQL and logs from Azure Log Analytics to a dashboard. I'm using make-series which works great but the catch is the following:
The logs I'm getting might not extend to the whole time range dictated by the dashboard. So basically I need access to the starttime/endtime (and time granularity) to make make-series cover the whole timerange.
e.g.
logs
| make-series
P90 = percentile(Elapsed, 90) default = 0,
Average = avg(Elapsed) default = 0
// ??? need start/end time to use in from/to
on TimeGenerated step 1m
Currently, it's not supported. There are some feedbacks about this feature: Support for time granularity selected in Azure Portal Dashboard, and Retrieve the portal time span and use it inside the kusto query.
And some people provided workarounds in the first feedback, you can give it a try.
I posted on another question on this subject - you can do a bit of a hack in your KQL to get this working: https://stackoverflow.com/a/73064218/5785878

Azure Response Time Monitoring per Url with a range

I am trying to configure the dashboard consists few business critical functionality which we needs to focus for performance monitoring based on the SLAs.
Example a landing page url retrieves a records needs to be faster and accepted SLA is
Green < 1sec
Amber 1 sec - 2 secs
Red > 2 secs
We were able to configure the same in SPLUNK based on flat file logs. However we could not able to configure similar thing in Azure.
As of now I could not able to create a dashboard for our requirement. Any type of graphical representation is Ok for us. Based on this monitoring we might need to react and improve the performance over the period of time when it goes slow.
You can use the below Kusto query in application insights:
requests
| where timestamp > ago(2h) //set the time range
| where url == "http://localhost:54917/" //set the url here
| summarize avg_time =avg(duration)
| extend my_result = case(
avg_time<=1000,"good", //1000 milliseconds
avg_time<=2000,"normal",//2000 milliseconds
"bad"
)
Note:
1.the unit of avg_time is milliseconds
2.when avg_time <=1000 milliseconds, then in dashboard, it shows "good"; when <=2000 milliseconds, it shows "normal"; when >2000 milliseconds, it shows "bad".
The query result(change it to Chart):
Then in dashboard:
An approximated solution which can serve your purpose
use request time vs time char along with reference lines which can be your SLA thresholds
So you can figure out at this moment the response time is below or above the threshold
// Response time trend
// Chart request duration over the last 12 hours
requests
| where timestamp > ago(12h)
| summarize avgRequestDuration=avg(duration) by bin(timestamp, 10m) // use a time grain of 10 minutes
| render timechart
| extend Green = 200
| extend Amber = 400
| extend red = 800
it would look something like below
I think it is much more useful than your previous UI, which has kind of a meter like feel that gives you health indication at that moment, but with continuous time plot you get better picture of the trend
If you run the same query in Azure Workbooks, you could use the "thresholds" renderer in grids or tiles to format cells with if/then/else like that for color for each range.
would get you:
you can then pin that grid/tiles/graph to an azure dashboard. (if the query uses a workbooks time range parameter, it will inherit the dashboard's time range and auto update as well.

How to change retention duration for Azure Application Insights?

At the moment most of the data retained for 90 days by default. I was wondering if there way to change this setting to 30-40 days. I know that I can export them to keep the data longer but what I'm looking for is mainly keep the data for shorter duration for the upcoming regulations.
Update
The default retention for Application Insights resources is 90 days. Different retention periods can be selected for each Application Insights resource. The full set of available retention periods is 30, 60, 90, 120, 180, 270, 365, 550 or 730 days.
Note: If you need to keep data longer than 730 days, you can use Continuous Export to copy it to a storage account during data ingestion.
To change the retention, from your Application Insights resource, go to the Usage and Estimated Costs page and select the Data Retention option:
Reference
Sometimes the only answer is a no. In this case, you can't. From the docs:
Raw data points (that is, items that you can query in Analytics and inspect in Search) are kept for up to 90 days. If you need to keep data longer than that, you can use continuous export to copy it to a storage account.
Aggregated data (that is, counts, averages and other statistical data that you see in Metric Explorer) are retained at a grain of 1 minute for 90 days.
I remember that a long time ago the pricing tier dictated the maximum retention period but it is now fixed to 90 days for all plans.
You can try give your feedback / ask for this feature here.
It is now available as an option in the Azure portal. If not, you need to get in touch with Azure support to get it activated.

Azure CPU metric (Live Metrics Stream)

Below shows the 'CPU Total' as displayed on the Azure Live Metrics page for our Web App which is scaled out to 4 x S3 instances.
It's not clear to me (despite much research) if this CPU Total is a percentage of the max CPU available for the instance or something else. I have noticed that the CPU Total has crept above 100% from time to time, which makes me question if it is a percentage of the total.
If this metric is not a % of the total: is there anywhere in the Azure portal that will show you the % CPU usage of your servers as a % and not of a % multiplied by core count or anything else?

Resources