Grafana azure log analytics transfer query from logs - azure

I have this query that works in Azure logs when i set the scope to the specific application insights I want to use
let usg_events = dynamic(["*"]);
let mainTable = union pageViews, customEvents, requests
| where timestamp > ago(1d)
| where isempty(operation_SyntheticSource)
| extend name =replace("\n", "", name)
| where '*' in (usg_events) or name in (usg_events)
;
let queryTable = mainTable;
let cohortedTable = queryTable
| extend dimension =tostring(client_CountryOrRegion)
| extend dimension = iif(isempty(dimension), "<undefined>", dimension)
| summarize hll = hll(user_Id) by tostring(dimension)
| extend Users = dcount_hll(hll)
| order by Users desc
| serialize rank = row_number()
| extend dimension = iff(rank > 5, 'Other', dimension)
| summarize merged = hll_merge(hll) by tostring(dimension)
| project ["Country or region"] = dimension, Counts = dcount_hll(merged);
cohortedTable
but trying to use the same in grafana just gives an error.
"'union' operator: Failed to resolve table expression named 'pageViews'"
Which is the same i get in azure logs if i dont set the scope to the specific application insights resource. So my question is. how do i make it so grafana targets this specific scope inside the logs? The query jsut gets the countries of the users that log in

As far as I know, Currently, there is no option/feature to add Scope in Grafana.
The Scope is available only in the Azure Log Analytics Workspace.
If you want the Feature/Resolution, please raise a ticket in Grafana Community where all the issues are officially addressed.

Related

Connector name from Kusto query

I am very new with the sintaxis of Kusto query. My goal is to create a kusto query to retreive which Logic App has a system error and in which action the error was located. Additionally, I would like to know which connector, this failed action belongs. For example, If the action "Move Email" failed I would like to have the connector name, in this case, Office 365 Outlook or something similar in order to classify the action.
My query to achieve this goal was based on the Table "AzureDiagnostics":
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.LOGIC"
| where Category == "WorkflowRuntime"
| where status_s == "Failed"
| where code_s !has 'ActionFailed'
| where OperationName has "workflowActionCompleted" or OperationName has "workflowTriggerCompleted"
| extend ResourceName = coalesce(resource_actionName_s, resource_triggerName_s)
| extend ResourceCategory = substring(OperationName, 34, strlen(OperationName) - 43)
| project
LogicAppName = resource_workflowName_s,
ResourceCategory,
ResourceName,
LogicAppId = resource_runId_s,
ErrorCode = code_s,
ErrorMessage = error_message_s,
ErrorTime = format_datetime(startTime_t,'dd.MM.yyyy')
The connector name will give me the possibility to classify the failed logic apps and this way I can create a report to show which type of connector we are having issues.
Thanks in advance for your help or another workarround to classify the failed logic apps.
After reproducing from our end, One of the workarounds is that we can fetch the action name of the failed step along with the status using the below query.
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.LOGIC"
| where Category == "WorkflowRuntime"
| where status_s == "Failed"
| extend Status = code_s
| project
LogicAppName = resource_workflowName_s,
ResourceRunID = resource_runId_s,
Operation = OperationName,
ActionName = coalesce(resource_actionName_s, resource_triggerName_s),
Status
RESULTS:
Updated Answer
There is no direct way to get the connector's name. One of the workarounds would be using tracked properties to save the connector name and retrieve it through logs. Not a perfect way but this is one of the workarounds that achieves the requirement.
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.LOGIC"
| where OperationName == "Microsoft.Logic/workflows/workflowActionCompleted"
| where status_s == "Failed"
| extend Status = code_s
| project
LogicAppName = resource_workflowName_s,
ResourceRunID = resource_runId_s,
Operation = OperationName,
ActionName = coalesce(resource_actionName_s, resource_triggerName_s),
Status,
ConnectorName = trackedProperties_ConnectorName_s
Below is the flow in my logic app
Failed Run
In logs

Azure Custom Log Alert - Not Firing

I am trying to debug an issue with an Azure Alert not firing. This alert should run every 30 minutes and find any devices that have not emitted a heartbeat in the last 30 minutes up to the hour. In addition, an alert should only be fired once for each device until it becomes healthy again.
The kusto query is:
let missedHeartbeatsFrom30MinsAgo = traces
| where message == “Heartbeat”
| summarize arg_max(timestamp, *) by tostring(customDimensions.id)
| project Id = customDimensions_id, LastHeartbeat = timestamp
| where LastHeartbeat < ago(30m);
let missedHeartbeatsFrom1HourAgo = traces
| where message == "Heartbeat"
| summarize arg_max(timestamp, *) by tostring(customDimensions.id)
| project Id = customDimensions_id, LastHeartbeat = timestamp
| where LastHeartbeat <= ago(1h);
let unhealthyIds = missedHeartbeatsFrom30MinsAgo
| join kind=leftanti missedHeartbeatsFrom1HourAgo on Id;
let deviceDetails = customEvents
| where name == "Heartbeat"
| distinct tostring(customDimensions.deviceId), tostring(customDimensions.fullName)
| project Id = customDimensions_deviceId, FullName = customDimensions_fullName;
unhealthyIds |
join kind=leftouter deviceDetails on Id
| project Id, FullName, LastHeartbeat
| order by FullName asc
The rules for this alert are:
When I pull the plug on a device, wait ~30 minutes, and run the query manually in App Insights, I see the device in the results data set. However, no alert gets generated (nothing shows up in the Alerts history page and no one in the Action Group gets notified). Any help in this matter would be greatly appreciated!
I can see your KQL Query take several times to execute, and it consume more resource usage to run the query.
Optimize your query to avoid more resource utilization and quick response of your query result.
Make sure your alert processing rule Status should be Enabled like below
Once it is done make sure your query result should be Greater than or equal to 1. So that the alert processing rule will check the threshold if it matches the condition the alert will fire.
Still, you get the issue alert not firing try to delete the alert and run your query in a Query Editor and try to create a New alert rule.

Excluding data in KQL SLA charts

We are showing SLA charts for URLs, VPN and VMs for that if there is any planned scheduled maintenance we want to exclude that timings in KQL SLA charts as its known downtime.
We are disabling Alerts via powershell during this time we are passing below columns to Loganalytics custom table.
"resourcename": "$resourcename",
"Alertstate": "Enabled",
"Scheduledmaintenance" : "stop",
"Environment" : "UAT",
"timestamp": "$TimeStampField",
Now we want to use join condition SLA charts queries with custom table data and exclude the time range in SLA charts during scheduled maintenance.
Adding query as per request
---------------------------
url_json_CL
| where Uri_s contains "xxxx"
| extend Availablity = iff(StatusCode_d ==200,1.000,0.000)
| extend urlhit = 1.000
| summarize PassCount = sum(Availablity), TestCount = sum(urlhit) by Uri_s ,ClientName_s
| extend AVLPERCENTAGE = ((PassCount / TestCount ) * 100)
| join kind=leftouter
( scheduledmaintenance2_CL
| where ResourceName_s == "VMname"
| where ScheduledMaintenance_s == "start"
| extend starttime = timestamp_t)
on ClientName_s
| join kind= leftouter
(scheduledmaintenance2_CL
| where ResourceName_s == "VMname"
| where ScheduledMaintenance_s == "stop"
| extend stoptime = timestamp_t )
on ClientName_s
| extend excludedtime=stoptime - starttime
| project ClientName_s, ResourceName_s, excludedtime, AVLPERCENTAGE , Uri_s
| top 3 by ClientName_s desc
You can perform cross-resource log queries in Azure Monitor
Using Application Insights explorer we can query Log analytics workspace custom tables as well.
workspace("/subscriptions/xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx/resourcegroups/rgname/providers/Microsoft.OperationalInsights/workspaces/workspacename").Event | count
Using Log Analytics logs explorer you can query the Application Insights Availability Results
app("applicationinsightsinstancename").availabilityResults
You can use any of the above options to query the required tables and join the tables. Please refer to this documentation on joins.
Additional documentation reference.
Hope this helps.

Search Query should contain 'AggregatedValue' and 'bin(timestamp, [roundTo])' for Metric alert type

I'm trying to create a custom metric alert based on some metrics in my Application Insights logs. Below is the query I'm using;
let start = customEvents
| where customDimensions.configName == "configName"
| where name == "name"
| extend timestamp, correlationId = tostring(customDimensions.correlationId), configName = tostring(customDimensions.configName);
let ending = customEvents
| where customDimensions.configName == configName"
| where name == "anotherName"
| where customDimensions.taskName == "taskName"
| extend timestamp, correlationId = tostring(customDimensions.correlationId), configName = tostring(customDimensions.configName), name= name, nameTimeStamp= timestamp ;
let timeDiffs = start
| join (ending) on correlationId
| extend timeDiff = nameTimeStamp- timestamp
| project timeDiff, timestamp, nameTimeStamp, name, anotherName, correlationId;
timeDiffs
| summarize AggregatedValue=avg(timeDiff) by bin(timestamp, 1m)
When I run this query in Analytics page, I get results, however when I try to create a custom metric alert, I got the error Search Query should contain 'AggregatedValue' and 'bin(timestamp, [roundTo])' for Metric alert type
The only response I found was adding AggregatedValue which I already have, I'm not sure why custom metric alert page is giving me this error.
I found what was wrong with my query. Essentially, aggregated value needs to be numeric, however AggregatedValue=avg(timeDiff) produces time value, but it was in seconds, so it was a bit hard to notice. Converting it to int solves the problem,
I have just updated last bit as follows
timeDiffs
| summarize AggregatedValue=toint(avg(timeDiff)/time(1ms)) by bin(timestamp, 5m)
This brings another challenge on Aggregate On while creating the alert as AggregatedValue is not part of the grouping that is coming after by statement.

Azure Log Analytics - Search REST API - How to Paginate through results

When grabbing search result using Azure Log Analytics Search REST API
I'm able to receive only the first 5000 results (as by the specs, at the top of the document), but know there are many more (by the "total" attribute in the metadata in the response).
Is there a way to paginate so to get the entire result set?
One hacky way would be to attempt to break down the desired time-range iteratively until the "total" is less than 5000 for that timeframe, and do this process iteratively for the entire desired time-range - but this is guesswork that will cost many redundant requests.
While it doesn't appear to be a way to paginate using the REST API itself, you can use your query to perform the pagination. The two key operators here are TOP and SKIP:
Suppose you want page n with pagesize x (starting at page 1), then append to your query:
query | skip (n-1) * x | top x.
For a full reference list, see https://learn.microsoft.com/en-us/azure/log-analytics/log-analytics-search-reference
Yes, skip operation is not available anymore but if you want create pagination there is still an option. You need to count total count of entries, use a simple math and two opposite sortings.
Prerequisites for this query are values: ContainerName, Namespace, Page, PageSize.
I'm using it in Workbook where these values are set by fields.
let containers = KubePodInventory
| where ContainerName matches regex '^.*{ContainerName}$' and Namespace == '{Namespace}'
| distinct ContainerID
| project ContainerID;
let TotalCount = toscalar(ContainerLog
| where ContainerID in (containers)
| where LogEntry contains '{SearchText}'
| summarize CountOfLogs = count()
| project CountOfLogs);
ContainerLog
| where ContainerID in (containers)
| where LogEntry contains '{SearchText}'
| extend Log=replace(#'(\x1b\[[0-9]*m|\x1b\[0 [0-9]*m)','', LogEntry)
| project TimeGenerated, Log
| sort by TimeGenerated asc
| take {PageSize}*{Page}
| top iff({PageSize}*{Page} > TotalCount, TotalCount - ({PageSize}*({Page} - 1)) , {PageSize}) by TimeGenerated desc;
// The '| extend' is not needed if in logs are not the annoying special characters

Resources