Application Insights: Analytics - how to extract string at specific position - azure

I'd like to do,
Extracting "query" strings where param=1 as follows in "2."
Getting pageViews in Analytics with table as "3."
1. Actual urls included in pageView
https://example.com/dir01/?query=apple&param=1
https://example.com/dir01/?query=apple&param=1
https://example.com/dir01/?query=lemon+juice&param=1
https://example.com/dir01/?query=lemon+juice&param=0
https://example.com/dir01/?query=tasteful+grape+wine&param=1
2. Value expected to extract
apple
lemon+juice
tasteful+grape+wine
3. Expected output in AI Analytics
Query Parameters | Count
apple | 2
lemon+juice | 1
tasteful+grape+wine | 1
Tried to do
https://learn.microsoft.com/en-us/azure/application-insights/app-insights-analytics-reference#parseurl
https://aka.ms/AIAnalyticsDemo
I think extract or parseurl(url) should be useful. I tried the latter parseurl(url) but don't know how to extract "Query Parameters" as one column.
pageViews
| where timestamp > ago(1d)
| extend parsed_url=parseurl(url)
| summarize count() by tostring(parsed_url)
| render barchart
url
http://aiconnect2.cloudapp.net/FabrikamProd/
parsed_url
{"Scheme":"http","Host":"aiconnect2.cloudapp.net","Port":"","Path":"/FabrikamProd/","Username":"","Password":"","Query Parameters":{},"Fragment":""}

Yes, parseurl is the way to do it. It results in a dynamic value which you can use as a json.
To get the "query" value of the query parameters:
pageViews
| where timestamp > ago(1d)
| extend parsed_url=parseurl(url)
| extend query = tostring(parsed_url["Query Parameters"]["query"])
and to summarize by the param value:
pageViews
| where timestamp > ago(1d)
| extend parsed_url=parseurl(url)
| extend query = tostring(parsed_url["Query Parameters"]["query"])
| extend param = toint(parsed["Query Parameters"]["param"])
| summarize sum(param) by query
You can see how it works on your example values in the demo portal:
let vals = datatable(url:string)["https://example.com/dir01/?
query=apple&param=1", "https://example.com/dir01/?query=apple&param=1",
"https://example.com/dir01/?query=lemon+juice&param=1",
"https://example.com/dir01/?query=lemon+juice&param=0",
"https://example.com/dir01/?query=tasteful+grape+wine&param=1"];
vals
| extend parsed = parseurl(url)
| extend query = tostring(parsed["Query Parameters"]["query"])
| extend param = toint(parsed["Query Parameters"]["param"])
| summarize sum(param) by query
Hope this helps,
Asaf

Related

Azure Kusto syntax

I need to run a very simple query
requests
| where cloud_RoleName == "blabla"
| summarize Count=count() by url
| order by Count desc
only thing i need to get the data just from the past 5 minutes
if i try this :
requests | where timestamp < ago(5m)
| where cloud_RoleName == "blabla"
| summarize Count=count() by url
| order by Count desc
or this
requests
| where cloud_RoleName == "blabla" and timestamp < ago(5m)
| summarize Count=count() by url
| order by Count desc
but all of them are returning answers with data older than 5 minutes.
ive read the doc and i see no other way of writing this query
can anyone assist?
Make sure to check if the timestamp is greater than the result of ago().
It returns the timestamp from e.g. 5 minutes ago, so if you want the data that is within last 5 minutes, you want the ones with a timestamp higher than that.
So the query should be:
requests
| where timestamp > ago(5m)
| where cloud_RoleName == "blabla"
| summarize Count=count() by url
| order by Count desc

Semantic error: Unsupported calculated column name GET /dbs/*/colls/*/pkranges Kusto

I'm having calculated columns in my kusto query. Now one of the column name is 'GET /dbs//colls//pkranges'. While running my query I'm facing this error
Semantic error: Unsupported calculated column name GET /dbs/*/colls/*/pkranges Kusto
Can someone help in replacing the column name dynamically or while the calculation itself?
My query is below
dependencies
| where operation_Id in (operation_ids)
| where timestamp > ago(7d)
| summarize duration_list=make_list_with_nulls(duration) by tostring(name), operation_Id
| extend p = pack(tostring(name), duration_list)
| summarize bag = make_bag(p) by operation_Id
| evaluate bag_unpack(bag);
Thanks in advance!!
you can replace the invalid character (* in this case) in the key with something else, as follows, using replace_string():
dependencies
| where operation_Id in (operation_ids)
| where timestamp > ago(7d)
| summarize duration_list=make_list_with_nulls(duration) by tostring(name), operation_Id
| extend p = pack(replace_string(name, '*', '_'), duration_list)
| summarize bag = make_bag(p) by operation_Id
| evaluate bag_unpack(bag);

KQL time graph number of vulnerabilities

I have a query that fetches the number of unique vulnerabilities found in our images in our Azure Container Registry:
securityresources
| where type == 'microsoft.security/assessments/subassessments'
| where id matches regex '(.+?)/providers/Microsoft.Security/assessments/dbd0cb49-b563-45e7-9724-889e799fa648/'
| parse id with registryResourceId '/providers/Microsoft.Security/assessments/' *
| parse registryResourceId with * "/providers/Microsoft.ContainerRegistry/registries/" registryName
| extend imageDigest = tostring(properties.additionalData.imageDigest), repository = tostring(properties.additionalData.repositoryName)
| project
registryName,
repository,
imageDigest,
severity = properties.status.severity,
vulnId = properties.id,
displayName = properties.displayName,
description = properties.description,
remediation = properties.remediation,
category = properties.category,
impact = properties.impact,
timeGenerated = properties.timeGenerated
| distinct tostring(vulnId)
| summarize count()
I would like to have a graph that shows the number of vulnerabilities over a period of time so we can see (visually) that the number of vulnerabilities are going down (or up), but I have no clue on how to do this. Hopefully someone can help me in achieving this.
instead of distinct tostring(vulnId) | summarize count(), try either of the following:
summarize dcount() by bin(timeGenerated, 1h)
make-series dcount() on timeGenerated step 1h
and then add a | render timechart at the end
e.g:
securityresources
| where type == 'microsoft.security/assessments/subassessments'
| where id matches regex '(.+?)/providers/Microsoft.Security/assessments/dbd0cb49-b563-5e7-9724-889e799fa648/'
| extend vulnId = tostring(properties.id)
| summarize dcount(vulnId) by bin(timeGenerated, 1h)
| render timechart

How do I exclude List B from List A in Kusto?

I would like to get an overview of recent SpecialEvents, the ones that already have a comment named 'Skip' need to be excluded from list A. Since comments is an array I can't simply put everything in one query with a where clause (it will not process Comments since it only contains value: '[]').
How do I combine these two tables (Show everything from List A except the ones that are in List B)?
// List A: Show all Event created less than 1 hour ago
SpecialEvent
| where TimeGenerated < ago(1h)
| distinct uniqueNumber
| project uniqueNumber
// List B: Don't add the ones that contain 'skip'
SpecialEvent
| mvexpand parsejson(Comments)
| extend commentMsg = Comments.message
| where commentMsg contains 'SKIP'
| distinct uniqueNumber
| project uniqueNumber
If I understand your question correctly, you could use the !in() operator or an anti-join.
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/inoperator
https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/joinoperator
For example:
let list_a =
SpecialEvent
| where TimeGenerated < ago(1h)
| distinct uniqueNumber
;
SpecialEvent
| where uniqueNumber !in(list_a)
| mv-expand parsejson(Comments) // you could also use 'mv-apply' and perform the filters on 'SKIP' under that scope
| extend commentMsg = Comments.message
| where commentMsg contains 'SKIP'
| distinct uniqueNumber

Application Insights Log Query Get Latest Row in a Group

I'm trying to find the latest row of each member of a group in Application Insights.
Here's the query:
traces |
where timestamp > ago(1h) |
where message startswith "TEST DONE" |
order by timestamp desc nulls last |
extend json=parse_json(substring(message,10)) |
summarize any(timestamp, tostring(json.status)) by tostring(json.testKey)
It does return just one row but it's not the latest, it's any random row from the set of possible rows.
I think you're looking for the arg_max function?
https://learn.microsoft.com/en-us/azure/kusto/query/arg-max-aggfunction
something like:
traces |
where timestamp > ago(1h) |
where message startswith "TEST DONE" |
order by timestamp desc nulls last |
extend json=parse_json(substring(message,10)) |
extend testKey = tostring(json.testKey) |
extend status = tostring(json.status) |
summarize arg_max(timestamp, status) by testKey
You can use makelist([column name], 1) to pick the first one. Then refer to it by index. Using this technique was able to solve above problem on my dataset.
Here is adaptation to your query:
traces |
where timestamp > ago(1h) |
where message startswith "TEST DONE" |
order by timestamp desc nulls last |
extend json=parse_json(substring(message,10)) |
extend testKey = tostring(json.testKey) |
summarize timeStampList=makelist(timestamp, 1), statusList=makelist(tostring(json.status), 1) by testKey |
project timeStampList[0], statusList[0], testKey

Resources