I've been writing some queries against AppInsights and noticed that in my data there's 2 ways of determining if a username exists against the telemetry.
customEvents
| where tostring(parse_json(tostring(customDimensions)).username) != '' or tostring(parse_json(tostring(customDimensions.Properties)).username) != ''
| project
Username = tostring(parse_json(tostring(customDimensions)).username),
timestamp = timestamp
| distinct Username, bin(timestamp, 1d)
| summarize count() by bin(timestamp, 1d)
| render timechart
Bit stuck, notice in the first where there's 2 ways of determing whether a record is valid, how do I change the projection to then say "if username is here, take it from here, else check in customDimensions.Properties
I assume we need a union from somewhere?
you could use the coalesce() function:
datatable(customDimensions: string)
[
'{"username": "user1"}',
'{"Properties": {"username": "user2"}}'
]
| where customDimensions has 'username'
| extend cd = parse_json(customDimensions)
| project UserName = tostring(coalesce(cd.username, cd.Properties.username))
| where isnotempty(UserName)
Username
user1
user2
Related
There is a AKS running that is connected to Log Analytics in Azure.
I'm trying to view logs of named PODs using the following query snippet:
let KubePodLogs = (clustername:string, podnameprefix:string) {
let ContainerIdList = KubePodInventory
| where ClusterName =~ clustername
| where Name startswith strcat(podnameprefix, "-")
| where strlen(ContainerID)>0
| distinct ContainerID;
ContainerLog
| where ContainerID in (ContainerIdList)
| join (KubePodInventory | project ContainerID, Name, PodLabel, Namespace, Computer) on ContainerID
| project TimeGenerated, Node=Computer, Namespace, PodName=Name1, PodLabel, ContainerID, LogEntry
};
KubePodLogs('aks-my-cluster', 'my-service') | order by TimeGenerated desc
The above query does return rows of the matching PODs but not all that are actually available.
Trying to get results of the partial queries by inspecting POD details:
KubePodInventory
| where ClusterName =~ 'aks-my-cluster'
| where Name startswith 'my-service-'
| where strlen(ContainerID)>0
| distinct ContainerID;
gives me a container-id. Now feeding this container-id into another query shows more
results then the combined query from above. Why ?
ContainerLog
| where ContainerID == "aec001...fc31"
| order by TimeGenerated desc
| project TimeGenerated, ContainerID, LogEntry
One thing I noticed is that the later simple query result contain log results that have a LogEntry field parsed from JSON formatted output of the POD. In the results I can expand LogEntryto more fields corresponding to the original JSON data of that POD log output.
I.e. it seems like the combined query ( with a join ) skips those JSON LogEntry ContainerLog entries, but why ?
As far as I can see the combined query doesn't filter in any way on the LogEntry field.
A changed query seems to produce the results I would expect:
I exchanged the join with a lookup and used more columns to distinct the KubePodInventory results.
let KubePodLogs = (clustername:string, podnameprefix:string) {
let ContainerIdList = KubePodInventory
| where ClusterName =~ clustername
| where Name startswith strcat(podnameprefix, "-")
| where strlen(ContainerID)>0
| distinct ContainerID, PodLabel, Namespace, PodIp, Name;
ContainerLog
| where ContainerID in (ContainerIdList)
| lookup kind=leftouter (ContainerIdList) on ContainerID
| project-away Image, ImageTag, Repository, Name, TimeOfCommand
| project-rename PodName=Name1
};
KubePodLogs('aks-my-cluster', 'my-service') | order by TimeGenerated desc
we collect custom events in application insights for each message a user sends to a chatbot. The event is called user_message.
We use a custom dimension field customDimensions.conversationid to know which message is related to which conversation.
I want to see the first message of each conversation so basically the "oldest" timestamp of each event based on the conversation id.
I tried to work with arg_max but I didn't figure out how it works.
customEvents
| extend itemType = iif(itemType == 'customEvent',itemType,"")
| where (itemType == 'customEvent')
| where name == 'User_Message'
i was able to show all user messages ordert by the conversationID however it shows me multiple lines and i only need the first message by conversation.
Datamodel:
timestamp [UTC] 2019-04-05T13:24:10.359Z
name User_Message
itemType customEvent
customDimensions
confidence N/A
conversationId BNu0SqC5RfA1S0lZmdxxxxx
intent N/A
userMessage user text
operation_Name POST /api/messages
operation_Id xxxxxxxa5d422eadebfebb2
operation_ParentId xxxxx545a5d422eadebfebb2.99811380_13.f033f887_
application_Version 1.0.0
client_Type PC
client_OS Windows_NT 10.0.14393
client_IP 0.0.0.0
client_City Amsterdam
client_StateOrProvince North Holland
client_CountryOrRegion Netherlands
cloud_RoleName Web
cloud_RoleInstance XXXXXXXFF74D594
appId ccccccc-8b24-41bb-a02a-1cb101da84e5
appName bot-XXXXX
iKey XXXXXX
sdkVersion node:XX
itemId XXXXXXXX-57a6-11e9-a5a7-ebc91e7cf64e
itemCount 1
SOLUION
customEvents
| extend itemType = iif(itemType == 'customEvent',itemType,"")
| where (itemType == 'customEvent')
| where (name=='User_Message')
| summarize list=makeset(customDimensions.userMessage) by
tostring(customDimensions.conversationId)
| mv-expand firstMessage=list[0]
Update:
customEvents
| where name == "User_Message"
| summarize timestamp=min(timestamp) by myconid=tostring(customDimensions.[conversationID])
| join kind= inner (
customEvents
| where name == "User_Message"
| extend myconid = tostring(customDimensions.[conversationID])
) on myconid,timestamp
You can use inner join to do that.
I don't have your data, so in your case, the code looks like below(maybe you need to make a little changes):
customEvents
| summarize timestamp=min(timestamp) by conversationID
| join kind= inner (
customEvents
) on conversationID,timestamp
| project-away conversationID1,timestamp1
Please let me know if you have more issues.
I'm trying to create a custom metric alert based on some metrics in my Application Insights logs. Below is the query I'm using;
let start = customEvents
| where customDimensions.configName == "configName"
| where name == "name"
| extend timestamp, correlationId = tostring(customDimensions.correlationId), configName = tostring(customDimensions.configName);
let ending = customEvents
| where customDimensions.configName == configName"
| where name == "anotherName"
| where customDimensions.taskName == "taskName"
| extend timestamp, correlationId = tostring(customDimensions.correlationId), configName = tostring(customDimensions.configName), name= name, nameTimeStamp= timestamp ;
let timeDiffs = start
| join (ending) on correlationId
| extend timeDiff = nameTimeStamp- timestamp
| project timeDiff, timestamp, nameTimeStamp, name, anotherName, correlationId;
timeDiffs
| summarize AggregatedValue=avg(timeDiff) by bin(timestamp, 1m)
When I run this query in Analytics page, I get results, however when I try to create a custom metric alert, I got the error Search Query should contain 'AggregatedValue' and 'bin(timestamp, [roundTo])' for Metric alert type
The only response I found was adding AggregatedValue which I already have, I'm not sure why custom metric alert page is giving me this error.
I found what was wrong with my query. Essentially, aggregated value needs to be numeric, however AggregatedValue=avg(timeDiff) produces time value, but it was in seconds, so it was a bit hard to notice. Converting it to int solves the problem,
I have just updated last bit as follows
timeDiffs
| summarize AggregatedValue=toint(avg(timeDiff)/time(1ms)) by bin(timestamp, 5m)
This brings another challenge on Aggregate On while creating the alert as AggregatedValue is not part of the grouping that is coming after by statement.
I need to combine requests and customMetrics tables by parsed url. On output it should have common parsed url, avg duration of requests and avg value of requests from CustomMetrics.
This code doesn't work ^(
let parseUrlOwn = (stringUrl:string) {
let halfparsed = substring(stringUrl,157);
substring(halfparsed,0 , indexof(halfparsed, "?"))
};
customMetrics
| where name == "Api.GetData"
| extend urlURI = tostring(customDimensions.RequestedUri)
| extend urlcustomMeticsParsed = parseUrlOwn(urlURI)
| extend unionColumnUrl = urlcustomMeticsParsed
| summarize summaryCustom = avg(value) by unionColumnUrl
| project summaryCustom, unionColumnUrl
| join (
requests
| where isnotempty(cloud_RoleName)
| extend urlRequestsParsed = parseUrlOwn(url)
| extend unionColumnUrl = urlRequestsParsed
| summarize summaryRequests =sum(itemCount), avg(duration)
| project summaryRequests, unionColumnUrl
) on unionColumnUrl
Instead of inventing your own url parsing, how about using parse_url (https://docs.loganalytics.io/docs/Language-Reference/Scalar-functions/parse_url()) and using that instead?
It also appears that your summarize line in the requests join, isn't summarizing on url, so I'm not sure how that works.
Shouldn't this line:
| summarize summaryRequests =sum(itemCount), avg(duration)
be
| summarize summaryRequests =sum(itemCount), avg(duration) by unionColumnUrl
like it is in the metrics part of the query. Also, why are you calculating the average in that summarize? you're just throwing it away by not projecting it on the next line.
I want to perform a subselect on a related set of data. That subdata needs to be filtered using data from the main query:
customEvents
| extend envId = tostring(customDimensions.EnvironmentId)
| extend organisation = tostring(customDimensions.OrganisationName)
| extend version = tostring(customDimensions.Version)
| extend app = tostring(customDimensions.Appname)
| where customDimensions.EventName contains "ApiSessionStartStart"
| extend dbInfo = toscalar(
customEvents
| extend dbInfo = tostring(customDimensions.dbInfo)
| extend serverEnvId = tostring(customDimensions.EnvironmentId)
| where customDimensions.EventName == "ServiceSessionStart" or customDimensions.EventName == "ServiceSessionContinuation"
| where serverEnvId = envId // This gives and error
| project dbInfo
| take 1)
| order by timestamp desc
| project timestamp, customDimensions.OrganisationName, customDimensions.Version, customDimensions.onBehalfOf, customDimensions.userId, customDimensions.Appname, customDimensions.apiKey, customDimensions.remoteIp, session_Id , dbInfo, envId
The above query results in an error:
Failed to resolve entity 'envId'
How can I filter the data in the subselect based on the field envId in the main query?
i believe you'd need to use join instead, where you'd join to get that value from the second query
docs for join: https://docs.loganalytics.io/docs/Language-Reference/Tabular-operators/join-operator
the left hand side of the join is your "outer" query, and the right hand side of the join would be that "inner" query, though instead of doing take 1, you'd probably do a simpler query that just gets distinct values of serverEnvId, dbInfo