Azure Query Analytics average all values in column

Azure Query Analytics average all values in column - azure

I am using application insights to record custom measurements about our application. I have a customEvent that has data stored in the customMeasurements object. The object ontains 4 key-value pairs. I have many of these customEvents and I am trying to average the key-value pairs from all the events and display the results in a 2 column table.
I want to have one table that has 2 columns. First column is the key
name, and the second column in the key-value of all the events
averaged.
For example, event1 has key1's value set to 2. event2 has key1's value set to 6. If those are the only two events I received in the last 7 days, I want my table to show the number 4 in the row containing data for key1.
I can only average 1 key per query since I cannot put multiple summarizes inside of 1 query... Here is what I have for averaging the first key in the customMeasurements object:
customEvents
| where name == "PerformanceMeasurements"
| where timestamp > ago(7d)
| summarize key1average=avg(toint(customMeasurements.key1))
| project key1average
But I need to average all the keys inside of this object and build 1 table as described above.
For reference, I have attached a screenshot of the layout of a customEvent customMeasurements object:

If amount of Keys is limited and is known beforehand, then I'd recommend using multiple aggregations within | summarize operator by separating them with comma:
| summarize key1average=avg(toint(customMeasurements.key1)), key2average=avg(toint(customMeasurements.key2)), key3average=avg(toint(customMeasurements.key3))
If Keys may vary, then you'd to flatten out custom dimensions first with |mvexpand operator:
customEvents
| where timestamp > ago(1h)
| where name == "EventName"
| project customDimensions
| mvexpand bagexpansion=array customDimensions
| extend Key = customDimensions[0], Value = customDimensions[1]
| summarize avg(toint(Value)) by tostring(Key)
In this case, each Key-Value pair from customDimensions will become its own row and you will be able to operate on those with the standard query language constructs.

Related

How can I combine duplicates of 1 column, then have multiple results in the same row of another column?

I am very new to kql, and i am stuck on this query. I am looking to have query display which users have had sign-ins from different states. I created this query, but i do not know how to count the results in the column "names".
SigninLogs
| project tostring(LocationDetails.state), UserDisplayName
| extend p =pack( 'Locations', LocationDetails_state)
| summarize names = make_set(p) by UserDisplayName
This generates a column "names" with a row like so:
[{"Locations":"Arkansas"},{"Locations":"Iowa"},{"Locations":""}]
Here is a simple query that grabs all sign-ins from users and another column with the locations.
SigninLogs
| where ResultType == "0"
| summarize by UserDisplayName, tostring(LocationDetails.state)
Is there a way to combine the duplicates of users column, and then display each location in the second? If so, could i count each location in order to filter by where location is > 1?

I am looking to have query display which users have had sign-ins from different states
Assuming I understood your question correctly, this could work (using array_length()):
SigninLogs
| project State = tostring(LocationDetails.state), UserDisplayName
| summarize States = make_set(State) by UserDisplayName
| where array_length(States) > 1 // filter users who had sign-ins from *more than 1 state*

How can i do a "GROUP BY WITH ROLLUP" in Kusto?

In T-SQL, when grouping results, you can also get a running total row when specifying "WITH ROLLUP".
How can i achieve this in Kusto? So, consider the following query:
customEvents | summarize counter = count() by name
The query above gives me a list of event names, and how often they occurred. This is what i need, but i also want a row with the running total (the count of all events).
It feels like there should be an easy way to achieve this, but i havent found anything in the docs ...

You can write 2 queries, the first query is used to count the number of each events, the second query is used to count the numbers of all the events. Then use the union operator to join them.
The query like below:
customEvents
| count
| extend name = "total",counter=Count
| project name,counter
| union
(customEvents
| summarize counter = count() by name)
Test result is as below:

Trying to calculate the average on a count of records in my query results

I'm trying to create a query in Application Insights that can show me the absolute and average number of messages in conversations over a particular time period. I'm using the LUIS trace example to get the context+LUIS information, which is where I'm pulling the conversationID from. I can get a table showing the number of messages per conversation, but I would also like to have a average number of messages for the data set. Either static average or rolling average (by pulling in timestamp) would be fine. I can get this value by doing a second summarize statement, but then I lose the granularity from the first. Here is my query.
requests
| where url endswith "messages"
| where timestamp > ago(30d)
| project timestamp, url, id
| parse kind = regex url with *"(?i)http://"botName".azurewebsites.net/api/messages"
| join kind= inner (
traces | extend id = operation_ParentId
) on id
| where message == "LUIS"
| extend convID = tostring(customDimensions.LUIS_botContext_conversation_id)
| order by timestamp desc nulls last
| project timestamp, botName, convID
| summarize messages=count() by conversation=convID
This gives me a table of conversation IDs with the message count for each conversation. I would also like to see the average number of messages per conversation. For example, if I have 4 conversations with 100 messages total, I want to see that the average is 25. I can get this result by doing a second summarize statement | summarize messages=sum(messages), avgMessages=avg(messages), but then of course I can no longer see the individual conversations. Is there any way to see both in the same table?

You can write 2 queries, one for "gives me a table of conversation IDs with the message count for each conversation", and another for " the average number of messages per conversation". And consider use Let statement for your query.
The tricky here is that, in both of the 2 queries, after the summarize statement, add this line of code at the end, like | extend myidentifier="aaa" .
Then you can join the 2 queries by using myidentifier.

I couldn't figure out how to do this without losing granularity from the first list (i.e. I couldn't figure out how to calculate average per period e.g. day), but the following query does at least get me the average across whatever timestamp filter I set, which ultimately gets me at the data I was looking for.
requests
| where url endswith "messages"
| where timestamp > ago(30d)
| project timestamp, url, id
| parse kind = regex url with *"(?i)http://"botName".azurewebsites.net/api/messages"
| join kind= inner (
traces | extend id = operation_ParentId
) on id
| where message == "LUIS"
| extend convID = tostring(customDimensions.LUIS_botContext_conversation_id)
| order by timestamp desc nulls last
| project timestamp, botName, convID
| summarize messages=count() by conversation=convID
| summarize conversations=count(), messageAverage=avg(messages)

Find by value on leveldb

I've been playing with leveldb and it's really good at what it's designed to do--storing and getting key/value pairs based on keys.
But now I want to do something more advanced and find myself immediately stuck. Is there no way to find a record by value? The only way I can think of is to iterate through the entire database until I find an entry with the value I'm looking for. This becomes worse if I'm looking for multiple entries with the value (basically a "where" query) since I have to iterate through the entire database every time I try to do this type of query.
Am I trying to do what Leveldb isn't designed to do and should I be using another database instead? Or is there a nice way to do this?

You are right. Basically what you need to know about is key composition.
Second, you don't query by value itself in SQL WHERE clause, but using a boolean query like age = 42.
To answer your particlular question imagine you have a first key-value namespace in leveldb, where you store your objects where the value is serialized in json for instance:
key | value
-------------------------------------------------
namespace | uid | value
================================================
users | 1 | {name:"amz", age=32}
------------------------------------------------
users | 2 | {name:"abki", age=42}
In another namespace, you index users uid by age:
key | value
----------------------------------
namespace | age | uid | value
==================================
users-by-uid | 32 | 1 | empty
----------------------------------
users-by-uid | 42 | 2 | empty
Here the value is empty because, the key must be unique. What we could think as the value of the given rows would be uid column it's composed
into the key to make each row's key unique.
In that second namespace, every key that starts with the (user-by-uid, 32) match records that answer the query age = 32.

Cassandra: can you add dynamic columns within existing column clustering?

I'm using Cassandra 1.2.12 with CQL 3, and am having trouble modeling my column family.
I currently store snapshots of customer data at particular times. Works great:
CREATE TABLE data (
cust_id varchar,
time timeuuid,
data_text text,
PRIMARY KEY (cust_id, time)
);
The cust_id is the partition key and time is the clustering id, so, as I understand it, I can think of each row in the table like:
| cust_id | timeuuid1 : data_text | timeuuid2 : data_text |
| CUST1 | data at this time | data at this time |
Now I'd like to store another group of metrics for each snapshot - but the name of each of these columns isn't fixed. So something like:
| cust_id | timeuuid1 : data_text | timeuuid1 : dynamicCol1 | timeuuid1 : dynamicCol2 | timeuuid1 : dynamicColN |
| CUST1 | data |{some value} |{some value} |{some value} |
I've achieved dynamic columns for timestamp by using a composite primary key, but I can't see how to achieve this within each cluster of columns, if you see what I mean.
If I add, say, "dynamicColumnName" to the existing composite key, I'll end up with customer data stored for each dynamic column, which is not what I want.
Is this possible, without using a Map column? Hope you can help, thanks!

I am not a CQL user... With the thrift API you dynamically add a column to a column family by inserting/updating a record with a value for a column with name X. The column X will start to exist right there and then for that record.
Have you tried an INSERT statement specifying a column that you have not explicitly defined? I would expect that to have the same effect (column is created).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Azure Query Analytics average all values in column - azure

Related

How can I combine duplicates of 1 column, then have multiple results in the same row of another column?

How can i do a "GROUP BY WITH ROLLUP" in Kusto?

Trying to calculate the average on a count of records in my query results

Find by value on leveldb

Cassandra: can you add dynamic columns within existing column clustering?

Categories

Resources