How to output multiple variables using Azure Kusto? - azure

I'm fairly new to Azure Kusto query-language. I'm trying to output 2 variables. This has to be something very simple, I just don't know how. I have tried using datatable, make-series, print, etc. functions to no avail. Here's my current code:
let allrequests = requests | project itemCount, resultCode, success, timestamp | where timestamp > now(-1h) and timestamp < now(-5m);
let requestcount = allrequests | summarize sum(itemCount);
let errorcount = allrequests | where toint(resultCode) >= 400 and toint(resultCode) <= 499 | summarize sum(itemCount);
requestcount; errorcount

Using union is one way, but if you want them on a single row use the print statement (docs):
let requestcount = requests
| summarize sum(itemCount);
let errorcount = exceptions
| summarize count();
print requests = toscalar(requestcount), exceptions = toscalar(errorcount)

I figured it out. You can join results using the union operator.
let allrequests = requests | project itemCount, resultCode, success, timestamp | where timestamp > now(-1h) and timestamp < now(-5m);
let requestcount = allrequests | summarize sum(itemCount);
let errorcount = allrequests | where toint(resultCode) >= 400 and toint(resultCode) <= 499 | summarize sum(itemCount);
errorcount | union requestcount

Related

Sphinx Results Take Huge Time To Show (Slow Index)

I'm new to Sphinx, i have simple table tbl_urls with two columns (domain_id,url)
i created my index as below to get domain id and number of urls for any giving keyword
source src2
{
type = mysql
sql_host = 0.0.0.0
sql_user = spnx
sql_pass = 123
sql_db = db_spnx
sql_port = 3306 # optional, default is 3306
sql_query = select id,domain_id,url from tbl_domain_urls
sql_attr_uint = domain_id
sql_field_string = url
}
index url_tbl
{
source = src2
path =/var/lib/sphinx/data/url_tbl
}
indexer
{
mem_limit = 2047M
}
searchd
{
listen = 0.0.0.0:9312
listen = 0.0.0.0:9306:mysql41
listen = /home/charlie/sphinx-3.4.1/bin/searchd.sock:sphinx
log = /var/log/sphinx/sphinx.log
query_log = /var/log/sphinx/query.log
read_timeout = 5
max_children = 30
pid_file = /var/run/sphinx/sphinx.pid
max_filter_values = 20000
seamless_rotate = 1
preopen_indexes = 0
unlink_old = 1
workers = threads # for RT indexes to work
binlog_path = /var/lib/sphinx/data
max_batch_queries = 128
}
problem is the time taken to show results is over one min
SELECT domain_id,count(*) as url_counter
FROM ul_tbl WHERE MATCH('games')
group by domain_id limit 1000000 OPTION max_matches=1000000;show meta;
+-----------+-------+
| domain_id | url |
+-----------+-------+
| 9900 | 444 |
| 41309 | 48 |
| 62308 | 491 |
| 85798 | 401 |
| 595 | 4851 |
13545 rows in set (3 min 22.56 sec)
+---------------+--------+
| Variable_name | Value |
+---------------+--------+
| total | 13545 |
| total_found | 13545 |
| time | 1.406 |
| keyword[0] | games |
| docs[0] | 456667 |
| hits[0] | 514718 |
+---------------+--------+
table tbl_domain_urls 100,821,614 rows
dedicated server HP Proliant 2xL5420 16GB RAM 2x1TB HDD
I need your support to optimize my QUERY or config settings, i need the results in the lowest time possible, i really appreciate any new idea to test
Note:
I tried distributed index to use multiple core for processing without any noticable results

Exclude Temporary Storage (D:) from KQL QUERY

I have a KQL query from disk logs from Azure Log Insights. Please let me know how to exclude a particular drive like D: or any temporary storage from this query.
InsightsMetrics
| where Name == "FreeSpaceMB"
| extend Tags = parse_json(Tags)
| extend mountId = tostring(Tags["vm.azm.ms/mountId"])
,diskSizeMB = toreal(Tags["vm.azm.ms/diskSizeMB"])
| project-rename FreeSpaceMB = Val
| summarize arg_max(TimeGenerated, diskSizeMB, FreeSpaceMB) by Computer, mountId
,FreeSpacePercentage = round(FreeSpaceMB / diskSizeMB * 100, 1)
| extend diskSizeGB = round(diskSizeMB / 1024, 1)
,FreeSpaceGB = round(FreeSpaceMB / 1024, 1)
| project TimeGenerated, Computer, mountId, diskSizeGB, FreeSpaceGB, FreeSpacePercentage
| order by Computer asc, mountId asc
You just need to do a where statement
| where mountId != "D:"
So in your query it will be
InsightsMetrics
| where Name == "FreeSpaceMB"
| extend Tags = parse_json(Tags)
| extend mountId = tostring(Tags["vm.azm.ms/mountId"])
,diskSizeMB = toreal(Tags["vm.azm.ms/diskSizeMB"])
| where mountId != "D:"
| project-rename FreeSpaceMB = Val
| summarize arg_max(TimeGenerated, diskSizeMB, FreeSpaceMB) by Computer, mountId
,FreeSpacePercentage = round(FreeSpaceMB / diskSizeMB * 100, 1)
| extend diskSizeGB = round(diskSizeMB / 1024, 1)
,FreeSpaceGB = round(FreeSpaceMB / 1024, 1)
| project TimeGenerated, Computer, mountId, diskSizeGB, FreeSpaceGB, FreeSpacePercentage
| order by Computer asc, mountId asc
And if you wanted to exclude multiple drives from the query, you can use the !in operator, will look like below
InsightsMetrics
| where Name == "FreeSpaceMB"
| extend Tags = parse_json(Tags)
| extend mountId = tostring(Tags["vm.azm.ms/mountId"])
,diskSizeMB = toreal(Tags["vm.azm.ms/diskSizeMB"])
| where mountId !in ("D:", "E:")
| project-rename FreeSpaceMB = Val
| summarize arg_max(TimeGenerated, diskSizeMB, FreeSpaceMB) by Computer, mountId
,FreeSpacePercentage = round(FreeSpaceMB / diskSizeMB * 100, 1)
| extend diskSizeGB = round(diskSizeMB / 1024, 1)
,FreeSpaceGB = round(FreeSpaceMB / 1024, 1)
| project TimeGenerated, Computer, mountId, diskSizeGB, FreeSpaceGB, FreeSpacePercentage
| order by Computer asc, mountId asc

Best way to show today Vs yesterday Vs week in KQL azure monitror

I am trying to show the count for today 9rolling 24 hours) vs yesterday (again rolling) Vs the weekly average
And though I've got the code to work but I am getting an error as well
The error is (Query succeeded with warnings: There were some errors when processing your query.: "Partial query failure: Unspecified error (message: 'shard: 5eeb9282-0854-4569-a674-10f8daef9f7d, source: (Error { details: Rest(404, "HEAD qh1kustorageoiprdweu16.blob.core.windows.net/jgtb64673c4e98a07fa116b4e49211-0d2a81b5bf3540e087ff2cc0e4e57c98/13da174e-3951-4b54-9a45-1f9cbe5759b4/426a5a10-4e91-4...")
The code
let Yes_End = ago(24h);
let Yes_Start = ago(48h);
let N = ago(1m);
let LW_end = ago(14d);
let Lw_start = ago(7d);
let Curr = customMetrics
|extend Dec_Reasion = tostring(customDimensions["DeclineReason"])
|extend Type = tostring(customDimensions["AcquiringInstitutionId"])
|extend dw = dayofweek(timestamp)
|where name =='TransactionsDeclined'
|where timestamp between (Yes_End..N)
|summarize CurrentVal=sum(valueCount) by tostring(Dec_Reasion);
let Trend = customMetrics
|extend Dec_Reasion = tostring(customDimensions["DeclineReason"])
|extend Type = tostring(customDimensions["AcquiringInstitutionId"])
|where timestamp between (Yes_Start .. Yes_End)
|where name =='TransactionsDeclined'
|summarize Yesterday_total=sum(valueCount) by tostring(Dec_Reasion);
let weekTrend =customMetrics
|extend Dec_Reasion = tostring(customDimensions["DeclineReason"])
|extend Type = tostring(customDimensions["AcquiringInstitutionId"])
|extend dw = dayofweek(timestamp)
|where toint(dw) <6
|where timestamp between (LW_end .. Lw_start)
|where name =='TransactionsDeclined'
|summarize Week_Avg=sum(valueCount)/5 by tostring(Dec_Reasion) ;
Curr
|join kind=leftouter Trend on Dec_Reasion
|join kind=leftouter weekTrend on Dec_Reasion
|project Dec_Reasion,CurrentVal,Yesterday_total,Week_Avg
This query can be written in a way that does not require joins.
You might want to give it a try.
let Yes_End = ago(24h);
let Yes_Start = ago(48h);
let N = ago(1m);
let LW_end = ago(14d);
let Lw_start = ago(7d);
customMetrics
| where timestamp between (LW_end .. Lw_start)
or timestamp between (Yes_Start .. N)
| where name == 'TransactionsDeclined'
| extend Dec_Reasion = tostring(customDimensions["DeclineReason"])
,Type = tostring(customDimensions["AcquiringInstitutionId"])
| summarize CurrentVal = sumif(valueCount, timestamp between (Yes_End .. N))
,Yesterday_total = sumif(valueCount, timestamp between (Yes_Start .. Yes_End))
,Week_Avg = sumif(valueCount, timestamp between (LW_end .. Lw_start) and where toint(dayofweek(timestamp)) < 6) / 5
by Dec_Reasion

KQL query showing preceding logs from a specific log

I'm working on a query where I need the log that has a message of "Compromised" in it, then I want it to return the preceding 5 "deny" logs. New to KQL and just don't know the operator, so I appreciate the help!
Current query:
| sort by TimeGenerated
| where SourceIP == "555.555.555.555"
| where TimeGenerated between (datetime(10/20/2021, 16:25:41.750).. datetime(10/20/2021, 16:35:41.750))
| where AdditionalExtensions has "Compromised" or DeviceAction == "deny"
Ideally in my head it would be something like:
Needed query:
| sort by TimeGenerated
| where SourceIP == "555.555.555.555"
| where AdditionalExtensions has "Compromised"
| \\show preceding 5 logs that have DeviceAction = "deny"
Thank you!
You can use the prev() function
Here's how you do it:
let N = 5; // Number of records before/after records for which Cond is true
YourTable
| extend Cond = (SourceIP == "555.555.555.555") and (AdditionalExtensions has "Compromised") and (DeviceAction == "deny") // The predicate to "identify" relevant records
| sort by TimeGenerated asc
| extend rn = row_number(0, Cond)
| extend nxt = next(rn, N), prv = prev(rn, N)
| where nxt < N or (rn <= N and isnotnull(prv)) or Cond
| project-away rn, nxt, prv, Cond
Note that the sorting is done after the extend, and not before - this is more optimal (it's always best to push down the sorting as further down as possible).
(Courtesy of #RoyO)

How to use series_divide() in Kusto?

I am not able correctly divide time-series data with another time-series.
I get data from my TestTablewhich results in the following view:
TagId, sdata
8862, [0,0,0,0,2,2,2,3,4]
6304, [0,0,0,0,2,2,2,3,2]
I want to divide the sdata series for tagId 8862 with the series from 6304
I expect the following result:
[NaN,NaN,NaN,NaN,1,1,1,1,2]
When I try the below code, I only get two empty ddata rows in my S2 results
TestTable
| where TagId in (8862,6304)
| make-series sdata = avg(todouble(Value)) default=0 on TimeStamp in range (datetime(2019-06-27), datetime(2019-06-29), 1m) by TagId
| as S1;
S1 | project ddata = series_divide(sdata[0].['sdata'], sdata[1].['sdata'])
| as S2
What am I doing wrong?
both arguments to series_divide() can't come from two separate rows in the dataset.
here's an example for how you could achieve that (based on the limited-and-perhaps-not-fully-representative-of-your-real use case, as shown in your question)
let T =
datatable(tag_id:long, sdata:dynamic)
[
8862, dynamic([0,0,0,0,2,2,2,3,4]),
6304, dynamic([0,0,0,0,2,2,2,3,2]),
]
;
let get_value_from_T = (_tag_id:long)
{
toscalar(
T
| where tag_id == _tag_id
| take 1
| project sdata
)
};
print sdata_1 = get_value_from_T(8862), sdata_2 = get_value_from_T(6304)
| extend result = series_divide(sdata_1, sdata_2)
which returns:
|sdata_1 | sdata_2 | result |
|--------------------|---------------------|---------------------------------------------|
|[0,0,0,0,2,2,2,3,4] | [0,0,0,0,2,2,2,3,2] |["NaN","NaN","NaN","NaN",1.0,1.0,1.0,1.0,2.0]|

Resources