Azure Log Analytics parse json

Azure Log Analytics parse json - azure

I have a query which results in a few columns but one of the columns, I am parsing JSON to retrieve the object value but there are multiple entries in it I want each entry in JSON to retrieve in a loop and display.
Below is the query,
let forEach_table = AzureDiagnostics
| where Parameters_LOAD_GROUP_s contains 'LOAD(AUTO)';
let ParentPlId = '';
let ParentPlName = '';
let commonKey = '';
forEach_table
| where Category == 'PipelineRuns'
| extend pplId = parse_json(Predecessors_s)[0].PipelineRunId, pplName = parse_json(Predecessors_s)[0].PipelineName
| extend dbMapName = tostring(parse_json(Parameters_getMetadataList_s)[0].dbMapName)
| summarize count(runId_g) by Resource, Status = status_s, Name=pipelineName_s, Loadgroup = Parameters_LOAD_GROUP_s, dbMapName, Parameters_LOAD_GROUP_s, Parameters_getMetadataList_s, pipelineName_s, Category, CorrelationId, start_t, end_t, TimeGenerated
| project ParentPL_ID = ParentPlId, ParentPL_Name = ParentPlName, LoadGroup_Name = Loadgroup, Map_Name = dbMapName, Status,Metadata = Parameters_getMetadataList_s, Category, CorrelationId, start_t, end_t
| project-away ParentPL_ID, ParentPL_Name, Category, CorrelationId
here in the above code,
extend dbMapName = tostring(parse_json(Parameters_getMetadataList_s)[0].dbMapName)
I am retrieving 0th element as default but I would like to retrieve all elements in sequence can somebody suggest me how to achieve this.

bag_keys() is just what you need.
For example, take a look at this query:
datatable(myjson: dynamic) [
dynamic({"a": 123, "b": 234, "c": 345}),
dynamic({"dd": 123, "ee": 234, "ff": 345})
]
| project keys = bag_keys(myjson)
Its output is:
|---------|
| keys |
|---------|
| [ |
| "a", |
| "b", |
| "c" |
| ] |
|---------|
| [ |
| "dd", |
| "ee", |
| "ff" |
| ] |
|---------|
If you want to have every key in a separate row, use mv-expand, like this:
datatable(myjson: dynamic) [
dynamic({"a": 123, "b": 234, "c": 345}),
dynamic({"dd": 123, "ee": 234, "ff": 345})
]
| project keys = bag_keys(myjson)
| mv-expand keys
The output of this query will be:
|------|
| keys |
|------|
| a |
| b |
| c |
| dd |
| ee |
| ff |
|------|

extend and mv-expand methods help in resolving this kind of scenario.
Solution:
extend rows = parse_json(Parameters_getMetadataList_s)
| mv-expand rows
| project Parameters_LOAD_GROUP_s,rows

Related

Parse `key1=value1 key2=value2` in Kusto

I'm running Cilium inside an Azure Kubernetes Cluster and want to parse the cilium log messages in the Azure Log Analytics. The log messages have a format like
key1=value1 key2=value2 key3="if the value contains spaces, it's wrapped in quotation marks"
For example:
level=info msg="Identity of endpoint changed" containerID=a4566a3e5f datapathPolicyRevision=0
I couldn't find a matching parse_xxx method in the docs (e.g. https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/parsecsvfunction ). Is there a possibility to write a custom function to parse this kind of log messages?

Not a fun format to parse... But this should work:
let LogLine = "level=info msg=\"Identity of endpoint changed\" containerID=a4566a3e5f datapathPolicyRevision=0";
print LogLine
| extend KeyValuePairs = array_concat(
extract_all("([a-zA-Z_]+)=([a-zA-Z0-9_]+)", LogLine),
extract_all("([a-zA-Z_]+)=\"([a-zA-Z0-9_ ]+)\"", LogLine))
| mv-apply KeyValuePairs on
(
extend p = pack(tostring(KeyValuePairs[0]), tostring(KeyValuePairs[1]))
| summarize dict=make_bag(p)
)
The output will be:
| print_0 | dict |
|--------------------|-----------------------------------------|
| level=info msg=... | { |
| | "level": "info", |
| | "containerID": "a4566a3e5f", |
| | "datapathPolicyRevision": "0", |
| | "msg": "Identity of endpoint changed" |
| | } |
|--------------------|-----------------------------------------|

With the help of Slavik N, I came with a query that works for me:
let containerIds = KubePodInventory
| where Namespace startswith "cilium"
| distinct ContainerID
| summarize make_set(ContainerID);
ContainerLog
| where ContainerID in (containerIds)
| extend KeyValuePairs = array_concat(
extract_all("([a-zA-Z0-9_-]+)=([^ \"]+)", LogEntry),
extract_all("([a-zA-Z0-9_]+)=\"([^\"]+)\"", LogEntry))
| mv-apply KeyValuePairs on
(
extend p = pack(tostring(KeyValuePairs[0]), tostring(KeyValuePairs[1]))
| summarize JSONKeyValuePairs=parse_json(make_bag(p))
)
| project TimeGenerated, Level=JSONKeyValuePairs.level, Message=JSONKeyValuePairs.msg, PodName=JSONKeyValuePairs.k8sPodName, Reason=JSONKeyValuePairs.reason, Controller=JSONKeyValuePairs.controller, ContainerID=JSONKeyValuePairs.containerID, Labels=JSONKeyValuePairs.labels, Raw=LogEntry

invalid string interpolation: `$$', `$'ident or `$'BlockExpr expected -> Spark SQL

The error I am getting:
invalid string interpolation: `$$', `$'ident or `$'BlockExpr expected
Spark SQL:
val sql =
s"""
|SELECT
| ,CAC.engine
| ,CAC.user_email
| ,CAC.submit_time
| ,CAC.end_time
| ,CAC.duration
| ,CAC.counter_name
| ,CAC.counter_value
| ,CAC.usage_hour
| ,CAC.event_date
|FROM
| xyz.command AS CAC
| INNER JOIN
| (
| SELECT DISTINCT replace(split(get_json_object(metadata_payload, '$.configuration.name'), '_')[1], 'acc', '') AS account_id
| FROM xyz.metadata
| ) AS QCM
| ON QCM.account_id = CAC.account_id
|WHERE
| CAC.event_date BETWEEN '2019-10-01' AND '2019-10-05'
|""".stripMargin
val df = spark.sql(sql)
df.show(10, false)

You added s prefix which means you want the string be interpolated. It means all tokens prefixed with $ will be replaced with the local variable with the same name. From you code it looks like you do not use this feature, so you could just remove s prefix from the string:
val sql =
"""
|SELECT
| ,CAC.engine
| ,CAC.user_email
| ,CAC.submit_time
| ,CAC.end_time
| ,CAC.duration
| ,CAC.counter_name
| ,CAC.counter_value
| ,CAC.usage_hour
| ,CAC.event_date
|FROM
| xyz.command AS CAC
| INNER JOIN
| (
| SELECT DISTINCT replace(split(get_json_object(metadata_payload, '$.configuration.name'), '_')[1], 'acc', '') AS account_id
| FROM xyz.metadata
| ) AS QCM
| ON QCM.account_id = CAC.account_id
|WHERE
| CAC.event_date BETWEEN '2019-10-01' AND '2019-10-05'
|""".stripMargin
Otherwise if you really need the interpolation you have to quote $ sign like this:
val sql =
s"""
|SELECT
| ,CAC.engine
| ,CAC.user_email
| ,CAC.submit_time
| ,CAC.end_time
| ,CAC.duration
| ,CAC.counter_name
| ,CAC.counter_value
| ,CAC.usage_hour
| ,CAC.event_date
|FROM
| xyz.command AS CAC
| INNER JOIN
| (
| SELECT DISTINCT replace(split(get_json_object(metadata_payload, '$$.configuration.name'), '_')[1], 'acc', '') AS account_id
| FROM xyz.metadata
| ) AS QCM
| ON QCM.account_id = CAC.account_id
|WHERE
| CAC.event_date BETWEEN '2019-10-01' AND '2019-10-05'
|""".stripMargin

white space issue in isAlpha() function of express-validator

I am using express-validator in my project
my json from the client is
{"name": "john doe"}
my express validation code is
[check('name', 'invalid name').isAlpha()]
why this code is returning invalid name while this is a string.
Also I tried isString() but it is also not working it is working in the same style as isAlpha().
Error json response to the client is
{
"errors": [
{
"value": "john doe",
"msg": "invalid name",
"param": "name",
"location": "body"
}
]
}
does isAlpha() function consider only one word as a string
How can I fix this

There is an option of .isAlpha you can use to ignore white spaces:
check('name', 'invalid name').isAlpha('en-US', {ignore: ' '})
The first parameter 'en-US' is AlphaLocale. For example I use 'es-ES' to validate Spanish special characters. You can use one of these to validate other languages: 'ar' | 'ar-AE' | 'ar-BH' | 'ar-DZ' | 'ar-EG' | 'ar-IQ' | 'ar-JO' | 'ar-KW' | 'ar-LB' | 'ar-LY' | 'ar-MA' | 'ar-QA' | 'ar-QM' | 'ar-SA' | 'ar-SD' | 'ar-SY' | 'ar-TN' | 'ar-YE' | 'az-AZ' | 'bg-BG' | 'cs-CZ' | 'da-DK' | 'de-DE' | 'el-GR' | 'en-AU' | 'en-GB' | 'en-HK' | 'en-IN' | 'en-NZ' | 'en-US' | 'en-ZA' | 'en-ZM' | 'es-ES' | 'fa-AF' | 'fa-IR' | 'fr-FR' | 'he' | 'hu-HU' | 'id-ID' | 'it-IT' | 'ku-IQ' | 'nb-NO' | 'nl-NL' | 'nn-NO' | 'pl-PL' | 'pt-BR' | 'pt-PT' | 'ru-RU' | 'sk-SK' | 'sl-SI' | 'sr-RS' | 'sr-RS#latin' | 'sv-SE' | 'th-TH' | 'tr-TR' | 'uk-UA' | 'vi-VN'.
The second parameter is the object IsAlphaOptions. It only contains an optional parameter 'ignore', and it can have the value of a string, string[] or RegExp.
So you can also ignore white spaces with the RegExp \s.
.isAlpha('en-US', {ignore: '\s'})

I got the answer. I used custom validation method. It resolved my issue.
[check('name').custom((value,{req})=>{
if(isNaN(value)){
return true;
}else{
throw new Error('invalid name')
}
})]

To check, using express-validator, a string contains only letters and spaces you can use a regular expression
check('name').custom((value) => {
return value.match(/^[A-Za-z ]+$/);
})

"john doe" consisting white space " ". Due to this white-space isAlpha() throwing error. isAlpha allows only a-zA-Z.

Hopefully Im not late to the party.
With class-validator#0.13.2, we can use
#Matches(/^[a-zA-Z0-9 -]*$/)
Just tweak the regex to satisfy your needs. In my case, I want to use #IsAlphanumeric() but with spaces and hyphen/dash

Simply replace isAlpha() or isAlphaNumeric()
with
isAlphanumericWithSpace()/ isAlphaWithSpace().

Nestled query Log analytics

Hi i'm trying to get to a log event by nestling a query in the "where" of another query. is this possible?
AzureDiagnostics
| where resource_workflowName_s == "[Workflow Name]"
| where resource_runId_s == (AzureDiagnostics | where trackedProperties_PayloadID_g == "[GUID]" | distinct resource_runId_s)

try:
AzureDiagnostics
| where resource_workflowName_s == "[Workflow Name]"
| where resource_runId_s in (
toscalar(AzureDiagnostics
| where trackedProperties_PayloadID_g == "[GUID]"
| distinct resource_runId_s))

Understanding dequeue_rt_stack() for RT scheduling class linux

enqueue_task_rt function in ./kernel/sched/rt.c is responsible for queuing the task to the run queue. enqueue_task_rt contains call to enqueue_rt_entity which calls dequeue_rt_stack. Most part of the code seems logical but I am a bit lost because of the function dequeue_rt_stack unable to understand what it does. Can somebody tell what is the logic that I am missing or suggest some good read.
Edit: The following is the code for dequeue_rt_stack function
struct sched_rt_entity *back = NULL;
/* macro for_each_sched_rt_entity defined as
for(; rt_se; rt_se = rt_se->parent)*/
for_each_sched_rt_entity(rt_se) {
rt_se->back = back;
back = rt_se;
}
for (rt_se = back; rt_se; rt_se = rt_se->back) {
if (on_rt_rq(rt_se))
__dequeue_rt_entity(rt_se);
}
More specifically, I do not understand why there is a need for this code:
for_each_sched_rt_entity(rt_se) {
rt_se->back = back;
back = rt_se;
}
What is its relevance.

When a task is to be added to some queue, it must first be removed from the queue that it currently is on, if any.
With the group scheduler, a task is always at the lowest level of the tree, and might have multiple ancestors:
NULL
^
|
+-----parent------+
| |
| top-level group |
| |
+-----------------+
^ ^_____________
| \
+-----parent------+ +-----parent------+
| | | |
| mid-level group | | other group | ...
| | | |
+-----------------+ +-----------------+
^ ^_____________
| \
+-----parent------+ +-----------------+
| | | |
| task | | other task | ...
| | | |
+-----------------+ +-----------------+
To remove the task from the tree, it must be removed from all groups' queues, and this must be done first at the top-level group (otherwise, the scheduler might try to run an already partially-removed task). Therefore, dequeue_rt_stack uses the back pointers to constructs a list in the opposite direction:
NULL back
^ |
| V
+-parent----------+
| |
| top-level group |
| |
+----------back---+
^ | ^_____________
| V \
+-parent----------+ +-----parent------+
| | | |
| mid-level group | | other group | ...
| | | |
+----------back---+ +-----------------+
^ | ^_____________
| V \
+-parent----------+ +-----------------+
| | | |
| task | | other task | ...
| | | |
+----------back---+ +-----------------+
|
V
NULL
That back list can then be used to walk down the tree to remove the entities in the correct order.

I am a fresh man in kernel hacking. This is my first time to answer linux kernel question.
Maybe this help to you.
I read the source code. I think it maybe relates to group scheduling.
When kernel have these codes:
#ifdef CONFIG_RT_GROUP_SCHED
It represents that we can collect some schedule entities in to one schduling group.
static void enqueue_rt_entity(struct sched_rt_entity *rt_se, bool head)
{
dequeue_rt_stack(rt_se);
for_each_sched_rt_entity(rt_se)
__enqueue_rt_entity(rt_se, head);
}
Function dequeue_rt_stack(rt_se) extracts all the scheduling entities belong to the group, then add them to run queue.
Hierarchical group I/O scheduling
CFS group scheduling

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Azure Log Analytics parse json - azure

extend and mv-expand methods help in resolving this kind of scenario. Solution: extend rows = parse_json(Parameters_getMetadataList_s) | mv-expand rows | project Parameters_LOAD_GROUP_s,rows

Related

Parse `key1=value1 key2=value2` in Kusto

invalid string interpolation: `$$', `$'ident or `$'BlockExpr expected -> Spark SQL

white space issue in isAlpha() function of express-validator

Nestled query Log analytics

Understanding dequeue_rt_stack() for RT scheduling class linux

Categories

Resources