Parse `key1=value1 key2=value2` in Kusto - azure

I'm running Cilium inside an Azure Kubernetes Cluster and want to parse the cilium log messages in the Azure Log Analytics. The log messages have a format like
key1=value1 key2=value2 key3="if the value contains spaces, it's wrapped in quotation marks"
For example:
level=info msg="Identity of endpoint changed" containerID=a4566a3e5f datapathPolicyRevision=0
I couldn't find a matching parse_xxx method in the docs (e.g. https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/parsecsvfunction ). Is there a possibility to write a custom function to parse this kind of log messages?

Not a fun format to parse... But this should work:
let LogLine = "level=info msg=\"Identity of endpoint changed\" containerID=a4566a3e5f datapathPolicyRevision=0";
print LogLine
| extend KeyValuePairs = array_concat(
extract_all("([a-zA-Z_]+)=([a-zA-Z0-9_]+)", LogLine),
extract_all("([a-zA-Z_]+)=\"([a-zA-Z0-9_ ]+)\"", LogLine))
| mv-apply KeyValuePairs on
(
extend p = pack(tostring(KeyValuePairs[0]), tostring(KeyValuePairs[1]))
| summarize dict=make_bag(p)
)
The output will be:
| print_0 | dict |
|--------------------|-----------------------------------------|
| level=info msg=... | { |
| | "level": "info", |
| | "containerID": "a4566a3e5f", |
| | "datapathPolicyRevision": "0", |
| | "msg": "Identity of endpoint changed" |
| | } |
|--------------------|-----------------------------------------|

With the help of Slavik N, I came with a query that works for me:
let containerIds = KubePodInventory
| where Namespace startswith "cilium"
| distinct ContainerID
| summarize make_set(ContainerID);
ContainerLog
| where ContainerID in (containerIds)
| extend KeyValuePairs = array_concat(
extract_all("([a-zA-Z0-9_-]+)=([^ \"]+)", LogEntry),
extract_all("([a-zA-Z0-9_]+)=\"([^\"]+)\"", LogEntry))
| mv-apply KeyValuePairs on
(
extend p = pack(tostring(KeyValuePairs[0]), tostring(KeyValuePairs[1]))
| summarize JSONKeyValuePairs=parse_json(make_bag(p))
)
| project TimeGenerated, Level=JSONKeyValuePairs.level, Message=JSONKeyValuePairs.msg, PodName=JSONKeyValuePairs.k8sPodName, Reason=JSONKeyValuePairs.reason, Controller=JSONKeyValuePairs.controller, ContainerID=JSONKeyValuePairs.containerID, Labels=JSONKeyValuePairs.labels, Raw=LogEntry

Related

Azure Log Analytics parse json

I have a query which results in a few columns but one of the columns, I am parsing JSON to retrieve the object value but there are multiple entries in it I want each entry in JSON to retrieve in a loop and display.
Below is the query,
let forEach_table = AzureDiagnostics
| where Parameters_LOAD_GROUP_s contains 'LOAD(AUTO)';
let ParentPlId = '';
let ParentPlName = '';
let commonKey = '';
forEach_table
| where Category == 'PipelineRuns'
| extend pplId = parse_json(Predecessors_s)[0].PipelineRunId, pplName = parse_json(Predecessors_s)[0].PipelineName
| extend dbMapName = tostring(parse_json(Parameters_getMetadataList_s)[0].dbMapName)
| summarize count(runId_g) by Resource, Status = status_s, Name=pipelineName_s, Loadgroup = Parameters_LOAD_GROUP_s, dbMapName, Parameters_LOAD_GROUP_s, Parameters_getMetadataList_s, pipelineName_s, Category, CorrelationId, start_t, end_t, TimeGenerated
| project ParentPL_ID = ParentPlId, ParentPL_Name = ParentPlName, LoadGroup_Name = Loadgroup, Map_Name = dbMapName, Status,Metadata = Parameters_getMetadataList_s, Category, CorrelationId, start_t, end_t
| project-away ParentPL_ID, ParentPL_Name, Category, CorrelationId
here in the above code,
extend dbMapName = tostring(parse_json(Parameters_getMetadataList_s)[0].dbMapName)
I am retrieving 0th element as default but I would like to retrieve all elements in sequence can somebody suggest me how to achieve this.
bag_keys() is just what you need.
For example, take a look at this query:
datatable(myjson: dynamic) [
dynamic({"a": 123, "b": 234, "c": 345}),
dynamic({"dd": 123, "ee": 234, "ff": 345})
]
| project keys = bag_keys(myjson)
Its output is:
|---------|
| keys |
|---------|
| [ |
| "a", |
| "b", |
| "c" |
| ] |
|---------|
| [ |
| "dd", |
| "ee", |
| "ff" |
| ] |
|---------|
If you want to have every key in a separate row, use mv-expand, like this:
datatable(myjson: dynamic) [
dynamic({"a": 123, "b": 234, "c": 345}),
dynamic({"dd": 123, "ee": 234, "ff": 345})
]
| project keys = bag_keys(myjson)
| mv-expand keys
The output of this query will be:
|------|
| keys |
|------|
| a |
| b |
| c |
| dd |
| ee |
| ff |
|------|
extend and mv-expand methods help in resolving this kind of scenario.
Solution:
extend rows = parse_json(Parameters_getMetadataList_s)
| mv-expand rows
| project Parameters_LOAD_GROUP_s,rows

invalid string interpolation: `$$', `$'ident or `$'BlockExpr expected -> Spark SQL

The error I am getting:
invalid string interpolation: `$$', `$'ident or `$'BlockExpr expected
Spark SQL:
val sql =
s"""
|SELECT
| ,CAC.engine
| ,CAC.user_email
| ,CAC.submit_time
| ,CAC.end_time
| ,CAC.duration
| ,CAC.counter_name
| ,CAC.counter_value
| ,CAC.usage_hour
| ,CAC.event_date
|FROM
| xyz.command AS CAC
| INNER JOIN
| (
| SELECT DISTINCT replace(split(get_json_object(metadata_payload, '$.configuration.name'), '_')[1], 'acc', '') AS account_id
| FROM xyz.metadata
| ) AS QCM
| ON QCM.account_id = CAC.account_id
|WHERE
| CAC.event_date BETWEEN '2019-10-01' AND '2019-10-05'
|""".stripMargin
val df = spark.sql(sql)
df.show(10, false)
You added s prefix which means you want the string be interpolated. It means all tokens prefixed with $ will be replaced with the local variable with the same name. From you code it looks like you do not use this feature, so you could just remove s prefix from the string:
val sql =
"""
|SELECT
| ,CAC.engine
| ,CAC.user_email
| ,CAC.submit_time
| ,CAC.end_time
| ,CAC.duration
| ,CAC.counter_name
| ,CAC.counter_value
| ,CAC.usage_hour
| ,CAC.event_date
|FROM
| xyz.command AS CAC
| INNER JOIN
| (
| SELECT DISTINCT replace(split(get_json_object(metadata_payload, '$.configuration.name'), '_')[1], 'acc', '') AS account_id
| FROM xyz.metadata
| ) AS QCM
| ON QCM.account_id = CAC.account_id
|WHERE
| CAC.event_date BETWEEN '2019-10-01' AND '2019-10-05'
|""".stripMargin
Otherwise if you really need the interpolation you have to quote $ sign like this:
val sql =
s"""
|SELECT
| ,CAC.engine
| ,CAC.user_email
| ,CAC.submit_time
| ,CAC.end_time
| ,CAC.duration
| ,CAC.counter_name
| ,CAC.counter_value
| ,CAC.usage_hour
| ,CAC.event_date
|FROM
| xyz.command AS CAC
| INNER JOIN
| (
| SELECT DISTINCT replace(split(get_json_object(metadata_payload, '$$.configuration.name'), '_')[1], 'acc', '') AS account_id
| FROM xyz.metadata
| ) AS QCM
| ON QCM.account_id = CAC.account_id
|WHERE
| CAC.event_date BETWEEN '2019-10-01' AND '2019-10-05'
|""".stripMargin

white space issue in isAlpha() function of express-validator

I am using express-validator in my project
my json from the client is
{"name": "john doe"}
my express validation code is
[check('name', 'invalid name').isAlpha()]
why this code is returning invalid name while this is a string.
Also I tried isString() but it is also not working it is working in the same style as isAlpha().
Error json response to the client is
{
"errors": [
{
"value": "john doe",
"msg": "invalid name",
"param": "name",
"location": "body"
}
]
}
does isAlpha() function consider only one word as a string
How can I fix this
There is an option of .isAlpha you can use to ignore white spaces:
check('name', 'invalid name').isAlpha('en-US', {ignore: ' '})
The first parameter 'en-US' is AlphaLocale. For example I use 'es-ES' to validate Spanish special characters. You can use one of these to validate other languages: 'ar' | 'ar-AE' | 'ar-BH' | 'ar-DZ' | 'ar-EG' | 'ar-IQ' | 'ar-JO' | 'ar-KW' | 'ar-LB' | 'ar-LY' | 'ar-MA' | 'ar-QA' | 'ar-QM' | 'ar-SA' | 'ar-SD' | 'ar-SY' | 'ar-TN' | 'ar-YE' | 'az-AZ' | 'bg-BG' | 'cs-CZ' | 'da-DK' | 'de-DE' | 'el-GR' | 'en-AU' | 'en-GB' | 'en-HK' | 'en-IN' | 'en-NZ' | 'en-US' | 'en-ZA' | 'en-ZM' | 'es-ES' | 'fa-AF' | 'fa-IR' | 'fr-FR' | 'he' | 'hu-HU' | 'id-ID' | 'it-IT' | 'ku-IQ' | 'nb-NO' | 'nl-NL' | 'nn-NO' | 'pl-PL' | 'pt-BR' | 'pt-PT' | 'ru-RU' | 'sk-SK' | 'sl-SI' | 'sr-RS' | 'sr-RS#latin' | 'sv-SE' | 'th-TH' | 'tr-TR' | 'uk-UA' | 'vi-VN'.
The second parameter is the object IsAlphaOptions. It only contains an optional parameter 'ignore', and it can have the value of a string, string[] or RegExp.
So you can also ignore white spaces with the RegExp \s.
.isAlpha('en-US', {ignore: '\s'})
I got the answer. I used custom validation method. It resolved my issue.
[check('name').custom((value,{req})=>{
if(isNaN(value)){
return true;
}else{
throw new Error('invalid name')
}
})]
To check, using express-validator, a string contains only letters and spaces you can use a regular expression
check('name').custom((value) => {
return value.match(/^[A-Za-z ]+$/);
})
"john doe" consisting white space " ". Due to this white-space isAlpha() throwing error. isAlpha allows only a-zA-Z.
Hopefully Im not late to the party.
With class-validator#0.13.2, we can use
#Matches(/^[a-zA-Z0-9 -]*$/)
Just tweak the regex to satisfy your needs. In my case, I want to use #IsAlphanumeric() but with spaces and hyphen/dash
Simply replace isAlpha() or isAlphaNumeric()
with
isAlphanumericWithSpace()/ isAlphaWithSpace().

Perl threads don't suspend/ resume

I am using Thread::Suspend to start threads from remote modules. Some of the $subrotine calls take longer than 30 seconds.
my $thr = threads->create(sub {
capture(EXIT_ANY, $^X, $pathToModule, $subroutine, %arguments)
});
return $thr->tid();
My issue is that I am not able to suspend/resume a created thread. Here is the code execute to suspend a thread:
use IPC::System::Simple qw (capture $EXITVAL EXIT_ANY);
use threads;
use Thread::Suspend;
use Try::Tiny;
sub suspendThread {
my $msg;
my $threadNumber = shift;
foreach (threads->list()) {
if ($_->tid() == $threadNumber) {
if ($_->is_suspended() == 0) {
try {
# here the execution of the thread is not paused
threads->suspend($_);
} catch {
print "error: " . $! . "\n";
};
$msg = "Process $threadNumber paused";
} else {
$msg = "Process $threadNumber has to be resumed\n";
}
}
}
return $msg;
}
And this is the code from the module that I load dynamically:
sub run {
no strict 'refs';
my $funcRef = shift;
my %paramsRef = #_;
print &$funcRef(%paramsRef);
}
run(#ARGV);
I guess that the problem is that the sub passed to the treads constructor calls capture (from IPC::System::Simple module). I also tried to create the thread with my $thr = threads->create(capture(EXIT_ANY, $^X, $pathToModule, $subroutine, %arguments)); Any ideas how to resolve it.
These are the threads you have:
Parent process Process launched by capture
+---------------------+ +---------------------+
| | | |
| Main thread | | Main thread |
| +---------------+ | | +---------------+ |
| | | | | | | |
| | $t->suspend() | | | | | |
| | | | | | | |
| +---------------+ | | +---------------+ |
| | | |
| Created thread | | |
| +---------------+ | | |
| | | | | |
| | capture() | | | |
| | | | | |
| +---------------+ | | |
| | | |
+---------------------+ +---------------------+
You claim the thread you created wasn't suspended, but you have practically no way of determining whether it was suspended or not. After all, capture does not print anything or change any external variables. In fact, you have no reason to believe it wasn't suspended.
Now, you might want the program you launched to freeze, but you have not done anything to suspend it or its main thread. As such, it will keep on running[1].
If you wanted to suspend an external process, you could send SIGSTOP to it (and SIGCONT to resume it). For that, you'll need the process's PID. I recommend replacing capture with an IPC::Run pump loop.
Well, it will eventually block when it tries to write to STDOUT because the pipe got full because you actually did suspend the thread running capture.

Understanding dequeue_rt_stack() for RT scheduling class linux

enqueue_task_rt function in ./kernel/sched/rt.c is responsible for queuing the task to the run queue. enqueue_task_rt contains call to enqueue_rt_entity which calls dequeue_rt_stack. Most part of the code seems logical but I am a bit lost because of the function dequeue_rt_stack unable to understand what it does. Can somebody tell what is the logic that I am missing or suggest some good read.
Edit: The following is the code for dequeue_rt_stack function
struct sched_rt_entity *back = NULL;
/* macro for_each_sched_rt_entity defined as
for(; rt_se; rt_se = rt_se->parent)*/
for_each_sched_rt_entity(rt_se) {
rt_se->back = back;
back = rt_se;
}
for (rt_se = back; rt_se; rt_se = rt_se->back) {
if (on_rt_rq(rt_se))
__dequeue_rt_entity(rt_se);
}
More specifically, I do not understand why there is a need for this code:
for_each_sched_rt_entity(rt_se) {
rt_se->back = back;
back = rt_se;
}
What is its relevance.
When a task is to be added to some queue, it must first be removed from the queue that it currently is on, if any.
With the group scheduler, a task is always at the lowest level of the tree, and might have multiple ancestors:
NULL
^
|
+-----parent------+
| |
| top-level group |
| |
+-----------------+
^ ^_____________
| \
+-----parent------+ +-----parent------+
| | | |
| mid-level group | | other group | ...
| | | |
+-----------------+ +-----------------+
^ ^_____________
| \
+-----parent------+ +-----------------+
| | | |
| task | | other task | ...
| | | |
+-----------------+ +-----------------+
To remove the task from the tree, it must be removed from all groups' queues, and this must be done first at the top-level group (otherwise, the scheduler might try to run an already partially-removed task). Therefore, dequeue_rt_stack uses the back pointers to constructs a list in the opposite direction:
NULL back
^ |
| V
+-parent----------+
| |
| top-level group |
| |
+----------back---+
^ | ^_____________
| V \
+-parent----------+ +-----parent------+
| | | |
| mid-level group | | other group | ...
| | | |
+----------back---+ +-----------------+
^ | ^_____________
| V \
+-parent----------+ +-----------------+
| | | |
| task | | other task | ...
| | | |
+----------back---+ +-----------------+
|
V
NULL
That back list can then be used to walk down the tree to remove the entities in the correct order.
I am a fresh man in kernel hacking. This is my first time to answer linux kernel question.
Maybe this help to you.
I read the source code. I think it maybe relates to group scheduling.
When kernel have these codes:
#ifdef CONFIG_RT_GROUP_SCHED
It represents that we can collect some schedule entities in to one schduling group.
static void enqueue_rt_entity(struct sched_rt_entity *rt_se, bool head)
{
dequeue_rt_stack(rt_se);
for_each_sched_rt_entity(rt_se)
__enqueue_rt_entity(rt_se, head);
}
Function dequeue_rt_stack(rt_se) extracts all the scheduling entities belong to the group, then add them to run queue.
Hierarchical group I/O scheduling
CFS group scheduling

Resources