Multiline in Logstash with timestamp in each line - logstash

I have a multiline log written in a file as follows:
INFO | jvm 1 | main | 2014/11/06 13:41:30.112 | ERROR [appHTTP50] [appEmployeeAuthenticationProvider] Can't login with username 'username'
INFO | jvm 1 | main | 2014/11/06 13:41:30.112 | org.framework.security.authentication.BadCredentialsException: Bad credentials
INFO | jvm 1 | main | 2014/11/06 13:41:30.112 | at de.app.platform.security.CoreAuthenticationProvider.authenticate(CoreAuthenticationProvider.java:133)
INFO | jvm 1 | main | 2014/11/06 13:41:30.112 | at ca.canadiantire.security.appEmployeeAuthenticationProvider.authenticate(appEmployeeAuthenticationProvider.java:39)
INFO | jvm 1 | main | 2014/11/06 13:41:30.112 | at org.framework.security.authentication.ProviderManager.authenticate(ProviderManager.java:156)
INFO | jvm 1 | main | 2014/11/06 13:41:30.112 | at org.framework.security.authentication.ProviderManager.authenticate(ProviderManager.java:177)
However, line below is in each line of the trace on the begginning:
INFO | jvm 1 | main | 2014/11/06 13:41:30.112 |
Does anyone know how to leave this line on the beggining near "ERROR" and drop this part of the line in the trace with grok and get full trace as a single message in Logstash? Any other solutions are welcome.

I would think gsub{} is the answer. Either have a conditional stanza that would remove the preface from the subsequent lines, e.g.:
if [message] !~ /\| ERROR / {
mutate {
gsub => [ "message", "^.* \| ", "" ]
}
}
which, if it's "greedy" might leave you with a line like this:
org.framework.security.authentication.BadCredentialsException: Bad credentials
which could then be combined with a subsequent multiline{} filter.
Obviously, you'd need to make both regexps generic enough to handle each log level that you're expecting.

Related

Parse `key1=value1 key2=value2` in Kusto

I'm running Cilium inside an Azure Kubernetes Cluster and want to parse the cilium log messages in the Azure Log Analytics. The log messages have a format like
key1=value1 key2=value2 key3="if the value contains spaces, it's wrapped in quotation marks"
For example:
level=info msg="Identity of endpoint changed" containerID=a4566a3e5f datapathPolicyRevision=0
I couldn't find a matching parse_xxx method in the docs (e.g. https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/parsecsvfunction ). Is there a possibility to write a custom function to parse this kind of log messages?
Not a fun format to parse... But this should work:
let LogLine = "level=info msg=\"Identity of endpoint changed\" containerID=a4566a3e5f datapathPolicyRevision=0";
print LogLine
| extend KeyValuePairs = array_concat(
extract_all("([a-zA-Z_]+)=([a-zA-Z0-9_]+)", LogLine),
extract_all("([a-zA-Z_]+)=\"([a-zA-Z0-9_ ]+)\"", LogLine))
| mv-apply KeyValuePairs on
(
extend p = pack(tostring(KeyValuePairs[0]), tostring(KeyValuePairs[1]))
| summarize dict=make_bag(p)
)
The output will be:
| print_0 | dict |
|--------------------|-----------------------------------------|
| level=info msg=... | { |
| | "level": "info", |
| | "containerID": "a4566a3e5f", |
| | "datapathPolicyRevision": "0", |
| | "msg": "Identity of endpoint changed" |
| | } |
|--------------------|-----------------------------------------|
With the help of Slavik N, I came with a query that works for me:
let containerIds = KubePodInventory
| where Namespace startswith "cilium"
| distinct ContainerID
| summarize make_set(ContainerID);
ContainerLog
| where ContainerID in (containerIds)
| extend KeyValuePairs = array_concat(
extract_all("([a-zA-Z0-9_-]+)=([^ \"]+)", LogEntry),
extract_all("([a-zA-Z0-9_]+)=\"([^\"]+)\"", LogEntry))
| mv-apply KeyValuePairs on
(
extend p = pack(tostring(KeyValuePairs[0]), tostring(KeyValuePairs[1]))
| summarize JSONKeyValuePairs=parse_json(make_bag(p))
)
| project TimeGenerated, Level=JSONKeyValuePairs.level, Message=JSONKeyValuePairs.msg, PodName=JSONKeyValuePairs.k8sPodName, Reason=JSONKeyValuePairs.reason, Controller=JSONKeyValuePairs.controller, ContainerID=JSONKeyValuePairs.containerID, Labels=JSONKeyValuePairs.labels, Raw=LogEntry

white space issue in isAlpha() function of express-validator

I am using express-validator in my project
my json from the client is
{"name": "john doe"}
my express validation code is
[check('name', 'invalid name').isAlpha()]
why this code is returning invalid name while this is a string.
Also I tried isString() but it is also not working it is working in the same style as isAlpha().
Error json response to the client is
{
"errors": [
{
"value": "john doe",
"msg": "invalid name",
"param": "name",
"location": "body"
}
]
}
does isAlpha() function consider only one word as a string
How can I fix this
There is an option of .isAlpha you can use to ignore white spaces:
check('name', 'invalid name').isAlpha('en-US', {ignore: ' '})
The first parameter 'en-US' is AlphaLocale. For example I use 'es-ES' to validate Spanish special characters. You can use one of these to validate other languages: 'ar' | 'ar-AE' | 'ar-BH' | 'ar-DZ' | 'ar-EG' | 'ar-IQ' | 'ar-JO' | 'ar-KW' | 'ar-LB' | 'ar-LY' | 'ar-MA' | 'ar-QA' | 'ar-QM' | 'ar-SA' | 'ar-SD' | 'ar-SY' | 'ar-TN' | 'ar-YE' | 'az-AZ' | 'bg-BG' | 'cs-CZ' | 'da-DK' | 'de-DE' | 'el-GR' | 'en-AU' | 'en-GB' | 'en-HK' | 'en-IN' | 'en-NZ' | 'en-US' | 'en-ZA' | 'en-ZM' | 'es-ES' | 'fa-AF' | 'fa-IR' | 'fr-FR' | 'he' | 'hu-HU' | 'id-ID' | 'it-IT' | 'ku-IQ' | 'nb-NO' | 'nl-NL' | 'nn-NO' | 'pl-PL' | 'pt-BR' | 'pt-PT' | 'ru-RU' | 'sk-SK' | 'sl-SI' | 'sr-RS' | 'sr-RS#latin' | 'sv-SE' | 'th-TH' | 'tr-TR' | 'uk-UA' | 'vi-VN'.
The second parameter is the object IsAlphaOptions. It only contains an optional parameter 'ignore', and it can have the value of a string, string[] or RegExp.
So you can also ignore white spaces with the RegExp \s.
.isAlpha('en-US', {ignore: '\s'})
I got the answer. I used custom validation method. It resolved my issue.
[check('name').custom((value,{req})=>{
if(isNaN(value)){
return true;
}else{
throw new Error('invalid name')
}
})]
To check, using express-validator, a string contains only letters and spaces you can use a regular expression
check('name').custom((value) => {
return value.match(/^[A-Za-z ]+$/);
})
"john doe" consisting white space " ". Due to this white-space isAlpha() throwing error. isAlpha allows only a-zA-Z.
Hopefully Im not late to the party.
With class-validator#0.13.2, we can use
#Matches(/^[a-zA-Z0-9 -]*$/)
Just tweak the regex to satisfy your needs. In my case, I want to use #IsAlphanumeric() but with spaces and hyphen/dash
Simply replace isAlpha() or isAlphaNumeric()
with
isAlphanumericWithSpace()/ isAlphaWithSpace().

Perl threads don't suspend/ resume

I am using Thread::Suspend to start threads from remote modules. Some of the $subrotine calls take longer than 30 seconds.
my $thr = threads->create(sub {
capture(EXIT_ANY, $^X, $pathToModule, $subroutine, %arguments)
});
return $thr->tid();
My issue is that I am not able to suspend/resume a created thread. Here is the code execute to suspend a thread:
use IPC::System::Simple qw (capture $EXITVAL EXIT_ANY);
use threads;
use Thread::Suspend;
use Try::Tiny;
sub suspendThread {
my $msg;
my $threadNumber = shift;
foreach (threads->list()) {
if ($_->tid() == $threadNumber) {
if ($_->is_suspended() == 0) {
try {
# here the execution of the thread is not paused
threads->suspend($_);
} catch {
print "error: " . $! . "\n";
};
$msg = "Process $threadNumber paused";
} else {
$msg = "Process $threadNumber has to be resumed\n";
}
}
}
return $msg;
}
And this is the code from the module that I load dynamically:
sub run {
no strict 'refs';
my $funcRef = shift;
my %paramsRef = #_;
print &$funcRef(%paramsRef);
}
run(#ARGV);
I guess that the problem is that the sub passed to the treads constructor calls capture (from IPC::System::Simple module). I also tried to create the thread with my $thr = threads->create(capture(EXIT_ANY, $^X, $pathToModule, $subroutine, %arguments)); Any ideas how to resolve it.
These are the threads you have:
Parent process Process launched by capture
+---------------------+ +---------------------+
| | | |
| Main thread | | Main thread |
| +---------------+ | | +---------------+ |
| | | | | | | |
| | $t->suspend() | | | | | |
| | | | | | | |
| +---------------+ | | +---------------+ |
| | | |
| Created thread | | |
| +---------------+ | | |
| | | | | |
| | capture() | | | |
| | | | | |
| +---------------+ | | |
| | | |
+---------------------+ +---------------------+
You claim the thread you created wasn't suspended, but you have practically no way of determining whether it was suspended or not. After all, capture does not print anything or change any external variables. In fact, you have no reason to believe it wasn't suspended.
Now, you might want the program you launched to freeze, but you have not done anything to suspend it or its main thread. As such, it will keep on running[1].
If you wanted to suspend an external process, you could send SIGSTOP to it (and SIGCONT to resume it). For that, you'll need the process's PID. I recommend replacing capture with an IPC::Run pump loop.
Well, it will eventually block when it tries to write to STDOUT because the pipe got full because you actually did suspend the thread running capture.

Parsing in Linux

I want to parse the compute zones in open-stack command output as below
+-----------------------+----------------------------------------+
| Name | Status |
+-----------------------+----------------------------------------+
| internal | available |
| |- controller | |
| | |- nova-conductor | enabled :-) 2016-07-07T08:09:57.000000 |
| | |- nova-consoleauth | enabled :-) 2016-07-07T08:10:01.000000 |
| | |- nova-scheduler | enabled :-) 2016-07-07T08:10:00.000000 |
| | |- nova-cert | enabled :-) 2016-07-07T08:10:00.000000 |
| Compute01 | available |
| |- compute01 | |
| | |- nova-compute | enabled :-) 2016-07-07T08:09:53.000000 |
| Compute02 | available |
| |- compute02 | |
| | |- nova-compute | enabled :-) 2016-07-07T08:10:00.000000 |
| nova | not available |
+-----------------------+----------------------------------------+
i want to parse the result as below, taking only nodes having nova-compute
Compute01;Compute02
I used below command:
nova availability-zone-list | awk 'NR>2 {print $2}' | grep -v '|' | tr '\n' ';'
but it returns output like this
;internal;Compute01;Compute02;nova;;
In Perl (and written rather more verbosely than is really necessary):
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
my $node; # Store current node name
my #compute_nodes; # Store known nova-compute nodes
while (<>) { # Read from STDIN
# If we find the start of line, followed by a pipe, a space and
# a series of word characters...
if (/^\| (\w+)/) {
# Store the series of word characters (i.e. the node name) in $node
$node = $1;
}
# If we find a line that contains "nova-compute", add the current
# node name in #compute_nodes
push #compute_nodes, $node if /nova-compute/;
}
# Print out all of the values in #compute_nodes
say join ';', #compute_nodes;
I detest one-line programs except for the most simple of applications. They are unnecessarily cryptic, they have none of the usual programming support, and they are stored only in the terminal buffer. Want to do the same thing tomorrow? You must start coding again
Here's a Perl solution. Run it as
$ perl nova-compute.pl command-output.txt
use strict;
use warnings 'all';
my ($node, #nodes);
while ( <> ) {
$node = $1 if /^ \| \s* (\w+) /x;
push #nodes, $node if /nova-compute/;
}
print join(';', #nodes), "\n";
output
Compute01;Compute02
Now all of that is saved on disk. It may be run again at any time, modified for similar results, or fixed if you got it wrong. It is also readable. No contest
$ nova availability-zone-list | awk '/^[|] [^|]/{node=$2} node && /nova-compute/ {s=s ";" node} END{print substr(s,2)}'
Compute01;Compute02
How it works:
/^[|] [^|]/{node=$2}
Any time a line begins with | followed by space followed by a character not |, then save the second field as a node name.
node && /nova-compute/ {s=s ";" node}
If node is non-empty and the current line contains nova-compute, then append node to the string s.
END{print substr(s,2)}
After we have read all the lines, print out string s minus its first character which is a superfluous ;.

Understanding dequeue_rt_stack() for RT scheduling class linux

enqueue_task_rt function in ./kernel/sched/rt.c is responsible for queuing the task to the run queue. enqueue_task_rt contains call to enqueue_rt_entity which calls dequeue_rt_stack. Most part of the code seems logical but I am a bit lost because of the function dequeue_rt_stack unable to understand what it does. Can somebody tell what is the logic that I am missing or suggest some good read.
Edit: The following is the code for dequeue_rt_stack function
struct sched_rt_entity *back = NULL;
/* macro for_each_sched_rt_entity defined as
for(; rt_se; rt_se = rt_se->parent)*/
for_each_sched_rt_entity(rt_se) {
rt_se->back = back;
back = rt_se;
}
for (rt_se = back; rt_se; rt_se = rt_se->back) {
if (on_rt_rq(rt_se))
__dequeue_rt_entity(rt_se);
}
More specifically, I do not understand why there is a need for this code:
for_each_sched_rt_entity(rt_se) {
rt_se->back = back;
back = rt_se;
}
What is its relevance.
When a task is to be added to some queue, it must first be removed from the queue that it currently is on, if any.
With the group scheduler, a task is always at the lowest level of the tree, and might have multiple ancestors:
NULL
^
|
+-----parent------+
| |
| top-level group |
| |
+-----------------+
^ ^_____________
| \
+-----parent------+ +-----parent------+
| | | |
| mid-level group | | other group | ...
| | | |
+-----------------+ +-----------------+
^ ^_____________
| \
+-----parent------+ +-----------------+
| | | |
| task | | other task | ...
| | | |
+-----------------+ +-----------------+
To remove the task from the tree, it must be removed from all groups' queues, and this must be done first at the top-level group (otherwise, the scheduler might try to run an already partially-removed task). Therefore, dequeue_rt_stack uses the back pointers to constructs a list in the opposite direction:
NULL back
^ |
| V
+-parent----------+
| |
| top-level group |
| |
+----------back---+
^ | ^_____________
| V \
+-parent----------+ +-----parent------+
| | | |
| mid-level group | | other group | ...
| | | |
+----------back---+ +-----------------+
^ | ^_____________
| V \
+-parent----------+ +-----------------+
| | | |
| task | | other task | ...
| | | |
+----------back---+ +-----------------+
|
V
NULL
That back list can then be used to walk down the tree to remove the entities in the correct order.
I am a fresh man in kernel hacking. This is my first time to answer linux kernel question.
Maybe this help to you.
I read the source code. I think it maybe relates to group scheduling.
When kernel have these codes:
#ifdef CONFIG_RT_GROUP_SCHED
It represents that we can collect some schedule entities in to one schduling group.
static void enqueue_rt_entity(struct sched_rt_entity *rt_se, bool head)
{
dequeue_rt_stack(rt_se);
for_each_sched_rt_entity(rt_se)
__enqueue_rt_entity(rt_se, head);
}
Function dequeue_rt_stack(rt_se) extracts all the scheduling entities belong to the group, then add them to run queue.
Hierarchical group I/O scheduling
CFS group scheduling

Resources