how to match multiple times in one field in logstash - logstash

This is the field
device_version => 2.6.1.1280 [eng:v1.3.26.0 rul:v2018.07.12.09 act:v2018.01.20.01 sws:v2018.07.12.09]
How can I get eng, and rul ... value and put them individually to a new field?
Thanks

If you just want to match eng and rul value, you can simple match them using %{DATA},
eng:%{DATA:eng}\srul:%{DATA:rul}\s
This will output,
{
"eng": [
[
"v1.3.26.0"
]
],
"rul": [
[
"v2018.07.12.09"
]
]
}
You can test it at https://grokdebug.herokuapp.com/
Edit:
filter {
grok {
match => { "device_version" => "eng:%{DATA:eng}\srul:%{DATA:rul}\s" }
}
}
You should also have a look at default grok patterns available, https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns

Related

match pattern before and after delimiter in grok

I have a pattern similar to "ApplicationID##EVENTREFERENCE" and I need to split it as key - before the delimiter ## - and value after ##
I tried through grok debugger:
(?[^##]*) but this match the value before ##
Expected results:
{
"ApplicationID": [
[
"EVENTREFERENCE"
]
]
}
You can use the common option "add_filter" at the end of your Grok filter after you have matched your data to create a new field that contains the data that you matched.
It should look something like this:
filter {
grok {
match => { "message" => "%{DATA:key}#\#%{GREEDYDATA:value}" }
add_field => { "%{key}" => "%{value}" }
}
}

using Grok to skip parts of message or logs

I have just started using grok for logstash and I am trying to parse my log file using grok filter.
My logline is something like below
03-30-2017 13:26:13 [00089] TIMER XXX.TimerLog: entType [organization], queueType [output], memRecno = 446323718, audRecno = 2595542711, elapsed time = 998ms
I want to capture only the initial date/time stamp, entType [organization], and elapsed time = 998ms.
However, it looks like I have to match pattern for every word and number in the line. Is there a way I can skip it ? I tried to look everywhere but couldn't find anything. Kindly help.
As per Charles Duffy's comment.
There are 2 ways of doing this:
The GREEDYDATA way (?:.*):
grok {
match => {"message" => "^%{DATE_US:dte}\s*%{TIME:tme}\s*\[%{GREEDYDATA}elapsed time\s*=\s*%{BASE10NUM}"
}
Or, telling it to ignore a match and look for the next one in the list.
grok {
break_on_match => false
match => { "message" => "^%{DATE_US:dte}\s*%{TIME:tme}\s*\[" }
match => { "message" => "elapsed time\s*=\s*%{BASE10NUM:elapsedTime}"
}
You can then rejoin the date & time into a single field and convert it to a timestamp.
As Charles Duffy suggested, you can simply bypass data you don't need.
You can use .* to do that.
Following will produce the output you want,
%{DATE_US:dateTime}.*entType\s*\[%{WORD:org}\].*elapsed time\s*=\s*%{BASE10NUM}
Explanation:
\s* matches space character.
\[ is bypassing [ character.
%{WORD:org} defines a word boundary and place it in a new field org
Outputs
{
"dateTime": [
[
"03-30-2017"
]
],
"MONTHNUM": [
[
"03"
]
],
"MONTHDAY": [
[
"30"
]
],
"YEAR": [
[
"2017"
]
],
"org": [
[
"organization"
]
],
"BASE10NUM": [
[
"998"
]
]
}
Click for a list of all available grok patterns

Logstash grok plugin, add field when matched

I have a grok match like this:
grok{ match => [ “message”, “Duration: %{NUMBER:duration}”, “Speed: %{NUMBER:speed}” ] }
I also want to add another field to captured variables if it matches a grok pattern. I know I can use mutate plugin and if-else to add new fields but I have too many matches and it will be too long that way. As an example, I want to capture right-side fields for given texts.
"Duration: 12" => [duration: "12", type: "duration_type"]
"Speed: 12" => [speed: "12", type: "speed_type"]
Is there a way to do this?
I am not 100% sure if that is what you need, but I did something similar. I have a basic parsing for my message, and then I analyse a specific field additionally with optional matches.
grok {
break_on_match => false
patterns_dir => "/etc/logstash/conf.d/patterns"
match => {
"message" => "\[%{LOGLEVEL:level}\] \[%{IPORHOST:from}\] %{TIMESTAMP_ISO8601:timestamp} \[%{DATA:thread}\] \[%{NOTSPACE:logger}\] %{GREEDYDATA:msg}"
"thread" => "(%{GREEDYDATA}%{REQUEST_TYPE:reqType}%{SPACE}%{URIPATH:reqPath}(%{URIPARAM:reqParam})?)?"
}
}
As you can see, the first one simply matches the complete message. I have a field thread, that is basically the Logger information. However, in my setup, http requests append some info to the thread name. In these cases, I want to OPTIONALLY match these as well.
Wit the above setup, the fields reqType, reqPath, reqParam are only created, if thread can match them. Otherwise they aren't.
I hope this is what you wanted.
Thanks,
Artur
Something like this?
filter{
grok { match => [ "message", "%{GREEDYDATA:types}: %{NUMBER:value}" ] }
mutate {
lowercase => [ "types" ]
add_field => { "%{types}" => "%{value}"
"type" => "%{types}_type" }
remove_field => [ "value", "types" ]
}
}

logstash generate #timestamp from parsed message

I have file containing series of such messages:
component+branch.job 2014-09-04_21:24:46 2014-09-04_21:24:49
It is string, some white spaces, first date and time, some white spaces and second date and time. Currently I'm using such filter:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{WORD:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
}
I would like to convert dateStart and timeStart to #timestamp for that message.
I found that there is date filter but I don't know how to use it on two separate fields.
I have also tried something like this as filter:
date {
match => [ "message", "YYYY-MM-dd_HH:mm:ss" ]
}
but it didn't worked as expected.
Based on duplicate suggested by Magnus Bäck, I created solution for my problem. Solution was to mutate parsed data into one field:
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
and then parse it as I suggested in my question.
So final solution looks like this:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{DATA:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
date {
match => [ "tmp_start_timestamp", "YYYY-MM-dd_HH:mm:ss" ]
}
}

How can I have logstash drop all events that do not match a group of regular expressions?

I'm trying to match event messages with several regular expressions. I was going for the use of grep filter, but its deprecated so I'm trying for the drop with negation.
The functionality I'm looking for is to have all events dropped unless the message matches several regular expressions.
The filter bellow does not work, but tested individually both expressions work fine.
What am I missing?
filter {
if ([message] !~ ' \[critical\]: ' or [message] !~ '\[crit\]: ') {
drop { }
}
}
I was reading a bit more and went along with painting the events with grok by adding a tag and dropping them in the end, if the tag was not there:
filter {
grok {
add_tag => [ "valid" ]
match => [
"message", ".+ \[critical\]: ?(.+)",
"message", ".+ \[crit\]: ?(.+) ",
"message", '.+ (Deadlock found.+) ',
"message", "(.+: Could not record email: .+) "
]
}
if "valid" not in [tags] {
drop { }
}
mutate {
remove_tag => [ "valid" ]
}
}
if "_grokparsefailure" in [tags] {
drop {}
}
You're using a regexp in your conditional, but not passing in the argument in the correct format. The doc shows this:
if [status] =~ /^5\d\d/ {
nagios { ... }
}
Note the regexp is unquoted and surrounded with slashes.

Resources