match pattern before and after delimiter in grok - logstash-grok

I have a pattern similar to "ApplicationID##EVENTREFERENCE" and I need to split it as key - before the delimiter ## - and value after ##
I tried through grok debugger:
(?[^##]*) but this match the value before ##
Expected results:
{
"ApplicationID": [
[
"EVENTREFERENCE"
]
]
}

You can use the common option "add_filter" at the end of your Grok filter after you have matched your data to create a new field that contains the data that you matched.
It should look something like this:
filter {
grok {
match => { "message" => "%{DATA:key}#\#%{GREEDYDATA:value}" }
add_field => { "%{key}" => "%{value}" }
}
}

Related

Logstash Add field from grok filter

Is it possible to match a message to a new field in logstash using grok and mutate?
Example log:
"<30>Dec 19 11:37:56 7f87c507df2a[20103]: [INFO] 2018-12-19 16:37:56 _internal (MainThread): 192.168.0.6 - - [19/Dec/2018 16:37:56] \"\u001b[37mGET / HTTP/1.1\u001b[0m\" 200 -\r"
I am trying to create a new key value where I match container_id to 7f87c507df2a.
filter {
grok {
match => [ "message", "%{SYSLOG5424PRI}%{NONNEGINT:ver} +(?:%{TIMESTAMP_ISO8601:ts}|-) +(?:%{HOSTNAME:service}|-) +(?:%{NOTSPACE:containerName}|-) +(?:%{NOTSPACE:proc}|-) +(?:%{WORD:msgid}|-) +(?:%{SYSLOG5424SD:sd}|-|) +%{GREEDYDATA:msg}" ]
}
mutate {
add_field => { "container_id" => "%{containerName}"}
}
}
The resulting logfile renders this, where the value of containerName isn't being referenced from grok, it is just a string literal:
"container_id": "%{containerName}"
I am trying to have the conf create:
"container_id": "7f87c507df2a"
Obviously the value of containerName isn't being linked from grok. Is what I want to do even possible?
As explained in the comments, my grok pattern was incorrect. For anyone that may wander towards this post that needs help with grok go here to make building your pattern less time consuming.
Here was the working snapshot:
filter {
grok {
match => [ "message", "\A%{SYSLOG5424PRI}%{SYSLOGTIMESTAMP}%{SPACE}%{BASE16NUM:docker_id}%{SYSLOG5424SD}%{GREEDYDATA:python_log_message}" ]
add_field => { "container_id" => "%{docker_id}" }
}
}

how to match multiple times in one field in logstash

This is the field
device_version => 2.6.1.1280 [eng:v1.3.26.0 rul:v2018.07.12.09 act:v2018.01.20.01 sws:v2018.07.12.09]
How can I get eng, and rul ... value and put them individually to a new field?
Thanks
If you just want to match eng and rul value, you can simple match them using %{DATA},
eng:%{DATA:eng}\srul:%{DATA:rul}\s
This will output,
{
"eng": [
[
"v1.3.26.0"
]
],
"rul": [
[
"v2018.07.12.09"
]
]
}
You can test it at https://grokdebug.herokuapp.com/
Edit:
filter {
grok {
match => { "device_version" => "eng:%{DATA:eng}\srul:%{DATA:rul}\s" }
}
}
You should also have a look at default grok patterns available, https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns

logstash generate #timestamp from parsed message

I have file containing series of such messages:
component+branch.job 2014-09-04_21:24:46 2014-09-04_21:24:49
It is string, some white spaces, first date and time, some white spaces and second date and time. Currently I'm using such filter:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{WORD:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
}
I would like to convert dateStart and timeStart to #timestamp for that message.
I found that there is date filter but I don't know how to use it on two separate fields.
I have also tried something like this as filter:
date {
match => [ "message", "YYYY-MM-dd_HH:mm:ss" ]
}
but it didn't worked as expected.
Based on duplicate suggested by Magnus Bäck, I created solution for my problem. Solution was to mutate parsed data into one field:
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
and then parse it as I suggested in my question.
So final solution looks like this:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{DATA:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
date {
match => [ "tmp_start_timestamp", "YYYY-MM-dd_HH:mm:ss" ]
}
}

need custom fields of log through grok filter in logstash

I have logstash, kibana and elasticsearch installed on my system, with this filter configuration:
filter{
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
mutate {
add_field => {
"timestamp" => "%{TIME} %{MONTH} %{monthday}"
}
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
and receiving output on kibana as:
but I need some fields which are as follows:
#timestamp
#version
_id
_index
_type
_file
Log Level
Host Name
Host IP
Process Name
Response Time
I tried adding Timestamp but its printing same string instead of dynamic result
You're confusing patterns with fields.
A pattern is a short-hand notation that represents a regular expression, such as %{WORD} as a shortcut for "\b\w+\b".
A field is where data - including information matched by patterns - is stored. It's possible to put a pattern into a field like this: %{WORD:my_field}
In your grok{}, you match with: %{SYSLOGTIMESTAMP:syslog_timestamp}, which puts everything that was matched into a single field called syslog_timestamp. This is the month, monthday, and time seen at the front of syslog messages.
Even though SYSLOGTIMESTAMP is itself defined as "%{MONTH} +%{MONTHDAY} %{TIME}", they don't have that ":name" syntax, so no fields are created for MONTH, MONTHDAY, and TIME.
Assuming that you really do want to make a new field in the format you describe, you'd need to either:
make a new pattern to replace all of SYSLOGTIMESTAMP that would make fields out of the pieces of information.
use the existing pattern to create the syslog_timestamp field as you're doing, and then grok{} that with a simple pattern to split it apart.
I'd recommend #2, so you'd end up with something like this:
grok {
match => { "syslog_timestamp" => "%{MONTH:month} +%{MONTHDAY:monthday} %{TIME:time}" }
}
That should do it.
Please note that your field will be a string, so it won't be of any use in range queries, etc. You should use the date{} filter to replace #timestamp with your syslog_timestamp information.
Good luck.

How can I have logstash drop all events that do not match a group of regular expressions?

I'm trying to match event messages with several regular expressions. I was going for the use of grep filter, but its deprecated so I'm trying for the drop with negation.
The functionality I'm looking for is to have all events dropped unless the message matches several regular expressions.
The filter bellow does not work, but tested individually both expressions work fine.
What am I missing?
filter {
if ([message] !~ ' \[critical\]: ' or [message] !~ '\[crit\]: ') {
drop { }
}
}
I was reading a bit more and went along with painting the events with grok by adding a tag and dropping them in the end, if the tag was not there:
filter {
grok {
add_tag => [ "valid" ]
match => [
"message", ".+ \[critical\]: ?(.+)",
"message", ".+ \[crit\]: ?(.+) ",
"message", '.+ (Deadlock found.+) ',
"message", "(.+: Could not record email: .+) "
]
}
if "valid" not in [tags] {
drop { }
}
mutate {
remove_tag => [ "valid" ]
}
}
if "_grokparsefailure" in [tags] {
drop {}
}
You're using a regexp in your conditional, but not passing in the argument in the correct format. The doc shows this:
if [status] =~ /^5\d\d/ {
nagios { ... }
}
Note the regexp is unquoted and surrounded with slashes.

Resources