logstash generate #timestamp from parsed message - logstash

I have file containing series of such messages:
component+branch.job 2014-09-04_21:24:46 2014-09-04_21:24:49
It is string, some white spaces, first date and time, some white spaces and second date and time. Currently I'm using such filter:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{WORD:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
}
I would like to convert dateStart and timeStart to #timestamp for that message.
I found that there is date filter but I don't know how to use it on two separate fields.
I have also tried something like this as filter:
date {
match => [ "message", "YYYY-MM-dd_HH:mm:ss" ]
}
but it didn't worked as expected.

Based on duplicate suggested by Magnus Bäck, I created solution for my problem. Solution was to mutate parsed data into one field:
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
and then parse it as I suggested in my question.
So final solution looks like this:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{DATA:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
date {
match => [ "tmp_start_timestamp", "YYYY-MM-dd_HH:mm:ss" ]
}
}

Related

match pattern before and after delimiter in grok

I have a pattern similar to "ApplicationID##EVENTREFERENCE" and I need to split it as key - before the delimiter ## - and value after ##
I tried through grok debugger:
(?[^##]*) but this match the value before ##
Expected results:
{
"ApplicationID": [
[
"EVENTREFERENCE"
]
]
}
You can use the common option "add_filter" at the end of your Grok filter after you have matched your data to create a new field that contains the data that you matched.
It should look something like this:
filter {
grok {
match => { "message" => "%{DATA:key}#\#%{GREEDYDATA:value}" }
add_field => { "%{key}" => "%{value}" }
}
}

how to match multiple times in one field in logstash

This is the field
device_version => 2.6.1.1280 [eng:v1.3.26.0 rul:v2018.07.12.09 act:v2018.01.20.01 sws:v2018.07.12.09]
How can I get eng, and rul ... value and put them individually to a new field?
Thanks
If you just want to match eng and rul value, you can simple match them using %{DATA},
eng:%{DATA:eng}\srul:%{DATA:rul}\s
This will output,
{
"eng": [
[
"v1.3.26.0"
]
],
"rul": [
[
"v2018.07.12.09"
]
]
}
You can test it at https://grokdebug.herokuapp.com/
Edit:
filter {
grok {
match => { "device_version" => "eng:%{DATA:eng}\srul:%{DATA:rul}\s" }
}
}
You should also have a look at default grok patterns available, https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns

Logstash grok plugin, add field when matched

I have a grok match like this:
grok{ match => [ “message”, “Duration: %{NUMBER:duration}”, “Speed: %{NUMBER:speed}” ] }
I also want to add another field to captured variables if it matches a grok pattern. I know I can use mutate plugin and if-else to add new fields but I have too many matches and it will be too long that way. As an example, I want to capture right-side fields for given texts.
"Duration: 12" => [duration: "12", type: "duration_type"]
"Speed: 12" => [speed: "12", type: "speed_type"]
Is there a way to do this?
I am not 100% sure if that is what you need, but I did something similar. I have a basic parsing for my message, and then I analyse a specific field additionally with optional matches.
grok {
break_on_match => false
patterns_dir => "/etc/logstash/conf.d/patterns"
match => {
"message" => "\[%{LOGLEVEL:level}\] \[%{IPORHOST:from}\] %{TIMESTAMP_ISO8601:timestamp} \[%{DATA:thread}\] \[%{NOTSPACE:logger}\] %{GREEDYDATA:msg}"
"thread" => "(%{GREEDYDATA}%{REQUEST_TYPE:reqType}%{SPACE}%{URIPATH:reqPath}(%{URIPARAM:reqParam})?)?"
}
}
As you can see, the first one simply matches the complete message. I have a field thread, that is basically the Logger information. However, in my setup, http requests append some info to the thread name. In these cases, I want to OPTIONALLY match these as well.
Wit the above setup, the fields reqType, reqPath, reqParam are only created, if thread can match them. Otherwise they aren't.
I hope this is what you wanted.
Thanks,
Artur
Something like this?
filter{
grok { match => [ "message", "%{GREEDYDATA:types}: %{NUMBER:value}" ] }
mutate {
lowercase => [ "types" ]
add_field => { "%{types}" => "%{value}"
"type" => "%{types}_type" }
remove_field => [ "value", "types" ]
}
}

logstash calculate elapsed time not working

I have file containing series of such messages:
component+branch.job 2014-09-04_21:24:46 2014-09-04_21:24:49
It is string, some white spaces, first date and time, some white spaces and second date and time. Currently I'm using such filter:
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{DATA:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
add_field => {"tmp_stop_timestamp" => "20%{dateStop}_%{timeStop}"}
}
date {
match => [ "tmp_start_timestamp", "YYYY-MM-dd_HH:mm:ss" ]
add_tag => [ "jobStarted" ]
}
date {
match => [ "tmp_stop_timestamp", "YYYY-MM-dd_HH:mm:ss" ]
target => "stop_timestamp"
remove_field => ["tmp_stop_timestamp", "tmp_start_timestamp", "dateStart", "timeStart", "dateStop", "timeStop"]
add_tag => [ "jobStopped" ]
}
elapsed {
start_tag => "jobStarted"
end_tag => "jobStopped"
unique_id_field => "message"
}
As result I receive "#timestamp" and "stop_timestamp" fields with date time data and two tags, without elapsed time calculation. What I'm missing?
UPDATE
I tried with splitting (as #Rumbles suggested) event on two separate events, but somehow logstash creates two the same events:
input {
stdin { type => "time" }
}
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{DATA:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
add_field => {"tmp_stop_timestamp" => "20%{dateStop}_%{timeStop}"}
update => [ "type", "start" ]
}
clone {
clones => ["stop"]
}
if [type] == "start" {
date {
match => [ "tmp_start_timestamp", "YYYY-MM-dd_HH:mm:ss" ]
target => ["start_timestamp"]
add_tag => [ "jobStarted" ]
}
}
if [type] == "stop" {
date {
match => [ "tmp_stop_timestamp", "YYYY-MM-dd_HH:mm:ss" ]
target => "stop_timestamp"
remove_field => ["tmp_stop_timestamp", "tmp_start_timestamp", "dateStart", "timeStart", "dateStop", "timeStop"]
add_tag => [ "jobStopped" ]
}
}
elapsed {
start_tag => "jobStarted"
end_tag => "jobStopped"
unique_id_field => "message"
timeout => 15
}
}
output {
stdout { codec => rubydebug }
}
I've never used this filter, however I have just had a quick read of the documentation, and I think I understand the issue you are having.
From your description I believe you are trying to run the elapsed filter on one event, from the documentation it would appear that the filter is expecting 2 events, one with the starting time the second with the ending time, with a common id helping the filter to identify when the 2 events match up:
The events managed by this filter must have some particular properties. The event describing the start of the task (the “start event”) must contain a tag equal to ‘start_tag’. On the other side, the event describing the end of the task (the “end event”) must contain a tag equal to ‘end_tag’. Both these two kinds of event need to own an ID field which identify uniquely that particular task. The name of this field is stored in ‘unique_id_field’.
Each message is considered an event, so you would need to split your messages in to two events and have each pair of events have a unique identifier to help the filter to link them back together. It's not exactly a tidy solution (split your event in to two events, and then reconnect them again later) there may be a better solution to this that I am not aware of.

need custom fields of log through grok filter in logstash

I have logstash, kibana and elasticsearch installed on my system, with this filter configuration:
filter{
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
mutate {
add_field => {
"timestamp" => "%{TIME} %{MONTH} %{monthday}"
}
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
and receiving output on kibana as:
but I need some fields which are as follows:
#timestamp
#version
_id
_index
_type
_file
Log Level
Host Name
Host IP
Process Name
Response Time
I tried adding Timestamp but its printing same string instead of dynamic result
You're confusing patterns with fields.
A pattern is a short-hand notation that represents a regular expression, such as %{WORD} as a shortcut for "\b\w+\b".
A field is where data - including information matched by patterns - is stored. It's possible to put a pattern into a field like this: %{WORD:my_field}
In your grok{}, you match with: %{SYSLOGTIMESTAMP:syslog_timestamp}, which puts everything that was matched into a single field called syslog_timestamp. This is the month, monthday, and time seen at the front of syslog messages.
Even though SYSLOGTIMESTAMP is itself defined as "%{MONTH} +%{MONTHDAY} %{TIME}", they don't have that ":name" syntax, so no fields are created for MONTH, MONTHDAY, and TIME.
Assuming that you really do want to make a new field in the format you describe, you'd need to either:
make a new pattern to replace all of SYSLOGTIMESTAMP that would make fields out of the pieces of information.
use the existing pattern to create the syslog_timestamp field as you're doing, and then grok{} that with a simple pattern to split it apart.
I'd recommend #2, so you'd end up with something like this:
grok {
match => { "syslog_timestamp" => "%{MONTH:month} +%{MONTHDAY:monthday} %{TIME:time}" }
}
That should do it.
Please note that your field will be a string, so it won't be of any use in range queries, etc. You should use the date{} filter to replace #timestamp with your syslog_timestamp information.
Good luck.

Resources