Logstash Add field from grok filter - logstash

Is it possible to match a message to a new field in logstash using grok and mutate?
Example log:
"<30>Dec 19 11:37:56 7f87c507df2a[20103]: [INFO] 2018-12-19 16:37:56 _internal (MainThread): 192.168.0.6 - - [19/Dec/2018 16:37:56] \"\u001b[37mGET / HTTP/1.1\u001b[0m\" 200 -\r"
I am trying to create a new key value where I match container_id to 7f87c507df2a.
filter {
grok {
match => [ "message", "%{SYSLOG5424PRI}%{NONNEGINT:ver} +(?:%{TIMESTAMP_ISO8601:ts}|-) +(?:%{HOSTNAME:service}|-) +(?:%{NOTSPACE:containerName}|-) +(?:%{NOTSPACE:proc}|-) +(?:%{WORD:msgid}|-) +(?:%{SYSLOG5424SD:sd}|-|) +%{GREEDYDATA:msg}" ]
}
mutate {
add_field => { "container_id" => "%{containerName}"}
}
}
The resulting logfile renders this, where the value of containerName isn't being referenced from grok, it is just a string literal:
"container_id": "%{containerName}"
I am trying to have the conf create:
"container_id": "7f87c507df2a"
Obviously the value of containerName isn't being linked from grok. Is what I want to do even possible?

As explained in the comments, my grok pattern was incorrect. For anyone that may wander towards this post that needs help with grok go here to make building your pattern less time consuming.
Here was the working snapshot:
filter {
grok {
match => [ "message", "\A%{SYSLOG5424PRI}%{SYSLOGTIMESTAMP}%{SPACE}%{BASE16NUM:docker_id}%{SYSLOG5424SD}%{GREEDYDATA:python_log_message}" ]
add_field => { "container_id" => "%{docker_id}" }
}
}

Related

Kibana does not show fields from grok filter in filebeat

I have log file with apache logs that i want to show in Kibana.
The logs start with IP. I have debuged my pattern and it passes.
I'm trying to add fields in the beats input configuration file, but are not show in Kibana even after refresh of the fields.
Here is the configuration file
filter {
if[type] == "apache" {
grok {
match => { "message" => "%{HOST:log_host}%{GREEDYDATA:remaining}" }
add_field => { "testip" => "%{log_host}" }
add_field => { "data_left" => "%{remaining}" }
}
}
...
Just to add that I have restarted all the services: logstash, elasticsearch, kibana after the new configuration.
The issue could be that your grok pattern is using too rigid of patterns.
Chances are that HOST should be IPORHOST based on your test_ip field's name.
Assuming that the data is actually coming in with the type defined as apache, then it should be:
filter {
if [type] == "apache" {
grok {
match => {
message => "%{IPORHOST:log_host}%{GREEDYDATA:remaining}"
}
add_field => {
testip => "%{log_host}"
data_left => "%{remaining}"
}
}
}
}
Having said that, your usage of add_field is completely unnecessary. The grok pattern itself is creating two fields: log_host and remaining, so there's no need to define extra fields called testip and data_left.
Perhaps even more usefully, you don't need to craft your own Apache web log grok pattern. The COMBINEDAPACHELOG pattern already exists, which gives all of the standard fields automatically.
filter {
if [type] == "apache" {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
# Set #timestamp to the log's time and drop the unneeded timestamp
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
remove_field => "timestamp"
}
}
}
You can see this in a more complete example in the Logstash documentation here.

Logstash grok plugin, add field when matched

I have a grok match like this:
grok{ match => [ “message”, “Duration: %{NUMBER:duration}”, “Speed: %{NUMBER:speed}” ] }
I also want to add another field to captured variables if it matches a grok pattern. I know I can use mutate plugin and if-else to add new fields but I have too many matches and it will be too long that way. As an example, I want to capture right-side fields for given texts.
"Duration: 12" => [duration: "12", type: "duration_type"]
"Speed: 12" => [speed: "12", type: "speed_type"]
Is there a way to do this?
I am not 100% sure if that is what you need, but I did something similar. I have a basic parsing for my message, and then I analyse a specific field additionally with optional matches.
grok {
break_on_match => false
patterns_dir => "/etc/logstash/conf.d/patterns"
match => {
"message" => "\[%{LOGLEVEL:level}\] \[%{IPORHOST:from}\] %{TIMESTAMP_ISO8601:timestamp} \[%{DATA:thread}\] \[%{NOTSPACE:logger}\] %{GREEDYDATA:msg}"
"thread" => "(%{GREEDYDATA}%{REQUEST_TYPE:reqType}%{SPACE}%{URIPATH:reqPath}(%{URIPARAM:reqParam})?)?"
}
}
As you can see, the first one simply matches the complete message. I have a field thread, that is basically the Logger information. However, in my setup, http requests append some info to the thread name. In these cases, I want to OPTIONALLY match these as well.
Wit the above setup, the fields reqType, reqPath, reqParam are only created, if thread can match them. Otherwise they aren't.
I hope this is what you wanted.
Thanks,
Artur
Something like this?
filter{
grok { match => [ "message", "%{GREEDYDATA:types}: %{NUMBER:value}" ] }
mutate {
lowercase => [ "types" ]
add_field => { "%{types}" => "%{value}"
"type" => "%{types}_type" }
remove_field => [ "value", "types" ]
}
}

logstash : how to extract data from log4j message?

I try to extract data from my log4j message with logstash.
The message look like this :
Method findAll - Start by : bokc
I would like to extract the method name : "findAll" and the user "bokc".
How can I do this?
I use logstash 1.5.2 and my config is :
input {
log4j {
mode => "server"
type => "log4j-artemis"
port => 4560
}
}
filter {
multiline {
type => "log4j-artemis"
pattern => "^\\s"
what => "previous"
}
mutate {
add_field => [ "source_ip", "%{host}" ]
}
}
Use a grok filter:
filter {
grok {
match => [
"message",
"^Method %{WORD:method} - Start by : %{USER:user}"
]
tag_on_failure => []
}
}
This extracts the two words into the fields "method" and "user". The setting of tag_on_failure makes sure that non-matching messages aren't tagged with _grokparsefailure. Since most messages aren't supposed to match the pattern it doesn't make sense to mark them as failures.

logstash generate #timestamp from parsed message

I have file containing series of such messages:
component+branch.job 2014-09-04_21:24:46 2014-09-04_21:24:49
It is string, some white spaces, first date and time, some white spaces and second date and time. Currently I'm using such filter:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{WORD:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
}
I would like to convert dateStart and timeStart to #timestamp for that message.
I found that there is date filter but I don't know how to use it on two separate fields.
I have also tried something like this as filter:
date {
match => [ "message", "YYYY-MM-dd_HH:mm:ss" ]
}
but it didn't worked as expected.
Based on duplicate suggested by Magnus Bäck, I created solution for my problem. Solution was to mutate parsed data into one field:
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
and then parse it as I suggested in my question.
So final solution looks like this:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{DATA:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
date {
match => [ "tmp_start_timestamp", "YYYY-MM-dd_HH:mm:ss" ]
}
}

need custom fields of log through grok filter in logstash

I have logstash, kibana and elasticsearch installed on my system, with this filter configuration:
filter{
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
mutate {
add_field => {
"timestamp" => "%{TIME} %{MONTH} %{monthday}"
}
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
and receiving output on kibana as:
but I need some fields which are as follows:
#timestamp
#version
_id
_index
_type
_file
Log Level
Host Name
Host IP
Process Name
Response Time
I tried adding Timestamp but its printing same string instead of dynamic result
You're confusing patterns with fields.
A pattern is a short-hand notation that represents a regular expression, such as %{WORD} as a shortcut for "\b\w+\b".
A field is where data - including information matched by patterns - is stored. It's possible to put a pattern into a field like this: %{WORD:my_field}
In your grok{}, you match with: %{SYSLOGTIMESTAMP:syslog_timestamp}, which puts everything that was matched into a single field called syslog_timestamp. This is the month, monthday, and time seen at the front of syslog messages.
Even though SYSLOGTIMESTAMP is itself defined as "%{MONTH} +%{MONTHDAY} %{TIME}", they don't have that ":name" syntax, so no fields are created for MONTH, MONTHDAY, and TIME.
Assuming that you really do want to make a new field in the format you describe, you'd need to either:
make a new pattern to replace all of SYSLOGTIMESTAMP that would make fields out of the pieces of information.
use the existing pattern to create the syslog_timestamp field as you're doing, and then grok{} that with a simple pattern to split it apart.
I'd recommend #2, so you'd end up with something like this:
grok {
match => { "syslog_timestamp" => "%{MONTH:month} +%{MONTHDAY:monthday} %{TIME:time}" }
}
That should do it.
Please note that your field will be a string, so it won't be of any use in range queries, etc. You should use the date{} filter to replace #timestamp with your syslog_timestamp information.
Good luck.

Resources