Facing errors in logstash - logstash-grok

When i defined the pattern for parsing apache tomcat and application log files in logstash we are getting the following error .
Sample log file is :
2014-08-20 12:35:26,037 INFO [routerMessageListener-74] PoolableRuleEngineFactory Executing the rule -->ECE Tagging Rule
config file is :
filter{
grok{
type => "log4j"
#pattern => "%{TIMESTAMP_ISO8601:logdate} %{LOGLEVEL:severity} \[\w+\[% {GREEDYDATA:thread},.*\]\] %{JAVACLASS:class} - %{GREEDYDATA:message}"
pattern => "%{TIMESTAMP_ISO8601:logdate}"
#add_tag => [ "level_%{level}" ]
}
date {
match => [ "logdate", "YYYY-MM-dd HH:mm:ss,SSS"]
}
}
Unknown setting 'timestamp' for date {:level=>:error}

Your post does not show a setting 'timestamp' for your date filter. I suspect you had started with the example here which used timestamp setting that used to be in older versions of date filter. You correctly fixed it for newer version of logstash to use match setting but perhaps had not saved your change. I have no problems using above filter with logstash-1.5.3.
Here is my complete config file. Note I am still testing it but it seems to be working to import a JBoss log with Log4J log messages imported from an existing log file.
input {
tcp {
type => "log4j"
port => 4560
}
stdin {
type => "log4j"
}
}
filter {
grok{
type => "log4j"
#pattern => "%{TIMESTAMP_ISO8601:logdate} %{LOGLEVEL:severity} \[\w+\[%{GREEDYDATA:thread},.*\]\] %{JAVACLASS:class} - %GREEDYDATA:message}"
pattern => "%{TIMESTAMP_ISO8601:logdate}"
#add_tag => [ "level_%{level}" ]
}
date {
type => "log4j"
match => [ "logdate", "YYYY-MM-dd HH:mm:ss,SSS"]
exclude_tags => "_grokparsefailure"
}
# Catches normal space indented type things, probably could be removed b/c the other multiline should do everythign we need
multiline {
type => "log4j"
tags => ["_grokparsefailure"] # exclude anything we already handled
pattern => ".*"
what => "previous"
add_tag => "notgrok"
}
}
output {
gelf {
host => "localhost"
custom_fields => ["environment", "PROD", "service", "BestServiceInTheWorld"]
}
# Print each event to stdout.
stdout {
codec => json
}
}

Related

Fields parsed from log path not added in logstash

I'm parsing multiple log files with logstash - and want to add fields based on the path of the files to my output. Here are the relevant parts of the config file:
input {
file {
path => "/mnt/logs/**/console-20200108*.log"
type => "tomcat"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
if [type] == "tomcat" {
grok {
patterns_dir => "/usr/share/logstash/patterns"
match => {
"message" => [ "%{TOMCAT_LOG_1}", "%{TOMCAT_LOG_2}" ]
"path" => "\/mnt\/logs\/%{DATA:site}\/%{DATA:version}\/node%{NUMBER:node}\/store\/tomcat\/%{DATA:file}\.log"
}
}
}
}
Here's a sample output:
{
"level" => "INFO",
"type" => "tomcat",
"data" => "Finished indexer cronjob.\r",
"timestamp" => "08-Jan-2020 11:00:05.860",
"qualifier1" => "[update-backofficeIndex-CronJob::ServicelayerJob]",
"#timestamp" => 2020-01-08T11:04:47.364Z,
"path" => "/mnt/logs/protec/qa/node1/store/tomcat/console-20200108.log",
"qualifier3" => "[SolrIndexerJob]",
"host" => "elk",
"#version" => "1",
"message" => "INFO | jvm 1 | srvmain | 08-Jan-2020 11:00:05.860 INFO [update-backofficeIndex-CronJob::ServicelayerJob] (update-backofficeIndex-CronJob) [SolrIndexerJob] Finished indexer cronjob.\r",
"qualifier2" => "(update-backofficeIndex-CronJob)"
}
Based on this, I was expecting to get relevant fields from parsing the message and a few more fields from parsing the path. Yet, I none of the fields from "path" parsing are added.
What am I missing? How do I add site, version, node and file fields?
Creating an answer from a comment that solved the problem (hence community wiki).
Apparently it might be coming from the break_on_match option, which defaults to true. From the doc:
The first successful match by grok will result in the filter being finished. So if %{TOMCAT_LOG_1} or %{TOMCAT_LOG_2} match, it won't try to match the path field

Kibana does not show fields from grok filter in filebeat

I have log file with apache logs that i want to show in Kibana.
The logs start with IP. I have debuged my pattern and it passes.
I'm trying to add fields in the beats input configuration file, but are not show in Kibana even after refresh of the fields.
Here is the configuration file
filter {
if[type] == "apache" {
grok {
match => { "message" => "%{HOST:log_host}%{GREEDYDATA:remaining}" }
add_field => { "testip" => "%{log_host}" }
add_field => { "data_left" => "%{remaining}" }
}
}
...
Just to add that I have restarted all the services: logstash, elasticsearch, kibana after the new configuration.
The issue could be that your grok pattern is using too rigid of patterns.
Chances are that HOST should be IPORHOST based on your test_ip field's name.
Assuming that the data is actually coming in with the type defined as apache, then it should be:
filter {
if [type] == "apache" {
grok {
match => {
message => "%{IPORHOST:log_host}%{GREEDYDATA:remaining}"
}
add_field => {
testip => "%{log_host}"
data_left => "%{remaining}"
}
}
}
}
Having said that, your usage of add_field is completely unnecessary. The grok pattern itself is creating two fields: log_host and remaining, so there's no need to define extra fields called testip and data_left.
Perhaps even more usefully, you don't need to craft your own Apache web log grok pattern. The COMBINEDAPACHELOG pattern already exists, which gives all of the standard fields automatically.
filter {
if [type] == "apache" {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
# Set #timestamp to the log's time and drop the unneeded timestamp
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
remove_field => "timestamp"
}
}
}
You can see this in a more complete example in the Logstash documentation here.

logstash : how to extract data from log4j message?

I try to extract data from my log4j message with logstash.
The message look like this :
Method findAll - Start by : bokc
I would like to extract the method name : "findAll" and the user "bokc".
How can I do this?
I use logstash 1.5.2 and my config is :
input {
log4j {
mode => "server"
type => "log4j-artemis"
port => 4560
}
}
filter {
multiline {
type => "log4j-artemis"
pattern => "^\\s"
what => "previous"
}
mutate {
add_field => [ "source_ip", "%{host}" ]
}
}
Use a grok filter:
filter {
grok {
match => [
"message",
"^Method %{WORD:method} - Start by : %{USER:user}"
]
tag_on_failure => []
}
}
This extracts the two words into the fields "method" and "user". The setting of tag_on_failure makes sure that non-matching messages aren't tagged with _grokparsefailure. Since most messages aren't supposed to match the pattern it doesn't make sense to mark them as failures.

logstash output not showing the desired timestamp

I am trying to get the desired time stamp format from logstash output. I can''t get that if I use this format in syslog
Please share your thoughts about convert to the other format that’s in the _source field like Yyyy-mm-ddThh:mm:ss.sssZ format?
filter {
grok {
match => [ "logdate", "Yyyy-mm-ddThh:mm:ss.sssZ" ]
overwrite => ["host", "message"]
}
_source: {
message: "activity_log: {"created_at":1421114642210,"actor_ip":"192.168.1.1","note":"From system","user":"4561c9d7aaa9705a25f66d","user_id":null,"actor":"4561c9d7aaa9705a25f66d","actor_id":null,"org_id":null,"action":"user.failed_login","data":{"transaction_id":"d6768c473e366594","name":"user.failed_login","timing":{"start":1422127860691,"end":14288720480691,"duration":0.00257},"actor_locatio
I am using this code in syslog file
filter {
if [message] =~ /^activity_log: / {
grok {
match => ["message", "^activity_log: %{GREEDYDATA:json_message}"]
}
json {
source => "json_message"
remove_field => "json_message"
}
date {
match => ["created_at", "UNIX_MS"]
}
mutate {
rename => ["[json][repo]", "repo"]
remove_field => "json"
}
}
}
output {
elasticsearch { host => localhost }
stdout { codec => rubydebug }
}
thanks
"message" => "<134>feb 1 20:06:12 {\"created_at\":1422765535789, pid=5450 tid=28643 version=b0b45ac proto=http ip=192.168.1.1 duration_ms=0.165809 fs_sent=0 fs_recv=0 client_recv=386 client_sent=0 log_level=INFO msg=\"http op done: (401)\" code=401" }
"#version" => "1",
"#timestamp" => "2015-02-01T20:06:12.726Z",
"type" => "activity_log",
"host" => "192.168.1.1"
The pattern in your grok filter doesn't make sense. You're using a Joda-Time pattern (normally used for the date filter) and not a grok pattern.
It seems your message field contains a JSON object. That's good, because it makes it easy to parse. Extract the part that comes after "activity_log: " to a temporary json_message field,
grok {
match => ["message", "^activity_log: %{GREEDYDATA:json_message}"]
}
and parse that field as JSON with the json filter (removing the temporary field if the operation was successful):
json {
source => "json_message"
remove_field => ["json_message"]
}
Now you should have the fields from the original message field at the top level of your message, including the created_at field with the timestamp you want to extract. That number is the number of milliseconds since the epoch so you can use the UNIX_MS pattern in a date filter to extract it into #timestamp:
date {
match => ["created_at", "UNIX_MS"]
}

How to remove trailing newline from message field

I am shipping Glassfish 4 logfiles with Logstash to an ElasticSearch sink. How can I remove with Logstash the trailing newline from a message field?
My event looks like this:
{
"#timestamp" => "2013-11-21T13:29:33.081Z",
"message" => "[2013-11-21T13:29:32.577+0000] [glassfish 4.0] [INFO] [] [javax.resourceadapter.mqjmsra.lifecycle] [tid: _ThreadID=142 _ThreadName=Thread-43] [timeMillis: 1385040572577] [levelValue: 800] [[\n MQJMSRA_RA1101: GlassFish MQ JMS Resource Adapter stopped.]]\n",
"#version" => "1",
"tags" => ["multiline", "date_filtered"],
"host" => "myhost",
"path" => "../server.log"
}
A second solution is using the mutate filter of Logstash. It allows you to strip the value of a field.
filter {
# Remove leading and trailing whitspaces (including newline etc. etc.)
mutate {
strip => "message"
}
}
You have to use the multiline filter with the correct pattern, to tell logstash, that every line with precending whitespace belongs to the line before. Add this lines to your conf file.
filter{
...
multiline {
type => "gflogs"
pattern => "\[\#\|\d{4}"
negate => true
what => "previous"
}
...
}
You can also include grok plugin to handle timestamp and filter irregular lines from beeing indexed.
See complete stack with single logstash instance on same machine
input {
stdin {
type => "stdin-type"
}
file {
path => "/path/to/glassfish/logs/*.log"
type => "gflogs"
}
}
filter{
multiline {
type => "gflogs"
pattern => "\[\#\|\d{4}"
negate => true
what => "previous"
}
grok {
type => "gflogs"
pattern => "(?m)\[\#\|%{TIMESTAMP_ISO8601:timestamp}\|%{LOGLEVEL:loglevel}\|%{DATA:server_version}\|%{JAVACLASS:category}\|%{DATA:kv}\|%{DATA:message}\|\#\]"
named_captures_only => true
singles => true
}
date {
type => "gflogs"
match => [ "timestamp", "ISO8601" ]
}
kv {
type => "gflogs"
exclude_tags => "_grokparsefailure"
source => "kv"
field_split => ";"
value_split => "="
}
}
output {
stdout { codec => rubydebug }
elasticsearch { embedded => true }
}
This worked for me. Pleas look also this post on logstash-usergroup. I can also advice the great and up to date logstash book. Its also a good way to support the work of the logstash author.
Hope to see you on any JUG-Berlin Event!

Resources