I'm having trouble configuring Logstash properly. There are two lines in the postfix logs that I care about:
Jun 14 09:06:22 devmailforwarder postfix/smtp[1994]: A03CA9F532: to=<person#gmail.com>, relay=server[0.0.0.0]:25, delay=0.02, delays=0.01/0.01/0/0.01, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as A0B4D5C49)
Jun 14 09:15:04 devmailforwarder postfix/cleanup[2023]: 0E1969F533: warning: header Subject: subjectline from server[0.0.0.0]; from=<from#gmail.com> to=<to#gmail.com> proto=SMTP helo=<server>
My grok filter patterns are:
POSTFIX_QUEUEID ([0-9A-F]{6,}|[0-9a-zA-Z]{15,})
POSTFIX_STATUS (?<=status=)(.*)(?= \()
POSTFIX_PROCESS (?=postfix\/)(.*?\[)(.*?)(?=: )
POSTFIX_TO (?<=to=<)(.*?)(?=>,)
POSTFIX_RELAY (?<=relay=)(.*?)(?=,)
POSTFIX_SUBJECT (?<=Subject: )(.*)(?= from )
SMTP ^%{SYSLOGTIMESTAMP:timestamp}%{SPACE}%{DATA:hostname}%{SPACE}%{POSTFIX_PROCESS:process}%{GREEDYDATA}%{POSTFIX_QUEUEID:queueid}%{GREEDYDATA}%{POSTFIX_TO:to}%{GREEDYDATA}%{POSTFIX_RELAY:relay}%{GREEDYDATA}%{POSTFIX_STATUS:status}%{SPACE}%{GREEDYDATA:response}
CLEANUP ^%{SYSLOGTIMESTAMP:timestamp}%{SPACE}%{DATA:hostname}%{SPACE}%{POSTFIX_PROCESS:process}:%{SPACE}%{POSTFIX_QUEUEID:queueid}%{GREEDYDATA}%{POSTFIX_SUBJECT:subject}%{GREEDYDATA:something2}
(non-working) Logstash Config is:
input {
file {
path => "/var/log/mail.log*"
exclude => "*.gz"
start_position => "beginning"
type => "postfix"
}
}
filter {
grok {
patterns_dir => ["/etc/logstash/conf.d/patterns"]
match => { "message" => ["%{SMTP}", "%{SUBJECT}"] }
}
if "_grokparsefailure" in [tags] {
drop {}
}
mutate {
add_field => { "logstashSource" => "source-server" }
}
aggregate {
task_id => "%{POSTFIX_QUEUEID}"
code => "
map['to'] ||= event.get('to')
map['from'] ||= event.get('from')
map['relay'] ||= event.get('relay')
map['status'] ||= event.get('status')
map['response'] ||= event.get('response')
map['from'] ||= event.get('timestamp')
map['relay'] ||= event.get('hostname')
map['status'] ||= event.get('process')
map['response'] ||= event.get('queueid')
map['subject'] ||= event.get('subject')
"
map_action => "create_or_update"
push_previous_map_as_event => true
timeout => 2
timeout_tags => ['aggregated']
}
}
output {
if [type] == "postfix" {
file {
path => "/var/log/logstash/postfix.log"
}
}
}
My goal here would be to have one elasticsearch document where each field is populated. The cleanup messages always come in the logs first. The logs are matched by a unique queue ID. I'm struggling with getting the aggregate piece working.
Solved it. Config below. Also needed to update the logstash.yml to add
pipeline.workers: 1
filter {
grok {
patterns_dir => ["/etc/logstash/conf.d/patterns"]
match => { "message" => ["%{SMTP}", "%{SUBJECT}", "%{CONNECTION}"] }
}
if "_grokparsefailure" in [tags] {
drop {}
}
mutate {
add_field => { "logstashSource" => "logstash-server-name" }
}
if ("" in [queueid]) {
aggregate {
task_id => "%{queueid}"
code => "
map['to'] ||= event.get('to')
map['from'] ||= event.get('from')
map['relay'] ||= event.get('relay')
map['status'] ||= event.get('status')
map['response'] ||= event.get('response')
map['from'] ||= event.get('timestamp')
map['relay'] ||= event.get('hostname')
map['status'] ||= event.get('status')
map['subject'] ||= event.get('subject')
map['queueid'] ||= event.get('queueid')
"
timeout => 2
timeout_tags => ['aggregated']
map_action => 'create_or_update'
push_map_as_event_on_timeout => true
}
}
}
output {
if ("aggregated" in [tags] or "" in [connection])
{
elasticsearch {
index => "postfix-%{+YYYY.MM.dd}"
hosts => "your-es-host-here"
}
file {
path => "/var/log/logstash/postfix.log"
}
}
}
I've got a similar problem. In this case I see a PatternError:
Pipeline error {:pipeline_id=>"main", :exception=>#<Grok::PatternError: pattern %{SUBJECT} not defined>, :backtrace=>["/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/jls-grok-0.11.5/lib/grok-pure.rb:123:in `block in compile'
Can you update again your code? yml and pattern please
Related
Logstash 7.8.1
I'm trying to create two documents from one input with logstash. Different templates, different output indexes. Everything worked fine until I tried to change value only on the cloned doc.
I need to have one field in both documents with different values - is it possible with clone filter plugin?
Doc A - [test][event]- trn
Doc B (cloned doc) - [test][event]- spn
I thought that it will work if I use remove_field and next add_field in clone plugin, but I'm afraid that there was problem with sorting - maybe remove_field method is called after add_field (the field was only removed, but not added with new value).
Next I tried to add value to cloned document first and than to original, but it always made an array with both values (orig and cloned) and I need to have only one value in that field:/.
Can someone help me please?
Config:
input {
file {
path => "/opt/test.log"
start_position => beginning
}
}
filter {
grok {
match => {"message" => "... grok...."
}
}
mutate {
add_field => {"[test][event]" => "trn"}
}
clone {
clones => ["cloned"]
#remove_field => [ "[test][event]" ] #remove the field completely
add_field => {"[test][event]" => "spn"} #not added
add_tag => [ "spn" ]
}
}
output {
if "spn" in [tags] {
elasticsearch {
index => "spn-%{+yyyy.MM}"
hosts => ["localhost:9200"]
template_name => "templ1"
}
stdout { codec => rubydebug }
} else {
elasticsearch {
index => "trn-%{+yyyy.MM}"
hosts => ["localhost:9200"]
template_name => "templ2"
}
stdout { codec => rubydebug }
}
}
If you want to make the field that is added conditional on whether the event is the clone or the original then check the [type] field.
clone { clones => ["cloned"] }
if [type] == "cloned" {
mutate { add_field => { "foo" => "spn" } }
} else {
mutate { add_field => { "foo" => "trn" } }
}
add_field is always done before remove_field.
I am new to logstash and grok and I am trying to parse AWS ECS logs in an S3 bucket in the following format -
File Name - my-logs-s3-bucket/3d265ee3-d2ee-4029-a3d9-fd2255d69b92/ecs-fargate-container-8ff0e472-c76f-4f61-a363-64c2b80aa842/000000.gz
Sample Lines -
2019-05-09T16:16:16.983Z JBoss Bootstrap Environment
2019-05-09T16:16:16.983Z JBOSS_HOME: /app/jboss
2019-05-09T16:16:16.983Z JAVA_OPTS: -server -XX:+UseCompressedOops -Djboss.server.log.dir=/var/log/jboss -Xms128m -Xmx4096m
And logstash.conf
input {
s3 {
region => "us-east-1"
bucket => "my-logs-s3-bucket"
interval => "7200"
}
}
filter {
grok {
match => ["message", "%{TIMESTAMP_ISO8601:tstamp}"]
}
date {
match => ["tstamp", "ISO8601"]
}
mutate {
remove_field => ["tstamp"]
add_field => {
"file" => "%{[#metadata][s3][key]}"
}
######### NEED HELP HERE - START #########
#grok {
# match => [ "file", "ecs-fargate-container-%{DATA:containerlogname}"]
#}
######### NEED HELP HERE - END #########
}
}
output {
stdout { codec => rubydebug {
#metadata => true
}
}
}
I am able to see all the logs parsed and the file name extracted when I run logstash using the above configuration and the file name from the output looks like below -
"file" => "myapp-logs/3d265ee3-d2ee-4029-a3d9-fd2255d69b92/ecs-fargate-container-8ff0e472-c76f-4f61-a363-64c2b80aa842/000000.gz",
I am trying to use grok to extract the file name as either ecs-fargate-container-8ff0e472-c76f-4f61-a363-64c2b80aa842 or 8ff0e472-c76f-4f61-a363-64c2b80aa842 by uncommenting grok config lines between #NEED HELP HERE - START and ending with the below error -
Expected one of #, => at line 21, column 10 (byte 536) after filter {\n grok {\n match => [\"message\", \"%{TIMESTAMP_ISO8601:tstamp}\"]\n }\n date {\n match => [\"tstamp\", \"ISO8601\"]\n }\n mutate {\n #remove_field => [\"tstamp\"]\n add_field => {\n \"file\" => \"%{[#metadata][s3][key]}\"\n }\n grok ", :
I am not sure where I am going wrong with this. Please advice.
Your grok filter was inside the mutate filter, try the following.
filter {
grok {
match => ["message", "%{TIMESTAMP_ISO8601:tstamp}"]
}
date {
match => ["tstamp", "ISO8601"]
}
mutate {
remove_field => ["tstamp"]
add_field => { "file" => "%{[#metadata][s3][key]}" }
}
grok {
match => [ "file", "ecs-fargate-container-%{DATA:containerlogname}"]
}
}
I'm building a ELK Setup and its working fine , however i'm getting into a situation where i want to remove certain fields from by system-log data while processing through logstash like remove_field & remove_tag which i've defined in my logstash configuration file but that's not working.
Looking for any esteem and expert advice to correct the config to make it running, thanks very much in advanced.
My logstash configuration file:
[root#sandbox-prd~]# cat /etc/logstash/conf.d/syslog.conf
input {
file {
path => [ "/data/SYSTEMS/*/messages.log" ]
start_position => beginning
sincedb_path => "/dev/null"
max_open_files => 64000
type => "sj-syslog"
}
}
filter {
if [type] == "sj-syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp } %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{#timestamp}" ]
remove_field => ["#version", "host", "_type", "_index", "_score", "path"]
remove_tag => ["_grokparsefailure"]
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
output {
if [type] == "sj-syslog" {
elasticsearch {
hosts => "sandbox-prd02:9200"
manage_template => false
index => "sj-syslog-%{+YYYY.MM.dd}"
document_type => "messages"
}
}
}
Data sample appearing on the Kibana Portal
syslog_pid:6662 type:sj-syslog syslog_message:(root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) syslog_severity:notice syslog_hostname:dbaprod01 syslog_severity_code:5 syslog_timestamp:Feb 11 10:25:02 #timestamp:February 11th 2019, 23:55:02.000 message:Feb 11 10:25:02 dbaprod01 CROND[6662]: (root) CMD (LANG=C LC_ALL=C /usr/bin/mrtg /etc/mrtg/mrtg.cfg --lock-file /var/lock/mrtg/mrtg_l --confcache-file /var/lib/mrtg/mrtg.ok) syslog_facility:user-level syslog_facility_code:1 syslog_program:CROND received_at:February 11th 2019, 10:25:03.353 _id:KpHo2mgBybCgY5IwmRPn _type:messages
_index:sj-syslog-2019.02.11 _score: -
MY Resource Details:
OS version : Linux 7
Logstash Version: 6.5.4
You can't remove _type and _index, those are metadata fields needed by elasticsearch to work, they have information about your index name and the mapping of your data, the _score field is also a metadata field, generated at search time, it's not on your document.
I am trying to process Nginx access log with the help of Logstash (Version 2.1.3)
On the basis of different endpoints found in Nginx access log, I want to send data in different queues or sometimes in different RabbitMQ Servers.
Here is my Logstash configuration:
input {
stdin {}
}
filter {
grok {
match => { "message" => "(?<status>.*?)!~~!(?<req_tm>.*?)!~~!(?<time>.*?)!~~!(?<req_method>.*?)!~~!(?<req_uri>.*)" }
tag_on_failure => ["first_grok_failed"]
}
if "/endpoint1" in [req_uri] {
mutate { add_field => { "[queue]" => "endpoint_one" } }
mutate { add_field => { "[rmqshost]" => "10.10.10.1" } }
}
else if "/endpoint2" in [req_uri] {
mutate { add_field => { "[queue]" => "endpoint_two" } }
mutate { add_field => { "[rmqshost]" => "10.10.10.2" } }
}
else {
mutate { add_field => { "[queue]" => "endpoint_other" } }
mutate { add_field => { "[rmqshost]" => "10.10.10.3" } }
}
}
output {
rabbitmq {
exchange => "%{[queue]}_exchange"
exchange_type => "direct"
host => "%{[rmqshost]}"
key => "%{[queue]}_key"
password => "mypassword"
user=>"myuser"
vhost=>"myvhost"
durable=>false
}
stdout {
codec => rubydebug
}
}
In filter section of above configuration, I add dynamic variable "queue" and "rmqshost".
In output section, I tried using those varibles inside rabbitmq plugin
block.
I am getting following error which shows that "rmqshost" variable have not got replaced.
Connection to %{[rmqshost]}:5672 refused: host unknown
{:exception=>"MarchHare::ConnectionRefused", :backtrace=>
["/opt/logstash/vendor/bundle/jruby/1.9/gems/march_hare-2.15.0-
java/lib/march_hare/session.rb:473:in `converting_rjc_exceptions_to_ruby'",
"/opt/logstash/vendor/bundle/jruby/1.9/gems/march_hare-2.15.0-java/lib/march_hare/session.rb:500:in `new_connection_impl'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/march_hare-2.15.0-java/lib/march_hare/session.rb:136:in `initialize'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/march_hare-2.15.0-java/lib/march_hare/session.rb:109:in `connect'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/march_hare-2.15.0-java/lib/march_hare.rb:20:in `connect'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-mixin-rabbitmq_connection-2.3.0-java/lib/logstash/plugin_mixins/rabbitmq_connection.rb:137:in `connect'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-mixin-rabbitmq_connection-2.3.0-java/lib/logstash/plugin_mixins/rabbitmq_connection.rb:94:in `connect!'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-rabbitmq-3.0.7-java/lib/logstash/outputs/rabbitmq.rb:40:in `register'", "org/jruby/RubyArray.java:1613:in `each'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.3-java/lib/logstash/pipeline.rb:192:in `start_outputs'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.3-java/lib/logstash/pipeline.rb:102:in `run'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.3-java/lib/logstash/agent.rb:165:in `execute'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.3-java/lib/logstash/runner.rb:90:in `run'", "org/jruby/RubyProc.java:281:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-core-2.1.3-java/lib/logstash/runner.rb:95:in `run'", "org/jruby/RubyProc.java:281:in `call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.22/lib/stud/task.rb:24:in `initialize'"], :level=>:error}
I am running logstash as follows:
/opt/logstash/bin/logstash -f /etc/logstash/conf.d/nginx-filter.conf
with following sample data:
200!~~!0.004!~~!14/Apr/2017:05:15:27 +0000!~~!GET!~~!/endpoint1?key1=val1
200!~~!0.004!~~!14/Apr/2017:05:17:25 +0000!~~!GET!~~!/endpoint2?key1=val2
The sprintf replacement only works for the key field in the rabbitmq output plugin.
#hare_info.exchange.publish(message, :routing_key => event.sprintf(#key), :properties => symbolize(#message_properties.merge(:persistent => #persistent)))
The actual connection is established using https://github.com/logstash-plugins/logstash-mixin-rabbitmq_connection, which does not provide the option to replace the host via variables:
:hosts => #host
So this is currently not supported, only the key field might be replaced by logstash variables.
I am trying to parse
[7/1/05 13:41:00:516 PDT]
This is the configuration grok I have written for the same :
\[%{DD/MM/YY HH:MM:SS:S Z}\]
With the date filter :
input {
file {
path => "logstash-5.0.0/bin/sta.log"
start_position => "beginning"
}
}
filter {
grok {
match =>" \[%{DATA:timestamp}\] "
}
date {
match => ["timestamp","DD/MM/YY HH:MM:SS:S ZZZ"]
}
}
output {
stdout{codec => "json"}
}
above is the configuration I have used.
And consider this as my sta.log file content:
[7/1/05 13:41:00:516 PDT]
Getting this error :
[2017-01-31T12:37:47,444][ERROR][logstash.agent ] fetched an invalid config {:config=>"input {\nfile {\npath => \"logstash-5.0.0/bin/sta.log\"\nstart_position => \"beginning\"\n}\n}\nfilter {\ngrok {\nmatch =>\"\\[%{DATA:timestamp}\\]\"\n}\ndate {\nmatch => [\"timestamp\"=>\"DD/MM/YY HH:MM:SS:S ZZZ\"]\n}\n}\noutput {\nstdout{codec => \"json\"}\n}\n\n", :reason=>"Expected one of #, {, ,, ] at line 12, column 22 (byte 184) after filter {\ngrok {\nmatch =>\"\\[%{DATA:timestamp}\\]\"\n}\ndate {\nmatch => [\"timestamp\""}
Can anyone help here?
You forgot to specify the input for your grokfilter. A correct configuration would look like this:
input {
file {
path => "logstash-5.0.0/bin/sta.log"
start_position => "beginning"
}
}
filter {
grok {
match => {"message" => "\[%{DATA:timestamp} PDT\]"}
}
date {
match => ["timestamp","dd/MM/yy HH:mm:ss:SSS"]
}
}
output {
stdout{codec => "json"}
}
For further reference check out the grok documentation here.