KV filter logstash value_split

KV filter logstash value_split - logstash

Hi i am using kv filter to split my string I wanted to know how do I put the values after I split em. For example:
My logs look like below:
47.30.221.46 - - [04/Sep/2017:13:24:44 +0530] "GET /api/v1.2/places/search/json?username=gaurav.saxena889&location=28.5506382,77.2689024&query=sunrise%20hy&explain=true&bridge=true HTTP/1.1" 200 2522 45402
47.30.221.46 - - [04/Sep/2017:13:24:46 +0530] "GET /api/v1.2/places/search/json?username=gaurav.saxena889&location=28.5506382,77.2689024&query=hy&explain=true&bridge=true HTTP/1.1" 200 2169 55267
47.30.221.46 - - [04/Sep/2017:13:24:47 +0530] "GET /api/v1.2/places/search/json?username=gaurav.saxena889&location=28.5506382,77.2689024&query=hyun&explain=true&bridge=true HTTP/1.1" 200 2530 29635
47.30.221.46 - - [04/Sep/2017:13:24:47 +0530] "GET /api/v1.2/places/search/json?username=gaurav.saxena889&location=28.5506382,77.2689024&query=hyunda&explain=true&bridge=true HTTP/1.1" 200 2572 25449
47.30.221.46 - - [04/Sep/2017:13:24:48 +0530] "GET /api/v1.2/places/search/json?username=gaurav.saxena889&location=28.5506382,77.2689024&query=hyundai&explain=true&bridge=true HTTP/1.1" 200 3576 28007
47.30.221.46 - - [04/Sep/2017:13:24:58 +0530] "GET /api/v1.2/places/search/json?username=gaurav.saxena889&location=28.5506382,77.2689024&query=su&explain=true&bridge=true HTTP/1.1" 200 2354 96861
47.30.221.46 - - [04/Sep/2017:13:24:58 +0530] "GET /api/v1.2/places/search/json?username=gaurav.saxena889&location=28.5506382,77.2689024&query=sun&explain=true&bridge=true HTTP/1.1" 200 3224 50897
My logstash config file looks like below:
input {
beats {
port => 5044
client_inactivity_timeout => 86400
}
}
filter {
grok {
match => {
"message" => "%{IPORHOST:client_ip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:method} /api/v%{NUMBER:version}/%{DATA:resource}/%{DATA:subresource}/%{DATA:response_type}\?%{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response_code} (?:%{NUMBER:data_transfered}|-) %{NUMBER:response_time}"
}
}
kv {
source => "request"
field_split => "&"
}
if [query] {
mutate {
rename => { "query" => "searched_keword" }
}
} else if [keyword] {
mutate {
rename => { "keyword" => "searched_keyword" }
}
}
if [refLocation] {
mutate {
rename => { "refLocation" => "location" }
}
}
mutate {
convert => { "response_code" => "integer" }
}
mutate {
convert => { "data_transfered" => "integer" }
}
mutate {
convert => { "version" => "float" }
}
mutate {
convert => { "response_time" => "integer" }
}
if [location] {
kv {
source => "location"
value_split => ","
}
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "logstash_apachelogs"
document_type => "log"
}
}
If you have a look at the last kv filter, I've split my location value with a ,. I have 2 questions:
If you see from the logs I have a location=28.5506382,77.2689024 using the kv filter I split the values using the , now how do I use the splited values in a goip filter which takes the vaues as below:
geoip {
source => "ClientIP"
target => "geoip"
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
How do i replace the %20 in the query parameter with a white space?

Related

Grok pattern working fine in grok debugger but the same pattern is not working when running with logstash

**Input File:**
123.123.12.123 - - [09/Jan/2021:00:00:41 -0500] "GET /abcde/common/abcde.jsp HTTP/1.1" 401 1944 1
Output in Grok Debugger
{
"Path": "GET /abcde/common/abcde.jsp HTTP/1.1",
"ResponseCode": "401",
"KnowCode": "1944",
"ExitCode": "1",
"UserInfo": "-",
"HostName": "123.123.12.123",
"Date": "09/Jan/2021:00:00:41 -0500"
}
GROK filter in logstash
grok {
match => {"message" => " %{IP:HostName}\s\-\s%{USERNAME:UserInfo}\s\[%{GREEDYDATA:Date}\]\s\"%{GREEDYDATA:Path}\"\s%{BASE10NUM:ResponseCode}\s%{BASE10NUM:KnowCode}\s%{BASE10NUM:ExitCode}"}
}
Whereas when process same grok pattern in logstash filter in Kibana screen it gives me result like this ::
"#version" => "1",
"#timestamp" => 2021-02-12T16:38:28.141Z,
"type" => "access_logs",
"path" => "C:/Temp/BOHLogs/CatalinaAccess/localhost_access_log.2021-01-09.txt",
"tags" => [
[0] "_grokparsefailure"
],
"host" => "AB-1SM433",
"message" => "123.123.12.123 - - [09/Jan/2021:00:00:41 -0500] \"GET /abcde/common/abcde.jsp HTTP/1.1\" 401 1944 1"

You have a space at the start of your pattern that needs to be removed.
grok {
match => {"message" => " %{IP:HostName}\s\-\s%{USERNAME:UserInfo}\s\[%{GREEDYDATA:Date}\]\s\"%{GREEDYDATA:Path}\"\s%{BASE10NUM:ResponseCode}\s%{BASE10NUM:KnowCode}\s%{BASE10NUM:ExitCode}"}
^ Delete this
}

Logstash stopped processing because of an error: (SystemExit) exit

We are trying to index Nginx access and error log separately in Elasticsearch. for that we have created Filbeat and Logstash config as below.
Below is our /etc/filebeat/filebeat.yml configuration
filebeat.inputs:
- type: log
paths:
- /var/log/nginx/*access*.log
exclude_files: ['\.gz$']
exclude_lines: ['*ELB-HealthChecker*']
fields:
log_type: type1
- type: log
paths:
- /var/log/nginx/*error*.log
exclude_files: ['\.gz$']
exclude_lines: ['*ELB-HealthChecker*']
fields:
log_type: type2
output.logstash:
hosts: ["10.227.XXX.XXX:5400"]
Our logstash file /etc/logstash/conf.d/logstash-nginx-es.conf config is as below
input {
beats {
port => 5400
}
}
filter {
if ([fields][log_type] == "type1") {
grok {
match => [ "message" , "%{NGINXACCESS}+%{GREEDYDATA:extra_fields}"]
overwrite => [ "message" ]
}
mutate {
convert => ["response", "integer"]
convert => ["bytes", "integer"]
convert => ["responsetime", "float"]
}
geoip {
source => "clientip"
target => "geoip"
add_tag => [ "nginx-geoip" ]
}
date {
match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
remove_field => [ "timestamp" ]
}
useragent {
source => "user_agent"
}
} else {
grok {
match => [ "message" , "(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{GREEDYDATA:message}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))"(, upstream: "%{GREEDYDATA:upstream}")?, host: "%{DATA:host}"(, referrer: "%{GREEDYDATA:referrer}")?"]
overwrite => [ "message" ]
}
mutate {
convert => ["response", "integer"]
convert => ["bytes", "integer"]
convert => ["responsetime", "float"]
}
geoip {
source => "clientip"
target => "geoip"
add_tag => [ "nginx-geoip" ]
}
date {
match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
remove_field => [ "timestamp" ]
}
useragent {
source => "user_agent"
}
}
}
output {
if ([fields][log_type] == "type1") {
amazon_es {
hosts => ["vpc-XXXX-XXXX.ap-southeast-1.es.amazonaws.com"]
region => "ap-southeast-1"
aws_access_key_id => 'XXXX'
aws_secret_access_key => 'XXXX'
index => "nginx-access-logs-%{+YYYY.MM.dd}"
}
} else {
amazon_es {
hosts => ["vpc-XXXX-XXXX.ap-southeast-1.es.amazonaws.com"]
region => "ap-southeast-1"
aws_access_key_id => 'XXXX'
aws_secret_access_key => 'XXXX'
index => "nginx-error-logs-%{+YYYY.MM.dd}"
}
}
stdout {
codec => rubydebug
}
}
And we are receiving below error while starting logstash.
[2020-10-12T06:05:39,183][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.9.2", "jruby.version"=>"jruby 9.2.13.0 (2.5.7) 2020-08-03 9a89c94bcc OpenJDK 64-Bit Server VM 25.265-b01 on 1.8.0_265-b01 +indy +jit [linux-x86_64]"}
[2020-10-12T06:05:39,861][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-10-12T06:05:41,454][ERROR][logstash.agent ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", \"{\", \",\", \"]\" at line 32, column 263 (byte 918) after filter {\n if ([fields][log_type] == \"type1\") {\n grok {\n match => [ \"message\" , \"%{NGINXACCESS}+%{GREEDYDATA:extra_fields}\"]\n overwrite => [ \"message\" ]\n }\n mutate {\n convert => [\"response\", \"integer\"]\n convert => [\"bytes\", \"integer\"]\n convert => [\"responsetime\", \"float\"]\n }\n geoip {\n source => \"clientip\"\n target => \"geoip\"\n add_tag => [ \"nginx-geoip\" ]\n }\n date {\n match => [ \"timestamp\" , \"dd/MMM/YYYY:HH:mm:ss Z\" ]\n remove_field => [ \"timestamp\" ]\n }\n useragent {\n source => \"user_agent\"\n }\n } else {\n grok {\n match => [ \"message\" , \"(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \\[%{LOGLEVEL:severity}\\] %{POSINT:pid}#%{NUMBER:threadid}\\: \\*%{NUMBER:connectionid} %{GREEDYDATA:message}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: \"", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:32:in `compile_imperative'", "org/logstash/execution/AbstractPipelineExt.java:183:in `initialize'", "org/logstash/execution/JavaBasePipelineExt.java:69:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:44:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:52:in `execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:357:in `block in converge_state'"]}
[2020-10-12T06:05:41,795][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2020-10-12T06:05:46,685][INFO ][logstash.runner ] Logstash shut down.
[2020-10-12T06:05:46,706][ERROR][org.logstash.Logstash ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit
There seems to be some formatting issue. Please help what is the problem
=================================UPDATE===================================
For all those who are looking for a robust grok filter for nginx access and error logs ... please try below filter patterns.
Access_Logs - %{IPORHOST:remote_ip} - %{DATA:user_name} \[%{HTTPDATE:access_time}\] \"%{WORD:http_method} %{URIPATHPARAM:url} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:body_sent_bytes} \"%{SPACE:referrer}\" \"%{DATA:agent}\" %{NUMBER:duration} req_header:\"%{DATA:req_header}\" req_body:\"%{DATA:req_body}\" resp_header:\"%{DATA:resp_header}\" resp_body:\"%{GREEDYDATA:resp_body}\"
Error_Logs - (?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{DATA:errormessage}, client: %{IP:client}, server: %{IP:server}, request: \"(?<httprequest>%{WORD:httpcommand} %{NOTSPACE:httpfile} HTTP/(?<httpversion>[0-9.]*))\", host: \"%{NOTSPACE:host}\"(, referrer: \"%{NOTSPACE:referrer}\")?

Grok pattern on line 32 is the issue. Need to escape all " characters.
Below is an escaped version of the GROK.
grok {
match => [ "message" , "(?<timestamp>%{YEAR}[./]%{MONTHNUM}[./]%{MONTHDAY} %{TIME})\[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER:threadid}\: \*%{NUMBER:connectionid} %{GREEDYDATA:message}, client: %{IP:client}, server: %{GREEDYDATA:server}, request: \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion}))\"(, upstream: \"%{GREEDYDATA:upstream}\")?, host: \"%{DATA:host}\"(, referrer: \"%{GREEDYDATA:referrer}\")?"]
overwrite => [ "message" ]
}

Logs are overwritten in the specified index under the same _id

I'm using filebeat - 6.5.1, Logstash - 6.5.1 and elasticsearch - 6.5.1
I'm using multiple GROK in the single config file and trying to send the logs into Elasticsearch
Below is my Filebeat.yml
filebeat.prospectors:
type: log
paths:
var/log/message
fields:
type: apache_access
tags: ["ApacheAccessLogs"]
type: log
paths:
var/log/indicate
fields:
type: apache_error
tags: ["ApacheErrorLogs"]
type: log
paths:
var/log/panda
fields:
type: mysql_error
tags: ["MysqlErrorLogs"]
output.logstash:
The Logstash hosts
hosts: ["logstash:5044"]
Below is my logstash config file -
input {
beats {
port => 5044
tags => [ "ApacheAccessLogs", "ApacheErrorLogs", "MysqlErrorLogs" ]
}
}
filter {
if "ApacheAccessLogs" in [tags] {
grok {
match => [
"message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}",
"message" , "%{COMMONAPACHELOG}+%{GREEDYDATA:extra_fields}"
]
overwrite => [ "message" ]
}
mutate {
convert => ["response", "integer"]
convert => ["bytes", "integer"]
convert => ["responsetime", "float"]
}
geoip {
source => "clientip"
target => "geoip"
add_tag => [ "apache-geoip" ]
}
date {
match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
remove_field => [ "timestamp" ]
}
useragent {
source => "agent"
}
}
if "ApacheErrorLogs" in [tags] {
grok {
match => { "message" => ["[%{APACHE_TIME:[apache2][error][timestamp]}] [%{LOGLEVEL:[apache2][error][level]}]( [client %{IPORHOST:[apache2][error][client]}])? %{GREEDYDATA:[apache2][error][message]}",
"[%{APACHE_TIME:[apache2][error][timestamp]}] [%{DATA:[apache2][error][module]}:%{LOGLEVEL:[apache2][error][level]}] [pid %{NUMBER:[apache2][error][pid]}(:tid %{NUMBER:[apache2][error][tid]})?]( [client %{IPORHOST:[apache2][error][client]}])? %{GREEDYDATA:[apache2][error][message1]}" ] }
pattern_definitions => {
"APACHE_TIME" => "%{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}"
}
remove_field => "message"
}
mutate {
rename => { "[apache2][error][message1]" => "[apache2][error][message]" }
}
date {
match => [ "[apache2][error][timestamp]", "EEE MMM dd H:m:s YYYY", "EEE MMM dd H:m:s.SSSSSS YYYY" ]
remove_field => "[apache2][error][timestamp]"
}
}
if "MysqlErrorLogs" in [tags] {
grok {
match => { "message" => ["%{LOCALDATETIME:[mysql][error][timestamp]} ([%{DATA:[mysql][error][level]}] )?%{GREEDYDATA:[mysql][error][message]}",
"%{TIMESTAMP_ISO8601:[mysql][error][timestamp]} %{NUMBER:[mysql][error][thread_id]} [%{DATA:[mysql][error][level]}] %{GREEDYDATA:[mysql][error][message1]}",
"%{GREEDYDATA:[mysql][error][message2]}"] }
pattern_definitions => {
"LOCALDATETIME" => "[0-9]+ %{TIME}"
}
remove_field => "message"
}
mutate {
rename => { "[mysql][error][message1]" => "[mysql][error][message]" }
}
mutate {
rename => { "[mysql][error][message2]" => "[mysql][error][message]" }
}
date {
match => [ "[mysql][error][timestamp]", "ISO8601", "YYMMdd H:m:s" ]
remove_field => "[apache2][access][time]"
}
}
}
output {
if "ApacheAccessLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_id => "apacheaccess"
}
}
if "ApacheErrorLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_id => "apacheerror"
}
}
if "MysqlErrorLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_id => "sqlerror"
}
}
stdout { codec => rubydebug }
}
The data is sent to elastic search but only 3 records are getting created for each document_id in the same index.
Only 3 records are created and every new logs incoming are overwritten onto the same document_id and the old one is lost.
Can you guys please help me out?

The definition of document_id is to provide an unique document id for an event. In your case, as they are static (apacheaccess, apacheerror, sqlerror), there will be only 1 event per index ingested into elasticsearch, overide by the newest event.
As you have 3 distinct data type, what you seems to be looking for provide for each event type (ApacheAccessLogs, ApacheErrorLogs, MysqlErrorLogs) a different index, as following :
output {
if "ApacheAccessLogs" in [tags] {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "apache-access"
}
}
if "ApacheErrorLogs" in [tags] {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "apache-error"
}
}
if "MysqlErrorLogs" in [tags] {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "mysql-error"
}
}
stdout {
codec => rubydebug
}
}
There are not many cases where you need to set the id manually (eg. in case of reingest of data), as Logstash & Elasticsearch will manage that by themself.
But if that's the case, and you can't use a field to identify each event individually, you could use the logstash-filter-fingerprint, that is made for that.

Grok help for a custom metric

I have a log line like this:
09 Nov 2018 15:51:35 DEBUG api.MapAnythingProvider - Calling API For Client: XXX Number of ELEMENTS Requested YYY
I want to ignore all other log lines and only want those lines that have the words "Calling API For Client" in it. Further, I am only interested in the String XXX and Number YYY.
Thanks for the help.

input {
file {
path => ["C:/apache-tomcat-9.0.7/logs/service/service.log"]
sincedb_path => "nul"
start_position => "beginning"
}
}
filter {
grok {
match => {
"message" => "%{MONTHDAY:monthDay} %{MONTH:mon} %{YEAR:year} %{TIME:ts} %{WORD:severity} %{JAVACLASS:claz} - %{GREEDYDATA:logmessage}"
}
}
grok {
match => {
"logmessage" => "%{WORD:keyword} %{WORD:customer} %{WORD:key2} %{NUMBER:mapAnythingCreditsConsumed:float} %{WORD:key3} %{NUMBER:elementsFromCache:int}"
}
}
if "_grokparsefailure" in [tags] {
drop {}
}
mutate {
remove_field => [ "monthDay", "mon", "ts", "severity", "claz", "keyword", "key2", "path", "message", "year", "key3" ]
}
}
output {
if [logmessage] =~ /ExecutingJobFor/ {
elasticsearch {
hosts => ["localhost:9200"]
index => "test"
manage_template => false
}
stdout {
codec => rubydebug
}
}
}

bro-ids logstash filter not working

I've set up an ELK stack on centos 7, and are forwarding logs from a freebsd 11 host which runs bro. However my filters are not working to correctly parse the bro logs.
This is the current set up:
freebsd filebeat client
filebeat.yml
filebeat:
registry_file: /var/run/.filebeat
prospectors:
-
paths:
- /var/log/messages
- /var/log/maillog
- /var/log/auth.log
- /var/log/cron
- /var/log/debug.log
- /var/log/devd.log
- /var/log/ppp.log
- /var/log/netatalk.log
- /var/log/setuid.today
- /var/log/utx.log
- /var/log/rkhunter.log
- /var/log/userlog
- /var/log/sendmail.st
- /var/log/xferlog
input_type: log
document_type: syslog
-
paths:
- /var/log/bro/current/app_stats.log
input_type: log
document_type: bro_app_stats
-
paths:
- /var/log/bro/current/communication.log
input_type: log
document_type: bro_communication
-
paths:
- /var/log/bro/current/conn.log
input_type: log
document_type: bro_conn
-
paths:
- /var/log/bro/current/dhcp.log
input_type: log
document_type: bro_dhcp
-
paths:
- /var/log/bro/current/dns.log
input_type: log
document_type: bro_dns
-
paths:
- /var/log/bro/current/dpd.log
input_type: log
document_type: bro_dpd
-
paths:
- /var/log/bro/current/files.log
input_type: log
document_type: bro_files
-
paths:
- /var/log/bro/current/ftp.log
input_type: log
document_type: bro_ftp
-
paths:
- /var/log/bro/current/http.log
input_type: log
document_type: bro_http
-
paths:
- /var/log/bro/current/known_certs.log
input_type: log
document_type: bro_app_known_certs
-
paths:
- /var/log/bro/current/known_hosts.log
input_type: log
document_type: bro_known_hosts
-
paths:
- /var/log/bro/current/known_services.log
input_type: log
document_type: bro_known_services
-
paths:
- /var/log/bro/current/notice.log
input_type: log
document_type: bro_notice
-
paths:
- /var/log/bro/current/smtp.log
input_type: log
document_type: bro_smtp
-
paths:
- /var/log/bro/current/software.log
input_type: log
document_type: bro_software
-
paths:
- /var/log/bro/current/ssh.log
input_type: log
document_type: bro_ssh
-
paths:
- /var/log/bro/current/ssl.log
input_type: log
document_type: bro_ssl
-
paths:
- /var/log/bro/current/weird.log
input_type: log
document_type: bro_weird
-
paths:
- /var/log/bro/current/x509.log
input_type: log
document_type: bro_x509
then on the centos ELK server I have 4 configs:
/etc/logstash/conf.d/02-beats-input.conf
input {
beats {
port => 5044
}
}
/etc/logstash/conf.d/10-syslog-filter.conf
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
/etc/logstash/conf.d/20-bro-ids-filter.conf
filter {
# bro_app_stats ######################
if [type] == "bro_app" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<ts_delta>(.*?))\t(?<app>(.*?))\t(?<uniq_hosts>(.*?))\t(?<hits>(.*?))\t(?<bytes>(.*))" ]
}
}
# bro_conn ######################
if [type] == "bro_conn" {
grok {
match => [
"message", "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<proto>(.*?))\t(?<service>(.*?))\t(?<duration>(.*?))\t(?<orig_bytes>(.*?))\t(?<resp_bytes>(.*?))\t(?<conn_state>(.*?))\t(?<local_orig>(.*?))\t(?<missed_bytes>(.*?))\t(?<history>(.*?))\t(?<orig_pkts>(.*?))\t(?<orig_ip_bytes>(.*?))\t(?<resp_pkts>(.*?))\t(?<resp_ip_bytes>(.*?))\t(?<tunnel_parents>(.*?))\t(?<orig_cc>(.*?))\t(?<resp_cc>(.*?))\t(?<sensorname>(.*))",
"message", "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<proto>(.*?))\t(?<service>(.*?))\t(?<duration>(.*?))\t(?<orig_bytes>(.*?))\t(?<resp_bytes>(.*?))\t(?<conn_state>(.*?))\t(?<local_orig>(.*?))\t(?<missed_bytes>(.*?))\t(?<history>(.*?))\t(?<orig_pkts>(.*?))\t(?<orig_ip_bytes>(.*?))\t(?<resp_pkts>(.*?))\t(?<resp_ip_bytes>(.*?))\t(%{NOTSPACE:tunnel_parents})"
]
}
}
# bro_notice ######################
if [type] == "bro_notice" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<fuid>(.*?))\t(?<file_mime_type>(.*?))\t(?<file_desc>(.*?))\t(?<proto>(.*?))\t(?<note>(.*?))\t(?<msg>(.*?))\t(?<sub>(.*?))\t(?<src>(.*?))\t(?<dst>(.*?))\t(?<p>(.*?))\t(?<n>(.*?))\t(?<peer_descr>(.*?))\t(?<actions>(.*?))\t(?<suppress_for>(.*?))\t(?<dropped>(.*?))\t(?<remote_location.country_code>(.*?))\t(?<remote_location.region>(.*?))\t(?<remote_location.city>(.*?))\t(?<remote_location.latitude>(.*?))\t(?<remote_location.longitude>(.*))" ]
}
}
# bro_dhcp ######################
if [type] == "bro_dhcp" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<mac>(.*?))\t(?<assigned_ip>(.*?))\t(?<lease_time>(.*?))\t(?<trans_id>(.*))" ]
}
}
# bro_dns ######################
if [type] == "bro_dns" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<proto>(.*?))\t(?<trans_id>(.*?))\t(?<query>(.*?))\t(?<qclass>(.*?))\t(?<qclass_name>(.*?))\t(?<qtype>(.*?))\t(?<qtype_name>(.*?))\t(?<rcode>(.*?))\t(?<rcode_name>(.*?))\t(?<AA>(.*?))\t(?<TC>(.*?))\t(?<RD>(.*?))\t(?<RA>(.*?))\t(?<Z>(.*?))\t(?<answers>(.*?))\t(?<TTLs>(.*?))\t(?<rejected>(.*))" ]
}
}
# bro_software ######################
if [type] == "bro_software" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<bro_host>(.*?))\t(?<host_p>(.*?))\t(?<software_type>(.*?))\t(?<name>(.*?))\t(?<version.major>(.*?))\t(?<version.minor>(.*?))\t(?<version.minor2>(.*?))\t(?<version.minor3>(.*?))\t(?<version.addl>(.*?))\t(?<unparsed_version>(.*))" ]
}
}
# bro_dpd ######################
if [type] == "bro_dpd" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<proto>(.*?))\t(?<analyzer>(.*?))\t(?<failure_reason>(.*))" ]
}
}
# bro_files ######################
if [type] == "bro_files" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<fuid>(.*?))\t(?<tx_hosts>(.*?))\t(?<rx_hosts>(.*?))\t(?<conn_uids>(.*?))\t(?<source>(.*?))\t(?<depth>(.*?))\t(?<analyzers>(.*?))\t(?<mime_type>(.*?))\t(?<filename>(.*?))\t(?<duration>(.*?))\t(?<local_orig>(.*?))\t(?<is_orig>(.*?))\t(?<seen_bytes>(.*?))\t(?<total_bytes>(.*?))\t(?<missing_bytes>(.*?))\t(?<overflow_bytes>(.*?))\t(?<timedout>(.*?))\t(?<parent_fuid>(.*?))\t(?<md5>(.*?))\t(?<sha1>(.*?))\t(?<sha256>(.*?))\t(?<extracted>(.*))" ]
}
}
# bro_http ######################
if [type] == "bro_http" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<trans_depth>(.*?))\t(?<method>(.*?))\t(?<bro_host>(.*?))\t(?<uri>(.*?))\t(?<referrer>(.*?))\t(?<user_agent>(.*?))\t(?<request_body_len>(.*?))\t(?<response_body_len>(.*?))\t(?<status_code>(.*?))\t(?<status_msg>(.*?))\t(?<info_code>(.*?))\t(?<info_msg>(.*?))\t(?<filename>(.*?))\t(?<http_tags>(.*?))\t(?<username>(.*?))\t(?<password>(.*?))\t(?<proxied>(.*?))\t(?<orig_fuids>(.*?))\t(?<orig_mime_types>(.*?))\t(?<resp_fuids>(.*?))\t(?<resp_mime_types>(.*))" ]
}
}
# bro_known_certs ######################
if [type] == "bro_known_certs" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<bro_host>(.*?))\t(?<port_num>(.*?))\t(?<subject>(.*?))\t(?<issuer_subject>(.*?))\t(?<serial>(.*))" ]
}
}
# bro_known_hosts ######################
if [type] == "bro_known_hosts" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<bro_host>(.*))" ]
}
}
# bro_known_services ######################
if [type] == "bro_known_services" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<bro_host>(.*?))\t(?<port_num>(.*?))\t(?<port_proto>(.*?))\t(?<service>(.*))" ]
}
}
# bro_ssh ######################
if [type] == "bro_ssh" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<status>(.*?))\t(?<direction>(.*?))\t(?<client>(.*?))\t(?<server>(.*?))\t(?<remote_location.country_code>(.*?))\t(?<remote_location.region>(.*?))\t(?<remote_location.city>(.*?))\t(?<remote_location.latitude>(.*?))\t(?<remote_location.longitude>(.*))" ]
}
}
# bro_ssl ######################
if [type] == "bro_ssl" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<version>(.*?))\t(?<cipher>(.*?))\t(?<server_name>(.*?))\t(?<session_id>(.*?))\t(?<subject>(.*?))\t(?<issuer_subject>(.*?))\t(?<not_valid_before>(.*?))\t(?<not_valid_after>(.*?))\t(?<last_alert>(.*?))\t(?<client_subject>(.*?))\t(?<client_issuer_subject>(.*?))\t(?<cert_hash>(.*?))\t(?<validation_status>(.*))" ]
}
}
# bro_weird ######################
if [type] == "bro_weird" {
grok {
match => [ "message", "(?<ts>(.*?))\t(?<uid>(.*?))\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t(?<name>(.*?))\t(?<addl>(.*?))\t(?<notice>(.*?))\t(?<peer>(.*))" ]
}
}
# bro_x509 #######################
if [type] == "bro_x509" {
csv {
#x509.log:#fields ts id certificate.version certificate.serial certificate.subject certificate.issuer certificate.not_valid_before certificate.not_valid_after certificate.key_alg certificate.sig_alg certificate.key_type certificate.key_length certificate.exponent certificate.curve san.dns san.uri san.email san.ip basic_constraints.ca basic_constraints.path_len
columns => ["ts","id","certificate.version","certificate.serial","certificate.subject","icertificate.issuer","certificate.not_valid_before","certificate.not_valid_after","certificate.key_alg","certificate.sig_alg","certificate.key_type","certificate.key_length","certificate.exponent","certificate.curve","san.dns","san.uri","san.email","san.ip","basic_constraints.ca","basic_constraints.path_len"]
#If you use a custom delimiter, change the following value in between the quotes to your delimiter. Otherwise, leave the next line alone.
separator => " "
}
#Let's convert our timestamp into the 'ts' field, so we can use Kibana features natively
date {
match => [ "ts", "UNIX" ]
}
}
if [type]== "bro_intel" {
grok {
match => [ "message", "(?<ts>(.*?))\t%{DATA:uid}\t(?<id.orig_h>(.*?))\t(?<id.orig_p>(.*?))\t(?<id.resp_h>(.*?))\t(?<id.resp_p>(.*?))\t%{DATA:fuid}\t%{DATA:file_mime_type}\t%{DATA:file_desc}\t(?<seen.indicator>(.*?))\t(?<seen.indicator_type>(.*?))\t(?<seen.where>(.*?))\t%{NOTSPACE:sources}" ]
}
}
}
date {
match => [ "ts", "UNIX" ]
}
}
filter {
if "bro" in [type] {
if [id.orig_h] {
mutate {
add_field => [ "senderbase_lookup", "http://www.senderbase.org/lookup/?search_string=%{id.orig_h}" ]
add_field => [ "CBL_lookup", "http://cbl.abuseat.org/lookup.cgi?ip=%{id.orig_h}" ]
add_field => [ "Spamhaus_lookup", "http://www.spamhaus.org/query/bl?ip=%{id.orig_h}" ]
}
}
mutate {
add_tag => [ "bro" ]
}
mutate {
convert => [ "id.orig_p", "integer" ]
convert => [ "id.resp_p", "integer" ]
convert => [ "orig_bytes", "integer" ]
convert => [ "resp_bytes", "integer" ]
convert => [ "missed_bytes", "integer" ]
convert => [ "orig_pkts", "integer" ]
convert => [ "orig_ip_bytes", "integer" ]
convert => [ "resp_pkts", "integer" ]
convert => [ "resp_ip_bytes", "integer" ]
}
}
}
filter {
if [type] == "bro_conn" {
#The following makes use of the translate filter (stash contrib) to convert conn_state into human text. Saves having to look up values for packet introspection
translate {
field => "conn_state"
destination => "conn_state_full"
dictionary => [
"S0", "Connection attempt seen, no reply",
"S1", "Connection established, not terminated",
"S2", "Connection established and close attempt by originator seen (but no reply from responder)",
"S3", "Connection established and close attempt by responder seen (but no reply from originator)",
"SF", "Normal SYN/FIN completion",
"REJ", "Connection attempt rejected",
"RSTO", "Connection established, originator aborted (sent a RST)",
"RSTR", "Established, responder aborted",
"RSTOS0", "Originator sent a SYN followed by a RST, we never saw a SYN-ACK from the responder",
"RSTRH", "Responder sent a SYN ACK followed by a RST, we never saw a SYN from the (purported) originator",
"SH", "Originator sent a SYN followed by a FIN, we never saw a SYN ACK from the responder (hence the connection was 'half' open)",
"SHR", "Responder sent a SYN ACK followed by a FIN, we never saw a SYN from the originator",
"OTH", "No SYN seen, just midstream traffic (a 'partial connection' that was not later closed)"
]
}
}
}
# Resolve #source_host to FQDN if possible if missing for some types of ging using source_host_ip from above
filter {
if [id.orig_h] {
if ![id.orig_h-resolved] {
mutate {
add_field => [ "id.orig_h-resolved", "%{id.orig_h}" ]
}
dns {
reverse => [ "id.orig_h-resolved" ]
action => "replace"
}
}
}
}
filter {
if [id.resp_h] {
if ![id.resp_h-resolved] {
mutate {
add_field => [ "id.resp_h-resolved", "%{id.resp_h}" ]
}
dns {
reverse => [ "id.resp_h-resolved" ]
action => "replace"
}
}
}
}
and /etc/logstash/conf.d/30-elasticsearch-output.conf
output {
elasticsearch {
hosts => ["localhost:9200"]
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
I've leveraged this gist and tailored it to my configuration. While running I get the following error in /var/log/logstash/logstash-plain.log:
[2016-11-06T15:30:36,961][ERROR][logstash.agent ] ########\n\t if [type] == \"bro_dhcp\" {\n\t\tgrok { \n\t\t match => [ \"message\", \"(?<ts>(.*?))\\t(?<uid>(.*?))\\t(?<id.orig_h>(.*?))\\t(?<id.orig_p>(.*?))\\t(?<id.resp_h>(.*?))\\t(?<id.resp_p>(.*?))\\t(?<mac>(.*?))\\t(?<assigned_ip>(.*?))\\t(?<lease_time>(.*?))\\t(?<trans_id>(.*))\" ]\n\t\t}\n\t }\n\n\t# bro_dns ######################\n\t if [type] == \"bro_dns\" {\n\t\tgrok {\n\t\t match => [ \"message\", \"(?<ts>(.*?))\\t(?<uid>(.*?))\\t(?<id.orig_h>(.*?))\\t(?<id.orig_p>(.*?))\\t(?<id.resp_h>(.*?))\\t(?<id.resp_p>(.*?))\\t(?<proto>(.*?))\\t(?<trans_id>(.*?))\\t(?<query>(.*?))\\t(?<qclass>(.*?))\\t(?<qclass_name>(.*?))\\t(?<qtype>(.*?))\\t(?<qtype_name>(.*?))\\t(?<rcode>(.*?))\\t(?<rcode_name>(.*?))\\t(?<AA>(.*?))\\t(?<TC>(.*?))\\t(?<RD>(.*?))\\t(?<RA>(.*?))\\t(?<Z>(.*?))\\t(?<answers>(.*?))\\t(?<TTLs>(.*?))\\t(?<rejected>(.*))\" ]\n\t\t}\n\t }\n\n\t# bro_software ######################\n\t if [type] == \"bro_software\" {\n\t\tgrok { \n\t\t match => [ \"message\", \"(?<ts>(.*?))\\t(?<bro_host>(.*?))\\t(?<host_p>(.*?))\\t(?<software_type>(.*?))\\t(?<name>(.*?))\\t(?<version.major>(.*?))\\t(?<version.minor>(.*?))\\t(?<version.minor2>(.*?))\\t(?<version.minor3>(.*?))\\t(?<version.addl>(.*?))\\t(?<unparsed_version>(.*))\" ]\n\t\t}\n\t }\n\n\t# bro_dpd ######################\n\t if [type] == \"bro_dpd\" {\n\t\tgrok {\n\t\t match => [ \"message\", \"(?<ts>(.*?))\\t(?<uid>(.*?))\\t(?<id.orig_h>(.*?))\\t(?<id.orig_p>(.*?))\\t(?<id.resp_h>(.*?))\\t(?<id.resp_p>(.*?))\\t(?<proto>(.*?))\\t(?<analyzer>(.*?))\\t(?<failure_reason>(.*))\" ]\n\t\t}\n\t }\n\n\t# bro_files ######################\n\t if [type] == \"bro_files\" {\n\t\tgrok {\n\t\t match => [ \"message\", \"(?<ts>(.*?))\\t(?<fuid>(.*?))\\t(?<tx_hosts>(.*?))\\t(?<rx_hosts>(.*?))\\t(?<conn_uids>(.*?))\\t(?<source>(.*?))\\t(?<depth>(.*?))\\t(?<analyzers>(.*?))\\t(?<mime_type>(.*?))\\t(?<filename>(.*?))\\t(?<duration>(.*?))\\t(?<local_orig>(.*?))\\t(?<is_orig>(.*?))\\t(?<seen_bytes>(.*?))\\t(?<total_bytes>(.*?))\\t(?<missing_bytes>(.*?))\\t(?<overflow_bytes>(.*?))\\t(?<timedout>(.*?))\\t(?<parent_fuid>(.*?))\\t(?<md5>(.*?))\\t(?<sha1>(.*?))\\t(?<sha256>(.*?))\\t(?<extracted>(.*))\" ]\n\t\t}\n\t }\n\n\t# bro_http ######################\n\t if [type] == \"bro_http\" {\n\t\tgrok {\n\t\t match => [ \"message\", \"(?<ts>(.*?))\\t(?<uid>(.*?))\\t(?<id.orig_h>(.*?))\\t(?<id.orig_p>(.*?))\\t(?<id.resp_h>(.*?))\\t(?<id.resp_p>(.*?))\\t(?<trans_depth>(.*?))\\t(?<method>(.*?))\\t(?<bro_host>(.*?))\\t(?<uri>(.*?))\\t(?<referrer>(.*?))\\t(?<user_agent>(.*?))\\t(?<request_body_len>(.*?))\\t(?<response_body_len>(.*?))\\t(?<status_code>(.*?))\\t(?<status_msg>(.*?))\\t(?<info_code>(.*?))\\t(?<info_msg>(.*?))\\t(?<filename>(.*?))\\t(?<http_tags>(.*?))\\t(?<username>(.*?))\\t(?<password>(.*?))\\t(?<proxied>(.*?))\\t(?<orig_fuids>(.*?))\\t(?<orig_mime_types>(.*?))\\t(?<resp_fuids>(.*?))\\t(?<resp_mime_types>(.*))\" ]\n\t\t}\n\t }\n\n\t# bro_known_certs ######################\n\t if [type] == \"bro_known_certs\" {\n\t\tgrok {\n\t\t match => [ \"message\", \"(?<ts>(.*?))\\t(?<bro_host>(.*?))\\t(?<port_num>(.*?))\\t(?<subject>(.*?))\\t(?<issuer_subject>(.*?))\\t(?<serial>(.*))\" ]\n\t\t}\n\t }\n\n\t# bro_known_hosts ######################\n\t if [type] == \"bro_known_hosts\" {\n\t\tgrok {\n\t\t match => [ \"message\", \"(?<ts>(.*?))\\t(?<bro_host>(.*))\" ]\n\t\t}\n\t }\n\n\t# bro_known_services ######################\n\t if [type] == \"bro_known_services\" {\n\t\tgrok {\n\t\t match => [ \"message\", \"(?<ts>(.*?))\\t(?<bro_host>(.*?))\\t(?<port_num>(.*?))\\t(?<port_proto>(.*?))\\t(?<service>(.*))\" ]\n\t\t}\n\t }\n\n\t# bro_ssh ######################\n\t if [type] == \"bro_ssh\" {\n\t\tgrok {\n\t\t match => [ \"message\", \"(?<ts>(.*?))\\t(?<uid>(.*?))\\t(?<id.orig_h>(.*?))\\t(?<id.orig_p>(.*?))\\t(?<id.resp_h>(.*?))\\t(?<id.resp_p>(.*?))\\t(?<status>(.*?))\\t(?<direction>(.*?))\\t(?<client>(.*?))\\t(?<server>(.*?))\\t(?<remote_location.country_code>(.*?))\\t(?<remote_location.region>(.*?))\\t(?<remote_location.city>(.*?))\\t(?<remote_location.latitude>(.*?))\\t(?<remote_location.longitude>(.*))\" ]\n\t\t}\n\t }\n\n\t# bro_ssl ######################\n\t if [type] == \"bro_ssl\" {\n\t\tgrok {\n\t\t match => [ \"message\", \"(?<ts>(.*?))\\t(?<uid>(.*?))\\t(?<id.orig_h>(.*?))\\t(?<id.orig_p>(.*?))\\t(?<id.resp_h>(.*?))\\t(?<id.resp_p>(.*?))\\t(?<version>(.*?))\\t(?<cipher>(.*?))\\t(?<server_name>(.*?))\\t(?<session_id>(.*?))\\t(?<subject>(.*?))\\t(?<issuer_subject>(.*?))\\t(?<not_valid_before>(.*?))\\t(?<not_valid_after>(.*?))\\t(?<last_alert>(.*?))\\t(?<client_subject>(.*?))\\t(?<client_issuer_subject>(.*?))\\t(?<cert_hash>(.*?))\\t(?<validation_status>(.*))\" ]\n\t\t}\n\t }\n\n\t# bro_weird ######################\n\tif [type] == \"bro_weird\" {\n\t\tgrok {\n\t\t\tmatch => [ \"message\", \"(?<ts>(.*?))\\t(?<uid>(.*?))\\t(?<id.orig_h>(.*?))\\t(?<id.orig_p>(.*?))\\t(?<id.resp_h>(.*?))\\t(?<id.resp_p>(.*?))\\t(?<name>(.*?))\\t(?<addl>(.*?))\\t(?<notice>(.*?))\\t(?<peer>(.*))\" ]\n\t\t\t}\n\t}\n\t\n\t# bro_x509 #######################\n\tif [type] == \"bro_x509\" {\n\t\tcsv {\n\t\n\t\t #x509.log:#fields\tts\tid\tcertificate.version\tcertificate.serial\tcertificate.subject\tcertificate.issuer\tcertificate.not_valid_before\tcertificate.not_valid_after\tcertificate.key_alg\tcertificate.sig_alg\tcertificate.key_type\tcertificate.key_length\tcertificate.exponent\tcertificate.curve\tsan.dns\tsan.uri\tsan.email\tsan.ip\tbasic_constraints.ca\tbasic_constraints.path_len\n\t\t columns => [\"ts\",\"id\",\"certificate.version\",\"certificate.serial\",\"certificate.subject\",\"icertificate.issuer\",\"certificate.not_valid_before\",\"certificate.not_valid_after\",\"certificate.key_alg\",\"certificate.sig_alg\",\"certificate.key_type\",\"certificate.key_length\",\"certificate.exponent\",\"certificate.curve\",\"san.dns\",\"san.uri\",\"san.email\",\"san.ip\",\"basic_constraints.ca\",\"basic_constraints.path_len\"]\n\t\n\t\t #If you use a custom delimiter, change the following value in between the quotes to your delimiter. Otherwise, leave the next line alone.\n\t\t separator => \"\t\"\n\t\t}\n\t\n\t\t#Let's convert our timestamp into the 'ts' field, so we can use Kibana features natively\n\t\tdate {\n\t\t match => [ \"ts\", \"UNIX\" ]\n\t\t}\n\t\n\t }\n\t\n\tif [type]== \"bro_intel\" {\n\t grok {\n\t\tmatch => [ \"message\", \"(?<ts>(.*?))\\t%{DATA:uid}\\t(?<id.orig_h>(.*?))\\t(?<id.orig_p>(.*?))\\t(?<id.resp_h>(.*?))\\t(?<id.resp_p>(.*?))\\t%{DATA:fuid}\\t%{DATA:file_mime_type}\\t%{DATA:file_desc}\\t(?<seen.indicator>(.*?))\\t(?<seen.indicator_type>(.*?))\\t(?<seen.where>(.*?))\\t%{NOTSPACE:sources}\" ]\n\t }\n }\n }\n date {\n\tmatch => [ \"ts\", \"UNIX\" ]\n }\n}\n\nfilter {\n if \"bro\" in [type] {\n\tif [id.orig_h] {\n\t mutate {\n\t\tadd_field => [ \"senderbase_lookup\", \"http://www.senderbase.org/lookup/?search_string=%{id.orig_h}\" ]\n\t\tadd_field => [ \"CBL_lookup\", \"http://cbl.abuseat.org/lookup.cgi?ip=%{id.orig_h}\" ]\n\t\tadd_field => [ \"Spamhaus_lookup\", \"http://www.spamhaus.org/query/bl?ip=%{id.orig_h}\" ]\n\t }\n\t}\n\tmutate {\n\t add_tag => [ \"bro\" ]\n\t}\n\tmutate {\n\t convert => [ \"id.orig_p\", \"integer\" ]\n\t convert => [ \"id.resp_p\", \"integer\" ]\n\t convert => [ \"orig_bytes\", \"integer\" ]\n\t convert => [ \"resp_bytes\", \"integer\" ]\n\t convert => [ \"missed_bytes\", \"integer\" ]\n\t convert => [ \"orig_pkts\", \"integer\" ]\n\t convert => [ \"orig_ip_bytes\", \"integer\" ]\n\t convert => [ \"resp_pkts\", \"integer\" ]\n\t convert => [ \"resp_ip_bytes\", \"integer\" ]\n\t}\n }\n}\n\nfilter {\n if [type] == \"bro_conn\" {\n\t#The following makes use of the translate filter (stash contrib) to convert conn_state into human text. Saves having to look up values for packet introspection\n\ttranslate {\n\t field => \"conn_state\"\n\t destination => \"conn_state_full\"\n\t dictionary => [ \n\t\t\"S0\", \"Connection attempt seen, no reply\",\n\t\t\"S1\", \"Connection established, not terminated\",\n\t\t\"S2\", \"Connection established and close attempt by originator seen (but no reply from responder)\",\n\t\t\"S3\", \"Connection established and close attempt by responder seen (but no reply from originator)\",\n\t\t\"SF\", \"Normal SYN/FIN completion\",\n\t\t\"REJ\", \"Connection attempt rejected\",\n\t\t\"RSTO\", \"Connection established, originator aborted (sent a RST)\",\n\t\t\"RSTR\", \"Established, responder aborted\",\n\t\t\"RSTOS0\", \"Originator sent a SYN followed by a RST, we never saw a SYN-ACK from the responder\",\n\t\t\"RSTRH\", \"Responder sent a SYN ACK followed by a RST, we never saw a SYN from the (purported) originator\",\n\t\t\"SH\", \"Originator sent a SYN followed by a FIN, we never saw a SYN ACK from the responder (hence the connection was 'half' open)\",\n\t\t\"SHR\", \"Responder sent a SYN ACK followed by a FIN, we never saw a SYN from the originator\",\n\t\t\"OTH\", \"No SYN seen, just midstream traffic (a 'partial connection' that was not later closed)\" \n\t ]\n\t}\n }\n}\n# Resolve #source_host to FQDN if possible if missing for some types of ging using source_host_ip from above\nfilter {\n if [id.orig_h] {\n\tif ![id.orig_h-resolved] {\n\t mutate {\n\t\tadd_field => [ \"id.orig_h-resolved\", \"%{id.orig_h}\" ]\n\t }\n\t dns {\n\t\treverse => [ \"id.orig_h-resolved\" ]\n\t\taction => \"replace\"\n\t }\n\t}\n }\n}\nfilter {\n if [id.resp_h] {\n\tif ![id.resp_h-resolved] {\n\t mutate {\n\t\tadd_field => [ \"id.resp_h-resolved\", \"%{id.resp_h}\" ]\n\t }\n\t dns {\n\t\treverse => [ \"id.resp_h-resolved\" ]\n\t\taction => \"replace\"\n\t }\n\t}\n }\n}\n\noutput {\n elasticsearch {\n hosts => [\"localhost:9200\"]\n #sniffing => true\n manage_template => false\n index => \"%{[#metadata][beat]}-%{+YYYY.MM.dd}\"\n document_type => \"%{[#metadata][type]}\"\n }\n}\n\n", :reason=>"Expected one of #, input, filter, output at line 158, column 3 (byte 8746) after "}
To the best of my ability I've reviewed my logstash configuration and I can't see any errors. Can anyone help me figure out whats wrong with it?
I'm running
logstash.noarch 1:5.0.0-1 #elasticsearch
elasticsearch.noarch 5.0.0-1 #elasticsearch
Many thanks

If you match the open curly brace at the top of 20-bro-ids-filter.conf, you'll see it matches with close curly brace just before your date{} stanza. That puts date{} outside the filter{}, generating the message that it's expecting input{}, output{}, or filter{}.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

KV filter logstash value_split - logstash

Related

Grok pattern working fine in grok debugger but the same pattern is not working when running with logstash

Logstash stopped processing because of an error: (SystemExit) exit

Logs are overwritten in the specified index under the same _id

Grok help for a custom metric

bro-ids logstash filter not working

Categories

Resources