Logstash ignores(?) one grok filter - logstash

This is my config:
input{
beats {
port => 55556
}
}
filter {
if "iis" in [tags] {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:log_timestamp} %{WORD:S-SiteName} %{WORD:S-ComputerName} %{IPORHOST:S-IP} %{WORD:CS-Method} %{URIPATH:CS-URI-Stem} %{NOTSPACE:CS-URI-QUERY} %{NUMBER:S-Port}$ %{NUMBER:SC-Win32-Status} %{NUMBER:SC-Bytes} %{NUMBER:CS-Bytes} %{NUMBER:Time-Taken}"}
}
}
}
filter {
if "nx" in [tags] {
grok {
match => { "message" => "\[%{TIMESTAMP_ISO8601:log_timestamp}\] (?<LogLevel>\[\w+\s*\]) (?<thread>\[\s*\d*\]) (?<snName>\[\w*\]) (?<snId>\[\d*\s*\d*\]) %{GREEDYDATA:message}"}
}
}
}
output{
if "nx" in [tags] {
elasticsearch {
hosts => ["localhost:9200"]
index => "nx-%{+YYYY.ww}"
user => "user"
password => "pass"
}
}
if "iis" in [tags] {
elasticsearch {
hosts => ["localhost:9200"]
index => "iis-%{+YYYY.ww}"
user => "user"
password => "pass"
}
}
}
Here's a log sample:
[2018-02-18 15:19:04.668] [INFO ] [ 155] [AliveReportCommand] [875076019 53033] - ProcessRequest Ended: elapsed time=00:00:00.0967851, _parser.Device.IsSuccess=False
[2018-02-18 15:25:32.716] [DEBUG] [ 181] [] [] - Web Facade called: streamIDParam=-1
This log corresponds to the "nx" logs.
For some reason it just does not get ingested. I ran grok simulation using the grok above and that code and it worked out great. However, it's like the "nx" filter is completely ignored. Even if I eliminate the "IIS" filter, still no info would show up on Kibana. Also no errors whatsoever in Logstash. I have the same setup for both using Filebeat:
- type: log
enabled: true
paths: c:\logs\*\*.log
exclude_files: ['mybeat.*']
tags: ["nx"]
multiline.pattern: '^\['
multiline.negate: true
multiline.match: after
- type: log
enabled: true
paths: C:\inetpub\logs\LogFiles\*\*.log
tags: ["iis"]

I think you must use only one filter.In your code you used two filter keywords.
filter {
if "iis" in [tags] {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:log_timestamp} %{WORD:S-SiteName} %{WORD:S-ComputerName} %{IPORHOST:S-IP} %{WORD:CS-Method} %{URIPATH:CS-URI-Stem} %{NOTSPACE:CS-URI-QUERY} %{NUMBER:S-Port}$ %{NUMBER:SC-Win32-Status} %{NUMBER:SC-Bytes} %{NUMBER:CS-Bytes} %{NUMBER:Time-Taken}"}
}
}
if "nx" in [tags] {
grok {
match => { "message" => "\[%{TIMESTAMP_ISO8601:log_timestamp}\] (?<LogLevel>\[\w+\s*\]) (?<thread>\[\s*\d*\]) (?<snName>\[\w*\]) (?<snId>\[\d*\s*\d*\]) %{GREEDYDATA:message}"}
}
}
}

I feel so embarrassed...
The problem wasnt with the filter to begin with.
It was with the timestamp set on Kibana.
After selecting the right field it showed up immediately.

Related

Logs are overwritten in the specified index under the same _id

I'm using filebeat - 6.5.1, Logstash - 6.5.1 and elasticsearch - 6.5.1
I'm using multiple GROK in the single config file and trying to send the logs into Elasticsearch
Below is my Filebeat.yml
filebeat.prospectors:
type: log
paths:
var/log/message
fields:
type: apache_access
tags: ["ApacheAccessLogs"]
type: log
paths:
var/log/indicate
fields:
type: apache_error
tags: ["ApacheErrorLogs"]
type: log
paths:
var/log/panda
fields:
type: mysql_error
tags: ["MysqlErrorLogs"]
output.logstash:
The Logstash hosts
hosts: ["logstash:5044"]
Below is my logstash config file -
input {
beats {
port => 5044
tags => [ "ApacheAccessLogs", "ApacheErrorLogs", "MysqlErrorLogs" ]
}
}
filter {
if "ApacheAccessLogs" in [tags] {
grok {
match => [
"message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}",
"message" , "%{COMMONAPACHELOG}+%{GREEDYDATA:extra_fields}"
]
overwrite => [ "message" ]
}
mutate {
convert => ["response", "integer"]
convert => ["bytes", "integer"]
convert => ["responsetime", "float"]
}
geoip {
source => "clientip"
target => "geoip"
add_tag => [ "apache-geoip" ]
}
date {
match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
remove_field => [ "timestamp" ]
}
useragent {
source => "agent"
}
}
if "ApacheErrorLogs" in [tags] {
grok {
match => { "message" => ["[%{APACHE_TIME:[apache2][error][timestamp]}] [%{LOGLEVEL:[apache2][error][level]}]( [client %{IPORHOST:[apache2][error][client]}])? %{GREEDYDATA:[apache2][error][message]}",
"[%{APACHE_TIME:[apache2][error][timestamp]}] [%{DATA:[apache2][error][module]}:%{LOGLEVEL:[apache2][error][level]}] [pid %{NUMBER:[apache2][error][pid]}(:tid %{NUMBER:[apache2][error][tid]})?]( [client %{IPORHOST:[apache2][error][client]}])? %{GREEDYDATA:[apache2][error][message1]}" ] }
pattern_definitions => {
"APACHE_TIME" => "%{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}"
}
remove_field => "message"
}
mutate {
rename => { "[apache2][error][message1]" => "[apache2][error][message]" }
}
date {
match => [ "[apache2][error][timestamp]", "EEE MMM dd H:m:s YYYY", "EEE MMM dd H:m:s.SSSSSS YYYY" ]
remove_field => "[apache2][error][timestamp]"
}
}
if "MysqlErrorLogs" in [tags] {
grok {
match => { "message" => ["%{LOCALDATETIME:[mysql][error][timestamp]} ([%{DATA:[mysql][error][level]}] )?%{GREEDYDATA:[mysql][error][message]}",
"%{TIMESTAMP_ISO8601:[mysql][error][timestamp]} %{NUMBER:[mysql][error][thread_id]} [%{DATA:[mysql][error][level]}] %{GREEDYDATA:[mysql][error][message1]}",
"%{GREEDYDATA:[mysql][error][message2]}"] }
pattern_definitions => {
"LOCALDATETIME" => "[0-9]+ %{TIME}"
}
remove_field => "message"
}
mutate {
rename => { "[mysql][error][message1]" => "[mysql][error][message]" }
}
mutate {
rename => { "[mysql][error][message2]" => "[mysql][error][message]" }
}
date {
match => [ "[mysql][error][timestamp]", "ISO8601", "YYMMdd H:m:s" ]
remove_field => "[apache2][access][time]"
}
}
}
output {
if "ApacheAccessLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_id => "apacheaccess"
}
}
if "ApacheErrorLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_id => "apacheerror"
}
}
if "MysqlErrorLogs" in [tags] {
elasticsearch { hosts => ["elasticsearch:9200"]
index => "apache"
document_id => "sqlerror"
}
}
stdout { codec => rubydebug }
}
The data is sent to elastic search but only 3 records are getting created for each document_id in the same index.
Only 3 records are created and every new logs incoming are overwritten onto the same document_id and the old one is lost.
Can you guys please help me out?
The definition of document_id is to provide an unique document id for an event. In your case, as they are static (apacheaccess, apacheerror, sqlerror), there will be only 1 event per index ingested into elasticsearch, overide by the newest event.
As you have 3 distinct data type, what you seems to be looking for provide for each event type (ApacheAccessLogs, ApacheErrorLogs, MysqlErrorLogs) a different index, as following :
output {
if "ApacheAccessLogs" in [tags] {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "apache-access"
}
}
if "ApacheErrorLogs" in [tags] {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "apache-error"
}
}
if "MysqlErrorLogs" in [tags] {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "mysql-error"
}
}
stdout {
codec => rubydebug
}
}
There are not many cases where you need to set the id manually (eg. in case of reingest of data), as Logstash & Elasticsearch will manage that by themself.
But if that's the case, and you can't use a field to identify each event individually, you could use the logstash-filter-fingerprint, that is made for that.

Add log4net Level field to logstash.conf file

I'm trying to add LEVEL field (so it shows up in Kibana). My logstash.conf
Input:
2018-03-18 15:43:40.7914 - INFO: Tick
2018-03-18 15:43:40.7914 - ERROR: Tock
file:
input {
beats {
port => 5044
}
}
filter {
grok {
match => {
"message" => "(?m)^%{TIMESTAMP_ISO8601:timestamp}~~\[%{DATA:thread}\]~~\[%{DATA:user}\]~~\[%{DATA:requestId}\]~~\[%{DATA:userHost}\]~~\[%{DATA:requestUrl}\]~~%{DATA:level}~~%{DATA:logger}~~%{DATA:logmessage}~~%{DATA:exception}\|\|"
}
match => {
"levell" => "(?m)^%{DATA:level}"
}
add_field => {
"received_at" => "%{#timestamp}"
"received_from" => "%{host}"
"level" => "levell"
}
remove_field => ["message"]
}
date {
match => [ "timestamp", "yyyy-MM-dd HH:mm:ss:SSS" ]
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
sniffing => true
index => "filebeat-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
#user => "elastic"
#password => "changeme"
}
stdout { codec => rubydebug }
}
this prints out "levell" instead of "INFO/ERROR" etc
EDIT:
Input:
2018-03-18 15:43:40.7914 - INFO: Tick
configuration:
# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.
input {
beats {
port => 5044
}
}
filter {
grok {
match => { "message" => "(?m)^%{TIMESTAMP_ISO8601:timestamp}~~\[%{DATA:thread}\]~~\[%{DATA:user}\]~~\[%{DATA:requestId}\]~~\[%{DATA:userHost}\]~~\[%{DATA:requestUrl}\]~~%{DATA:level}~~%{DATA:logger}~~%{DATA:logmessage}~~%{DATA:exception}\|\|" }
add_field => {
"received_at" => "%{#timestamp}"
"received_from" => "%{host}"
}
}
grok {
match => { "message" => "- %{LOGLEVEL:level}" }
remove_field => ["message"]
}
date {
match => [ "timestamp", "yyyy-MM-dd HH:mm:ss:SSS" ]
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
sniffing => true
index => "filebeat-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
#user => "elastic"
#password => "changeme"
}
stdout { codec => rubydebug }
}
Output I'm getting. Still missing received_at and level:
In that part of the configuration:
add_field => {
"received_at" => "%{#timestamp}"
"received_from" => "%{host}"
"level" => "levell"
}
When using "level" => "levell", you just put the String levell in the field level. To put the value of the field named levell, you have to use %{levell}. So in you case, it would look like:
add_field => {
"received_at" => "%{#timestamp}"
"received_from" => "%{host}"
"level" => "%{levell}"
}
Also the grok#match, according to the documentation:
A hash that defines the mapping of where to look, and with which patterns.
So trying to match on the levell field won't work, since it look like it doesn't exist yet. And the grok pattern you're using to match the message field don't match the example you provided.

Grok help for a custom metric

I have a log line like this:
09 Nov 2018 15:51:35 DEBUG api.MapAnythingProvider - Calling API For Client: XXX Number of ELEMENTS Requested YYY
I want to ignore all other log lines and only want those lines that have the words "Calling API For Client" in it. Further, I am only interested in the String XXX and Number YYY.
Thanks for the help.
input {
file {
path => ["C:/apache-tomcat-9.0.7/logs/service/service.log"]
sincedb_path => "nul"
start_position => "beginning"
}
}
filter {
grok {
match => {
"message" => "%{MONTHDAY:monthDay} %{MONTH:mon} %{YEAR:year} %{TIME:ts} %{WORD:severity} %{JAVACLASS:claz} - %{GREEDYDATA:logmessage}"
}
}
grok {
match => {
"logmessage" => "%{WORD:keyword} %{WORD:customer} %{WORD:key2} %{NUMBER:mapAnythingCreditsConsumed:float} %{WORD:key3} %{NUMBER:elementsFromCache:int}"
}
}
if "_grokparsefailure" in [tags] {
drop {}
}
mutate {
remove_field => [ "monthDay", "mon", "ts", "severity", "claz", "keyword", "key2", "path", "message", "year", "key3" ]
}
}
output {
if [logmessage] =~ /ExecutingJobFor/ {
elasticsearch {
hosts => ["localhost:9200"]
index => "test"
manage_template => false
}
stdout {
codec => rubydebug
}
}
}

how to assign particular value from logstash output

In logstash I am getting this output, however, I am trying to get repo: "username/logstashrepo" only from this output. Please share your thoughts how to grep only that value and assign to variable.
message: "github_audit: {"actor_ip":"192.168.1.1","from":"repositories#create","actor":"username","repo":"username/logstashrepo","user":"username","created_at":1416299104782,"action":"repo.create","user_id":1033,"repo_id":44744,"actor_id":1033,"data":{"actor_location":{"location":{"lat":null,"lon":null}}}}",
#version: "1",
#timestamp: "2014-11-18T08:25:05.427Z",
host: "15-274-145-63",
type: "syslog",
syslog5424_pri: "190",
timestamp: "Nov 18 00:25:05",
actor_ip: "192.168.1.1",
from: "repositories#create",
actor: "username",
repo: "username/logstashrepo",
user: "username",
created_at: 1416299104782,
action: "repo.create",
user_id: 1033,
repo_id: 44744,
actor_id: 1033,
I am using this in my config file:
input {
tcp {
port => 8088
type => syslog
}
udp {
port => 8088
type => syslog
}
}
filter {
grok {
match => [
"message",
"%{SYSLOG5424PRI}%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:host} %{GREEDYDATA:message}"
]
overwrite => ["host", "message"]
}
if [message] =~ /^github_audit: / {
grok {
match => ["message", "^github_audit: %{GREEDYDATA:json_payload}"]
}
json {
source => "json_payload"
remove_field => "json_payload"
}
}
}
output {
elasticsearch { host => localhost }
stdout { codec => rubydebug }
}
I actually posted the question here, for some reason I can't edit and followup.
how to grep particulr field from logstash output
You can have the json filter store the expanded JSON object in a subfield. Use the mutate filter to move the "repo" field into the toplevel and delete the whole subfield. Partial example from the json filter and onwards:
json {
source => "json_payload"
target => "json"
remove_field => "json_payload"
}
mutate {
rename => ["[json][repo]", "repo"]
remove_field => "json"
}

how to grep particulr field from logstash output

I am trying to grep only few fields from this output from logstash 1.repositories#create 2.\"repo\":\"username/reponame\" . please share your ideas to grep particular info from this outpput and assign this to another variable
"message" => "<190>Nov 01 20:35:15 10-254-128-66 github_audit: {\"actor_ip\":\"192.168.1.1\",\"from\":\"repositories#create\",\"actor\":\"myuserid\",\"repo\":\"username/reponame\",\"action\":\"staff.repo_route\",\"created_at\":1516286634991,\"repo_id\":44743,\"actor_id\":1033,\"data\":{\"actor_location\":{\"location\":{\"lat\":null,\"lon\":null}}}}",
I am using this syslog.conf file to get the output.
input {
tcp {
port => 8088
type => syslog
}
udp {
port => 8088
type => syslog
}
}
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp}"
}
grep {
match => { "message" => "repositories#create" }
}
}
}
output {
elasticsearch { host => localhost }
stdout { codec => rubydebug }
}
I am not able to add my comments for your reply, thank you so much for your reply.
could you please share your ideas to get username: and repo: only from this output , i m trying assign the values from this particular output, thanks again
message: "github_audit: {"actor_ip":"192.168.1.1","from":"repositories#create","actor":"username","repo":"username/logstashrepo","user":"username","created_at":1416299104782,"action":"repo.create","user_id":1033,"repo_id":44744,"actor_id":1033,"data":{"actor_location":{"location":{"lat":null,"lon":null}}}}",
#version: "1",
#timestamp: "2014-11-18T08:25:05.427Z",
host: "15-274-145-63",
type: "syslog",
syslog5424_pri: "190",
timestamp: "Nov 18 00:25:05",
actor_ip: "10.239.37.185",
from: "repositories#create",
actor: "username",
repo: "username/logstashrepo",
user: "username",
created_at: 1416299104782,
action: "repo.create",
user_id: 1033,
repo_id: 44744,
actor_id: 1033,
Use a grok filter to extract the JSON payload into a separate field, then use a json filter to extract the fields from the JSON object. The example below works but only extracts the JSON payload from messages prefixed with "github_audit: ". I'm also guessing that the field after the timestamp is a hostname that should overwrite whatever might currently be in the "host" field. Don't forget to add a date filter to parse the string in the "timestamp" field into "#timestamp".
filter {
grok {
match => [
"message",
"%{SYSLOG5424PRI}%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:host} %{GREEDYDATA:message}"
]
overwrite => ["host", "message"]
}
if [message] =~ /^github_audit: / {
grok {
match => ["message", "^github_audit: %{GREEDYDATA:json_payload}"]
}
json {
source => "json_payload"
remove_field => "json_payload"
}
}
}

Resources