Nginx access log with grok pattern - logstash

My nginx log format is as following, there certain log file will have empty value (such as without request_time/hrrp_x_forwarded_for and http_host), in this case, the grok pattern not working, it will not separate filter the message body but appear as a complete message, is there anyway to skip the empty value?
Example, if the $request_time is empty, skip this field and proceed with other.
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$request_time" "$http_x_forwarded_for" $http_host ';```
grok {
match => { "message" => "%{IPORHOST:remote_ip} - %{DATA:user_name} \[%{HTTPDATE:time}\] \"%{WORD:method}%{DATA:url} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:body_sent:bytes} (?:\"(?:%{DATA:referrer}|-))\" \"%{DATA:agent}\" (?:\"(?:%{NUMBER:request_time}|-))\" (?:\"(?:%{DATA:http_x_forwarded_for}|-))\" (?:%{IPORHOST:http_host}) " }
remove_field => "message"
}

Related

Logstash - Grok Syntax Issues

I'm using filebeat to send log to logstash but I'm having issues with grok syntax on Logstash. I used the grok debugger on Kibanna and manager to come to a solution.
The problem is that I can't find the same syntax for Logstash.
The original log :
{"log":"188.188.188.188 - tgaro [22/Aug/2022:11:37:54 +0200] \"PROPFIND /remote.php/dav/files/xxx#yyyy.com/ HTTP/1.1\" 207 1035 \"-\" \"Mozilla/5.0 (Windows) mirall/2.6.1stable-Win64 (build 20191105) (Nextcloud)\"\n","stream":"stdout","time":"2022-08-22T09:37:54.782377901Z"}
The message receive in Logstash :
"message" => "{\"log\":\"188.188.188.188 - tgaro [22/Aug/2022:11:37:54 +0200] \\\"PROPFIND /remote.php/dav/files/xxx#yyyy.com/ HTTP/1.1\\\" 207 1035 \\\"-\\\" \\\"Mozilla/5.0 (Windows) mirall/2.6.1stable-Win64 (build 20191105) (Nextcloud)\\\"\\n\",\"stream\":\"stdout\",\"time\":\"2022-08-22T09:37:54.782377901Z\"}",
The Grok Pattern i used on Grok Debugger (Kibana):
{\\"log\\":\\"%{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] \\\\\\"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\\\\\\" (?:-|%{NUMBER:response}) (?:-|%{NUMBER:bytes}) \\\\\\("%{DATA:referrer}\\\\\\") \\\\\\"%{DATA:user-agent}\\\\\\"
The real problem is that I can't even manage to get the IP (188.188.188.188).
I tried :
match => { "message" => '{\\"log\\":\\"%{IPORHOST:clientip}' # backslash to escape the backslash
match => { "message" => '{\\\"log\\\":\\\"%{IPORHOST:clientip}' # backslash to escape the quote
match => { "message" => "{\\\"log\\\":\\\"%{IPORHOST:clientip}" # backslash to escape the quote
Help would be appreciated
Thanks !
ps : The log used here is shrink. The real log is mixed with Json and string so i can't send it as Json in Filebeat.
Ok, so i manage to make it work by using this :
grok {
match => { "message" => '%{SYSLOGTIMESTAMP:syslog_timestamp} %{IPORHOST:syslog_server} %{WORD:syslog_tag}: %{GREEDYDATA:jsonMessage}' }
}
json {
source => "jsonMessage"
}
grok {
match => { "jsonMessage" => '%{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] \\"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\\" (?:-|%{NUMBER:response}) (?:-|%{NUMBER:bytes}) \\("%{DATA:referrer}\\") \\"%{DATA:user-agent}\\"'}
}
with log like this :
Aug 24 00:00:01 hostname containers: {"log":"188.188.188.188 - user.name#things.com [23/Aug/2022:23:59:52 +0200] \"PROPFIND /remote.php/dav/files/ HTTP/1.1\" 207 1159 \"-\" \"Mozilla/5.0 (Linux) mirall/3.4.2-1ubuntu1 (Nextcloud, ubuntu-5.15.0-46-generic ClientArchitecture: x86_64 OsArchitecture: x86_64)\"\n","stream":"stdout","time":"2022-08-23T21:59:52.612843092Z"}
The first match will fetch the first 3 field (time, hostname and tag), and then get everything after the : with the pattern GREEDYDATA in the jsonMessage.
Than the json filter is used on the jsonMessage. Since then, we have the information needed in the new field log created by using the json filter.
I still don't understand why my grok work on Kibanna debugger but not on Logstash. I mean it's probably because some character needs to be escaped. But even when i escape them it didn't work.
#RaiZy_Style has already filtered out the JSON using the JSON Filter and trying to match the jsonMessage using the GROK Filter
I used the GROK Debugger to create a grok pattern that will match jsonMessage field.
The jsonMessage that I assumed coming after using the JSON Filter for the above log example is:
188.188.188.188 - user.name#things.com [23/Aug/2022:23:59:52 +0200] \\"PROPFIND /remote.php/dav/files/ HTTP/1.1\\" 207 1159 \\"-\\" \\"Mozilla/5.0 (Linux) mirall/3.4.2-1ubuntu1 (Nextcloud, ubuntu-5.15.0-46-generic ClientArchitecture: x86_64 OsArchitecture: x86_64)\\"\\n"
Here is pattern that will work:
%{IPORHOST:clientip} %{USER:ident} %{DATA:auth} \[%{HTTPDATE:timestamp}\] \\\\"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \\\\"
Output Screenshot:
Note: If you want to expand the rawrequest field value into individual filed, it can be done also.

Filebeat: How to remove log if some key or value exist?

filebeat version 7.17.3
i have 3 different logs for example
{"level":"debug","message":"Start proxy checking","module":"proxy","timestamp":"2022-05-18 23:22:15 +0200"}
{"level":"info","message":"Attempt to get proxy","module":"proxy","timestamp":"2022-05-18 23:22:17 +0200"}
{"campaign":"18","level":"warn","message":"Missed or empty list","module":"loader","session":"pYpifim","timestamp":"2022-05-18 23:27:46 +0200"}
how is it possible to not provide/filter out the log to logstash or elasticsearch if level is equal "info"
how is it possible to not provide/filter out the log to logstash or elasticsearch if key campaign does not exist?
in FileBeat i have following
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
- decode_json_fields:
fields: ["message"]
process_array: false
max_depth: 1
target: ""
overwrite_keys: true
add_error_key: true
- drop_fields:
fields: ["agent", "host", "log", "ecs", "input", "location"]
but with drop_fields i can remove some field and i need to not save completely log if key or value are exist!
in Logstash to delete those events is no problem - see below, but how to do this in filebeats?
/etc/logstash/conf.d/40-filebeat-to-logstash.conf
input {
beats {
port => 5044
include_codec_tag => false
}
}
filter {
if "Start proxy checking" in [message] {
drop { }
}
if "Attempt to get proxy" in [message] {
drop { }
}
}
output {
elasticsearch {
hosts => ["http://xxx.xxx.xxx.xxx:9200"]
# index => "myindex"
index => "%{[#metadata][beat]}-%{[#metadata][version]}-%{+yyyy.MM.dd}"
}
}
Thank you in Advance
in filebeat there is drop events processor,
processors:
- drop_event:
when:
condition
https://www.elastic.co/guide/en/beats/filebeat/7.17/drop-event.html

How to use a HELM-3 value a multi-line string

I'm having an nginx ConfigMap yaml file which is then mounted as nginx.conf. This config map contains the config in a multiline string starting like the following:
data:
nginx.conf: |
worker_processes auto;
pid /tmp/nginx.pid;
:
and in this multiline string I'd like to inject a value from the values.yaml like for example:
data:
nginx.conf: |
worker_processes auto;
pid /tmp/nginx.pid;
:
:
log_format combined '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
access_log /var/log/nginx/access.log {{ .Values.nginx.log.format | default "combined" | quote }};
error_log /var/log/nginx/error.log;
:
but using the syntax above I'm getting an unexpected EOF error. Any way/work-around to get this done?
Well, I would suggest you to avoid doing this. Instead, you can create a folder such as configuration, sibling to your template folder and create a file named nginx.conf in it with your entire string content which of course can have required placeholders. Then, in configMap definition, you can call your file like this
data:
{{ (.Files.Glob "configuration/nginx.conf").AsConfig | indent 2 }}
This would create your configmap

_grokparsefailure Tag in all parsed logs with multiple grok filter

I am trying to parse minecraft log with Elastic Stack and I'v faced a very strange problem (likely strange for me!)
all line of my log get parsed correctly but I got _grokparsefailure tag in everyone of them.
my logstash pipeline config is this:
input {
file {
path => [ "/path/to/my/log" ]
#start_position => "beginning"
tags => ["minecraft"]
}
}
filter {
if "minecraft" in [tags] {
# mutate {
# gsub => [
# "message", "\\n", ""
# ]
# }
#############################
# Num 1 #
#############################
grok {
match => [ "message", "\[%{TIME:timestamp}] \[(?<originator>[^\/]+)?/%{LOGLEVEL:level}]: %{GREEDYDATA:message}" ]
overwrite => [ "message" ]
break_on_match => false
}
#############################
# Num 2 #
#############################
grok {
match => [ "message", "UUID of player %{USERNAME} is %{UUID}" ]
add_tag => [ "player", "uuid" ]
break_on_match => true
}
#############################
# Num 3 #
#############################
grok {
match => [ "message", "\A(?<player>[a-zA-Z0-9_]+)\[/%{IPV4:ip_address}:%{POSINT}\] logged in with entity id %{POSINT:entity_id} at \(\[(?<world>[a-zA-Z]+)\](?<pos>[^\)]+)\)\Z" ]
add_tag => [ "player", "join" ]
break_on_match => true
}
#
# grok {
# match => [ "message", "^(?<player>[a-zA-Z0-9_]+) has just earned the achievement \[(?<achievement>[^\[]+)\]$" ]
# add_tag => [ "player", "achievement" ]
# }
#
# grok {
# match => [ "message", "^(?<player>[a-zA-Z0-9_]+) left the game$" ]
# add_tag => [ "player", "part" ]
# }
#
# grok {
# match => [ "message", "^<(?<player>[a-zA-Z0-9_]+)> .*$" ]
# add_tag => [ "player", "chat" ]
# }
}
}
output {
elasticsearch {
hosts => ["elasticsearch:xxxx"]
user => "xxxx"
password => "xxxxxx"
index => "minecraft_s1v15_%{+YYYY.MM.dd}"
}
}
AND A SAMPLE OF MY LOG IS:
[11:21:46] [User Authenticator #7/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[11:21:46] [Server thread/INFO]: MyAwsomeUsername joined the game
[11:21:46] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45140] logged in with entity id 6868 at ([world]61.45686149445207, 70.9375, -175.44700729217607)
[11:21:49] [Server thread/INFO]: MyAwsomeUsername issued server command: //efererg
[11:21:52] [Async Chat Thread - #1/INFO]: <MyAwsomeUsername> egerg
[11:21:54] [Async Chat Thread - #1/INFO]: <MyAwsomeUsername> ef
[12:00:19] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:19] [Server thread/INFO]: MyAwsomeUsername left the game
[12:00:21] [User Authenticator #8/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[12:00:21] [Server thread/INFO]: MyAwsomeUsername joined the game
[12:00:21] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45470] logged in with entity id 11767 at ([world]61.45686149445207, 70.9375, -175.44700729217607)
[12:00:27] [Server thread/INFO]: MyAwsomeUsername issued server command: /wgergerger
[12:00:29] [Async Chat Thread - #2/INFO]: <MyAwsomeUsername> gerg
[12:00:33] [Async Chat Thread - #2/INFO]: <MyAwsomeUsername> gerger
[12:00:35] [Async Chat Thread - #2/INFO]: <MyAwsomeUsername> rerg
[12:00:37] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:37] [Server thread/INFO]: MyAwsomeUsername left the game
[12:00:38] [User Authenticator #8/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[12:00:38] [Server thread/INFO]: MyAwsomeUsername joined the game
[12:00:38] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45476] logged in with entity id 11793 at ([world]62.97573252632079, 71.0, -179.01739415148737)
[12:00:40] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:40] [Server thread/INFO]: MyAwsomeUsername left the game
[12:00:51] [User Authenticator #8/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[12:00:51] [Server thread/INFO]: MyAwsomeUsername joined the game
[12:00:51] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45486] logged in with entity id 11805 at ([world]62.97573252632079, 71.0, -179.01739415148737)
[12:00:55] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:55] [Server thread/INFO]: MyAwsomeUsername left the game
Explanation:
I commented other grok to explain the problem simpler (Exact same problem when uncommnet them)
I'v tested 3 situation:
Comment 2 and 3 as well as others and only 1 was active, in this case every line of log got parsed without any _grokparsefailure in the record.
Only was commented as well as others and 1 and 2 were active. in this case the line of log with match the grok number 2 was parsed with no _grokparsefailure and others get _grokparsefailure. this is still make sense!
In last situation I uncommented all 3 grok (1, 2, 3 were active) and every line of log get parsed BUT with _grokparsefailure in it! even though the break_on_match is true by default and when it match by grok 2 it shouldn't be tested with grok 3.
I've read some other question similar to me in stackoverflow: Similar Question 1 and I added the mutate block before the grok filters (cause every line of the log end with \n) but nothing changed and the problem persist!
other thing I think I need to mention is the fact that I know adding more grok beside grok 2 (3 and others) cause this tag cause some of the logs don't match grok 2 at all and have to wrap them with a regex. but for now at lease the logs that match grok 2 should be Ok (no _grokparsefailure) but they are not! (Read it in a stackoverflow question: Similar Question 2
In fact, this is the expected behavior, you are confusing a little the way logstash and grok works.
First, all filters are independent from each other, using break_on_match in a grok only affects that grok, it makes no difference for other grok filters that appears after that in your pipeline. The break_on_match also only makes sense when you have more than one pattern in the same grok, which is not your case.
Second, since Logstash is serial and you are not using any conditional, your grok filters will be applied to every message in your pipeline, it doesn't matter if it was already parsed, this is what is making your lines to get the _grokparsefailure
To fix that you need to use conditionals.
You don't need conditionals in your two first grok filters, the first one is just taking the different part of your log lines and overwritting into the message field, the second will be just your first test, for every grok after the second one you will need the following configuration.
if "_grokparsefailure" in [tags] {
grok {
match => "your pattern"
add_tag => "your tags"
remove_tag => ["_grokparsefailure"]
}
}
This grok will only be applied if the message has _grokparsefailure in the tags field, if the message matchs your pattern, this tag will be removed, if it does not match, the tag remains and the message can be tested by the following groks.
In the end your grok config should look something like this.
grok {
"your first grok"
}
grok {
"your second grok, can be any of the others"
}
if "_grokparsefailure" in [tags] {
grok {
"your grok N"
remove_tag => ["_grokparsefailure"]
}
}
This is only needed because you are adding different tags for every message, if you move this logic to a mutate filter for example, you can use only two grok filters, your second one would be a multi-pattern grok, with break_on_match set to true.
grok {
match => {
"message" => [
"pattern from grok 2",
"pattern from grok 3",
"pattern from grok N"
]
}
break_on_match => true
}

grok filter for processing log4j logs pattern in Logstash

I am stuck in finding grok filter for processing conversion pattern %d{HH:mm:ss.SSS} %-5p [%t][%c] %m%n in log4j logs
here is an example log entry:
2018-02-12 12:10:03 INFO classname:25 - Exiting application.
2017-12-31 05:09:06 WARN foo:133 - Redirect Request : login
2015-08-19 08:07:03 INFO DBConfiguration:47 - Initiating DynamoDb Configuration...
2016-02-12 11:06:49 ERROR foo:224 - Error Code : 500
can anyone help in finding the Logstash grok filter.
Here I found the filter for your log4j pattren.
filter{
mutate {
gsub => ['message', "\n", " "]
}
grok {
match => { "message" => "(?<date>[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}) (?:%{LOGLEVEL:loglevel}) +(?:%{WORD:caller_class}):(?:%{NONNEGINT:caller_line}) - (?:%{GREEDYDATA:msg})" }
}
}
However, this is specific to the above log.

Resources