_grokparsefailure Tag in all parsed logs with multiple grok filter - logstash

I am trying to parse minecraft log with Elastic Stack and I'v faced a very strange problem (likely strange for me!)
all line of my log get parsed correctly but I got _grokparsefailure tag in everyone of them.
my logstash pipeline config is this:
input {
file {
path => [ "/path/to/my/log" ]
#start_position => "beginning"
tags => ["minecraft"]
}
}
filter {
if "minecraft" in [tags] {
# mutate {
# gsub => [
# "message", "\\n", ""
# ]
# }
#############################
# Num 1 #
#############################
grok {
match => [ "message", "\[%{TIME:timestamp}] \[(?<originator>[^\/]+)?/%{LOGLEVEL:level}]: %{GREEDYDATA:message}" ]
overwrite => [ "message" ]
break_on_match => false
}
#############################
# Num 2 #
#############################
grok {
match => [ "message", "UUID of player %{USERNAME} is %{UUID}" ]
add_tag => [ "player", "uuid" ]
break_on_match => true
}
#############################
# Num 3 #
#############################
grok {
match => [ "message", "\A(?<player>[a-zA-Z0-9_]+)\[/%{IPV4:ip_address}:%{POSINT}\] logged in with entity id %{POSINT:entity_id} at \(\[(?<world>[a-zA-Z]+)\](?<pos>[^\)]+)\)\Z" ]
add_tag => [ "player", "join" ]
break_on_match => true
}
#
# grok {
# match => [ "message", "^(?<player>[a-zA-Z0-9_]+) has just earned the achievement \[(?<achievement>[^\[]+)\]$" ]
# add_tag => [ "player", "achievement" ]
# }
#
# grok {
# match => [ "message", "^(?<player>[a-zA-Z0-9_]+) left the game$" ]
# add_tag => [ "player", "part" ]
# }
#
# grok {
# match => [ "message", "^<(?<player>[a-zA-Z0-9_]+)> .*$" ]
# add_tag => [ "player", "chat" ]
# }
}
}
output {
elasticsearch {
hosts => ["elasticsearch:xxxx"]
user => "xxxx"
password => "xxxxxx"
index => "minecraft_s1v15_%{+YYYY.MM.dd}"
}
}
AND A SAMPLE OF MY LOG IS:
[11:21:46] [User Authenticator #7/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[11:21:46] [Server thread/INFO]: MyAwsomeUsername joined the game
[11:21:46] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45140] logged in with entity id 6868 at ([world]61.45686149445207, 70.9375, -175.44700729217607)
[11:21:49] [Server thread/INFO]: MyAwsomeUsername issued server command: //efererg
[11:21:52] [Async Chat Thread - #1/INFO]: <MyAwsomeUsername> egerg
[11:21:54] [Async Chat Thread - #1/INFO]: <MyAwsomeUsername> ef
[12:00:19] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:19] [Server thread/INFO]: MyAwsomeUsername left the game
[12:00:21] [User Authenticator #8/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[12:00:21] [Server thread/INFO]: MyAwsomeUsername joined the game
[12:00:21] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45470] logged in with entity id 11767 at ([world]61.45686149445207, 70.9375, -175.44700729217607)
[12:00:27] [Server thread/INFO]: MyAwsomeUsername issued server command: /wgergerger
[12:00:29] [Async Chat Thread - #2/INFO]: <MyAwsomeUsername> gerg
[12:00:33] [Async Chat Thread - #2/INFO]: <MyAwsomeUsername> gerger
[12:00:35] [Async Chat Thread - #2/INFO]: <MyAwsomeUsername> rerg
[12:00:37] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:37] [Server thread/INFO]: MyAwsomeUsername left the game
[12:00:38] [User Authenticator #8/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[12:00:38] [Server thread/INFO]: MyAwsomeUsername joined the game
[12:00:38] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45476] logged in with entity id 11793 at ([world]62.97573252632079, 71.0, -179.01739415148737)
[12:00:40] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:40] [Server thread/INFO]: MyAwsomeUsername left the game
[12:00:51] [User Authenticator #8/INFO]: UUID of player MyAwsomeUsername is d800b63e-c2d2-3140-83a7-32315d09feca
[12:00:51] [Server thread/INFO]: MyAwsomeUsername joined the game
[12:00:51] [Server thread/INFO]: MyAwsomeUsername[/111.111.111.111:45486] logged in with entity id 11805 at ([world]62.97573252632079, 71.0, -179.01739415148737)
[12:00:55] [Server thread/INFO]: MyAwsomeUsername lost connection: Disconnected
[12:00:55] [Server thread/INFO]: MyAwsomeUsername left the game
Explanation:
I commented other grok to explain the problem simpler (Exact same problem when uncommnet them)
I'v tested 3 situation:
Comment 2 and 3 as well as others and only 1 was active, in this case every line of log got parsed without any _grokparsefailure in the record.
Only was commented as well as others and 1 and 2 were active. in this case the line of log with match the grok number 2 was parsed with no _grokparsefailure and others get _grokparsefailure. this is still make sense!
In last situation I uncommented all 3 grok (1, 2, 3 were active) and every line of log get parsed BUT with _grokparsefailure in it! even though the break_on_match is true by default and when it match by grok 2 it shouldn't be tested with grok 3.
I've read some other question similar to me in stackoverflow: Similar Question 1 and I added the mutate block before the grok filters (cause every line of the log end with \n) but nothing changed and the problem persist!
other thing I think I need to mention is the fact that I know adding more grok beside grok 2 (3 and others) cause this tag cause some of the logs don't match grok 2 at all and have to wrap them with a regex. but for now at lease the logs that match grok 2 should be Ok (no _grokparsefailure) but they are not! (Read it in a stackoverflow question: Similar Question 2

In fact, this is the expected behavior, you are confusing a little the way logstash and grok works.
First, all filters are independent from each other, using break_on_match in a grok only affects that grok, it makes no difference for other grok filters that appears after that in your pipeline. The break_on_match also only makes sense when you have more than one pattern in the same grok, which is not your case.
Second, since Logstash is serial and you are not using any conditional, your grok filters will be applied to every message in your pipeline, it doesn't matter if it was already parsed, this is what is making your lines to get the _grokparsefailure
To fix that you need to use conditionals.
You don't need conditionals in your two first grok filters, the first one is just taking the different part of your log lines and overwritting into the message field, the second will be just your first test, for every grok after the second one you will need the following configuration.
if "_grokparsefailure" in [tags] {
grok {
match => "your pattern"
add_tag => "your tags"
remove_tag => ["_grokparsefailure"]
}
}
This grok will only be applied if the message has _grokparsefailure in the tags field, if the message matchs your pattern, this tag will be removed, if it does not match, the tag remains and the message can be tested by the following groks.
In the end your grok config should look something like this.
grok {
"your first grok"
}
grok {
"your second grok, can be any of the others"
}
if "_grokparsefailure" in [tags] {
grok {
"your grok N"
remove_tag => ["_grokparsefailure"]
}
}
This is only needed because you are adding different tags for every message, if you move this logic to a mutate filter for example, you can use only two grok filters, your second one would be a multi-pattern grok, with break_on_match set to true.
grok {
match => {
"message" => [
"pattern from grok 2",
"pattern from grok 3",
"pattern from grok N"
]
}
break_on_match => true
}

Related

Logstash - Grok Syntax Issues

I'm using filebeat to send log to logstash but I'm having issues with grok syntax on Logstash. I used the grok debugger on Kibanna and manager to come to a solution.
The problem is that I can't find the same syntax for Logstash.
The original log :
{"log":"188.188.188.188 - tgaro [22/Aug/2022:11:37:54 +0200] \"PROPFIND /remote.php/dav/files/xxx#yyyy.com/ HTTP/1.1\" 207 1035 \"-\" \"Mozilla/5.0 (Windows) mirall/2.6.1stable-Win64 (build 20191105) (Nextcloud)\"\n","stream":"stdout","time":"2022-08-22T09:37:54.782377901Z"}
The message receive in Logstash :
"message" => "{\"log\":\"188.188.188.188 - tgaro [22/Aug/2022:11:37:54 +0200] \\\"PROPFIND /remote.php/dav/files/xxx#yyyy.com/ HTTP/1.1\\\" 207 1035 \\\"-\\\" \\\"Mozilla/5.0 (Windows) mirall/2.6.1stable-Win64 (build 20191105) (Nextcloud)\\\"\\n\",\"stream\":\"stdout\",\"time\":\"2022-08-22T09:37:54.782377901Z\"}",
The Grok Pattern i used on Grok Debugger (Kibana):
{\\"log\\":\\"%{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] \\\\\\"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\\\\\\" (?:-|%{NUMBER:response}) (?:-|%{NUMBER:bytes}) \\\\\\("%{DATA:referrer}\\\\\\") \\\\\\"%{DATA:user-agent}\\\\\\"
The real problem is that I can't even manage to get the IP (188.188.188.188).
I tried :
match => { "message" => '{\\"log\\":\\"%{IPORHOST:clientip}' # backslash to escape the backslash
match => { "message" => '{\\\"log\\\":\\\"%{IPORHOST:clientip}' # backslash to escape the quote
match => { "message" => "{\\\"log\\\":\\\"%{IPORHOST:clientip}" # backslash to escape the quote
Help would be appreciated
Thanks !
ps : The log used here is shrink. The real log is mixed with Json and string so i can't send it as Json in Filebeat.
Ok, so i manage to make it work by using this :
grok {
match => { "message" => '%{SYSLOGTIMESTAMP:syslog_timestamp} %{IPORHOST:syslog_server} %{WORD:syslog_tag}: %{GREEDYDATA:jsonMessage}' }
}
json {
source => "jsonMessage"
}
grok {
match => { "jsonMessage" => '%{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] \\"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\\" (?:-|%{NUMBER:response}) (?:-|%{NUMBER:bytes}) \\("%{DATA:referrer}\\") \\"%{DATA:user-agent}\\"'}
}
with log like this :
Aug 24 00:00:01 hostname containers: {"log":"188.188.188.188 - user.name#things.com [23/Aug/2022:23:59:52 +0200] \"PROPFIND /remote.php/dav/files/ HTTP/1.1\" 207 1159 \"-\" \"Mozilla/5.0 (Linux) mirall/3.4.2-1ubuntu1 (Nextcloud, ubuntu-5.15.0-46-generic ClientArchitecture: x86_64 OsArchitecture: x86_64)\"\n","stream":"stdout","time":"2022-08-23T21:59:52.612843092Z"}
The first match will fetch the first 3 field (time, hostname and tag), and then get everything after the : with the pattern GREEDYDATA in the jsonMessage.
Than the json filter is used on the jsonMessage. Since then, we have the information needed in the new field log created by using the json filter.
I still don't understand why my grok work on Kibanna debugger but not on Logstash. I mean it's probably because some character needs to be escaped. But even when i escape them it didn't work.
#RaiZy_Style has already filtered out the JSON using the JSON Filter and trying to match the jsonMessage using the GROK Filter
I used the GROK Debugger to create a grok pattern that will match jsonMessage field.
The jsonMessage that I assumed coming after using the JSON Filter for the above log example is:
188.188.188.188 - user.name#things.com [23/Aug/2022:23:59:52 +0200] \\"PROPFIND /remote.php/dav/files/ HTTP/1.1\\" 207 1159 \\"-\\" \\"Mozilla/5.0 (Linux) mirall/3.4.2-1ubuntu1 (Nextcloud, ubuntu-5.15.0-46-generic ClientArchitecture: x86_64 OsArchitecture: x86_64)\\"\\n"
Here is pattern that will work:
%{IPORHOST:clientip} %{USER:ident} %{DATA:auth} \[%{HTTPDATE:timestamp}\] \\\\"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \\\\"
Output Screenshot:
Note: If you want to expand the rawrequest field value into individual filed, it can be done also.

If "keyword" in message not working for logstash

I am receiving logs from 5 different sources on one single port. In fact it is a collection of files being sent through syslog from a server in realtime. The server stores logs from 4 VPN servers and one DNS server. Now the server admin started sending all 5 types of files on a single port although I asked something different. Anyways, I thought to make this also work now.
Below are the different types of samples-
------------------
<13>Sep 30 22:03:28 xx2.20.43.100 370 <134>1 2021-09-30T22:03:28+05:30 canopus.domain1.com1 PulseSecure: - - - id=firewall time="2021-09-30 22:03:28" pri=6 fw=xx2.20.43.100 vpn=ive user=System realm="google_auth" roles="" proto= src=1xx.99.110.19 dst= dstname= type=vpn op= arg="" result= sent= rcvd= agent="" duration= msg="AUT23278: User Limit realm restrictions successfully passed for /google_auth "
------------------
<134>Sep 30 22:41:43 xx2.20.43.101 1 2021-09-30T22:41:43+05:30 canopus.domain1.com2 PulseSecure: - - - id=firewall time="2021-09-30 22:41:43" pri=6 fw=xx2.20.43.101 vpn=ive user=user22 realm="google_auth" roles="Domain_check_role" proto= src=1xx.200.27.62 dst= dstname= type=vpn op= arg="" result= sent= rcvd= agent="" duration= msg="NWC24328: Transport mode switched over to SSL for user with NCIP xx2.20.210.252 "
------------------
<134>Sep 30 22:36:59 vpn-dns-1 named[130237]: 30-Sep-2021 22:36:59.172 queries: info: client #0x7f8e0f5cab50 xx2.30.16.147#63335 (ind.event.freefiremobile.com): query: ind.event.freefiremobile.com IN A + (xx2.31.0.171)
------------------
<13>Sep 30 22:40:31 xx2.20.43.101 394 <134>1 2021-09-30T22:40:31+05:30 canopus.domain1.com2 PulseSecure: - - - id=firewall time="2021-09-30 22:40:31" pri=6 fw=xx2.20.43.101 vpn=ive user=user3 realm="google_auth" roles="Domain_check_role" proto= src=1xx.168.77.166 dst= dstname= type=vpn op= arg="" result= sent= rcvd= agent="" duration= msg="NWC23508: Key Exchange number 1 occurred for user with NCIP xx2.20.214.109 "
Below is my config file-
syslog {
port => 1301
ecs_compatibility => disabled
tags => ["vpn"]
}
}
I tried to apply a condition first to get VPN logs (1st sample logline) and pass it to dissect-
filter {
if "vpn" in [tags] {
#if ([message] =~ /vpn=ive/) {
if "vpn=ive" in [message] {
dissect {
mapping => { "message" => "%{reserved} id=firewall %{message1}" }
# using id=firewall to get KV pairs in message1
}
}
}
else { drop {} }
# \/ end of filter brace
}
But when I run with this config file, I am getting mixture of all 5 types of logs in kibana. I don't see any dissect failures as well. I remember this worked in some other server for other type of log, but not working here.
Another question is, if I have to process all 5 types of logs in one config file, will below be a good approach?
if "VPN-logline" in [message] { use KV plugin and add tag of "vpn" }
else if "DNS-logline" in [message] { use JSON plugin and tag of "dns"}
else if "something-irrelevant" in [message] { drop {} }
Or can it be done in input section of config?
So, the problem was to assign every logline with the tag pf vpn. I was doing so because I had to merge this config to a larger config file that carries many more tags.Anyways, now thought to keep this config file separate only.
input {
syslog {
port => 1301
ecs_compatibility => disabled
}
}
filter {
if "vpn=ive" in [message] {
dissect {
mapping => { "message" => "%{reserved} id=firewall %{message1}" }
}
}
else { drop {} }
}
output {
elasticsearch {
hosts => "localhost"
index => "vpn1oct"
user => "elastic"
password => "xxxxxxxxxx"
}
stdout { }
}

Grokparsefailure and type problems in logstash configuration file

I have several problems with my configuration file. My goal is to parse three types of logs (for the moment). Here they are :
[29/05/2020 07:41:51.354] - ih912865 - 10.107.119.121 - 93 - Transaction 7635 COMPLETED 318 ms wait time 3183 ms
[29/05/2020 10:30:01.318] - Process status database sync - us1salx08167.corpnet2.com:8400(#52279) (load 0 grace period 5 minutes) : current date 2020/02/02 21:30:01 update date 2020/02/02 21:29:58 old state OK new state OK
31730 31626 464 10980020 52:25 /plw/modules/bin/Lx86_64/opx2-intranet.exe -I /plw/modules/bin/Lx86_64/opx2-intranet.dxl -H /plw/modules/bin/Lx86_64 -L /plw/PLW_PROD/modules/preload-intranet.ini -- plw-sysconsole -port 8400 -logdir /plw/PLW_PROD/httpdocs/admin/log/ -slaves 2
Two of these logs can be in slave files named intranet-2020-06-25-8401.log or intranet-2020-06-25-8400.log the last one is in a master file named intranet-2020-06-25-8402.log
For my tests I simplified the architecture of my log files, so I have a Log-test folder in which I put a slave file and a master file.
In these files I only put the corresponding logs and a different log to be able to see how to manage this case.
Here is the content of a "slave" :
[29/05/2020 07:41:51.354] - ih912865 - 10.107.199.125 - 93 - Transaction 7635 COMPLETED 318 ms wait time 3183 ms
[29/05/2020 10:30:01.318] - Process status database sync - us1salx08167.corpnet2.com:8400(#52279) (load 0 grace period 5 minutes) : current date 2020/02/02 21:30:01 update date 2020/02/02 21:29:58 old state OK new state OK
[29/05/2020 13:49:20.635] - Main process - Transaction SYSTEM 105238-12 SQL done 1 ms
Here is the content of a "master" :
31730 31626 464 10980020 52:25 /plw/modules/bin/Lx86_64/opx2-intranet.exe -I /plw/modules/bin/Lx86_64/opx2-intranet.dxl -H /plw/modules/bin/Lx86_64 -L /plw/PLW_PROD/modules/preload-intranet.ini -- plw-sysconsole -port 8400 -logdir /plw/PLW_PROD/httpdocs/admin/log/ -slaves 2
[26/06/2020 21:38:01.386] - Main process - Starting HTTP service on port 8402 (socket #<MULTIVALENT stream socket waiting for connection at */8402 # #x1022d2ddbb2>)
Now that you have a better understanding of my environment and my purpose, here's the problem. When I launch my logstash configuration, I retrieve my data in kibana. But kibana shows me that each log has been treated as coming from a slave file while I also have a log coming from a master file which doesn't have the same processing.
For a better understanding here is my configuration file :
input {
file {
path => "/home/mathis/Documents/**/intranet*.log"
exclude =>"*8402.log"
sincedb_path => '/dev/null'
start_position => beginning
type => "slave"
}
file {
path => "/home/mathis/Documents/**/intranet*8402.log"
sincedb_path => '/dev/null'
type => "master"
}
}
filter {
if [type] == "slave" {
grok {
match => { "message" => ["\[%{DATESTAMP:eventtime}\] \- %{USERNAME:user} \- %{IPV4:clientip} \- %{NUMBER} \- %{WORD} %{NUMBER:exectime} %{WORD} %{NUMBER:time} %{GREEDYDATA:data} %{NUMBER:waittime}","\[%{DATESTAMP:eventtime}\] \- Process status database sync \- %{WORD}\.%{WORD}\.%{WORD}\:%{NUMBER:slavenumb}\(\#%{NUMBER}\) \(load %{NUMBER:nbutilisateur} grace period 5 minutes\) %{GREEDYDATA}"] }
remove_field => "message"
}
date {
match => [ "eventtime", "dd/MM/YYYY HH:mm:ss.SSS" ]
target => "#timestamp"
}
}
if [type] == "master" {
grok {
match => {"message" => ["%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}(?<starttime>((?!<[0-9])%{HOUR}:)?%{MINUTE}(?::%{SECOND})(?![0-9]))"]}
remove_field => "message"
}
date {
match => [ "starttime", "HH:mm:ss","mm:ss" ]
}
}
}
output {
elasticsearch {
hosts => "127.0.0.1:9200"
index => "logstash-local3-%{+YYYY.MM.dd}"
}
}
And now this is what kibana shows me:
As you can see, the type field is slave for all logs but we can also observe that the logs of the slave file "intranet-2020-06-25-8401.log" are correctly parsed and that the line of added log that does not interest me has the field tags _grokparsefailure (the middle line in the picture).
The other problem is that the other logs (the first two lines on the image) are from a slave file (which is not true) according to kibana, so I guess they are processed in my first grok which would explain why they also have the _grokparsefailure tags field.
So I guess there are several errors in my input and filter part. I've been searching for a long time and doing a lot of testing, could you help me fix my config file please?

”_grokparsefailure” even though the grok pattern works

I am trying to parse different logs lines from two different type of file : slave and master. I did test my pattern in the Grok Dubugger and it is working fine but tags field in kibana is _grokparsefailure.
Here is my config file
input {
file {
type => "slave"
path => "/home/mathis/Documents/**/intranet*.log"
exclude =>"*8402.log"
sincedb_path => '/dev/null'
start_position => beginning
}
file {
type => "master"
path => "/home/mathis/Documents/**/intranet*8402.log"
sincedb_path => '/dev/null'
}
}
filter {
if [type] == "slave" {
grok {
match => { "message" => ["\[%{DATESTAMP:eventtime}\] \- %{USERNAME:user} \- %{IPV4:clientip} \- %{NUMBER} \- %{WORD} %{NUMBER:exectime} %{WORD} %{NUMBER:time} %{GREEDYDATA:data} %{NUMBER:waittime}","\[%{DATESTAMP:eventtime}\] \- Process status database sync \- %{WORD}\.%{WORD}\.%{WORD}\:%{NUMBER:slavenumb}\(\#%{NUMBER}\) \(load %{NUMBER:nbutilisateur} grace period 5 minutes\) %{GREEDYDATA}"] }
remove_field => "message"
}
date {
match => [ "eventtime", "dd/MM/YYYY HH:mm:ss.SSS" ]
target => "#timestamp"
}
}
if [type] == "master" {
grok {
match => {"message" => ["%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}(?<starttime>((?!<[0-9])%{HOUR}:)?%{MINUTE}(?::%{SECOND})(?![0-9]))"]}
remove_field => "message"
}
date {
match => [ "starttime", "HH:mm:ss","mm:ss" ]
}
}
}
output {
elasticsearch {
hosts => "127.0.0.1:9200"
index => "logstash-local3-%{+YYYY.MM.dd}"
}
}
Here are the 3 logs lines that I want to parse :
(they are in the order of groks in my conf file)
[24/06/2020 21:57:29.548] - Process status database sync - us1salx08167.corpnet2.com:8100(#53738) (load 0 grace period 5 minutes) : current date 2020/06/24 21:57:29 update date 2020/06/24 21:55:44 old state OK new state OK
[29/05/2020 07:41:51.354] - ih912865 - 10.104.149.128 - 93 - Transaction 7635 COMPLETED 318 ms wait time 3183 ms
31730 31626 464 10970020 52:25 /plw/modules/bin/Lx86_64/opx2-intranet.exe -I /plw/modules/bin/Lx86_64/opx2-intranet.dxl -H /plw/modules/bin/Lx86_64 -L /plw/PLW_PROD/modules/preload-intranet.ini -- plw-sysconsole -port 8400 -logdir /plw/PLW_PROD/httpdocs/admin/log/ -slaves 2
So, I don't know if you've already resolved this -- but below is something you could use.
N.B. I added a couple of extra fields, but you can easily remove those [https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-remove_field].
When trying the expressions you provided, one of them actually failed in the grok debugger, so I just took it upon myself to rewrite them all from scratch while still maintaining variable names.
I noticed there was a lot of data that you simply didn't glean. If you want more captured, let me know.
Line 1:
[24/06/2020 21:57:29.548] - Process status database sync - us1salx08167.corpnet2.com:8100(#53738) (load 0 grace period 5 minutes) : current date 2020/06/24 21:57:29 update date 2020/06/24 21:55:44 old state OK new state OK
Pattern 1:
\[(?<eventtime>%{DATESTAMP})\] - Process status database sync - (?<host>%{HOSTNAME}):(?<slavenumber>%{NUMBER})(?<zz>\(#[\d]+\)) \(load (?<nbutilisateur>%{NUMBER}) grace period 5 minutes\)%{GREEDYDATA}
Line 2:
[29/05/2020 07:41:51.354] - ih912865 - 10.104.149.128 - 93 - Transaction 7635 COMPLETED 318 ms wait time 3183 ms
Pattern 2:
\[(?<eventtime>%{DATESTAMP})\] - (?<user>%{USER}) - (?<clientip>%{IPV4}) - %{NUMBER} - %{WORD} (?<exectime>%{NUMBER}) %{WORD} (?<ctime>%{NUMBER}) (?<ctimeunits>%{WORD}) wait time (?<waittime>%{NUMBER}) (?<waittimeunits>%{WORD})
Line 3:
31730 31626 464 10970020 52:25 /plw/modules/bin/Lx86_64/opx2-intranet.exe -I /plw/modules/bin/Lx86_64/opx2-intranet.dxl -H /plw/modules/bin/Lx86_64 -L /plw/PLW_PROD/modules/preload-intranet.ini -- plw-sysconsole -port 8400 -logdir /plw/PLW_PROD/httpdocs/admin/log/ -slaves 2
Pattern 3:
%{GREEDYDATA}(?<starttime>(?<=[\s])([\d]+:[\d]+))%{GREEDYDATA}

Logstash receive strange "<133>" code at the start of receiving TrendMicro log

My Logstash server is CentOS Linux release 8.1.1911.
logstash.version"=>"7.7.0"
I have a capture of what I received on port UDP 5514 with :
nc -lvu 5514 -o log.txt
The content of log.txt
<133>Jun 05 09:23:35 TMCM:EVT_URL_CONTENT_FILTERING Security product="OfficeScan" Security product node="N/A" Security product IP="xx.xx.xx.xx;xxxx::xxxx:xxxx:xxxx:4490" Event time="4/25/2020 11:46:01 PM (UTC)" URL="http://xxxxxxx.xxxxxxx.intranet/SMS_MP/.sms_pol?DEP-Z0120115-ScopeId_B14503FF-F7AA-49EC-A38C-F50D813EEC6E/Application_57a673e1-3e65-4f1c-8ce2-0f4cc1b38acc.SHA256:5EF20484EEC38EA203D7A885EAA48BE2DFDC4F130ED8BF5BEA333378875B2516" Source IP="" Destination IP="yyy.yyy.yyy.yyy" Policy rule="" Blocking type="Web reputation" Domain="xxxx-xxxxx" Event time (local)="4/25/2020 7:46:01 PM" Client host name="N/A" Reputation Score="81"`
myfilter.conf
input
{
udp
{
port => 5514
type => syslog
}
}
filter
{
grok
{
match =>
{ "message" => "(?<user_agent>[^>]*)(?<user_agent>[^:]*)%{POSINT}\s%{WORD:logfrom}\s%{WORD:logtag}\:\s%{NOTSPACE:eventname}\s([^=]*)\=%{QUOTEDSTRING:security_product} ([^=]*)\=%{QUOTEDSTRING:security_prod_node}\s([^=]*)\=\"%{IPV4:security_prod_ip}([^=]*)\=\"(?<agent_detected_time>%{MONTHNUM}\/%{MONTHDAY}\/%{YEAR} %{TIME}\s(?:AM|am|PM|pm)\s*\s\(%{TZ:tz}\)).*URL\=\"%{URI:url}\" ([^=]*)\=%{QUOTEDSTRING:src_ip}\s([^=]*)\=\"%{IPV4:dest_ipv4}\"\s([^=]*)\=%{QUOTEDSTRING:policy_rule} ([^=]*)\=%{QUOTEDSTRING:bloking_type} ([^=]*)\=%{QUOTEDSTRING:domain} ([^=]*)\=\"(?<server_alert_time>%{MONTHNUM}\/%{MONTHDAY}\/%{YEAR} %{TIME}\s(?:AM|am|PM|pm))\"\s([^=]*)\=%{QUOTEDSTRING:client_hostname} ([^=]*)\=\"%{BASE10NUM:reputation_score}/?"
}
}
}
output
{
stdout { codec => rubydebug }
}
The example of the output of logstash:
[2020-06-08T13:11:02,253][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
"type" => "syslog",
"#timestamp" => 2020-06-08T18:06:39.090Z,
"message" => "<133>Jun 08 14:06:38 TMCM:EVT_URL_CONTENT_FILTERING Security product=\"OfficeScan\" Security product node=\"N/A\" Security product IP=\"xx.xx.xx.xx;xxxx::xxxx:xxx:xxxx:4490\" Event time=\"4/26/2020 7:33:36 AM (UTC)\" URL=\"http://blabnlabla.bla-blabla.intranet/SMS_MP/.sms_pol?DEP-Z0120105-ScopeId_B14503FF-F7AA-49EC-A38C-F50D813EEC6E/Application_2be50193-9121-4239-a70f-ba06ad7bbfbd.SHA256:6FF12991BBA769F9C15F7E1FA3E3058E22B4D918F6C5659CF7B976059082510D\" Source IP=\"\" Destination IP=\"xxx.xx.xxx.xx\" Policy rule=\"\" Blocking type=\"Web reputation\" Domain=\"bla-blabla\" Event time (local)=\"4/26/2020 3:33:36 AM\" Client host name=\"N/A\" Reputation Score=\"81\"",
"#version" => "1",
"host" => "xx.xxx.xx.xx",
"tags" => [
[0] "_grokparsefailure"
]
}
I have tried also "\<133\>" but it still appears. I have no idea what this <133> is.
P.S. I'm learning by myself since last 2 weeks.

Resources