Grokparsefailure and type problems in logstash configuration file - logstash

I have several problems with my configuration file. My goal is to parse three types of logs (for the moment). Here they are :
[29/05/2020 07:41:51.354] - ih912865 - 10.107.119.121 - 93 - Transaction 7635 COMPLETED 318 ms wait time 3183 ms
[29/05/2020 10:30:01.318] - Process status database sync - us1salx08167.corpnet2.com:8400(#52279) (load 0 grace period 5 minutes) : current date 2020/02/02 21:30:01 update date 2020/02/02 21:29:58 old state OK new state OK
31730 31626 464 10980020 52:25 /plw/modules/bin/Lx86_64/opx2-intranet.exe -I /plw/modules/bin/Lx86_64/opx2-intranet.dxl -H /plw/modules/bin/Lx86_64 -L /plw/PLW_PROD/modules/preload-intranet.ini -- plw-sysconsole -port 8400 -logdir /plw/PLW_PROD/httpdocs/admin/log/ -slaves 2
Two of these logs can be in slave files named intranet-2020-06-25-8401.log or intranet-2020-06-25-8400.log the last one is in a master file named intranet-2020-06-25-8402.log
For my tests I simplified the architecture of my log files, so I have a Log-test folder in which I put a slave file and a master file.
In these files I only put the corresponding logs and a different log to be able to see how to manage this case.
Here is the content of a "slave" :
[29/05/2020 07:41:51.354] - ih912865 - 10.107.199.125 - 93 - Transaction 7635 COMPLETED 318 ms wait time 3183 ms
[29/05/2020 10:30:01.318] - Process status database sync - us1salx08167.corpnet2.com:8400(#52279) (load 0 grace period 5 minutes) : current date 2020/02/02 21:30:01 update date 2020/02/02 21:29:58 old state OK new state OK
[29/05/2020 13:49:20.635] - Main process - Transaction SYSTEM 105238-12 SQL done 1 ms
Here is the content of a "master" :
31730 31626 464 10980020 52:25 /plw/modules/bin/Lx86_64/opx2-intranet.exe -I /plw/modules/bin/Lx86_64/opx2-intranet.dxl -H /plw/modules/bin/Lx86_64 -L /plw/PLW_PROD/modules/preload-intranet.ini -- plw-sysconsole -port 8400 -logdir /plw/PLW_PROD/httpdocs/admin/log/ -slaves 2
[26/06/2020 21:38:01.386] - Main process - Starting HTTP service on port 8402 (socket #<MULTIVALENT stream socket waiting for connection at */8402 # #x1022d2ddbb2>)
Now that you have a better understanding of my environment and my purpose, here's the problem. When I launch my logstash configuration, I retrieve my data in kibana. But kibana shows me that each log has been treated as coming from a slave file while I also have a log coming from a master file which doesn't have the same processing.
For a better understanding here is my configuration file :
input {
file {
path => "/home/mathis/Documents/**/intranet*.log"
exclude =>"*8402.log"
sincedb_path => '/dev/null'
start_position => beginning
type => "slave"
}
file {
path => "/home/mathis/Documents/**/intranet*8402.log"
sincedb_path => '/dev/null'
type => "master"
}
}
filter {
if [type] == "slave" {
grok {
match => { "message" => ["\[%{DATESTAMP:eventtime}\] \- %{USERNAME:user} \- %{IPV4:clientip} \- %{NUMBER} \- %{WORD} %{NUMBER:exectime} %{WORD} %{NUMBER:time} %{GREEDYDATA:data} %{NUMBER:waittime}","\[%{DATESTAMP:eventtime}\] \- Process status database sync \- %{WORD}\.%{WORD}\.%{WORD}\:%{NUMBER:slavenumb}\(\#%{NUMBER}\) \(load %{NUMBER:nbutilisateur} grace period 5 minutes\) %{GREEDYDATA}"] }
remove_field => "message"
}
date {
match => [ "eventtime", "dd/MM/YYYY HH:mm:ss.SSS" ]
target => "#timestamp"
}
}
if [type] == "master" {
grok {
match => {"message" => ["%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}(?<starttime>((?!<[0-9])%{HOUR}:)?%{MINUTE}(?::%{SECOND})(?![0-9]))"]}
remove_field => "message"
}
date {
match => [ "starttime", "HH:mm:ss","mm:ss" ]
}
}
}
output {
elasticsearch {
hosts => "127.0.0.1:9200"
index => "logstash-local3-%{+YYYY.MM.dd}"
}
}
And now this is what kibana shows me:
As you can see, the type field is slave for all logs but we can also observe that the logs of the slave file "intranet-2020-06-25-8401.log" are correctly parsed and that the line of added log that does not interest me has the field tags _grokparsefailure (the middle line in the picture).
The other problem is that the other logs (the first two lines on the image) are from a slave file (which is not true) according to kibana, so I guess they are processed in my first grok which would explain why they also have the _grokparsefailure tags field.
So I guess there are several errors in my input and filter part. I've been searching for a long time and doing a lot of testing, could you help me fix my config file please?

Related

I'm having issues sending logs from a file to a destination using logstash

I'm very new to logstash configuration. Below is my logstash pipeline configuration that I'm using to send a log file to Coralogix without success.
input {
file {
path => "/etc/logstash/conf.d/CSE-Candidate-Exercise-LOG.log"
type => "log" # a type to identify those logs (will need this later)
start_position => "beginning"
sincedb_path => "NUL"
}
}
output {
http {
url => "https://api.coralogix.com/logs/rest/singles "
http_method => "post"
headers => ["private_key", "xxxxxxxx-cccc-7777-ffff-000000000000"]
format => "json_batch"
codec => "line"
mapping => {
"applicationName" => "%{[#metadata][application]}"
"subsystemName" => "%{[#metadata][subsystem]}"
"computerName" => "%{host}"
"text" => "%{[#metadata][event]}"
}
http_compression => true
automatic_retries => 5
retry_non_idempotent => true
connect_timeout => 30
keepalive => false
}
}
The input should be ok but I think my issue is with the output. I'm not sure what "codec" plugin to use, do I need the "format" field?
Each time I run logstash to send the log in the file specified in the path, all I get is the below log snippet:
[INFO ] 2022-08-02 13:25:02.659 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[INFO ] 2022-08-02 13:25:03.588 [Converge PipelineAction::Create<main>] Reflections - Reflections took 135 ms to scan 1 urls, producing 124 keys and 408 values
[INFO ] 2022-08-02 13:25:04.158 [Converge PipelineAction::Create<main>] javapipeline - Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[INFO ] 2022-08-02 13:25:04.278 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/etc/logstash/conf.d/logstash-config.conf"], :thread=>"#<Thread:0x5a5e32c1 run>"}
[INFO ] 2022-08-02 13:25:05.154 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.87}
[INFO ] 2022-08-02 13:25:05.235 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
[INFO ] 2022-08-02 13:25:05.316 [[main]<file] observingtail - START, creating Discoverer, Watch with file and sincedb collections
[INFO ] 2022-08-02 13:25:05.344 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
I see no log in Coralogix platform. I simply want to send the raw log in the file without any filtering or manipulations to the destination. Any help would be appreciated.

If "keyword" in message not working for logstash

I am receiving logs from 5 different sources on one single port. In fact it is a collection of files being sent through syslog from a server in realtime. The server stores logs from 4 VPN servers and one DNS server. Now the server admin started sending all 5 types of files on a single port although I asked something different. Anyways, I thought to make this also work now.
Below are the different types of samples-
------------------
<13>Sep 30 22:03:28 xx2.20.43.100 370 <134>1 2021-09-30T22:03:28+05:30 canopus.domain1.com1 PulseSecure: - - - id=firewall time="2021-09-30 22:03:28" pri=6 fw=xx2.20.43.100 vpn=ive user=System realm="google_auth" roles="" proto= src=1xx.99.110.19 dst= dstname= type=vpn op= arg="" result= sent= rcvd= agent="" duration= msg="AUT23278: User Limit realm restrictions successfully passed for /google_auth "
------------------
<134>Sep 30 22:41:43 xx2.20.43.101 1 2021-09-30T22:41:43+05:30 canopus.domain1.com2 PulseSecure: - - - id=firewall time="2021-09-30 22:41:43" pri=6 fw=xx2.20.43.101 vpn=ive user=user22 realm="google_auth" roles="Domain_check_role" proto= src=1xx.200.27.62 dst= dstname= type=vpn op= arg="" result= sent= rcvd= agent="" duration= msg="NWC24328: Transport mode switched over to SSL for user with NCIP xx2.20.210.252 "
------------------
<134>Sep 30 22:36:59 vpn-dns-1 named[130237]: 30-Sep-2021 22:36:59.172 queries: info: client #0x7f8e0f5cab50 xx2.30.16.147#63335 (ind.event.freefiremobile.com): query: ind.event.freefiremobile.com IN A + (xx2.31.0.171)
------------------
<13>Sep 30 22:40:31 xx2.20.43.101 394 <134>1 2021-09-30T22:40:31+05:30 canopus.domain1.com2 PulseSecure: - - - id=firewall time="2021-09-30 22:40:31" pri=6 fw=xx2.20.43.101 vpn=ive user=user3 realm="google_auth" roles="Domain_check_role" proto= src=1xx.168.77.166 dst= dstname= type=vpn op= arg="" result= sent= rcvd= agent="" duration= msg="NWC23508: Key Exchange number 1 occurred for user with NCIP xx2.20.214.109 "
Below is my config file-
syslog {
port => 1301
ecs_compatibility => disabled
tags => ["vpn"]
}
}
I tried to apply a condition first to get VPN logs (1st sample logline) and pass it to dissect-
filter {
if "vpn" in [tags] {
#if ([message] =~ /vpn=ive/) {
if "vpn=ive" in [message] {
dissect {
mapping => { "message" => "%{reserved} id=firewall %{message1}" }
# using id=firewall to get KV pairs in message1
}
}
}
else { drop {} }
# \/ end of filter brace
}
But when I run with this config file, I am getting mixture of all 5 types of logs in kibana. I don't see any dissect failures as well. I remember this worked in some other server for other type of log, but not working here.
Another question is, if I have to process all 5 types of logs in one config file, will below be a good approach?
if "VPN-logline" in [message] { use KV plugin and add tag of "vpn" }
else if "DNS-logline" in [message] { use JSON plugin and tag of "dns"}
else if "something-irrelevant" in [message] { drop {} }
Or can it be done in input section of config?
So, the problem was to assign every logline with the tag pf vpn. I was doing so because I had to merge this config to a larger config file that carries many more tags.Anyways, now thought to keep this config file separate only.
input {
syslog {
port => 1301
ecs_compatibility => disabled
}
}
filter {
if "vpn=ive" in [message] {
dissect {
mapping => { "message" => "%{reserved} id=firewall %{message1}" }
}
}
else { drop {} }
}
output {
elasticsearch {
hosts => "localhost"
index => "vpn1oct"
user => "elastic"
password => "xxxxxxxxxx"
}
stdout { }
}

”_grokparsefailure” even though the grok pattern works

I am trying to parse different logs lines from two different type of file : slave and master. I did test my pattern in the Grok Dubugger and it is working fine but tags field in kibana is _grokparsefailure.
Here is my config file
input {
file {
type => "slave"
path => "/home/mathis/Documents/**/intranet*.log"
exclude =>"*8402.log"
sincedb_path => '/dev/null'
start_position => beginning
}
file {
type => "master"
path => "/home/mathis/Documents/**/intranet*8402.log"
sincedb_path => '/dev/null'
}
}
filter {
if [type] == "slave" {
grok {
match => { "message" => ["\[%{DATESTAMP:eventtime}\] \- %{USERNAME:user} \- %{IPV4:clientip} \- %{NUMBER} \- %{WORD} %{NUMBER:exectime} %{WORD} %{NUMBER:time} %{GREEDYDATA:data} %{NUMBER:waittime}","\[%{DATESTAMP:eventtime}\] \- Process status database sync \- %{WORD}\.%{WORD}\.%{WORD}\:%{NUMBER:slavenumb}\(\#%{NUMBER}\) \(load %{NUMBER:nbutilisateur} grace period 5 minutes\) %{GREEDYDATA}"] }
remove_field => "message"
}
date {
match => [ "eventtime", "dd/MM/YYYY HH:mm:ss.SSS" ]
target => "#timestamp"
}
}
if [type] == "master" {
grok {
match => {"message" => ["%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}%{NUMBER}%{SPACE}(?<starttime>((?!<[0-9])%{HOUR}:)?%{MINUTE}(?::%{SECOND})(?![0-9]))"]}
remove_field => "message"
}
date {
match => [ "starttime", "HH:mm:ss","mm:ss" ]
}
}
}
output {
elasticsearch {
hosts => "127.0.0.1:9200"
index => "logstash-local3-%{+YYYY.MM.dd}"
}
}
Here are the 3 logs lines that I want to parse :
(they are in the order of groks in my conf file)
[24/06/2020 21:57:29.548] - Process status database sync - us1salx08167.corpnet2.com:8100(#53738) (load 0 grace period 5 minutes) : current date 2020/06/24 21:57:29 update date 2020/06/24 21:55:44 old state OK new state OK
[29/05/2020 07:41:51.354] - ih912865 - 10.104.149.128 - 93 - Transaction 7635 COMPLETED 318 ms wait time 3183 ms
31730 31626 464 10970020 52:25 /plw/modules/bin/Lx86_64/opx2-intranet.exe -I /plw/modules/bin/Lx86_64/opx2-intranet.dxl -H /plw/modules/bin/Lx86_64 -L /plw/PLW_PROD/modules/preload-intranet.ini -- plw-sysconsole -port 8400 -logdir /plw/PLW_PROD/httpdocs/admin/log/ -slaves 2
So, I don't know if you've already resolved this -- but below is something you could use.
N.B. I added a couple of extra fields, but you can easily remove those [https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-remove_field].
When trying the expressions you provided, one of them actually failed in the grok debugger, so I just took it upon myself to rewrite them all from scratch while still maintaining variable names.
I noticed there was a lot of data that you simply didn't glean. If you want more captured, let me know.
Line 1:
[24/06/2020 21:57:29.548] - Process status database sync - us1salx08167.corpnet2.com:8100(#53738) (load 0 grace period 5 minutes) : current date 2020/06/24 21:57:29 update date 2020/06/24 21:55:44 old state OK new state OK
Pattern 1:
\[(?<eventtime>%{DATESTAMP})\] - Process status database sync - (?<host>%{HOSTNAME}):(?<slavenumber>%{NUMBER})(?<zz>\(#[\d]+\)) \(load (?<nbutilisateur>%{NUMBER}) grace period 5 minutes\)%{GREEDYDATA}
Line 2:
[29/05/2020 07:41:51.354] - ih912865 - 10.104.149.128 - 93 - Transaction 7635 COMPLETED 318 ms wait time 3183 ms
Pattern 2:
\[(?<eventtime>%{DATESTAMP})\] - (?<user>%{USER}) - (?<clientip>%{IPV4}) - %{NUMBER} - %{WORD} (?<exectime>%{NUMBER}) %{WORD} (?<ctime>%{NUMBER}) (?<ctimeunits>%{WORD}) wait time (?<waittime>%{NUMBER}) (?<waittimeunits>%{WORD})
Line 3:
31730 31626 464 10970020 52:25 /plw/modules/bin/Lx86_64/opx2-intranet.exe -I /plw/modules/bin/Lx86_64/opx2-intranet.dxl -H /plw/modules/bin/Lx86_64 -L /plw/PLW_PROD/modules/preload-intranet.ini -- plw-sysconsole -port 8400 -logdir /plw/PLW_PROD/httpdocs/admin/log/ -slaves 2
Pattern 3:
%{GREEDYDATA}(?<starttime>(?<=[\s])([\d]+:[\d]+))%{GREEDYDATA}

Logstash receive strange "<133>" code at the start of receiving TrendMicro log

My Logstash server is CentOS Linux release 8.1.1911.
logstash.version"=>"7.7.0"
I have a capture of what I received on port UDP 5514 with :
nc -lvu 5514 -o log.txt
The content of log.txt
<133>Jun 05 09:23:35 TMCM:EVT_URL_CONTENT_FILTERING Security product="OfficeScan" Security product node="N/A" Security product IP="xx.xx.xx.xx;xxxx::xxxx:xxxx:xxxx:4490" Event time="4/25/2020 11:46:01 PM (UTC)" URL="http://xxxxxxx.xxxxxxx.intranet/SMS_MP/.sms_pol?DEP-Z0120115-ScopeId_B14503FF-F7AA-49EC-A38C-F50D813EEC6E/Application_57a673e1-3e65-4f1c-8ce2-0f4cc1b38acc.SHA256:5EF20484EEC38EA203D7A885EAA48BE2DFDC4F130ED8BF5BEA333378875B2516" Source IP="" Destination IP="yyy.yyy.yyy.yyy" Policy rule="" Blocking type="Web reputation" Domain="xxxx-xxxxx" Event time (local)="4/25/2020 7:46:01 PM" Client host name="N/A" Reputation Score="81"`
myfilter.conf
input
{
udp
{
port => 5514
type => syslog
}
}
filter
{
grok
{
match =>
{ "message" => "(?<user_agent>[^>]*)(?<user_agent>[^:]*)%{POSINT}\s%{WORD:logfrom}\s%{WORD:logtag}\:\s%{NOTSPACE:eventname}\s([^=]*)\=%{QUOTEDSTRING:security_product} ([^=]*)\=%{QUOTEDSTRING:security_prod_node}\s([^=]*)\=\"%{IPV4:security_prod_ip}([^=]*)\=\"(?<agent_detected_time>%{MONTHNUM}\/%{MONTHDAY}\/%{YEAR} %{TIME}\s(?:AM|am|PM|pm)\s*\s\(%{TZ:tz}\)).*URL\=\"%{URI:url}\" ([^=]*)\=%{QUOTEDSTRING:src_ip}\s([^=]*)\=\"%{IPV4:dest_ipv4}\"\s([^=]*)\=%{QUOTEDSTRING:policy_rule} ([^=]*)\=%{QUOTEDSTRING:bloking_type} ([^=]*)\=%{QUOTEDSTRING:domain} ([^=]*)\=\"(?<server_alert_time>%{MONTHNUM}\/%{MONTHDAY}\/%{YEAR} %{TIME}\s(?:AM|am|PM|pm))\"\s([^=]*)\=%{QUOTEDSTRING:client_hostname} ([^=]*)\=\"%{BASE10NUM:reputation_score}/?"
}
}
}
output
{
stdout { codec => rubydebug }
}
The example of the output of logstash:
[2020-06-08T13:11:02,253][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
"type" => "syslog",
"#timestamp" => 2020-06-08T18:06:39.090Z,
"message" => "<133>Jun 08 14:06:38 TMCM:EVT_URL_CONTENT_FILTERING Security product=\"OfficeScan\" Security product node=\"N/A\" Security product IP=\"xx.xx.xx.xx;xxxx::xxxx:xxx:xxxx:4490\" Event time=\"4/26/2020 7:33:36 AM (UTC)\" URL=\"http://blabnlabla.bla-blabla.intranet/SMS_MP/.sms_pol?DEP-Z0120105-ScopeId_B14503FF-F7AA-49EC-A38C-F50D813EEC6E/Application_2be50193-9121-4239-a70f-ba06ad7bbfbd.SHA256:6FF12991BBA769F9C15F7E1FA3E3058E22B4D918F6C5659CF7B976059082510D\" Source IP=\"\" Destination IP=\"xxx.xx.xxx.xx\" Policy rule=\"\" Blocking type=\"Web reputation\" Domain=\"bla-blabla\" Event time (local)=\"4/26/2020 3:33:36 AM\" Client host name=\"N/A\" Reputation Score=\"81\"",
"#version" => "1",
"host" => "xx.xxx.xx.xx",
"tags" => [
[0] "_grokparsefailure"
]
}
I have tried also "\<133\>" but it still appears. I have no idea what this <133> is.
P.S. I'm learning by myself since last 2 weeks.

ELK | Log file grok filtered format not pushing into elastic search

I have log file having below format to extract into elastic search, but logstash filtered data not pushing into elastic search.
Same grok filtered configuration am able to get it from kibana devtools
Sample logfile:
OCDE - 2019-05-22 13:24:34.000 ERROR org.ramyam.ocde.task.NBALookupTask.checkResponsesToBeProcessed - checkResponsesToBeProcessed started : Wed May 22 13:24:34 IST 2019
Filebeat configuration:
filebeat.inputs:
- type: log
enabled: true
paths:
- C:\data\logs\OCDE.log
document_type: ocde
logstash configuration:
input {
file {
type => "ocde"
path => "C:\data\logs\OCDE.log"
}
beats {
port => 5044
ssl => false
}
}
filter {
grok {
match => [ "message" ,'%{DATA:moduleName} - %{TIMESTAMP_ISO8601:loggerTime}\s+%{LOGLEVEL:level}\s+%{JAVACLASS:className}\.%{DATA:methodName} - %{GREEDYDATA:loggermsg}}']
}
}
output {
if [type]=="ocde"
{
elasticsearch
{
hosts => ["localhost:9200"]
#manage_template => false
index => "enliven_be_log_yyyymmdd"
document_type=> ocde
}
}
}
I am expecting below result from an above configuration in elastic search
{
"level": "ERROR",
"loggerTime": "2019-05-22 13:24:34.000",
"moduleName": "OCDE",
"methodName": "checkResponsesToBeProcessed",
"className": "org.ramyam.ocde.task.NBALookupTask",
"loggermsg": "checkResponsesToBeProcessed started : Wed May 22 13:24:34 IST 2019"
}
Can anyone please explain or share sample configuration what I am missing
You can try below grok pattern -
%{DATA:moduleName}%{SPACE}*-%{SPACE}*%{TIMESTAMP_ISO8601:loggerTime}%{SPACE}*%{LOGLEVEL:level}%{SPACE}*%{JAVACLASS:className}\.%{DATA:methodName}%{SPACE}*-%{SPACE}*%{GREEDYDATA:loggermsg}
Change your grok from:
%{DATA:moduleName} - %{TIMESTAMP_ISO8601:loggerTime}\s+%{LOGLEVEL:level}\s+%{JAVACLASS:className}\.%{DATA:methodName} - %{GREEDYDATA:loggermsg}}
to:
%{DATA:moduleName} - %{TIMESTAMP_ISO8601:loggerTime}\s+%{LOGLEVEL:level}\s+%{JAVACLASS:className}\.%{DATA:methodName} - %{GREEDYDATA:loggermsg}
To validate this, use http://grokdebug.herokuapp.com/ and paste the log message you provided into the "
Your pattern works fine, you just had one extra bracket at the end.

Resources