Grok Patterns for SSSD Logs - logstash-grok

I am trying to parse the SSSD Demon logs using Logstash grok patterns for better visibility
log samples
(Mon Nov 9 12:08:56 2020) [sssd[nss]] [client_recv] (0x0200): Client disconnected!
(Mon Nov 9 12:08:56 2020) [sssd[nss]] [client_close_fn] (0x2000): Terminated client [0x55ffd29d93c0][22]
I have created custom Grok patterns as stated below:
SSSD_TIME [ \(%{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}\)]+
SSSD_DEMON \[[a-z]*\[[a-z]*\]\]+
SSSD_FUNCTION \[[a-z,_]*\]+
SSD_LOG_LEVEL (\(\dx\d*\))+
I am getting the below output using the above custom grok patterns for the query stated below
%{SSSD_TIME:time} %{SSSD_DEMON:demon} %{SSSD_FUNCTION:function} %{SSD_LOG_LEVEL:loglevel}[:]\s+%{GREEDYDATA:message}
Output:
{
"function": "[client_recv]",
"loglevel": "(0x0200)",
"time": "(Mon Nov 9 12:08:56 2020)",
"demon": "[sssd[nss]]",
"message": "Client disconnected!"
}
I need to extract only the values with in the brackets and not the whole content
I tried skipping the brackets but it only work for first value
query below for skipping first bracket
\(%{SSSD_TIME:time}\) %{SSSD_DEMON:demon} %{SSSD_FUNCTION:function} %{SSD_LOG_LEVEL:loglevel}[:]\s+%{GREEDYDATA:message}
I need to get the below output
{
"function": "client_recv",
"loglevel": "0x0200",
"time": "Mon Nov 9 12:08:56 2020",
"demon": "sssd[nss]",
"message": "Client disconnected!"
}
If anyone can help me with this that will be great
Thanks

Here is the grok pattern for your desired output:
\((?<timestamp>%{DAY} %{MONTH} %{MONTHNUM} %{TIME} %{YEAR})\) \[(?<daemon>(.*))\] \[%{DATA:function}\] \(%{DATA:log_level}\): %{GREEDYDATA:message}
I have used the Grok Debugger to create the from pattern.
Here is the screenshot of the output:
If you want, you can then remove the unnecessary tags like DAY, MONTH etc., using mutate filter of logstash.

Related

Logstash - Grok Syntax Issues

I'm using filebeat to send log to logstash but I'm having issues with grok syntax on Logstash. I used the grok debugger on Kibanna and manager to come to a solution.
The problem is that I can't find the same syntax for Logstash.
The original log :
{"log":"188.188.188.188 - tgaro [22/Aug/2022:11:37:54 +0200] \"PROPFIND /remote.php/dav/files/xxx#yyyy.com/ HTTP/1.1\" 207 1035 \"-\" \"Mozilla/5.0 (Windows) mirall/2.6.1stable-Win64 (build 20191105) (Nextcloud)\"\n","stream":"stdout","time":"2022-08-22T09:37:54.782377901Z"}
The message receive in Logstash :
"message" => "{\"log\":\"188.188.188.188 - tgaro [22/Aug/2022:11:37:54 +0200] \\\"PROPFIND /remote.php/dav/files/xxx#yyyy.com/ HTTP/1.1\\\" 207 1035 \\\"-\\\" \\\"Mozilla/5.0 (Windows) mirall/2.6.1stable-Win64 (build 20191105) (Nextcloud)\\\"\\n\",\"stream\":\"stdout\",\"time\":\"2022-08-22T09:37:54.782377901Z\"}",
The Grok Pattern i used on Grok Debugger (Kibana):
{\\"log\\":\\"%{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] \\\\\\"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\\\\\\" (?:-|%{NUMBER:response}) (?:-|%{NUMBER:bytes}) \\\\\\("%{DATA:referrer}\\\\\\") \\\\\\"%{DATA:user-agent}\\\\\\"
The real problem is that I can't even manage to get the IP (188.188.188.188).
I tried :
match => { "message" => '{\\"log\\":\\"%{IPORHOST:clientip}' # backslash to escape the backslash
match => { "message" => '{\\\"log\\\":\\\"%{IPORHOST:clientip}' # backslash to escape the quote
match => { "message" => "{\\\"log\\\":\\\"%{IPORHOST:clientip}" # backslash to escape the quote
Help would be appreciated
Thanks !
ps : The log used here is shrink. The real log is mixed with Json and string so i can't send it as Json in Filebeat.
Ok, so i manage to make it work by using this :
grok {
match => { "message" => '%{SYSLOGTIMESTAMP:syslog_timestamp} %{IPORHOST:syslog_server} %{WORD:syslog_tag}: %{GREEDYDATA:jsonMessage}' }
}
json {
source => "jsonMessage"
}
grok {
match => { "jsonMessage" => '%{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth} \[%{HTTPDATE:timestamp}\] \\"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\\" (?:-|%{NUMBER:response}) (?:-|%{NUMBER:bytes}) \\("%{DATA:referrer}\\") \\"%{DATA:user-agent}\\"'}
}
with log like this :
Aug 24 00:00:01 hostname containers: {"log":"188.188.188.188 - user.name#things.com [23/Aug/2022:23:59:52 +0200] \"PROPFIND /remote.php/dav/files/ HTTP/1.1\" 207 1159 \"-\" \"Mozilla/5.0 (Linux) mirall/3.4.2-1ubuntu1 (Nextcloud, ubuntu-5.15.0-46-generic ClientArchitecture: x86_64 OsArchitecture: x86_64)\"\n","stream":"stdout","time":"2022-08-23T21:59:52.612843092Z"}
The first match will fetch the first 3 field (time, hostname and tag), and then get everything after the : with the pattern GREEDYDATA in the jsonMessage.
Than the json filter is used on the jsonMessage. Since then, we have the information needed in the new field log created by using the json filter.
I still don't understand why my grok work on Kibanna debugger but not on Logstash. I mean it's probably because some character needs to be escaped. But even when i escape them it didn't work.
#RaiZy_Style has already filtered out the JSON using the JSON Filter and trying to match the jsonMessage using the GROK Filter
I used the GROK Debugger to create a grok pattern that will match jsonMessage field.
The jsonMessage that I assumed coming after using the JSON Filter for the above log example is:
188.188.188.188 - user.name#things.com [23/Aug/2022:23:59:52 +0200] \\"PROPFIND /remote.php/dav/files/ HTTP/1.1\\" 207 1159 \\"-\\" \\"Mozilla/5.0 (Linux) mirall/3.4.2-1ubuntu1 (Nextcloud, ubuntu-5.15.0-46-generic ClientArchitecture: x86_64 OsArchitecture: x86_64)\\"\\n"
Here is pattern that will work:
%{IPORHOST:clientip} %{USER:ident} %{DATA:auth} \[%{HTTPDATE:timestamp}\] \\\\"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) \\\\"
Output Screenshot:
Note: If you want to expand the rawrequest field value into individual filed, it can be done also.

Grok expression to Parse Log data

I have just started using grok for logstash and I am trying to parse my log file line below using grok filter.
10.210.57.60 0x756682x2 connectadmin [12/May/2020:00:00:00 +0530] "GET /rest/auth/1/session HTTP/1.1" 200 286 456 "-" "Jersey/2.11 (HttpUrlConnection 1.8.0_171)" "1twyrho"
I am interested in :
IP : 10.210.57.60 //
user : connectadmin //
timestamp : 12/May/2020:00:00:00 +0530 //
URL : /rest/auth/1/session //
Response Code : 200
I am currently stuck with the grok expression : %{IPV4:client_ip} %{WORD:skip_me1} %{USERNAME}
by which I am able to get IP and username. Can you please help me proceed.
Thank You..
I have used grok debugger https://grokdebug.herokuapp.com/ to get the desired output.
Below is the grok pattern that will match your requirement:
%{IPV4:IP} %{GREEDYDATA:girbish} %{USERNAME:user} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{GREEDYDATA:girbish}
Also, below is the screenshot of final output after using the grok pattern
Note: You can remove unnecessary fields using mutate filter of logstash.

ELK | Log file grok filtered format not pushing into elastic search

I have log file having below format to extract into elastic search, but logstash filtered data not pushing into elastic search.
Same grok filtered configuration am able to get it from kibana devtools
Sample logfile:
OCDE - 2019-05-22 13:24:34.000 ERROR org.ramyam.ocde.task.NBALookupTask.checkResponsesToBeProcessed - checkResponsesToBeProcessed started : Wed May 22 13:24:34 IST 2019
Filebeat configuration:
filebeat.inputs:
- type: log
enabled: true
paths:
- C:\data\logs\OCDE.log
document_type: ocde
logstash configuration:
input {
file {
type => "ocde"
path => "C:\data\logs\OCDE.log"
}
beats {
port => 5044
ssl => false
}
}
filter {
grok {
match => [ "message" ,'%{DATA:moduleName} - %{TIMESTAMP_ISO8601:loggerTime}\s+%{LOGLEVEL:level}\s+%{JAVACLASS:className}\.%{DATA:methodName} - %{GREEDYDATA:loggermsg}}']
}
}
output {
if [type]=="ocde"
{
elasticsearch
{
hosts => ["localhost:9200"]
#manage_template => false
index => "enliven_be_log_yyyymmdd"
document_type=> ocde
}
}
}
I am expecting below result from an above configuration in elastic search
{
"level": "ERROR",
"loggerTime": "2019-05-22 13:24:34.000",
"moduleName": "OCDE",
"methodName": "checkResponsesToBeProcessed",
"className": "org.ramyam.ocde.task.NBALookupTask",
"loggermsg": "checkResponsesToBeProcessed started : Wed May 22 13:24:34 IST 2019"
}
Can anyone please explain or share sample configuration what I am missing
You can try below grok pattern -
%{DATA:moduleName}%{SPACE}*-%{SPACE}*%{TIMESTAMP_ISO8601:loggerTime}%{SPACE}*%{LOGLEVEL:level}%{SPACE}*%{JAVACLASS:className}\.%{DATA:methodName}%{SPACE}*-%{SPACE}*%{GREEDYDATA:loggermsg}
Change your grok from:
%{DATA:moduleName} - %{TIMESTAMP_ISO8601:loggerTime}\s+%{LOGLEVEL:level}\s+%{JAVACLASS:className}\.%{DATA:methodName} - %{GREEDYDATA:loggermsg}}
to:
%{DATA:moduleName} - %{TIMESTAMP_ISO8601:loggerTime}\s+%{LOGLEVEL:level}\s+%{JAVACLASS:className}\.%{DATA:methodName} - %{GREEDYDATA:loggermsg}
To validate this, use http://grokdebug.herokuapp.com/ and paste the log message you provided into the "
Your pattern works fine, you just had one extra bracket at the end.

How to use IF ELSE condition in grok pattern in logstash

I have web and API log combined and I want to save it separately in elasticsearch. So I want to write one pattern if the request is for API then if past should execute, the request is web then else part of the log should be executed.
Below are few web and API logs.
00:06:27,778 INFO [stdout] (ajp--0.0.0.0-8009-38) 00:06:27.777 [ajp--0.0.0.0-8009-38] INFO c.r.s.web.rest.WidgetController - Method getWidgetDetails() started to get widget details.
00:06:27,783 INFO [stdout] (ajp--0.0.0.0-8009-38) ---> HTTP GET http://api.survey.me/v1/getwidgetdetails?profileName=jeremy-steffens&profileLevel=INDIVIDUAL&companyProfileName=premier-nationwide-lending&hideHistory=true
00:06:27,817 INFO [stdout] (ajp--0.0.0.0-8009-38) <--- HTTP 200 http://api.survey.me/v1/getwidgetdetails?profileName=jeremy-steffens&profileLevel=INDIVIDUAL&companyProfileName=premier-nationwide-lending&hideHistory=true (29ms)
00:06:27,822 INFO [stdout] (ajp--0.0.0.0-8009-38) 00:06:27.822 [ajp--0.0.0.0-8009-38] INFO c.r.s.web.rest.WidgetController - Method getWidgetDetails() finished.
00:06:27,899 INFO [stdout] (ajp--0.0.0.0-8009-40) 00:06:27.899 [ajp--0.0.0.0-8009-40] INFO c.r.s.web.controller.LoginController - Inside initLoginPage() of LoginController
I tried to write condition but it's not working. It's working only up to thread name. After thread I have multiple type log so not able to write witout if condition.
(?:%{TIME:CREATED_ON})(?:%{SPACE})%{WORD:LEVEL}%{SPACE}\[%{NOTSPACE}\]%{SPACE}\(%{NOTSPACE:THREAD}\)
Can anybody give me suggestion?
You don't need to use an if/else conditon to do this, you can use multiple patterns, one will match the API log lines and the other will match the WEB log lines.
For the API log lines you can use the following pattern:
(?:%{TIME:CREATED_ON})(?:%{SPACE})%{WORD:LEVEL}%{SPACE}\[%{NOTSPACE}\]%{SPACE}\(%{NOTSPACE:THREAD}\)%{SPACE}(?:%{DATA})%{SPACE}\[%{DATA}\]%{SPACE}%{WORD}%{SPACE}%{GREEDYDATA:MSG}
And your return will be something like this:
{
"MSG": "c.r.s.web.controller.LoginController - Inside initLoginPage() of LoginController",
"CREATED_ON": "00:06:27,899",
"LEVEL": "INFO",
"THREAD": "ajp--0.0.0.0-8009-40"
}
For the web lines you can use the following pattern:
(?:%{TIME:CREATED_ON})(?:%{SPACE})%{WORD:LEVEL}%{SPACE}\[%{NOTSPACE}\]%{SPACE}\(%{NOTSPACE:THREAD}\)%{SPACE}%{DATA}%{WORD:PROTOCOL}%{SPACE}%{WORD:MethodOrStatus}%{SPACE}%{GREEDYDATA:ENDPOINT}
And the result will be:
{
"CREATED_ON": "00:06:27,783",
"PROTOCOL": "HTTP",
"ENDPOINT": "http://api.survey.me/v1/getwidgetdetails?profileName=jeremy-steffens&profileLevel=INDIVIDUAL&companyProfileName=premier-nationwide-lending&hideHistory=true",
"LEVEL": "INFO",
"THREAD": "ajp--0.0.0.0-8009-38",
"MethodOrStatus": "GET"
}
To use multiple patterns in grok just do this:
grok {
match => ["message", "pattern1", "pattern2"]
}
Or you can save your patterns to a file and use patterns_dir to point to the directory of the file.
If you still want to use a conditional, just check for anything in the message, for example:
if "HTTP" in [message] {
grok { your grok for the web messages }
} else {
grok { your grok for the api messages }
}

conditional matching with grok for logstash

I have php log of this format
[Day Mon DD HH:MM:SS YYYY] [Log-Type] [client <ipv4 ip address>] <some php error type>: <other msg with /path/of/a/php/script/file.php and something else>
[Day Mon DD HH:MM:SS YYYY] [Log-Type] [client <ipv4 ip address>] <some php error type>: <other msg without any file name in it>
[Day Mon DD HH:MM:SS YYYY] [Log-Type] [client <ipv4 ip address>] <some msg with out semicolon in it but /path/of/a/file inside the message>
This I am trying to send to Graylog2 after processing through logstash. Using this post here, I was able to start. now I would like to get some additional fields, so that my final version would look something like this.
{
"message" => "<The entire error message goes here>",
"#version" => "1",
"#timestamp" => "converted timestamp from Day Mon DD HH:MM:SS YYYY",
"host" => "<ipv4 ip address>",
"logtime" => "Day Mon DD HH:MM:SS YYYY",
"loglevel" => "Log-Type",
"clientip" => "<ipv4 ip address>",
"php_error_type" => "<some php error type>"
"file_name_from_the_log" => "/path/of/a/file || /path/of/a/php/script/file.php"
"errormsg" => "<the error message after first colon (:) found>"
}
I have the expression for individual line, or atleast I think these should be able to parse, using grokdebugger. something like this:
%{DATA:php_error_type}: %{DATA:message_part1}%{URIPATHPARAM:file_name}%{GREEDYDATA:errormsg}
%{DATA:php_error_type}: %{GREEDYDATA:errormsg}
%{DATA:message_part1}%{URIPATHPARAM:file_name}%{GREEDYDATA:errormsg}
But somehow I am finding it very difficult to make it work for the entire log file.
Any suggestion please? Also, not sure if there would be any other type of error messages coming in the log file. but the intention is to get the same format for all. Any suggestions how to tackle these logs to get the above mentioned format?
The grok filter can be configured with multiple patterns:
grok {
match => [
"message", "%{DATA:php_error_type}: %{DATA:message_part1}%{URIPATHPARAM:file_name}%{GREEDYDATA:errormsg}",
"message", "%{DATA:php_error_type}: %{GREEDYDATA:errormsg}",
"message", "%{DATA:message_part1}%{URIPATHPARAM:file_name}%{GREEDYDATA:errormsg}"
]
}
(Instead of a single filter with multiple patterns you could have multiple grok filters, but then you'd probably want to disable the _grokparsefailure tagging with tag_on_failure => [].)
If you have some part of your log line missing sometime you can use the following syntax :
(?:%{PATTERN1}|%{PATTERN2})
or
(?:%{PATTERN1}|)
To allow PATTERN1 OR ''. (empty)
Using this, you can have have only one pattern to manage :
grok {
match => [
"message", "(?:%{DATA:php_error_type}: |)(?:%{DATA:message_part1}:)(?:%{URIPATHPARAM:file_name}|)%{GREEDYDATA:errormsg}",
]
}
If you have problems, maybe replace %{DATA} by a more restrictive pattern.
You can also use this syntax (more regex like)
(?:%{PATTERN1})?
To debug a complex grok pattern, I recommend :
https://grokconstructor.appspot.com/do/match (multiline option + multiple input lines at same time + others options)
https://grokdebug.herokuapp.com/ (simpler to use)

Resources