Grok filter Pattern -Unable to get required values from Square bracket - logstash-grok

Part of log sample entry given below
[\"request_type\", \"post\"], [\"status_code\", 404], [\"u_id\", 111111]
Need to get the values list post,404,111111 using grok, worked for below example
[\"user_id\", 111111] => %{GREEDYDATA:A}, %{WORD:U_ID}\] gives output as
"A": [
[
"[\\"user_id\\""
]
],
"U_ID": [
[
"111111"
]
]}
"
When trying to keep "user_id" and extract it's value using filter "
[\"user_id\", %{WORD:U_ID}\]
getting compile error or {}
Please let me know where the error or any thing missing filter pattern.

Here is the grok pattern that matches your log:
%{GREEDYDATA:no_use}%{POSINT:status_code}\]%{GREEDYDATA:no_use} %{POSINT:UID}\]
I have used the Grok Debugger to validate the grok pattern.
Output:

Related

Regex group from within custom grok pattern

I'm trying to create custom grok patterns to extract various data using logstash and am wracking my brain getting the syntax correct to pull the regex group 1 equivalent from my log rows. I've looked at a ton of threads on this over the past 2 days, but nothing's out there that fits my example, and none of the canned grok patterns seem like they will pull the value I need.
3 example log file rows look similar to this (with abbreviated data for the examples):
2022-04-07 12:52:06,184:INFO :Thread-70_SCHEDULE.0001: MsgID=63759111848731967
2022-04-07 07:23:39,876:INFO :Thread-53_OrderInterfaceIntServer: MsgID=21316889724753182|
07:23:40,482 INFO [stdout] (http-/0.0.0.0:8080-20) 2022-04-07 07:23:40,482:ERROR
I want to create a custom grok pattern called SERVICE that extracts a pattern match using a regex match string:
Thread-[0-9]{2}_(.*?)\:
that for the 3 rows would return:
SCHEDULE.0001
OrderInterfaceIntServer
""
In the log:
SERVICE will always be prefixed by "Thread-xx_" where xx = 2-digit number followed by underscore. Some logs may not have this pattern at all (like row 3). In that case, no match.
SERVICE is always followed by a colon
In grok, I can define this in 2 ways:
SERVICE Thread-[0-9]{2}_(.*?)\:
or as a field using (?<service>Thread-[0-9]{2}_(.*?)\:)
however, for row 1, I get the response value of:
{
"service": [
[
"Thread-70_SCHEDULE.0001:"
]
]
}
What I want is:
{
"service": [
[
"SCHEDULE.0001"
]
]
}
Which is the equivalent of the regex group 1 response. I can't figure out how to manage the grok patterns to get the result I need.
You do not have to include all of the pattern in the capture group. You can use
grok { match => { "message" => "Thread-[0-9]{2}_(?<service>.*?):" } }
That will result in
"service" => "SCHEDULE.0001",
"service" => "OrderInterfaceIntServer",
and a "_grokparsefailure" tag on the third event.

Logstash Filter : syntax

Ive recently began learning logstash and the syntax is confusing me.
eg : for match i have various codes:
match => [ "%{[date]}" , "YYYY-MM-dd HH:mm:ss" ]
match => { "message" => "%{COMBINEDAPACHELOG}" }
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
What does each of these keys ("%{[date]}", "message", "timestamp") mean. And where can i find a proper documentation that explains all the keywords and syntax.
Please help and provide links if possible.
The grok{} filter has a match parameter that takes a field and a pattern. It will apply the pattern, trying to extract new fields from it. Your second example is from grok, so it will try to apply the COMBINEDAPACHELOG pattern against the text in the "message" field.
The doc for grok{} is here, and there are detailed blogs, too.
The other two examples look like they're from the date{} filter, which does a similar thing. It takes a field containing a string that represents a date, applies the given pattern to that field, and (by default) replaces the value in the #timestamp field.
The doc for date{} is here and examples here.

How to pull specific data out of a message in LogStash

I am trying to take log data from a custom application that has a well defined format. I am trying to pick out certain pieces of the data using the grok filter, but I am not having any luck. Here is a sample log:
- System.Data.SqlClient.SqlException (0x80131904): Arithmetic overflow error converting IDENTITY to data type int.
Arithmetic overflow occurred.
What I would like to do is extract out the SqlException out of the string. Here is the grok that I am using:
grok{
match =>
{
"message" =>
[
"(?m)%{DATE:TIMESTAMP_DATE}%{SPACE}%{TIME:TIMESTAMP_TIME}%{SPACE}%{WORD:LOG_LEVEL}%{SPACE}(?<THREAD>[^\s]+)%{SPACE}(?<HOST>[^\s]+)%{SPACE}%{GREEDYDATA:MESSAGE}",
"(?<EXCEPTION>[.*]+)"
]
}
}
I have tried several different ways, but I guess I am not completely understanding the documentation. What I would expect to happen is all of the fields that I have extracts in the first set would include the result of the second set. In other words:
TIMESTAMP_DATE,TIMESTAMP_TIME,LOG_LEVEL,THREAD,HOST,MESSAGE,EXCEPTION
I am getting the other fields perfectly, it is just additional matching that I am missing. Any help would be appreciated. Thanks
If you specify multiple patterns grok by default only looks checks the patterns until the first match is encountered. If you want to match against both patterns regardless of whether the first one matched or not you can change the behaviour like that:
grok{
break_on_match => false
match =>
{
"message" =>
[
"(?m)%{DATE:TIMESTAMP_DATE}%{SPACE}%{TIME:TIMESTAMP_TIME}%{SPACE}%{WORD:LOG_LEVEL}%{SPACE}(?<THREAD>[^\s]+)%{SPACE}(?<HOST>[^\s]+)%{SPACE}%{GREEDYDATA:MESSAGE}",
"(?<EXCEPTION>[.*]+)"
]
}
}
Check out the docs under: https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#plugins-filters-grok-break_on_match

How can you parse 2 properties out of 1 value with grok in logstash?

Some context:
I want to parse the following log statement using grok in logstash
07:51:45,729 TRACE [com.company.Class] (ajp-/1.2.3.4:8080-251) USERID called path: /url and took: 1000 ms
I am now using the following syntax to parse the complete message:
%{DATA:time}\s%{DATA:level}\s%{DATA:class}\s%{DATA:thread}\s%{DATA:userid}\s.*path:\s%{DATA:url}\s.*:\s%{NUMBER:duration:int}\sms
Which gives me all the properties that i have defined.
My question:
I want to parse this part (ajp-/1.2.3.4:8080-251) into a 'thread' property and an ip property.
The result needs to be:
thread: (ajp-/1.2.3.4:8080-251)
ip: 1.2.3.4
How can i do this?
Thanks
Just add a second grok filter after your working one. Do not put this in your existing grok filter because it will finish after the first match.
Example:
grok {
match => [ 'thread', '%{IP:ip}' ]
}
This obtains your previous field thread => "(ajp-/1.2.3.4:8080-251)" and adds a new field ip => "1.2.3.4"
Apart from that, I would recommend you to be more specific with your pattern. You used DATA everytime which is kind of imprecise. Start with something like this:
%{TIME:timestamp} %{WORD:method} \[%{JAVACLASS:class}\] \(%{DATA:thread}\) %{NUMBER:userid} %{DATA}%{URIPATH:uri}%{DATA}

Different structure in a few lines in my log file

My log file contains different structures in a few lines, and I can not grok it, I don't know if we can test by lines or attribute, I'm still a beginner.
if you don't understand me I can give you some examples :
input :
id=firewall action=bloc type=web
id=firewall fw="ER" type=filter
id=firewall fw="Az" tz="loo" action=bloc
Pattern:
id=%{WORD:id} ...
I thought to add some patterns between ()?,
but i don't know exactly how to do it.
you can use this site to test it http://grokdebug.herokuapp.com/
Any help please? What should i do :(
Logstash supports key-value Values, take a look at http://logstash.net/docs/1.4.2/filters/kv.
Or you could use multiple match values:
grok {
patterns_dir => "./patterns"
match => [
"message", "%{BASE_PATTERN} %{EXTRA_PATTERN}",
"message", "%{BASE_PATTERN}",
"message", "%{SOME_OTHER_PATTERN}"
]
}
Not sure if I understood well your question but I will try to answer. I think the first thing you have to do is to parse the different fields from your input. Example of pattern to parse your first line input :
PATTERN %{NOTSPACE} %{NOTSPACE} %{NOTSPACE} (in $LOGSTASH_HOME/pattern/extra)
Then in your logstash configuration file :
filter {
grok {
patterns_dir => "$LOGSTASH_HOME/pattern"
match => [ "message" => "%{PATTERN}" ]
}
}
This will match your first line as 3 fields ("id=firewall" "action=bloc" "type=web") (you have to adapt it if you have more than 3 fields).
And the last thing you seem be looking for is splitting field (in key-value scheme) like id=firewall would become id => "firewall". This can be done with the kv plugin. I never used it but I recommend you the logstash docs here
If I did not understand you question, please be more clear.

Resources