grok filtering pattern issue - logstash

I try to match the loglevel of a log file with a grok filter, but still getting a _grokparsefailure. The problem is maybe with the space between [ and the log level.
example of log: 2017-04-21 10:12:03,004 [ INFO] Message
my filter:
filter {
grok {
match => {
"log.level" => "\[ %{LOGLEVEL:loglevel}\]"
}
}
}
I also tried some other solutions without success:
"\[ *%{LOGLEVEL:loglevel}\]"
"\[%{SPACE}%{LOGLEVEL:loglevel}\]"
Thanks in advance for your help

The issue is with the option match in your filter: this option is a hash that tells the filter which field to look at and which field to look at.
Your regex is fine (you can check with http://grokconstructor.appspot.com/do/match), the issue is with the field name; it should be message.
So in your case, your filter should look like this:
grok {
match => {
"message" => "\[ %{LOGLEVEL:loglevel}\]"
}
}

The point is the default field is message and you need to match all the string
filter {
grok {
match => {
"message" => "%{TIMESTAMP_ISO8601:logDate} \[ %{LOGLEVEL:loglevel}\]%{GREEDYDATA:messages}"
}
}
}

Related

grok/kv filter audit logs logstash

I'm learning how to use the grok plugin. I have a message string like so
"type=CRYPTO_SESSION msg=audit(111111.111:111111): pid=22730 uid=0 auid=123 ses=123 subj=system_u:system_r:sshd_t:a1-a1:a1.a1234 msg='op=xx=xx cipher=xx ksize=111 mac=xx pfs=xxx spid=111 suid=000 rport=000 laddr=11.11.111.111 lport=123 exe=\"/usr/sbin/sshd\" hostname=? addr=11.111.111.11 terminal=? res=success'"
I'd like to extract the fields laddr, addr, and lport. I created a patterns directory with the following structure
patterns
|
-- laddr
|
-- addr
My filter is written like so
filter {
grok {
patterns_dir => ["./patterns"]
match => { "messaage" => "%{LADDR:laddr} %{ADDR:addr}"}
}
}
I was expecting to extract at least laddr and addr. I get matches using https://grokdebug.herokuapp.com/. With these patterns
(?<laddr>\b(laddr=\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b)
(?<addr>\b(addr=\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b)
but the configuration fails to compile. I'm just going off of these docs: https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html. I've also tried using a kv filter the issue that I run into when I try to use something like
filter{
kv {
value_split => "="
}
}
I end up with msg field showing up twice. I'd really like to figure out how to get the properties from this string. Any help would be greatly appreciated.
I think its field split :
filter {
kv {
field_split => "&?"
}
}
Did you try this, are you getting any error messages?

logstash converting ruby code to logstash filters

I was wondering what will be the best way to implement the following task in logstash :
I have the following field that contains multiple paths divided by ':' :
my_field : "/var/log/my_custom_file.txt:/var/log/otherfile.log/:/root/aaa.jar
I want to add a new field called "first_file" that will contain only the file_name(without suffix) of the first path :
first_file : my_custom_file
I implemented it with the following ruby code ;
code => 'event.set("first_file",event.get("[my_field]").split(":")[0].split("/")[-1].split(".")[0])'
How can I use logstash filters (add_field,split,grok) to do the same task ? I feel like using ruby code should be my last option.
You could do it using just grok, but I think it would be clearer to use mutate to pull out the first value
mutate { split => { "my_field" => ":" } }
mutate { replace => "{ "my_field" => "[my_field][0]" } }
grok { match => { "my_field" => "/(?<my_field>[^/]+)\.%{WORD}$" } overwrite => [ "my_field" ] }
rather than
grok { match => { "my_field" => "/(?<my_field>[^/]+)\.%{WORD}:" } overwrite => [ "my_field" ] }
The (?<my_field>[^/]+) is a custom pattern (documented here) which creates a field called [my_field] from a sequence of one or more (+) characters which are not /
Yes with a basic grok you could match every field in the value.
This kind of filter must work (put it in your logstash configuration file), this one extract the "basename" of the file (filename without extension and path) :
filter{
grok {
match => { "my_field" => "%{GREEDYDATA}/%{WORD:filename}.%{WORD}:%{GREEDYDATA}/%{WORD:filename2}.%{WORD}:%{GREEDYDATA}/%{WORD:filename3}.%{WORD}" }
}
}
You could be more strict in grok with use of PATH in place of GREYDATA, I let you determine your best approach that works in your context.
You could debug the grok pattern with the online tool grokdebug.

Getting optional field out of message in logstash grok filter

I´m trying to extract the number of ms in this logline
20190726160424 [INFO]
[concurrent/forasdfMES-managedThreadFactory-Thread-10] -
Metricsdceptor: ## End of call: Historirrtory.getHistrrOrder took 2979
ms
The problem is, that not all loglines contain that string
Now I want to extract it optionally into a duration field. I tried this, but nothing happend .... no error, but also no result.
grok
{
match => ["message", "(took (?<duration>[\d]+) ms)?"]
}
What I´m I doing wrong ?
Thanks guys !
A solution would be to only apply the grok filter on the log lines ending with ms. It can be done using conditionals in your configuration.
if [message] =~ /took \d+ ms$/ {
grok {
match => ["message", "took %{NUMBER:duration} ms"]
}
}
I cannot explain why, but it works if you anchor it
grok { match => { "message" => "(took (?<duration>\d+) ms)?$" } }

Logstash grok filter error "_grokparsefailure"

2017-08-09T12:01:43.049963+05:30 55.3.244.1 11235 GET
This is my log data.
I am trying to filter this data using custom patterns. I am getting "_grokparsefailure" error.
my pattern file data isTIMESTAMP_LOG [0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{6}\+[0-9]{2}:[0-9]{2}
my filter is:
filter {
grok {
patterns_dir => ["./patterns"]
match => { "message" => "%{TIMESTAMP_LOG:time} %{IP:client} %{NUMBER:bytes} %{WORD:method}" }
} }
can anyone help me where i am done wrong.Thanks.
Your timestamp is actually of a standard format - ISO8601. So instead of having your custom pattern for timestamp, you can use one built into Logstash instead. I tested this grok pattern and it worked with your sample log:
%{TIMESTAMP_ISO8601:time} %{IP:client} %{NUMBER:bytes} %{WORD:method}

How to distinct grok filter for similar logs

I have this to kind of logs for dhcpack:
Jun 30 06:34:18 HOSTNAME dhcpd: DHCPACK to IP (MAC) via eth2
Jun 30 06:34:28 HOSTNAME dhcpd: DHCPACK on IP to MAC via eth2
How can I use grok, to use two different matches?
I have these two matches for dhcpack, but just use the first:
((%{SYSLOGTIMESTAMP:timestamp})\s*(%{HOSTNAME:hostname})\sdhcpd\S+\s(%{WORD:dhcp_action})?.[for|on]
(%{IPV4:dhcp_client_ip})?.[from|to]
(%{COMMONMAC:dhcp_client_mac})?.*via (%{USERNAME:interface}))
((%{SYSLOGTIMESTAMP:timestamp})\s*(%{HOSTNAME:hostname})\sdhcpd\S+\s(%{WORD:dhcp_action})?.*[to]
(%{IPV4:dhcp_client_ip})?.*via (%{USERNAME:interface}))
Someone can help?
I would suggest pulling the common stuff (up to the colon) off first and then processing the more specific stuff with more specific patterns. Some details here.
As shown in the doc, grok{} can take multiple patterns:
filter {
grok { match => { "message" => [
"Duration: %{NUMBER:duration}",
"Speed: %{NUMBER:speed}"
] } }
}
By default, it will stop processing after the first match, but that's configurable.
EDIT:
Based on your other comments, you can also branch based on conditionals:
if [myField] == "someValue" {
grok {
...
}
}
else {
grok {
...
}
}
In this case, you're running a comparison ("==") or regexp ("=~") to see if you should run a regexp (grok{}). Depending on the full business logic, this seems like a waste.
I solve the problem with this:
filter
{
grok
{
match => ["message", "(dhcpd\S+\s*(%{WORD:dhcp_action_test}))"]
}
if "DHCPINFORM" in [message]
{
grok
{
match => ["message","((%{SYSLOGTIMESTAMP:timestamp})\s* (%{HOSTNAME:hostname})\sdhcpd\S+\s(%{WORD:dhcp_action})?.[from] (%{IPV4:dhcp_client_ip}))"]
}
}
else if "DHCPDISCOVER" in [message]
{
grok
{
match => ["message","((%{SYSLOGTIMESTAMP:timestamp})\s(%{HOSTNAME:hostname})\sdhcpd\S+\s(%{WORD:dhcp_action})?.*[from] (%{COMMONMAC:dhcp_client_mac})"]
}
}
else
{
drop {}
}
}
I want do something like:
In ((%{SYSLOGTIMESTAMP:timestamp})\s*(%{HOSTNAME:hostname})\sdhcpd\S+\s(%{WORD:dhcp_action})?.[for|on] (%{IPV4:dhcp_client_ip})?.[from|to] (%{COMMONMAC:dhcp_client_mac})?.*via (%{USERNAME:interface}))
get just dhcp_action and use if statement, like:
if (mCursor != null && mCursor.moveToFirst()) {
......
} else {
......
}
It's possible?

Resources