How to parse CEF format with Logstash - logstash

My entry message from file test.log
Dec 8 07:18:23 XXXXXXXX69.XXXX.COM CEF: 0|F-Secure|F-Secure Client Security Premium|15.30|online_safety.harmful_page.block|Harmful website blocked.|10|msg=Harmful website blocked. URL: http://sairuscars.weebly.com suser=XXXX\\mark.markovic domainTreePath=XXXX/Office/TES0PC0677 shost=TESTC0677
My actually config file test.config for parse
input { stdin { } }
output { stdout { codec => rubydebug } }
filter {
# Manipulate the message
mutate {
split => ["message", "|"]
}
kv {
field_split => " "
value_split => "="
}
}
Result from call command "cat /var/log/test.log | logstash -f /root/tmp/test.config "
{
"host" => "xxxapp0001.test.com",
"suser" => "XXXX\\\\mark.markovic",
"domainTreePath" => "XXXX/Office/TTESTC0677",
"message" => [
[0] "Dec 8 07:18:23 XXXXXXXX69.XXXX.COM CEF: 0",
[1] "F-Secure",
[2] "F-Secure Client Security Premium",
[3] "15.30",
[4] "online_safety.harmful_page.block",
[5] "Harmful website blocked.",
[6] "10",
[7] "msg=Harmful website blocked. URL: http://sairuscars.weebly.com suser=XXXX\\\\mark.markovic domainTreePath=XXXX/Office/TTESTC0677 shost=TESTPC0677"
],
"#version" => "1",
"msg" => "Harmful",
"#timestamp" => 2022-12-08T10:21:01.355Z,
"shost" => "TESTPC0677"
}
I don't know how to split message[7] to key:value

Related

How to cumul filters with logstash?

I'm currently discovering elastic search, kibana and logstash with docker. (Version 7.1.1) The three containers are running well.
I have some data files containing some lines like this one:
foo=bar type=alpha T=20180306174204527
My logstash.conf contains:
input {
file {
path => "/tmp/data/*.txt"
start_position => "beginning"
}
}
filter {
kv {
field_split => "\t"
value_split => "="
}
}
output {
elasticsearch { hosts => ["elasticsearch:9200"] }
stdout {
codec => rubydebug
}
}
I handle this data:
{
"host" => "07f3051a3bec",
"foo" => "bar",
"message" => "foo=bar\ttype=alpha\tT=20180306174204527",
"T" => "20180306174204527",
"#timestamp" => 2019-06-17T13:47:14.589Z,
"path" => "/tmp/data/ucL12018_03_06.txt",
"type" => "alpha"
"#version" => "1",
}
First step of job is done.
Now I want to add a filter to transform the value of the key T as a timestamp.
{
...
"T" => "2018-03-06T17:42:04.527Z",
"#timestamp" => 2019-06-17T13:47:14.589Z,
...
}
I do not know how to do it. I tried to add a second filter just after the kv filter, but nothing change when I add new files.
Add this filter after the kv filter:
date {
match => [ "T", "yyyyMMddHHmmssSSS" ]
target => "T"
}
The date filter will try to parse the field T using the provided pattern to create a date, which will be written to the T field (by default it overwrite the #timestamp field).

How to write a Multiple input and logstash filter for hostname based on message field

As a newbie to logstash i would like to understand as i have two types of logs one is Linux system logs and another i have CISCO switches logs , now i'm looking forward to create the diffrent input and filter's for both.
I have defined the type for linux logs as syslog and for CISCO switches as APIC and want to define the and for filter section. My CISCO log pattrens sample is as below where my SWITCH NAME is 7th Field in the messages , so wonder how to take that 7th field as a Hostname for swiches.
Aug 23 16:36:58 Aug 23 11:06:58.830 mydc-leaf-3-5 %LOG_-1-SYSTEM_MSG [E4210472][transition][info][sys] sent user message to syslog group:Syslog_Elastic_Server:final
Blow is my logstash-syslog.conf file which is working for syslog but needs while for CISCO logs ie type => APIC ..
# cat logstash-syslog.conf
input {
file {
path => [ "/scratch/rsyslog/*/messages.log" ]
type => "syslog"
}
file {
path => [ "/scratch/rsyslog/Aug/messages.log" ]
type => "APIC"
}
}
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp } %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
if [type] == "APIC" {
grok {
match => { "message" => "%{CISCOTIMESTAMP:syslog_timestamp} %{CISCOTIMESTAMP} %{SYSLOGHOST:syslog_hostname} %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{#timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
}
}
output {
#if "automount" in [message] or "ldap" in [message] {
elasticsearch {
hosts => "noida-elk:9200"
index => "syslog-%{+YYYY.MM.dd}"
#index => "%{[type]}-%{+YYYY.MM.dd}"
#index => "%{index}-%{+YYYY.MM.dd}"
#type => "%{type}
document_type => "messages"
}
}
Filter works correctly for below message and i get the Field syslog_hostname correctly, here in case i get can get the linuxdev.
Aug 24 10:34:02 linuxdev automount[1905]: key ".P4Config" not found in map source(s).
Filter do not work for below message..
Aug 24 10:26:22 Aug 24 04:56:22.444 my-apic-1 %LOG_-3-SYSTEM_MSG [F1546][soaking_clearing][packets-dropped][minor][dbgs/ac/sdvpcpath-207-208-to-109-110/fault-F1546] 2% of packets were dropped during the last collection interval
After some grokking here is my pattern for Cisco APIC syslogs:
%{SYSLOG5424PRI:initial_code}%{CISCOTIMESTAMP:cisco_timestamp}%{SPACE}%{TZ}%{ISO8601_TIMEZONE}%{SPACE}%{URIHOST:uri_host}%{SPACE}%{SYSLOGPROG:syslog_prog}%{SPACE}%{SYSLOG5424SD:message_code}%{SYSLOG5424SD:message_type}%{SYSLOG5424SD:messa
ge_class}%{NOTSPACE:message_dn}%{SPACE}%{GREEDYDATA:message_content}
Let me have some feedbacks to improve.

Logstash grok doesn't match for specific field

I have tab separated string and I want to extract each fields using grok plugin.
The tab separated string is like
http://www.allaboutpc.co.kr 2016110913 d6123c6caa12f08852c82b876bdd3ceceb166d5e 0 0 1 0 /Event/QuizChoice.asp?IdxEvent=3141
I would like to get each fields as url, datetime, hashvalue, count1, count2, count3, count4, path.
I used %{DATA:hashvalue} for 3rd field to extract hashvalue but logstash didn't print hashvalue
Here is my conf file
input {
stdin { }
file {
path => "/Users/Projects/webmastermrinput/20161021/17/*"
codec => plain
}
}
filter {
# tab to space
mutate {
gsub => ["message", "\t", " "]
}
grok {
match => {
'message' => "%{DATA:url} %{NUMBER:datetime2} %{DATA:hashvalue} % {NUMBER:count1} %{NUMBER:count2} %{NUMBER:count3} %{NUMBER:count4} % {URIPATHPARAM:path}'
}
}
}
output {
stdout { codec => rubydebug }
}
Logstash output for input : "http://www.allaboutpc.co.kr 2016110913 d6123c6caa12f08852c82b876bdd3ceceb166d5e 0 0 1 0 /Event/QuizChoice.asp?IdxEvent=3141"
{
"#timestamp" => 2016-11-11T02:26:01.828Z,
"#version" => "1",
"host" => "MacBook-Air-10.local",
"datetime" => "2016110913",
"message" => "http://www.allaboutpc.co.kr 2016110913 d6123c6caa12f08852c82b876bdd3ceceb166d5e 0 0 1 0 /Event/QuizChoice.asp?IdxEvent=3141",
"url" => "http://www.allaboutpc.co.kr"
}
Your grok works perfectly well, you just need to remove the spaces between %and { in % {NUMBER:count1} and % {URIPATHPARAM:path}
'message' => "%{DATA:url} %{NUMBER:datetime2} %{DATA:hashvalue} % {NUMBER:count1} %{NUMBER:count2} %{NUMBER:count3} %{NUMBER:count4} % {URIPATHPARAM:path}'
^ ^
| |
here and here

Logstash 1.4.2 grok filter: _grokparsefailure

i am trying to parse this log line:
- 2014-04-29 13:04:23,733 [main] INFO (api.batch.ThreadPoolWorker) Command-line options for this run:
here's the logstash config file i use:
input {
stdin {}
}
filter {
grok {
match => [ "message", " - %{TIMESTAMP_ISO8601:time} \[%{WORD:main}\] %{LOGLEVEL:loglevel} %{JAVACLASS:class} %{DATA:mydata} "]
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
output {
elasticsearch {
host => "localhost"
}
stdout { codec => rubydebug }
}
Here's the output i get:
{
"message" => " - 2014-04-29 13:04:23,733 [main] INFO (api.batch.ThreadPoolWorker) Commans run:",
"#version" => "1",
"#timestamp" => "2015-02-02T10:53:58.282Z",
"host" => "NAME_001.corp.com",
"tags" => [
[0] "_grokparsefailure"
]
}
Please if anyone can help me find where the problem is on the gork pattern.
I tried to parse that line in http://grokdebug.herokuapp.com/ but it parses only the timestamp, %{WORD} and %{LOGLEVEL} the rest is ignored!
There are two error in your config.
First
The error in GROK is the JAVACLASS, you have to include ( ) in the pattern, For example: \(%{JAVACLASS:class}\.
Second
The date filter match have two value, first is the field you want to parse, so in your example it is time, not timestamp. The second value is the date pattern. You can refer to here
Here is the config
input {
stdin {
}
}
filter {
grok {
match => [ "message", " - %{TIMESTAMP_ISO8601:time} \[%{WORD:main}\] %{LOGLEVEL:loglevel} \(%{JAVACLASS:class}\) %{GREEDYDATA:mydata}"
]
}
date {
match => [ "time" , "YYYY-MM-dd HH:mm:ss,SSS" ]
}
}
output
{
stdout {
codec => rubydebug
}
}
FYI. Hope this can help you.

How do I check for the presence of fields in logstash events?

Given the following logstash configuration:
input { stdin {}
}
filter {
grok {
match => ["message", "foo (?<bar>.*)",
"message", "quux (?<stuff>.*)"
]
}
if "bar" in [tags] {
mutate {
add_field => { "had_bar" => "yup"}
}
}
}
output { stdout { codec => rubydebug } }
I would expect a message starting with "foo " to get the field had_bar added to my event. However, when I try it:
* bin/logstash -f simple.conf
Picked up JAVA_TOOL_OPTIONS: -Xmx1G
foo bar quux
{
"message" => "foo bar quux",
"#version" => "1",
"#timestamp" => "2014-05-14T09:46:15.498Z",
"host" => "my-dev-machine.com",
"bar" => "bar quux"
}
What have I done wrong? I'm aware that grok also provides an add_field option, but I only want to add the field when I encounter the first pattern.
You're checking for fields in tags, but tags is just a normal field on the event. What you want is:
if [bar] {
# ...
}

Resources