logstash add_field and remove_field - logstash

I'm attempting to simplify my logstash config. I want to split the program field into separate fields (as show below) however I would prefer to use just one grok statement (if it's at all possible!)
Of the two examples below I get an _grokparsefailure on the second example, but not the first. Since grok has the add_field and remove_field options I would assume that I could combine it all into one grok statement. Why is this not the case? Have I missed some ordering/syntax somewhere?
Sample log:
2016-02-16T16:42:06Z ubuntu docker/THISTESTNAME[892]: 172.16.229.1 - - [16/Feb/2016:16:42:06 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36" "-"
Why does this work:
filter {
# Extracts the docker name, ID, image etc elements
mutate {
add_field => { "[#metadata][program]" => "%{program}" }
remove_field => "[program]"
}
grok {
patterns_dir => "/logstash/patterns_dir/docker"
match => { "[#metadata][program]" => "%{D_ID}" }
}
}
But this does not:
filter {
grok {
add_field => { "[#metadata][program]" => "%{program}" }
remove_field => "[program]"
patterns_dir => "/logstash/patterns_dir/docker"
match => { "[#metadata][program]" => "%{D_ID}" }
}
}

add_field and remove_field only run if the underlying filter works. In your second example, the [#metadata][program] doesn't yet exist for you to run grok{} against.

This was directly answered by #Alan, however I found this way a little more readable and compressed my code even more:
grok {
patterns_dir => "/logstash/patterns_dir/docker-patterns"
match => { "program" => "%{D_ID}" }
overwrite => [ "program" ]
}

Related

Grok Filter not adding new fields | Logstash

We have the below grok filter configured for our journlabeat. The same was deployed on our local for filebeat was working fine but isn't adding the new fields on journalbeat.
filter {
grok {
patterns_dir => ["/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.1.2/patterns"]
match => { "message" => [
'%{IPV4:client_ip} - - \[%{HTTPDATE:date}\] "%{WORD:method} %{URIPATH:request} %{URIPROTO:protocol}\/[1-9].[0-9]" (%{NUMBER:status}|-) (%{NUMBER:bytes}|-) "(%{URI:url}|-)" %{QUOTEDSTRING:client}'
]
break_on_match => false
tag_on_failure => ["failed_match"]
}
}
}
We tried adding the mutate filter for adding new fields using below but it isn't fetching the value and is printing the scalar values itself (example: %{client_ip}).
mutate {
add_field => {
"client_ip" => "%{client_ip}"
"date" => "%{date}"
"method" => "%{method}"
"status" => "%{status}"
"request" => "%{request}"
}
}
The log which we are trying to match is as below.
::ffff:172.65.205.3 - - [09/Jul/2020:11:32:52 +0000] "POST /v1-get-profile HTTP/1.1" 404 71 "https://mycompany.com/customer/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36"
Could someone let me know what exactly are we doing wrong. Thanks in Advance.

logstash GROK filter along with KV plugin couldn't able to process the events

i am new to ELK. when i onboarded the below log file, it is going to "dead letter queue" in logstash because logstash couldn't able to process the events.I have written the GROK filter to parse the events but logstash still couldn't not process the events. Any help would be appreciated.
Below is the sample log format.
25193662345 [http-nio-8080-exec-44] DEBUG c.s.b.a.m.PerformanceMetricsFilter - method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=31, totalTime=33 tenantId=b9sdfs-1033-4444-aba5-csdfsdfsf, immutableBlobId=bss_c_586331/Sample_app12-sdas-157123148464.txt, blobSize=2862, domain=abc
2519366789 [http-nio-8080-exec-47] DEBUG q.s.b.y.m.PerformanceMetricsFilter - method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=32, totalTime=33 tenantId=b0csdfsd-1066-4444-adf4-ce7bsdfssdf, immutableBlobId=bss_c_586334/Sample_app15-615223-157sadas6648465.txt, blobSize=2862, domain=cde
GROK filter:
dissect { mapping => { "message" => "%{NUMBER:number} [%{thread}] %{level} %{class} - %{[#metadata][msg]}" } }
kv { source => "[#metadata][msg]" field_split => "," }
Thanks
You have basically two problems in your configuration.
1.) You are using the dissect filter, not grok, both are used to parse messages, but grok uses regular expressions to validate the value of the field and dissect is just positional, it does not perform any validation, if you have a WORD value in the position of a field that expects a NUMBER, grok will fail, but dissect will not.
If your log lines always have the same pattern, you should continue to use dissect since it is faster and needs less cpu.
Your correct dissect mapping should be:
dissect {
mapping => { "message" => "%{number} [%{thread}] %{level} %{class} - %{[#metadata][msg]}" }
}
2.) The field that contains the kv message is wrong, it has fields separated by space and by comma, kv won't work this way.
After your dissect filter this is the content of [#metadata][msg].
method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=32, totalTime=33 tenantId=b0csdfsd-1066-4444-adf4-ce7bsdfssdf, immutableBlobId=bss_c_586334/Sample_app15-615223-157sadas6648465.txt, blobSize=2862, domain=cde
To solve this you should use a mutate filter to remove the comma from the [#metadata][msg] and use the kv filter with the default configurations.
This should be your filter configuration
filter {
dissect {
mapping => { "message" => "%{number} [%{thread}] %{level} %{class} - %{[#metadata][msg]}" }
}
mutate {
gsub => ["[#metadata][msg]",",",""]
}
kv {
source => "[#metadata][msg]"
}
}
Your output should be something like this:
{
"number" => "2519366789",
"#timestamp" => 2019-11-03T16:42:11.708Z,
"thread" => "http-nio-8080-exec-47",
"appLogicTime" => "1",
"domain" => "cde",
"method" => "PUT",
"level" => "DEBUG",
"blobSize" => "2862",
"#version" => "1",
"immutableBlobId" => "bss_c_586334/Sample_app15-615223-157sadas6648465.txt",
"streamInTime" => "0",
"status" => "201",
"blobStorageTime" => "32",
"message" => "2519366789 [http-nio-8080-exec-47] DEBUG q.s.b.y.m.PerformanceMetricsFilter - method=PUT status=201 appLogicTime=1, streamInTime=0, blobStorageTime=32, totalTime=33 tenantId=b0csdfsd-1066-4444-adf4-ce7bsdfssdf, immutableBlobId=bss_c_586334/Sample_app15-615223-157sadas6648465.txt, blobSize=2862, domain=cde",
"totalTime" => "33",
"tenantId" => "b0csdfsd-1066-4444-adf4-ce7bsdfssdf",
"class" => "q.s.b.y.m.PerformanceMetricsFilter"
}

Logstash parser error, timestamp is malformed

Can somebody tell me what I'm doing wrong, or why Logstash doesn't want to parse an ISO8601 timestamp?
The error message I get is
Failed action ... "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [timestamp]",
"caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid
format: \"2017-03-24 12:14:50\" is malformed at \"17-03-24
12:14:50\""}}
Sample log file line (last byte in IP address replaced with 000 on purpose)
2017-03-24 12:14:50 87.123.123.000 12345678.domain.com GET /smil:stream_17.smil/chunk_ctvideo_ridp0va0r600115_cs211711500_mpd.m4s - HTTP/1.1 200 750584 0.714 "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36" https://referrer.domain.com/video/2107 https fra1 "HIT, MISS" 12345678.domain.com
GROK pattern (use http://grokconstructor.appspot.com/do/match to verify)
RAW %{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{IPV4:clientip}%{SPACE}%{HOSTNAME:http_host}%{SPACE}%{WORD:verb}%{SPACE}\/(.*:)?%{WORD:stream}%{NOTSPACE}%{SPACE}%{NOTSPACE}%{SPACE}%{WORD:protocol}\/%{NUMBER:httpversion}%{SPACE}%{NUMBER:response}%{SPACE}%{NUMBER:bytes}%{SPACE}%{SECOND:request_time}%{SPACE}%{QUOTEDSTRING:agent}%{SPACE}%{URI:referrer}%{SPACE}%{WORD}%{SPACE}%{WORD:location}%{SPACE}%{QUOTEDSTRING:cache_status}%{SPACE}%{WORD:account}%{GREEDYDATA}
Logstash configuration (input side):
input {
file {
path => "/subfolder/logs/*"
type => "access_logs"
start_position => "beginning"
}
}
filter {
# skip first two lines in log file with comments
if [message] =~ /^#/ {
drop { }
}
grok {
patterns_dir => ["/opt/logstash/patterns"]
match => { "message" => "%{RAW}" }
}
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss" ]
locale => "en"
}
# ... (rest of the config omitted for readability)
}
So I am pretty sure this is being caused by the field timestamp being mapped to a type in Elasticsearch that it doesn't parse to. If you post your index mapping, I'd be happy to look at it.
A note: You can quickly solve this by adding remove_field because if the date filter is successful, the value of that field will be pulled into #timestamp. Right now you have the same value stored in two fields. Then you don't have to worry about the mapping for the field. :)
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss" ]
locale => "en"
remove_field => [ "timestamp" ]
}

Issue with GROK match for access logs

I am getting a grokparsefailure on some of these apache logs, that is not making sense to me. One of the kibana tags for these is the grokparsefailure. Obviously something is wrong here but I am having trouble figuring out what that is.
Example log entry that resulted in a failure:
127.0.0.1 - - [10/Oct/2016:19:05:54 +0000] "POST /v1/api/query.random HTTP/1.1" 201 - "-" "-" 188
Logstash output config file:
filter {
if [type] == "access" {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
}
filter {
if [type] == "requests" {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
}
output {
elasticsearch {
hosts => ["http://ESCLUSTER:9200"]
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "[type]"
}
stdout {
codec => rubydebug
}
}
There are two spaces instead of one between the two - and between the - and the [: 127.0.0.1 - - [.
The pattern (%{IPORHOST:clientip} %{HTTPDUSER:ident} %{HTTPDUSER:auth}) expect only one space at this points.
So either you correct your log format so that all logs are of the same format, or you replace %{COMBINEDAPACHELOG} by
%{IPORHOST:clientip} %{HTTPDUSER:ident}%{SPACE}%{HTTPDUSER:auth}%{SPACE}\[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent}
This pattern is equivalent to the COMBINEDAPACHELOG pattern, but I replace the space at the beginning by the %{SPACE} pattern which match one or more space.

Create tag from field using regex

I am using logstash to collect my apache logs, and as such I have a field called request_url which contains values that look like:
POST /api_v1/services/order_service HTTP/1.1
POST /api_v2/services/user_service HTTP/1.0
I want to create separate tags containing on the API version and the service name, e.g.
POST /api_v1/services/order_service HTTP/1.1 -> ["v1", "order_service"]
POST /api_v2/services/user_service HTTP/1.0 -> ["v2", "user_service"]
How do achieve this in logstash configuration? Thanks for any pointers.
Using the following grok filter, you can separate the components you need and then add the appropriate tags
filter {
grok {
"match" => { "message" => "%{WORD:verb} /api_%{NOTSPACE:version}/services/%{WORD:service}" }
"add_tag" => ["%{version}", "%{service}"]
}
}
The event you'll get will look like this:
{
"message" => "POST /api_v1/services/order_service HTTP/1.1",
"#version" => "1",
"#timestamp" => "2016-06-07T02:52:01.136Z",
"host" => "iMac.local",
"verb" => "POST",
"version" => "v1",
"service" => "order_service",
"tags" => [
[0] "v1",
[1] "order_service"
]
}

Resources