Logstash parser error, timestamp is malformed - logstash

Can somebody tell me what I'm doing wrong, or why Logstash doesn't want to parse an ISO8601 timestamp?
The error message I get is
Failed action ... "error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [timestamp]",
"caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid
format: \"2017-03-24 12:14:50\" is malformed at \"17-03-24
12:14:50\""}}
Sample log file line (last byte in IP address replaced with 000 on purpose)
2017-03-24 12:14:50 87.123.123.000 12345678.domain.com GET /smil:stream_17.smil/chunk_ctvideo_ridp0va0r600115_cs211711500_mpd.m4s - HTTP/1.1 200 750584 0.714 "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36" https://referrer.domain.com/video/2107 https fra1 "HIT, MISS" 12345678.domain.com
GROK pattern (use http://grokconstructor.appspot.com/do/match to verify)
RAW %{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{IPV4:clientip}%{SPACE}%{HOSTNAME:http_host}%{SPACE}%{WORD:verb}%{SPACE}\/(.*:)?%{WORD:stream}%{NOTSPACE}%{SPACE}%{NOTSPACE}%{SPACE}%{WORD:protocol}\/%{NUMBER:httpversion}%{SPACE}%{NUMBER:response}%{SPACE}%{NUMBER:bytes}%{SPACE}%{SECOND:request_time}%{SPACE}%{QUOTEDSTRING:agent}%{SPACE}%{URI:referrer}%{SPACE}%{WORD}%{SPACE}%{WORD:location}%{SPACE}%{QUOTEDSTRING:cache_status}%{SPACE}%{WORD:account}%{GREEDYDATA}
Logstash configuration (input side):
input {
file {
path => "/subfolder/logs/*"
type => "access_logs"
start_position => "beginning"
}
}
filter {
# skip first two lines in log file with comments
if [message] =~ /^#/ {
drop { }
}
grok {
patterns_dir => ["/opt/logstash/patterns"]
match => { "message" => "%{RAW}" }
}
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss" ]
locale => "en"
}
# ... (rest of the config omitted for readability)
}

So I am pretty sure this is being caused by the field timestamp being mapped to a type in Elasticsearch that it doesn't parse to. If you post your index mapping, I'd be happy to look at it.
A note: You can quickly solve this by adding remove_field because if the date filter is successful, the value of that field will be pulled into #timestamp. Right now you have the same value stored in two fields. Then you don't have to worry about the mapping for the field. :)
date {
match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss" ]
locale => "en"
remove_field => [ "timestamp" ]
}

Related

How to pass hardcoded data in logstash input generated section

How to pass an hardcoded data from input generated, When I am passing this input through input section logstash is successfully executed but didnt produce any filtered output in console. Is there any way to pass json in input generator section??
Note- I have taken this input data from some log file...
input {
generator {
lines => [\"clientContextId\":\"INTERNAL-b8d19563-94a1-442d-9a09-dde36743fb7d\",\"description\":\"A N1QL EXPLAIN statement was executed\",\"id\":28673,\"isAdHoc\":true,\"metrics\":{\"elapsedTime\":\"11.921ms\",\"executionTime\":\"11.764ms\",\"resultCount\":1,\"resultSize\":649},\"name\":\"EXPLAIN statement\",\"node\":\"127.0.0.1:8091\",\"real_userid\":{\"domain\":\"builtin\",\"user\":\"Administrator\"},\"remote\":{\"ip\":\"127.0.0.1\",\"port\":44695},\"requestId\":\"958a7e12-d5a6-4d7b-bd40-ac9bb60cf4a3\",\"statement\":\"explain INSERT INTO `Guardium` (KEY, VALUE) \\nVALUES ( \\\"id::5554\\n\\\", { \\\"Emp Name\\\": \\\"Test4\\\", \\\"Emp Company\\\" : \\\"GS Lab\\\", \\\"Emp Country\\\" : \\\"India\\\"} )\\nRETURNING *;\",\"status\":\"errors\",\"timestamp\":\"2021-01-07T09:37:00.486Z\",\"userAgent\":\"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36 (Couchbase Query Workbench (6.6.1-9213-enterprise))\"]
}
}
filter{
// filter logic
}
output{
stdout { codec => rubydebug }
}
Yes, you can have JSON in a generator input. For example
input { generator { count => 1 lines => [ '{ "id": "43fb7d", "description": "..." }' ] } }
filter { json { source => "message" } }
will result in
"description" => "...",
"id" => "43fb7d",

Grok Filter not adding new fields | Logstash

We have the below grok filter configured for our journlabeat. The same was deployed on our local for filebeat was working fine but isn't adding the new fields on journalbeat.
filter {
grok {
patterns_dir => ["/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-patterns-core-4.1.2/patterns"]
match => { "message" => [
'%{IPV4:client_ip} - - \[%{HTTPDATE:date}\] "%{WORD:method} %{URIPATH:request} %{URIPROTO:protocol}\/[1-9].[0-9]" (%{NUMBER:status}|-) (%{NUMBER:bytes}|-) "(%{URI:url}|-)" %{QUOTEDSTRING:client}'
]
break_on_match => false
tag_on_failure => ["failed_match"]
}
}
}
We tried adding the mutate filter for adding new fields using below but it isn't fetching the value and is printing the scalar values itself (example: %{client_ip}).
mutate {
add_field => {
"client_ip" => "%{client_ip}"
"date" => "%{date}"
"method" => "%{method}"
"status" => "%{status}"
"request" => "%{request}"
}
}
The log which we are trying to match is as below.
::ffff:172.65.205.3 - - [09/Jul/2020:11:32:52 +0000] "POST /v1-get-profile HTTP/1.1" 404 71 "https://mycompany.com/customer/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36"
Could someone let me know what exactly are we doing wrong. Thanks in Advance.

Create tag from field using regex

I am using logstash to collect my apache logs, and as such I have a field called request_url which contains values that look like:
POST /api_v1/services/order_service HTTP/1.1
POST /api_v2/services/user_service HTTP/1.0
I want to create separate tags containing on the API version and the service name, e.g.
POST /api_v1/services/order_service HTTP/1.1 -> ["v1", "order_service"]
POST /api_v2/services/user_service HTTP/1.0 -> ["v2", "user_service"]
How do achieve this in logstash configuration? Thanks for any pointers.
Using the following grok filter, you can separate the components you need and then add the appropriate tags
filter {
grok {
"match" => { "message" => "%{WORD:verb} /api_%{NOTSPACE:version}/services/%{WORD:service}" }
"add_tag" => ["%{version}", "%{service}"]
}
}
The event you'll get will look like this:
{
"message" => "POST /api_v1/services/order_service HTTP/1.1",
"#version" => "1",
"#timestamp" => "2016-06-07T02:52:01.136Z",
"host" => "iMac.local",
"verb" => "POST",
"version" => "v1",
"service" => "order_service",
"tags" => [
[0] "v1",
[1] "order_service"
]
}

logstash add_field and remove_field

I'm attempting to simplify my logstash config. I want to split the program field into separate fields (as show below) however I would prefer to use just one grok statement (if it's at all possible!)
Of the two examples below I get an _grokparsefailure on the second example, but not the first. Since grok has the add_field and remove_field options I would assume that I could combine it all into one grok statement. Why is this not the case? Have I missed some ordering/syntax somewhere?
Sample log:
2016-02-16T16:42:06Z ubuntu docker/THISTESTNAME[892]: 172.16.229.1 - - [16/Feb/2016:16:42:06 +0000] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.109 Safari/537.36" "-"
Why does this work:
filter {
# Extracts the docker name, ID, image etc elements
mutate {
add_field => { "[#metadata][program]" => "%{program}" }
remove_field => "[program]"
}
grok {
patterns_dir => "/logstash/patterns_dir/docker"
match => { "[#metadata][program]" => "%{D_ID}" }
}
}
But this does not:
filter {
grok {
add_field => { "[#metadata][program]" => "%{program}" }
remove_field => "[program]"
patterns_dir => "/logstash/patterns_dir/docker"
match => { "[#metadata][program]" => "%{D_ID}" }
}
}
add_field and remove_field only run if the underlying filter works. In your second example, the [#metadata][program] doesn't yet exist for you to run grok{} against.
This was directly answered by #Alan, however I found this way a little more readable and compressed my code even more:
grok {
patterns_dir => "/logstash/patterns_dir/docker-patterns"
match => { "program" => "%{D_ID}" }
overwrite => [ "program" ]
}

Logstash does not update #timestamp from apache log

If you are backfilling logs into logstash you are supposed to try and pull somehow the proper timestamps. Otherwise they get assigned to the time the log line was received by logstash.
This is achieved using date filter like:
date { match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ] }
But unfortunately this does not work for me.
So i have the following apache logline:
10.80.161.251 - - [15/Oct/2015:09:13:45 +0000] "- -" "POST /xxx HTTP/1.1" 200 696 29416 "-" "xxx" 4026
And the following pattern
ACCESS_LOG %{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:[#metadata][timestamp]}\] "(?:TLSv%{NUMBER:tlsversion}|-) (?:%{NOTSPACE:cypher}|-)" "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes_in}|-) (?:%{NUMBER:bytes_out}|-) %{QS:referrer} %{QS:agent} %{NUMBER:tts}
And the following logstash config
# INPUTS
input {
file {
path => '/var/log/test.log'
type => 'apache-access'
}
}
# filter/mix/match
filter {
if [type] == 'apache-access' {
grok {
patterns_dir => [ '/root/logstash-patterns' ]
match => [ "message", "%{ACCESS_LOG}" ]
}
if !("_grokparsefailure" in [tags]) {
mutate { add_field => ["timestamp_submitted", "%{#timestamp}"] }
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
}
}
# now output
output {
stdout { codec => rubydebug }
}
What i am doing wrong here. I tried adding timezones, locales and what not. And it still does not work. Any help is greatly appreciated (plus a drink of choice if you happen to be in sofia, bulgaria).
Note to self: read more carefully
The issue here is not matching against the proper field.
Because of the default pattern for apache logs the timestamp from the logline is in [#metadata][timestamp] and not in timestamp
So the date match filter should be:
date {
match => [ "[#metadata][timestamp]", "dd/MMM/yyyy:HH:mm:ss Z" ]
}

Resources