Multiple If in single filter - logstash

if [CREATION_DATE] == ""
{
mutate {
convert => [ "CREATION_DATE", "string" ]
}
}
else
{
date {
locale => "en"
match => [ "CREATION_DATE", "dd-MMM-yy hh.mm.ss.SSS a"]
target => "CREATION_DATE"
}
}
if [SUBMITTED_DATE] == ""
{
mutate {
convert => [ "SUBMITTED_DATE", "string" ]
}
}
else
{
date {
locale => "en"
match => [ "SUBMITTED_DATE", "dd-MMM-yy hh.mm.ss.SSS a"]
target => "SUBMITTED_DATE"
}
}
if [LAST_MODIFIED_DATE] == ""
{
mutate {
convert => [ "LAST_MODIFIED_DATE", "string" ]
}
}
else
{
date {
locale => "en"
match => [ "LAST_MODIFIED_DATE", "dd-MMM-yy hh.mm.ss.SSS a"]
target => "LAST_MODIFIED_DATE"
}
}'
am getting output if i have all three (CREATION_DATE,SUBMITTED_DATE,LAST_MODIFIED_DATE) in date format.If any is STRING am not getting that log file in my input.
for ex:
my input is
12-JUL-13 11.33.56.259 AM,12-JUL-13 03.59.36.136 PM,12-JUL-13 04.00.05.584 PM
14-JUL-13 11.33.56.259 AM,11-JUL-13 04.00.05.584 PM
my output will come successfully for
12-JUL-13 11.33.56.259 AM,12-JUL-13 03.59.36.136 PM,12-JUL-13 04.00.05.584 PM
but NOT FOR 2nd line
In Simple,Logstash is indexing only when three if clauses have dates.
Help me out.THanks in advance!!

The issue with your if statements is pointed out by the comments by #Fairy and #alain-collins.
if [CREATION_DATE] == ""
Does not check if that field exists, it checks if it is an empty string.
Instead you could use a regex check to see if there is any content in the field using:
if [CREATION_DATE] =~ /.*/
and perform your date filter when this returns true.

Issue is solved when i change input format.
(New Format) 11-JUL-13 06.36.33.425000000 PM,13-JUL-13 06.36.33.425000000 PM,,
instead of
(Old format)11-JUL-13 06.36.33.425000000 PM,13-JUL-13 06.36.33.425000000 PM,"",
But My ques is still open.
I posted this because this solution might be useful to some.
Thanks!!!

Related

Unable to parse date and time from csv log into logstash

I want to combine two fields from a logfile and use the result as timestamp for logstash.
The logfile is in csv format and the date format is somewhat confusing. Date and time are formated like this:
Datum => 17|3|19
Zeit => 19:21:50
I tried the following code.
filter {
csv {
separator => ","
columns => [ "Datum", "Zeit" ]
}
mutate {
merge => { "Datum" => "Zeit" }
}
date {
match => [ "Datum", "d M yy HH:mm:ss" ]
}
}
The merge part seems to work with this result
"Datum" => [
[0] "17|3|19",
[1] "23:32:37"
]
but for the conversion of the date i get the following error message:
"_dateparsefailure"
can someone please help me?
With an event with the following fields:
"Datum" => "17|3|19"
"Zeit" => "19:21:50"
I got a working configuration:
mutate {
merge => { "Datum" => "Zeit" }
}
mutate {
join => {"Datum" => ","}
}
date {
match => [ "Datum", "d|M|yy,HH:mm:ss" ]
}
This give me in the output: "#timestamp":"2019-03-17T18:21:50.000Z"

Logstash 6.2.4 - match time does not default to current date

I am using logstash 6.2.4 with the following config:
input {
stdin { }
}
filter {
date {
match => [ "message","HH:mm:ss" ]
}
}
output {
stdout { }
}
With the following input:
10:15:20
I get this output:
{
"message" => "10:15:20",
"#version" => "1",
"host" => "DESKTOP-65E12L2",
"#timestamp" => 2019-01-01T09:15:20.000Z
}
I have just a time information, but would like to parse it as current date.
Note that current date is 1. March 2019, so I guess that 2019-01-01 is some sort of default ?
How can I parse time information and add current date information to it ?
I am not really interested in any replace or other blocks as according to the documentation, parsing the time should default to current date.
You need to add a new field merging the current date with the field that contains your time information, which in your example is the message field, then your date filter will need to be tested against this new field, you can do this using the following configuration.
filter {
mutate {
add_field => { "current_date" => "%{+YYYY-MM-dd} %{message}" }
}
date {
match => ["current_date", "YYYY-MM-dd HH:mm:ss" ]
}
}
The result will be something like this:
{
"current_date" => "2019-03-03 10:15:20",
"#timestamp" => 2019-03-03T13:15:20.000Z,
"host" => "elk",
"message" => "10:15:20",
"#version" => "1"
}

grok filter fails for ISO8601 timestamps since 5.2

since I've upgraded our ELK-stack from 5.0.2 to 5.2 our grok filters fail and I've no idea why. Maybe I've overlooked something in the changelogs?
Filter
filter {
if [type] == "nginx_access" {
grok {
match => { "message" => "%{IPORHOST:remote_addr} - %{USERNAME:remote_user} \[%{TIMESTAMP_ISO8601:timestamp}\] \"%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}\" %{INT:status} %{INT:body_bytes_sent} %{QS:http_referer} %{QS:http_user_agent} \"%{DATA:host_uri}\" \"%{DATA:proxy}\" \"%{DATA:upstream_addr}\" \"%{WORD:cache_status}\" \[%{NUMBER:request_time}\] \[(?:%{NUMBER:proxy_response_time}|-)\]" }
add_field => [ "received_at", "%{#timestamp}" ]
}
mutate {
convert => {
"proxy_response_time" => "float"
"request_time" => "float"
"body_bytes_sent" => "integer"
}
}
}
}
Error
Invalid format: \"2017-02-05T15:55:38+01:00\" is malformed at \"-02-05T15:55:38+01:00\"
Full Error
[2017-02-05T15:55:49,500][WARN ][logstash.outputs.elasticsearch] Failed action. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-2017.02.05", :_type=>"nginx_access", :_routing=>nil}, 2017-02-05T14:55:38.000Z proxy2 4.3.2.1 - - [2017-02-05T15:55:38+01:00] "HEAD / HTTP/1.1" 200 0 "-" "Zabbix" "example.com" "host1:10040" "1.2.3.4:10040" "MISS" [0.095] [0.095]], :response=>{"index"=>{"_index"=>"filebeat-2017.02.05", "_type"=>"nginx_access", "_id"=>"AVoOxh7p5p68dsalXDFX", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"2017-02-05T15:55:38+01:00\" is malformed at \"-02-05T15:55:38+01:00\""}}}}}
The whole thing works perfectly on http://grokconstructor.appspot.com and the TIMESTAMP_ISO8601 still seems the right choice (https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns)
Techstack
Ubuntu 16.04
Elasticsearch 5.2.0
Logstash 5.2.0
Filebeat 5.2.0
Kibana 5.2.0
Any idas?
Cheers,
Finn
UPDATE
So this version works for some reason
filter {
if [type] == "nginx_access" {
grok {
match => { "message" => "%{IPORHOST:remote_addr} - %{USERNAME:remote_user} \[%{TIMESTAMP_ISO8601:timestamp}\] \"%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}\" %{INT:status} %{INT:body_bytes_sent} %{QS:http_referer} %{QS:http_user_agent} \"%{DATA:host_uri}\" \"%{DATA:proxy}\" \"%{DATA:upstream_addr}\" \"%{WORD:cache_status}\" \[%{NUMBER:request_time}\] \[(?:%{NUMBER:proxy_response_time}|-)\]" }
add_field => [ "received_at", "%{#timestamp}" ]
}
date {
match => [ "timestamp" , "yyyy-MM-dd'T'HH:mm:ssZ" ]
target => "timestamp"
}
mutate {
convert => {
"proxy_response_time" => "float"
"request_time" => "float"
"body_bytes_sent" => "integer"
}
}
}
}
If someone can shed some light why I have to redefine a valid ISO8601 date I would be happy to know.
Make sure you specify the format of timestamp you are expecting in your documents, where the mapping could look like:
PUT index
{
"mappings": {
"your_index_type": {
"properties": {
"date": {
"type": "date",
"format": "yyyy-MM-ddTHH:mm:ss+01:SS" <-- make sure to give the correct one
}
}
}
}
}
If you do not specify it correctly, Elasticsearch will expect the timestamp value in format of ISO. OR you could do a date match for your timestamp field, which could look something like this within your filter:
date {
match => [ "timestamp" , "yyyy-MM-ddTHH:mm:ss+01:SS" ] <--match the timestamp (I'm not sure what +01:ss stands for, make sure it matches)
target => "timestamp"
locale => "en"
timezone => "UTC"
}
Or you could add a new field and match that to the timestamp if you wish, and then you could remove it if you aren't really using it, since you have the timestamp on the new field. Hope it helps.

logstash generate #timestamp from parsed message

I have file containing series of such messages:
component+branch.job 2014-09-04_21:24:46 2014-09-04_21:24:49
It is string, some white spaces, first date and time, some white spaces and second date and time. Currently I'm using such filter:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{WORD:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
}
I would like to convert dateStart and timeStart to #timestamp for that message.
I found that there is date filter but I don't know how to use it on two separate fields.
I have also tried something like this as filter:
date {
match => [ "message", "YYYY-MM-dd_HH:mm:ss" ]
}
but it didn't worked as expected.
Based on duplicate suggested by Magnus Bäck, I created solution for my problem. Solution was to mutate parsed data into one field:
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
and then parse it as I suggested in my question.
So final solution looks like this:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{DATA:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
date {
match => [ "tmp_start_timestamp", "YYYY-MM-dd_HH:mm:ss" ]
}
}

Negative regexp in logstash configuration

I cannot get negative regexp expressions working within LogStash (as described in the docs)
Consider the following positive regex which works correctly to detect fields that have been assigned a value:
if [remote_ip] =~ /(.+)/ {
mutate { add_tag => ["ip"] }
}
However, the negative expression seems to return false even when the field is blank:
if [remote_ip] !~ /(.+)/ {
mutate { add_tag => ["no_ip"] }
}
Am I misunderstanding the usage?
Update - this was fuzzy thinking on my part. There were issues with my config file. If the rest of your config file is sane, the above should work.
This was fuzzy thinking on my part - there were issues with the rest of my config file.
Based on Ben Lim's example, I came up with an input that is easier to test:
input {
stdin { }
}
filter {
if [message] !~ /(.+)/ {
mutate { add_tag => ["blank_message"] }
}
if [noexist] !~ /(.+)/ {
mutate { add_tag => ["tag_does_not_exist"] }
}
}
output {
stdout {debug => true}
}
The output for a blank message is:
{
"message" => "",
"#version" => "1",
"#timestamp" => "2014-02-27T01:33:19.285Z",
"host" => "benchmark.example.com",
"tags" => [
[0] "blank_message",
[1] "tag_does_not_exist"
]
}
The output for a message with the content "test message" is:
test message
{
"message" => "test message",
"#version" => "1",
"#timestamp" => "2014-02-27T01:33:25.059Z",
"host" => "benchmark.example.com",
"tags" => [
[0] "tag_does_not_exist"
]
}
Thus, the "negative regex" /(.+)/ returns true only when the field is empty or the field does not exist.
The negative regex /(.*)/ will only return true when the field does not exist. If the field exists (whether empty or with values), the return value will be false.
Below is my configuration. The type field is not exist, therefore, the negative expression is return true.
input {
stdin {
}
}
filter {
if [type] !~ /(.+)/ {
mutate { add_tag => ["aa"] }
}
}
output {
stdout {debug => true}
}
The regexp /(.+)/ means it accepts everything, include blank. So, when the "type" field is exist, even the field value is blank, it also meet the regexp. Therefore, in your example, if the remote_ip field exist, your "negative expression" will always return false.

Resources