Parse error for float values - logstash

I'm trying to get a float value from a log line but logstash mutate filter rounds the value and converts it into integer.
The log line is
f413e89e-8c2f-e411-97a5-005056820dbe|0,0033
and the configuration file is
input {
file {
path => "log.txt"
}
}
filter {
grok {
match => ["message", "%{UUID:request_object_id}[/|]%{LOCALNUM:total_time}"]
}
mutate {
gsub => ["total_time", "[,]", "."]
convert => [ "total_time", "float" ]
}
}
output {
elasticsearch { host => localhost }
}
LOCALNUM is a custom pattern and it is
(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:[,][0-9]+)?)|(?:[,][0-9]+)))
(uses "," instead of "." in floating numbers).
With this configuration, total_time is 0 instead of 0.0033.

Looking at the logstash source code it does this:
convert(event) if #convert
gsub(event) if #gsub
So it does the convert before the gsub. Try splitting your mutate into two different mutates and it will fix your problem.
mutate {
gsub => ["total_time", "[,]", "."]
}
mutate {
convert => [ "total_time", "float" ]
}

Oh I found my mistake. I used 2 seperate mutate blocks, 1 for gsub and the other for convert and it solved the problem.

Related

Creating dynamic Key-Value pairs in logstash

I have the following data in logstash output:
"Details" => "SAID,:EGT1_M2P7_01,::LIP,:10-168-98-203::RIP,:10-81-122-84:",
I want to make dynamic Key-value pairs according to delimiters
",:" means that "SAID" is the key and "EGT1_M2P7_01" is the value
"::" means that it is a new line and again ",:" means that "LIP" is the key and "10-168-98-203" is the value.
Need to know how to do it. Looking forward for answers
for the input you have given
"SAID,:EGT1_M2P7_01,::LIP,:10-168-98-203::RIP,:10-81-122-84:"
this filter plugin and stdout
filter {
kv {
source => "Details"
field_split => "::"
value_split => ":"
}
mutate {
remove_field => ["host", "#timestamp","#version", "message", "sequence" ]
}
}
output {
stdout {
codec => rubydebug
}
}
gives you
{
"LIP," => "10-168-98-203",
"SAID," => "EGT1_M2P7_01,",
"RIP," => "10-81-122-84"
}
remove the additional fields that are specific to your host system by adding in above remove_field list.

How do I replace a string in a field in Logstash

I have an IP address field from the Windows event log that contains characters like "::fffff:" in front of the IP address. I cannot change the source here, so I have to fix this in Logstash.
I must suck at googling, but I really can't find a simple way to just strip these characters from the ip-address fields in logstash.
I have tried for example
if ("" in [event_data][IpAddress]) {
mutate {
add_field => { "client-host" => "%{[event_data][IpAddress]}"}
gsub => ["client-host", ":", ""]
}
dns {
action => "replace"
reverse => [ "client-host" ]
}
}
but no luck, the colon is still there. How can I replace "::ffff:" in the string "::ffff:10.0.36.39" in Logstash?
The add_field isn't executed until after the gsub, so you need to break it up into two mutate blocks.
mutate {
add_field => { "client-host" => "%{[event_data][IpAddress]}"}
}
mutate {
gsub => ["client-host", "::ffff:", ""]
}
The specifc order that mutate works in:
rename(event) if #rename
update(event) if #update
replace(event) if #replace
convert(event) if #convert
gsub(event) if #gsub
uppercase(event) if #uppercase
lowercase(event) if #lowercase
strip(event) if #strip
remove(event) if #remove
split(event) if #split
join(event) if #join
merge(event) if #merge
filter_matched(event)
Where filter_matched has all of the standard actions like add_field

Retrieving RESTful GET parameters in logstash

I am trying to get logstash to parse key-value pairs in an HTTP get request from my ELB log files.
the request field looks like
http://aaa.bbb/get?a=1&b=2
I'd like there to be a field for a and b in the log line above, and I am having trouble figuring it out.
My logstash conf (formatted for clarity) is below which does not load any additional key fields. I assume that I need to split off the address portion of the URI, but have not figured that out.
input {
file {
path => "/home/ubuntu/logs/**/*.log"
type => "elb"
start_position => "beginning"
sincedb_path => "log_sincedb"
}
}
filter {
if [type] == "elb" {
grok {
match => [ "message", "%{TIMESTAMP_ISO8601:timestamp}
%{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int}
%{IP:backend_ip}:%{NUMBER:backend_port:int}
%{NUMBER:request_processing_time:float}
%{NUMBER:backend_processing_time:float}
%{NUMBER:response_processing_time:float}
%{NUMBER:elb_status_code:int}
%{NUMBER:backend_status_code:int}
%{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int}
%{QS:request}" ]
}
date {
match => [ "timestamp", "ISO8601" ]
}
kv {
field_split => "&?"
source => "request"
exclude_keys => ["callback"]
}
}
}
output {
elasticsearch { host => localhost }
}
kv will take a URL and split out the params. This config works:
input {
stdin { }
}
filter {
mutate {
add_field => { "request" => "http://aaa.bbb/get?a=1&b=2" }
}
kv {
field_split => "&?"
source => "request"
}
}
output {
stdout {
codec => rubydebug
}
}
stdout shows:
{
"request" => "http://aaa.bbb/get?a=1&b=2",
"a" => "1",
"b" => "2"
}
That said, I would encourage you to create your own versions of the default URI patterns so that they set fields. You can then pass the querystring field off to kv. It's cleaner that way.
UPDATE:
For "make your own patterns", I meant to take the existing ones and modify them as needed. In logstash 1.4, installing them was as easy as putting them in a new file the 'patterns' directory; I don't know about patterns for >1.4 yet.
MY_URIPATHPARAM %{URIPATH}(?:%{URIPARAM:myuriparams})?
MY_URI %{URIPROTO}://(?:%{USER}(?::[^#]*)?#)?(?:%{URIHOST})?(?:%{MY_URIPATHPARAM})?
Then you could use MY_URI in your grok{} pattern and it would create a field called myuriparams that you could feed to kv{}.

logstash generate #timestamp from parsed message

I have file containing series of such messages:
component+branch.job 2014-09-04_21:24:46 2014-09-04_21:24:49
It is string, some white spaces, first date and time, some white spaces and second date and time. Currently I'm using such filter:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{WORD:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
}
I would like to convert dateStart and timeStart to #timestamp for that message.
I found that there is date filter but I don't know how to use it on two separate fields.
I have also tried something like this as filter:
date {
match => [ "message", "YYYY-MM-dd_HH:mm:ss" ]
}
but it didn't worked as expected.
Based on duplicate suggested by Magnus Bäck, I created solution for my problem. Solution was to mutate parsed data into one field:
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
and then parse it as I suggested in my question.
So final solution looks like this:
filter {
grok {
match => [ "message", "%{WORD:componentName}\+%{WORD:branchName}\.%{DATA:jobType}\s+20%{DATE:dateStart}_%{TIME:timeStart}\s+20%{DATE:dateStop}_%{TIME:timeStop}" ]
}
mutate {
add_field => {"tmp_start_timestamp" => "20%{dateStart}_%{timeStart}"}
}
date {
match => [ "tmp_start_timestamp", "YYYY-MM-dd_HH:mm:ss" ]
}
}

Negative regexp in logstash configuration

I cannot get negative regexp expressions working within LogStash (as described in the docs)
Consider the following positive regex which works correctly to detect fields that have been assigned a value:
if [remote_ip] =~ /(.+)/ {
mutate { add_tag => ["ip"] }
}
However, the negative expression seems to return false even when the field is blank:
if [remote_ip] !~ /(.+)/ {
mutate { add_tag => ["no_ip"] }
}
Am I misunderstanding the usage?
Update - this was fuzzy thinking on my part. There were issues with my config file. If the rest of your config file is sane, the above should work.
This was fuzzy thinking on my part - there were issues with the rest of my config file.
Based on Ben Lim's example, I came up with an input that is easier to test:
input {
stdin { }
}
filter {
if [message] !~ /(.+)/ {
mutate { add_tag => ["blank_message"] }
}
if [noexist] !~ /(.+)/ {
mutate { add_tag => ["tag_does_not_exist"] }
}
}
output {
stdout {debug => true}
}
The output for a blank message is:
{
"message" => "",
"#version" => "1",
"#timestamp" => "2014-02-27T01:33:19.285Z",
"host" => "benchmark.example.com",
"tags" => [
[0] "blank_message",
[1] "tag_does_not_exist"
]
}
The output for a message with the content "test message" is:
test message
{
"message" => "test message",
"#version" => "1",
"#timestamp" => "2014-02-27T01:33:25.059Z",
"host" => "benchmark.example.com",
"tags" => [
[0] "tag_does_not_exist"
]
}
Thus, the "negative regex" /(.+)/ returns true only when the field is empty or the field does not exist.
The negative regex /(.*)/ will only return true when the field does not exist. If the field exists (whether empty or with values), the return value will be false.
Below is my configuration. The type field is not exist, therefore, the negative expression is return true.
input {
stdin {
}
}
filter {
if [type] !~ /(.+)/ {
mutate { add_tag => ["aa"] }
}
}
output {
stdout {debug => true}
}
The regexp /(.+)/ means it accepts everything, include blank. So, when the "type" field is exist, even the field value is blank, it also meet the regexp. Therefore, in your example, if the remote_ip field exist, your "negative expression" will always return false.

Resources