logstash check if field exists - logstash

I have log files coming in to an ELK stack. I want to copy a field (foo) in order to perform various mutations on it, However the field (foo) isn't always present.
If foo doesn't exist, then bar still gets created, but is assigned the literal string "%{foo}"
How can I perform a mutation only if a field exists?
I'm trying to do something like this.
if ["foo"] {
mutate {
add_field => "bar" => "%{foo}
}
}

To check if field foo exists:
1) For numeric type fields use:
if ([foo]) {
...
}
2) For types other than numeric like boolean, string use:
if ("" in [foo]) {
...
}

"foo" is a literal string.
[foo] is a field.
# technically anything that returns 'true', so good for numbers and basic strings:
if [foo] {
}
# contains a value
if [foo] =~ /.+/ {
}

On Logstash 2.2.2, the ("" in [field]) construct does not appear to work for me.
if ![field] { }
does, for a non-numerical field.

It's 2020 and none of the above answers are quite correct. I've been working with logstash since 2014 and expressions in filter were, are and will be a thing...
For example, you may have a boolean field with false value and with the above solutions you may not know if false is the value of the field or the resulting value of the expression because the field doesn't exists.
Workaround for checking if a field exists in all versions
I think all versions of logstash supports [#metadata] field. That is, a field that will not be visible for output plugins and lives only in the filtering state. So this is what I have to workaround:
filter {
mutate {
# we use a "temporal" field with a predefined arbitrary known value that
# lives only in filtering stage.
add_field => { "[#metadata][testField_check]" => "unknown arbitrary value" }
# we copy the field of interest into that temporal field.
# If the field doesn't exist, copy is not executed.
copy => { "testField" => "[#metadata][testField_check]" }
}
# now we now if testField didn't exists, our field will have
# the initial arbitrary value
if [#metadata][testField_check] == "unknown arbitrary value" {
# just for debugging purpouses...
mutate { add_field => { "FIELD_DID_NOT_EXISTED" => true }}
} else {
# just for debugging purpouses...
mutate { add_field => { "FIELD_DID_ALREADY_EXISTED" => true }}
}
}
Old solution for logstash prior version 7.0.0
Check my issue in github.
I've been struggling a lot with expressions in logstash. My old solution worked until version 7. This was for boolean fields, for instance:
filter {
# if the field does not exists, `convert` will create it with "false" string. If
# the field exists, it will be the boolean value converted into string.
mutate { convert => { "field" => "string" } }
# This condition breaks on logstash > 7 (see my bug report). Before version 7,
# this condition will be true if a boolean field didn't exists.
if ![field] {
mutate { add_field => { "field" => false } }
}
# at this stage, we are sure field exists, so make it boolean again
mutate { convert => { "field" => "boolean" } }
}

Related

Add "type" to my custom input plugin

I'm trying to add "type" field that i would be able to use as tag when filtering to my input plugin.
For example, I can add "type" in the 'file' input plugin, and filter it later.
Code:
input {
file {
path => "/var/log/%{type}.%{+yyyy.MM.dd.HH}"
type => "myType"
}
}
filter {
if [type] == "myType"
...
}
output {
...
}
I looked at the file input plugin implementation, and did not see any field related to the "type" field.
So i just tried to add it like this:
input {
myPlugin {
path => "/var/log/%{type}.%{+yyyy.MM.dd.HH}"
type => "myType"
}
}
But the filtering did not work at all.
How can i add this "tag" (or whatever it is) to my plugin?
The functionalitiy of type is to use it for filtering:
Types are used mainly for filter activation.
So you can't actually see a new field being created unless you have created a new field under your filter, something like this:
input {
file {
path => "/var/log/%{type}.%{+yyyy.MM.dd.HH}"
type => "myType"
}
}
filter {
if [type] == "myType"{
add_field => {
"somefield" => "this is a sample" <--- this will actually create the field
}
add_tag => "myType" <--- this would actually create a tag for your es record if it matches the filter
}
}
You could maybe check out this, in order to add a new field and adding tags as well. Make sure your plugin supports type. Hope this helps!

Logstash Filter geo_point with lat long?

Part of my grok filter (working) grabs the following two fields:
%{NUMBER:XCent} %{NUMBER:YCent}
which are lat, long points.
I'm attempting to add a location pin but keep getting a config failure when I use the --debug flag on my configuration file
All of my configuration works until I get to this section.
if [XCent] and [YCent] {
mutate {
add_field => {
"[location][lat]" => "%{XCent}"
"[location][lon]" => "%{YCent}"
}
}
mutate {
convert => {
"[location][lat]" => "float"
"[location][lon]" => "float"
}
}
mutate {
convert => {"[location]", "geo_point"}
}
}
My thought was that this is basically what the elastic documentation for logstash 1.4 suggested
https://www.elastic.co/guide/en/elasticsearch/reference/1.4/mapping-geo-point-type.html
Edit: found better way to apply configuration in filter, updated code.
The third mutate filter is invalid. convert accepts a hash as it's argument. And the valid conversions are integer, float, string, and boolean. You don't need this filter so you can just remove it.
To set the location field as a geo_point type you need to modify the Elasticsearch index template you are using for your data.

Eliminate the top-level field in Logstash

I am using Logstash and one of my applications is sending me fields like:
[message][UrlVisited]
[message][TotalDuration]
[message][AccountsProcessed]
I'd like to be able to collapse these fields, removing the top level message altogether. So the above fields will become:
[UrlVisited]
[TotalDuration]
[AccountsProcessed]
Is there a way to do this in Logstash?
Assuming the names of all such subfields are known in advance you can use the mutate filter:
filter {
mutate {
rename => ["[message][UrlVisited]", "UrlVisited"]
}
mutate {
rename => ["[message][TotalDuration]", "TotalDuration"]
}
mutate {
rename => ["[message][AccountsProcessed]", "AccountsProcessed"]
}
mutate {
remove_field => ["message"]
}
}
Alternatively, use a ruby filter (which works even if you don't know the field names):
filter {
ruby {
code => "
event.get('message').each {|k, v|
event.set(k, v)
}
event.remove('message')
"
}
}
This example works on Logstash 2.4 and later. For earlier versions use event['message'].each ... and event[k] = v instead.

Can I use gsub to recursively replace all fieldnames with another field?

After changing my mapping in ElasticSearch to more definitively type the data I am inputting into the system, I have unwittingly made my new variables a nested object. Upon thinking about it more, I actually like the idea of those fields being nested objects because that way I can explicitly know if that src_port statistic is from netflow or from the ASA logs, as an example.
I'd like to use a mutate (gsub, perhaps?) to cause all of my fieldnames for a given type to be renamed to newtype.fieldname. I see that there is gsub which uses a regexp, and rename which takes the literal field name, but I would like to prevent having 30 distinct gsub/rename statements when I will be replacing all of the fields in that type with the "newtype" prefix.
Is there a way to do this?
Here is an example for your reference.
input {
stdin{
type => 'netflow'
}
}
filter {
mutate {
add_field => {"%{type}.message" => "%{message}"}
remove_field => ["message"]
}
}
output {
stdout{
codec => rubydebug
}
}
In this example I have change the message field name to type.message, then delete the origin message field. I think you can use this sample to do what you want.
Hope this can help you.
I have updated my answer!
Use the ruby plugin to do what you want!
Please notice that elasticsearch uses #timestamp field to do index, so I recommend do not change the field name.
input {
stdin{
type => 'netflow'
}
}
filter {
ruby {
code => "
data = event.clone.to_hash;
type = event['type']
data.each do |k,v|
if k != '#timestamp'
newFieldName = type +'.'+ k
event[newFieldName] = v
event.remove(k)
end
end
"
}
}
output {
stdout{
codec => rubydebug
}
}

Logstash grok filter - name fields dynamically

I've got log lines in the following format and want to extract fields:
[field1: content1] [field2: content2] [field3: content3] ...
I neither know the field names, nor the number of fields.
I tried it with backreferences and the sprintf format but got no results:
match => [ "message", "(?:\[(\w+): %{DATA:\k<-1>}\])+" ] # not working
match => [ "message", "(?:\[%{WORD:fieldname}: %{DATA:%{fieldname}}\])+" ] # not working
This seems to work for only one field but not more:
match => [ "message", "(?:\[%{WORD:field}: %{DATA:content}\] ?)+" ]
add_field => { "%{field}" => "%{content}" }
The kv filter is also not appropriate because the content of the fields may contain whitespaces.
Is there any plugin / strategy to fix this problem?
Logstash Ruby Plugin can help you. :)
Here is the configuration:
input {
stdin {}
}
filter {
ruby {
code => "
fieldArray = event['message'].split('] [')
for field in fieldArray
field = field.delete '['
field = field.delete ']'
result = field.split(': ')
event[result[0]] = result[1]
end
"
}
}
output {
stdout {
codec => rubydebug
}
}
With your logs:
[field1: content1] [field2: content2] [field3: content3]
This is the output:
{
"message" => "[field1: content1] [field2: content2] [field3: content3]",
"#version" => "1",
"#timestamp" => "2014-07-07T08:49:28.543Z",
"host" => "abc",
"field1" => "content1",
"field2" => "content2",
"field3" => "content3"
}
I have try with 4 fields, it also works.
Please note that the event in the ruby code is logstash event. You can use it to get all your event field such as message, #timestamp etc.
Enjoy it!!!
I found another way using regex:
ruby {
code => "
fields = event['message'].scan(/(?<=\[)\w+: .*?(?=\](?: |$))/)
for field in fields
field = field.split(': ')
event[field[0]] = field[1]
end
"
}
I know that this is an old post, but I just came across it today, so I thought I'd offer an alternate method. Please note that, as a rule, I would almost always use a ruby filter, as suggested in either of the two previous answers. However, I thought I would offer this as an alternative.
If there is a fixed number of fields or a maximum number of fields (i.e., there may be fewer than three fields, but there will never be more than three fields), this can be done with a combination of grok and mutate filters, as well.
# Test message is: `[fieldname: value]`
# Store values in [#metadata] so we don't have to explicitly delete them.
grok {
match => {
"[message]" => [
"\[%{DATA:[#metadata][_field_name_01]}:\s+%{DATA:[#metadata][_field_value_01]}\]( \[%{DATA:[#metadata][_field_name_02]}:\s+%{DATA:[#metadata][_field_value_02]}\])?( \[%{DATA:[#metadata][_field_name_03]}:\s+%{DATA:[#metadata][_field_value_03]}\])?"
]
}
}
# Rename the fieldname, value combinations. I.e., if the following data is in the message:
#
# [foo: bar]
#
# It will be saved in the elasticsearch output as:
#
# {"foo":"bar"}
#
mutate {
rename => {
"[#metadata][_field_value_01]" => "[%{[#metadata][_field_name_01]}]"
"[#metadata][_field_value_02]" => "[%{[#metadata][_field_name_02]}]"
"[#metadata][_field_value_03]" => "[%{[#metadata][_field_name_03]}]"
}
tag_on_failure => []
}
For those who may not be as familiar with regex, the captures in ()? are optional regex matches, meaning that if there is no match, the expression won't fail. The tag_on_failure => [] option in the mutate filter ensures that no error will be appended to tags if one of the renames fails because there was no data to capture and, as a result, there is no field to rename.

Resources