I want to put a value into part of a nested hash, but name that part depending on upstream filters. This is to refactor and reduce overall code size as currently each of the 20+ incoming event types have their own section like this with 18 lines in the logstash file (but currently the %{detail_part} bit is hard-coded).
# Working code
filter {
if [result] == 0 {
# Success
mutate {
add_field => {
"[Thing][ThingDetail][OtherThing][MoreDetail]" => "true"
}
}
}
else {
# Failed
mutate {
add_field => {
"[Thing][ThingDetail][OtherThing][MoreDetail]" => "false"
}
}
}
}
Above is hard-coded to "OtherThing". Below has a variable, but doesn't work.
# Non-Working code
filter {
if [result] == 0 {
# Success
mutate {
add_field => {
"[Thing][ThingDetail][%{detail_part}][MoreDetail]" => "true"
}
}
}
else {
# Failed
mutate {
add_field => {
"[Thing][ThingDetail][%{detail_part}][MoreDetail]" => "false"
}
}
}
}
In the above (non-Working code), detail_part is set in an upstream filter to a string value like "OtherThing". This currently compiles and runs, but no XML is output from it, so I don't think anything is set into the hash as a result of these statements.
I know it can be done with embedded Ruby code, but I would like a way that is as simple as possible. The output of this process is going to XML so I am constrained to use this kind of nested hash.
Is this possible with Logstash?
It turns out that yes, Logstash supports this, just the syntax was wrong. So here is the fix:
filter {
# This field could be set conditionally in an if statement ...
mutate { add_field => { "[detail_part]" => "Result" } }
if [result] == 0 {
# Success
mutate {
add_field => {
"[Thing][ThingDetail][%{[detail_part]}][MoreDetail]" => "true"
}
}
}
else {
# Failed
mutate {
add_field => {
"[Thing][ThingDetail][%{[detail_part]}][MoreDetail]" => "false"
}
}
}
}
I just couldn't find any non-trivial examples that did things like this.
Related
I have a testing environment to test some logstash plugin before to move to production.
For now, I am using kiwi syslog generator, to generate some syslog for testing.
The field I have are as follow:
#timestamp
message
+ elastic medatadata
Starting from this basic fields, I start filtering my data.
The first thing is to add a new field based on the timestamp and message as follow:
input {
syslog {
port => 514
}
}
filter {
prune {
whitelist_names =>["timestamp","message","newfield", "message_count"]
}
mutate {
add_field => {"newfield" => "%{#timestamp}%{message}"}
}
}
The prune is just to don't process unwanted data.
And this works just fine as I am getting a new field with those 2 values.
The next step was to run some aggregation based on specific content of the message, such as if the message contains logged in or logged out
and to do this, I used the aggregation filter
grok {
match => {
"message" => [
"(?<[#metadata][event_type]>logged out)",
"(?<[#metadata][event_type]>logged in)",
"(?<[#metadata][event_type]>workstation locked)"
]
}
}
aggregate {
task_id => "%{message}"
code => "
map['message_count'] ||= 0; map['message_count'] += 1;
"
push_map_as_event_on_timeout => true
timeout_timestamp_field => "#timestamp"
timeout => 60
inactivity_timeout => 50
timeout_tags => ['_aggregatetimeout']
}
}
This worked as expected but I am having a problem here. When the aggregation times out. the only field populated for the specific aggregation, is the message_count
As you can see in the above screenshot, the newfield and message(the one on the total left, sorry it didn't fit in the screenshot) are both empty.
For the demostration and testing purpose that's is absolutely fine, but it will because unmanageable if I get hundreds of syslog per second not knowing to with message that message_count refers to.
Please, I am struggling here and I don't know how to solve this issue, can please somebody help me to understand how I can fill the newfield with the content of the message that it refers to?
This is my whole logstash configuration to make it easier.
input {
syslog {
port => 514
}
}
filter {
prune {
whitelist_names =>["timestamp","message","newfield", "message_count"]
}
mutate {
add_field => {"newfield" => "%{#timestamp}%{message}"}
}
grok {
match => {
"message" => [
"(?<[#metadata][event_type]>logged out)",
"(?<[#metadata][event_type]>logged in)",
"(?<[#metadata][event_type]>workstation locked)"
]
}
}
aggregate {
task_id => "%{message}"
code => "
map['message_count'] ||= 0; map['message_count'] += 1;
"
push_map_as_event_on_timeout => true
timeout_timestamp_field => "#timestamp"
timeout => 60
inactivity_timeout => 50
timeout_tags => ['_aggregatetimeout']
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash_index"
}
stdout {
codec => rubydebug
}
csv {
path => "C:\Users\adminuser\Desktop\syslog\syslogs-%{+yyyy.MM.dd}.csv"
fields => ["timestamp", "message", "message_count", "newfield"]
}
}
push_map_as_event_on_timeout => true
When you use this, and a timeout occurs, it creates a new event using the contents of the map. If you want fields from the original messages to be in the new event then you have to add them to the map. For the task_id there is a shorthand notation to do this using the timeout_task_id_field option on the filter, otherwise you have explicitly add them
map['newfield'] ||= event.get('newfield');
I have an output like so:
output {
if [target_index] == "mystream-%{+YYYY.MM.dd}"{
kinesis {
stream_name => "mystream"
region => "us-east-1"
}
}
I'd like to filter by the index pattern mystream-{date}. For some reason this conditional is not working. I'm not sure what the problem is here. Any help would be greatly appreciated.
Values compared in conditionals are not sprintf'd. You would have to use mutate to add a value that gets sprintf'd, and then compare to that. For example
filter { mutate { add_field => { "[#metadata][streamName]" => "mystream-%{+YYYY.MM.dd}" } } }
output {
if [target_index] == [#metadata][streamName] { ...
I have one custom field in Logstash event defined as expression:
{ "customIndex" => "my-service-%{+YYYY.MM}" }
And filter that calculates index name for elasticsearch output plugin:
filter {
if [customIndex] {
mutate {
add_field => { "indexName" => "custom-%{customIndex}" }
}
} else {
mutate {
add_field => { "indexName" => "common-%{+YYYY.MM.dd}" }
}
}
}
But for custom index it creates invalid name custom-my-service-%{+YYYY.MM} and does not evaluate %{+YYYY.MM} expression.
Is it possible to evaluate field and get custom-my-service-2016.11?
If you can reformat your created field to this:
{ "customIndex" => "my-service-%Y.%m" }
Then this Ruby filter will do the trick:
ruby {
init => "require 'date'"
code => "event['indexName'] = 'custom-' + Date.today.strftime(event['customIndex'])"
}
Here is a documentation on placeholders you can use.
Maybe it is me, but how come that when I use the CSV Output from LogStash it does not output in a csv format? I am using nothing special (as seen in the configuration). Can someone tell me what I am doing wrong?
input
{
stdin {
type => "stdin-type"
}
}
filter
{
mutate { add_field => { "test" => "testme" } }
mutate { add_field => { "[#metadata][test]" => "Hello" } }
mutate { add_field => { "[#metadata][test2]" => "world" } }
}
output {
# .\bin\logstash-plugin.bat install logstash-output-csv
csv {
fields => ["test", "[#metadata][test]"]
path => "./TestLogs.csv"
}
stdout { codec => rubydebug { metadata => true } }
}
It actually create an output. If I type something (Ex.: test me) in the console (stdin) it creates the file and all. But the CSV file contains the following:
2016-11-25T11:49:40.338Z MyPcName test me
And I am expecting the following:
testme,Hello
Note: I am using LogStash 5 (latest version at the moment).
This is a Logstash 5.x issue. For now, I'm using the script below:
output {
file {
path => "/app/logstash/test.csv"
message_pattern => (grok pattern)
}
I'm building out logstash and would like to build functionality to anonymize fields as specified in the message.
Given the message below, the field fta is a list of fields to anonymize. I would like to just use %{fta} and pass it through to the anonymize filter, but that doesn't seem to work.
{ "containsPII":"True", "fta":["f1","f2"], "f1":"test", "f2":"5551212" }
My config is as follows
input {
stdin { codec => json }
}
filter {
if [containsPII] {
anonymize {
algorithm => "SHA1"
key => "123456789"
fields => %{fta}
}
}
}
output {
stdout {
codec => rubydebug
}
}
The output is
{
"containsPII" => "True",
"fta" => [
[0] "f1",
[1] "f2"
],
"f1" => "test",
"f2" => "5551212",
"#version" => "1",
"#timestamp" => "2016-07-13T22:07:04.036Z",
"host" => "..."
}
Does anyone have any thoughts? I have tried several permutations at this point with no luck.
Thanks,
-D
EDIT:
After posting in the Elastic forums, I found out that this is not possible using base logstash functionality. I will try using the ruby filter instead. So, to ammend my question, How do I call another filter from within the ruby filter? I tried the following with no luck and honestly can't even figure out where to look. I'm very new to ruby.
filter {
if [containsPII] {
ruby {
code => "event['fta'].each { |item| event[item] = LogStash::Filters::Anonymize.execute(event[item],'12345','SHA1') }"
add_tag => ["Rubyrun"]
}
}
}
You can execute the filters from ruby script. Steps will be:
Create the required filter instance in the init block of inline ruby script.
For every event call the filter method of the filter instance.
Following is the example for above problem statement. It will replace my_ip field in event with its SHA1.
Same can be achieved using ruby script file.
Following is the sample config file.
input { stdin { codec => json_lines } }
filter {
ruby {
init => "
require 'logstash/filters/anonymize'
# Create instance of filter with applicable parameters
#anonymize = LogStash::Filters::Anonymize.new({'algorithm' => 'SHA1',
'key' => '123456789',
'fields' => ['my_ip']})
# Make sure to call register
#anonymize.register
"
code => "
# Invoke the filter
#anonymize.filter(event)
"
}
}
output { stdout { codec => rubydebug {metadata => true} } }
Well, I wasn't able to figure out how to call another filter from within a ruby filter, but I did get to the functional goal.
filter {
if [fta] {
ruby {
init => "require 'openssl'"
code => "event['fta'].each { |item| event[item] = OpenSSL::HMAC.hexdigest(OpenSSL::Digest::SHA256.new, '123456789', event[item] ) }"
}
}
}
If the field FTA exists, it will SHA2 encode each of the fields listed in that array.