Im trying to add a simple translation filter to convert port number in application name:
filter {
mutate
{
add_field => { "[source][application]" => "%{[source][port]}" }
}
translate
{
field => "[source][port]"
destination => "[source][application]"
dictionary => {
"80" => "HTTP"
"443" => "SSL"
"5432" => "Postgresql"
}
fallback => "__NO_MATCH"
}
}
The mutate part is done correctly but translate filter is completely ignored.
In [source][application] I get the original port number and not application name or even __NO_MATCH.
What Im doing wrong? Is there a type problem?
Thanks
If you want to write the result of a translate filter into a field that already exists in your document, you will need to set override => true, if you do not set it the filter will skip the translation. [documentation]
But in your case it is better to not use the mutate filter to add the field source.application, there is no need for it since this field would be overwritten by the translate filter.
Just use the translate filter and it should work.
filter {
translate
{
field => "[source][port]"
destination => "[source][application]"
dictionary => {
"80" => "HTTP"
"443" => "SSL"
"5432" => "Postgresql"
}
fallback => "__NO_MATCH"
}
}
Related
I have a testing environment to test some logstash plugin before to move to production.
For now, I am using kiwi syslog generator, to generate some syslog for testing.
The field I have are as follow:
#timestamp
message
+ elastic medatadata
Starting from this basic fields, I start filtering my data.
The first thing is to add a new field based on the timestamp and message as follow:
input {
syslog {
port => 514
}
}
filter {
prune {
whitelist_names =>["timestamp","message","newfield", "message_count"]
}
mutate {
add_field => {"newfield" => "%{#timestamp}%{message}"}
}
}
The prune is just to don't process unwanted data.
And this works just fine as I am getting a new field with those 2 values.
The next step was to run some aggregation based on specific content of the message, such as if the message contains logged in or logged out
and to do this, I used the aggregation filter
grok {
match => {
"message" => [
"(?<[#metadata][event_type]>logged out)",
"(?<[#metadata][event_type]>logged in)",
"(?<[#metadata][event_type]>workstation locked)"
]
}
}
aggregate {
task_id => "%{message}"
code => "
map['message_count'] ||= 0; map['message_count'] += 1;
"
push_map_as_event_on_timeout => true
timeout_timestamp_field => "#timestamp"
timeout => 60
inactivity_timeout => 50
timeout_tags => ['_aggregatetimeout']
}
}
This worked as expected but I am having a problem here. When the aggregation times out. the only field populated for the specific aggregation, is the message_count
As you can see in the above screenshot, the newfield and message(the one on the total left, sorry it didn't fit in the screenshot) are both empty.
For the demostration and testing purpose that's is absolutely fine, but it will because unmanageable if I get hundreds of syslog per second not knowing to with message that message_count refers to.
Please, I am struggling here and I don't know how to solve this issue, can please somebody help me to understand how I can fill the newfield with the content of the message that it refers to?
This is my whole logstash configuration to make it easier.
input {
syslog {
port => 514
}
}
filter {
prune {
whitelist_names =>["timestamp","message","newfield", "message_count"]
}
mutate {
add_field => {"newfield" => "%{#timestamp}%{message}"}
}
grok {
match => {
"message" => [
"(?<[#metadata][event_type]>logged out)",
"(?<[#metadata][event_type]>logged in)",
"(?<[#metadata][event_type]>workstation locked)"
]
}
}
aggregate {
task_id => "%{message}"
code => "
map['message_count'] ||= 0; map['message_count'] += 1;
"
push_map_as_event_on_timeout => true
timeout_timestamp_field => "#timestamp"
timeout => 60
inactivity_timeout => 50
timeout_tags => ['_aggregatetimeout']
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "logstash_index"
}
stdout {
codec => rubydebug
}
csv {
path => "C:\Users\adminuser\Desktop\syslog\syslogs-%{+yyyy.MM.dd}.csv"
fields => ["timestamp", "message", "message_count", "newfield"]
}
}
push_map_as_event_on_timeout => true
When you use this, and a timeout occurs, it creates a new event using the contents of the map. If you want fields from the original messages to be in the new event then you have to add them to the map. For the task_id there is a shorthand notation to do this using the timeout_task_id_field option on the filter, otherwise you have explicitly add them
map['newfield'] ||= event.get('newfield');
This is the filter part in my logstash config file.
filter {
mutate {
split => ["message", "|"]
add_field => {
"start_time" => "%{[message][1]}"
"end_time" => "%{[message][2]}"
"channel" => "%{[message][5]}"
"[range_time][gte]" => "%{[message][1]}"
"[range_time][lte]" => "%{[message][2]}"
# "duration" => "%{[end_time]-[start_time]}"
}
# remove_field => ["message"]
}
date {
match => ["start_time", "yyyyMMddHHmmss"]
target => "start_time"
}
date {
match => ["end_time", "yyyyMMddHHmmss"]
target => "end_time"
}
ruby {
code =>
"
event.set('start_time', event.get('start_time').to_i)
event.set('end_time', event.get('end_time').to_i)
"
}
mutate {
remove_field => ["message", "#timestamp"]
}
ruby {
init => "require 'time'"
code => "event['duration'] = event['end_time'] - event['start_time'];"
}
In the end, I wanna create a new field named duration to represent the difference between end_time and start_time.
Obviously, the last ruby part was wrong. How could I write for this part?
To start you must make sure that the field you want to put the duration in exists prior to your setting of the value.
Make sure you add the field up front.
As it will be numeric you could do it like this
mutate {
add_field {
"duration" => 0
}
}
After this you can calculate the value and set it using ruby
ruby {
code => "event.set('duration', event.get('end_time').to_i - event.get('start_time').to_i)"
}
My input is a log from IIS server with cookies included. I want my output (elasticsearch) to have a field like this:
"cookies": {
"cookie_name": "cookie_value"
}
Also for some cookies I want their values to be replaced with some other values from a dictionary.
Basically, I think the following filter config solves my problem:
kv {
source => "cookie"
target => "cookies"
trim => ";"
include_keys => [ "cookie_name1","cookie_name2" ]
}
translate {
field => "cookies.cookie_name1"
destination => "cookies.cookie_name1"
dictionary_path => "/etc/logstash/dict.yaml"
override => "true"
fallback => "%{cookies.cookie_name1}"
}
The problem is that I don't know if it’s the right way to do this, and whether it will work at all (especially the cookies.cookie_name part).
The correct way to do this is:
kv {
source => "cookie"
target => "cookies"
field_split => ";+"
include_keys => [ "cookie_name1","cookie_name2" ]
}
translate {
field => "[cookies][cookie_name1]"
destination => "[cookies][cookie_name1]"
dictionary_path => "/etc/logstash/dict.yaml"
override => "true"
fallback => "%{[cookies][cookie_name1]}"
}
https://www.elastic.co/guide/en/logstash/current/event-dependent-configuration.html#logstash-config-field-references
https://www.elastic.co/guide/en/logstash/7.4/plugins-filters-kv.html
https://www.elastic.co/guide/en/logstash/7.4/plugins-filters-translate.html
I'm building out logstash and would like to build functionality to anonymize fields as specified in the message.
Given the message below, the field fta is a list of fields to anonymize. I would like to just use %{fta} and pass it through to the anonymize filter, but that doesn't seem to work.
{ "containsPII":"True", "fta":["f1","f2"], "f1":"test", "f2":"5551212" }
My config is as follows
input {
stdin { codec => json }
}
filter {
if [containsPII] {
anonymize {
algorithm => "SHA1"
key => "123456789"
fields => %{fta}
}
}
}
output {
stdout {
codec => rubydebug
}
}
The output is
{
"containsPII" => "True",
"fta" => [
[0] "f1",
[1] "f2"
],
"f1" => "test",
"f2" => "5551212",
"#version" => "1",
"#timestamp" => "2016-07-13T22:07:04.036Z",
"host" => "..."
}
Does anyone have any thoughts? I have tried several permutations at this point with no luck.
Thanks,
-D
EDIT:
After posting in the Elastic forums, I found out that this is not possible using base logstash functionality. I will try using the ruby filter instead. So, to ammend my question, How do I call another filter from within the ruby filter? I tried the following with no luck and honestly can't even figure out where to look. I'm very new to ruby.
filter {
if [containsPII] {
ruby {
code => "event['fta'].each { |item| event[item] = LogStash::Filters::Anonymize.execute(event[item],'12345','SHA1') }"
add_tag => ["Rubyrun"]
}
}
}
You can execute the filters from ruby script. Steps will be:
Create the required filter instance in the init block of inline ruby script.
For every event call the filter method of the filter instance.
Following is the example for above problem statement. It will replace my_ip field in event with its SHA1.
Same can be achieved using ruby script file.
Following is the sample config file.
input { stdin { codec => json_lines } }
filter {
ruby {
init => "
require 'logstash/filters/anonymize'
# Create instance of filter with applicable parameters
#anonymize = LogStash::Filters::Anonymize.new({'algorithm' => 'SHA1',
'key' => '123456789',
'fields' => ['my_ip']})
# Make sure to call register
#anonymize.register
"
code => "
# Invoke the filter
#anonymize.filter(event)
"
}
}
output { stdout { codec => rubydebug {metadata => true} } }
Well, I wasn't able to figure out how to call another filter from within a ruby filter, but I did get to the functional goal.
filter {
if [fta] {
ruby {
init => "require 'openssl'"
code => "event['fta'].each { |item| event[item] = OpenSSL::HMAC.hexdigest(OpenSSL::Digest::SHA256.new, '123456789', event[item] ) }"
}
}
}
If the field FTA exists, it will SHA2 encode each of the fields listed in that array.
Within some logstash configuration, I'd like to conditionally set the value of one field within a configuration block, without having to repeat the entire block.
So for example, I've got two http outputs with very similar settings:
output {
if "foo" in [tags] {
http {
url => "http://example.com/x"
http_method => "post"
follow_redirects => true
keepalive => false
connect_timeout => 20
cookies => true
}
} else {
http {
url => "http://example.com/y"
http_method => "post"
follow_redirects => true
keepalive => false
connect_timeout => 20
cookies => true
}
}
}
what's the best way of avoiding repeating the content of those two blocks, and just changing the single field that I'm interested in? I was hoping I could either set a variable and use that within the block, or use an if within the block, but couldn't find an example of either.
I was looking for something like the following sort of thing (which is invalid configuration):
output {
http {
if "foo" in [tags] {
url => "http://example.com/x"
} else {
url => "http://example.com/y"
}
http_method => "post"
follow_redirects => true
keepalive => false
connect_timeout => 20
cookies => true
}
}
Set a variable in your filter{} section (mutate->add_field, perhaps) and then refer to that variable in your output stanza, e.g.:
url => "%{myField}"
If you don't want to store the variable in elasticsearch, use metadata in both places, e.g.:
[#metadata][myField]