How to do translation dictionary dynamically in logstash based on field value? - logstash

How to do translation dictionary dynamically in logstash based on field value?
For example my current configuration is:
if [host] == "1.1.1.1" {
translate {
field => "[netflow][input_snmp]"
destination => "[netflow][interface_in]"
dictionary_path => "/etc/logstash/yaml/1.1.1.1.yml"
}
}
if [host] == "2.2.2.2" {
translate {
field => "[netflow][input_snmp]"
destination => "[netflow][interface_in]"
dictionary_path => "/etc/logstash/yaml/2.2.2.2.yml"
}
}
Is there a generic way to achieve this?
Logstash version 2.2.4
Thanks

I guess you can use it as:
translate {
field => "[netflow][input_snmp]"
destination => "[netflow][interface_in]"
dictionary_path => "/etc/logstash/yaml/%{host}.yml"
}
Check that: https://www.elastic.co/guide/en/logstash/current/event-dependent-configuration.html#sprintf

You can't load dictionary files dynamically depening on field value, it's not a question of syntax.
At least for the moment (current logstash version is 7.6.2)
All dictionary files are loaded in memory at logstash startup (and I suppose after a logstash configuration reload), before any event is processed.
Then the contents of the existing dictionary files are dynamically reloaded according to the refresh_interval option.
The dictionary paths can't be modified "at run time" depending on the current event.
In the elastic support forums you can check extra explanation (the 1st link even has a reference to the source code involved) and workarounds, but in the end it revolves around the same idea shown in your config:
set a bunch of static dictionary file names and control their usage with conditionals. You may use environment variables in the dictionary_path but they will be used once per logstash startup/reload.
https://discuss.elastic.co/t/dynamic-dictionary/138798/5
https://discuss.elastic.co/t/logstash-translate-plugin-dynamic-dictionary-path/129889

Related

How to read keys into array?

I am trying to read keys from a hiera json file into an array.
The json is as follows:
{
"network::interfaces": {
"eth0": {
"ip": "10.111.22.10"
},
"eth1": {
"ip": "10.111.22.11"
},
"eth2": {
"ip": "10.111.22.12"
}
}
}
In my Puppet code, I am doing this:
$network_interfaces = hiera_array('network::interfaces')
notice($network_interfaces)
Which results in the following:
Notice: Scope(Class[Role::Vagrant]): {eth0 => {ip => 10.111.22.10}, eth2 => {ip => 10.111.22.11}, eth3 => {ip => 10.111.22.12}}
But what I want are just the interfaces: [eth0, eth1, eth2]
Can someone let me know how to do this?
The difference between hiera_array() and plain hiera() has to do with what happens when the requested key (network::interfaces in your case) is present at multiple hierarchy levels. It has very little to do with what form you want the data in, and nothing to do with selecting bits and pieces of data structures. hiera_array() requests an "array-merge" lookup. The more modern lookup() function refers to this as the "unique" merge strategy.
It seems unlikely that an array-merge lookup is in fact what you want. In that case, the easiest thing to do is read the whole hash and extract the keys:
$network_interfaces = keys(hiera('network::interfaces'))
In Puppet 4 you'll need to use the keys() function provided by the puppetlabs/stdlib module. From Puppet 5 on, that function appears in core Puppet.

Logstash Filter not working when something has a period in the name

So I need to write a filter that changes all the periods in field names to underscores. I am using mutate, and I can do some things and not other things. For reference here is my current output in Kibana.
See those fields that say "packet.event-id" and so forth? I need to rename all of those. Here is my filter that I wrote and I do not know why it doesn't work
filter {
json {
source => "message"
}
mutate {
add_field => { "pooooo" => "AW CMON" }
rename => { "offset" = "my_offset" }
rename => { "packet.event-id" => "my_packet_event_id" }
}
}
The problem is that I CAN add a field, and the renaming of "offset" WORKS. But when I try and do the packet one nothing changes. I feel like this should be simple and I am very confused as to why only the one with a period in it doesn't work.
I have refreshed the index in Kibana, and still nothing changes. Anyone have a solution?
When they show up in dotted notation in Kibana, it's because there is structure to the document you originally loaded in json format.
To access the document structure using logstash, you need to use [packet][event-id] in your rename filter instead of packet.event-id.
For example:
filter {
mutate {
rename => {
"[packet][event-id]" => "my_packet_event_id"
}
}
}
You can do the JSON parsing directly in Filebeat by adding a few lines of config to your filebeat.yml.
filebeat.prospectors:
- paths:
- /var/log/snort/snort.alert
json.keys_under_root: true
json.add_error_key: true
json.message_key: log
You shouldn't need to rename the fields. If you do need to access a field in Logstash you can reference the field as [packet][length] for example. See Logstash field references for documentation on the syntax.
And by the way, there is a de_dot for replacing dots in field names, but that shouldn't be applied in this case.

Puppet: How to gather selected node names in manifest?

I am trying to set up a set of nodes running various parts of the ELK stack. In particular, I've got a set of systems running Elasticsearch and I'd like to fill out my logstash config file so that it knows which systems have ES on them.
I can see something like this in the logstash config (obviously untested):
output {
elasticsearch {
hosts => [
<%
#es_hosts.each do |host|
"#{host}",
end
-%>
]
}
}
But what I can't figure out is how to collect the hostnames for systems which are running elasticsearch. I've got a modules which apply RabbitMQ and ES, and it already exports some resources, but this one looks like it just needs nodenames for merging into a list.
--------EDIT BELOW--------
I stumbled across datacat after examing some of the PF modules I use, and thought it might be a candidate. Here's what I've done, posted here because it's not working the way I would have expected.
On my elasticsearch nodes (there are several):
##datacat_fragment { "${::hostname} in hosts":
tag => ['elasticsearch_cluster'],
target => '/etc/logstash/conf.d/master.conf',
data => {
host => ["${::hostname}" ],
}
}
Then, on the logstash node that needs to output to these ES nodes:
Datacat_fragment<| tag == 'elasticsearch_cluster' |>
datacat { '/etc/lostash/conf.d/master.conf':
template => "${module_name}/logstash-master.conf.erb",
}
Finally, the template itself:
input { [...snip...] }
filter {}
output {
elasticsearch {
<% #data.keys.sort.each do |host| %>
hosts = [
<%= #data[host.sort.join(',') %>
]
}
}
Sadly, the result of all this is
input { [...snip...] }
filter {}
output {
elasticsearch {
}
}
So at present, it looks like the exported resources aren't being instantiated as expected and I can't see why. If I add a datacat_fragment defined the same way but local to the logstash manifest, the data gets inserted into the .conf file just fine. It's just the ones from the ES nodes that are being ignored.
To further complicate matters, the input section needs to have a value inserted into it that's based on the system receiving the file. So there's one part that needs to behave like a traditional template, and another section that needs to have data inserted from multiple sources. Datacat looks promising, but is there another way to do this? Concat with an inline template somehow?

Getting logstash "fingerprint" filter to source every field

I'm using the fingerprint filter in Logstash to create a fingerprint field that I set to document_id in the elasticsearch output.
Configuration is as follows:
filter {
fingerprint {
method => "SHA1"
key => "KEY"
}
}
output {
elasticsearch {
host => localhost
document_id => "%{fingerprint}"
}
}
This defaults to source being message, but how do I make it SHA1 the entire record and not just message? Note, what fields a record has depends on the message.
I think there is no built-in possibility to achieve this with the fingerprint plugin. Even the concatenate_sources option doesn't recognize all fields and as your fields change you cannot set them manually as source.
However, you might consider using the ruby plugin to calculate an SHA1 hash regarding all of your fields. Following might do what you want.
filter {
ruby {
init => "require 'digest/sha1'; require 'json'"
code => "event['fingerprint'] = Digest::SHA1.hexdigest event.to_json"
}
}
I've just tested it and I get suitable SHA1 hashes regarding all fields.
to add to #hurb's solution, with Logstash 5.x due to breaking changes the following seems to be working:
ruby {
init => "require 'digest/sha1'; require 'json'"
code => "event.set('fingerprint', Digest::SHA1.hexdigest(event.to_json))"
}

Logstash conditional to check if tag exists?

Is there any way in logstash to use a conditional to check if a specific tag exists?
For example,
grok {
match => [
"message", "Some expression to match|%{GREEDYDATA:NOMATCHES}"
]
if NOMATCHES exists Do something.
How do I verify if NOMATCHES tag exists or not?
Thanks.
Just so we're clear: the config snippet you provided is setting a field, not a tag.
Logstash events can be thought of as a dictionary of fields. A field named tags is referenced by many plugins via add_tag and remove_tag operations.
You can check if a tag is set:
if "foo" in [tags] {
...
}
But you seem to want to check if a field contains anything:
if [NOMATCHES] =~ /.+/ {
...
}
The above will check that NOMATCHES exists and isn't empty.
Reference: configuration file overview.
The following test for existence also works [tested in Logstash 1.4.2], although it may not validate non-empty:
if [NOMATCHES] {
...
}

Resources