Math functions in Logstash - logstash

I am looking forward to use mathematical operations on the input received in Logstash, but unable to see any of such filter.
Input is as following:
{
"user_id": "User123",
"date": "2016 Jun 26 12:00:12",
"data": {
"doc_name": "mydocs.xls",
"doc_size": "8526587",
}
}
The "doc_size" field will have bytes, I would like to add a new field say "doc_size_mb" which will contain the size in MB's.
So I want a simple division operation here like:
doc_size_mb = doc_size/(1024*1024)
I could see a link which says Logstash has math filter, but this is not visible here .

The logstash-filter-math is not a core plugin but it is available here. You can follow the next steps in order or install it:
> git clone https://github.com/robin13/logstash-filter-math.git
> cd logstash-filter-math
> gem build
> $LS_HOME/bin/logstash-plugin install logstash-filter-math-0.2.gem
If you don't want to install a 3rd party plugin just for that, you can also easily achieve the same computation with a ruby filter:
filter {
ruby {
code => "event['data']['doc_size_mb'] = event['data']['doc_size'].to_i / (1024 * 1024)"
}
}

I tried using the above approach to multiply an existing field by a factor value and update the value of the existing field in the event by this new scaled value in Logstash 7.0.1, but it did not work as expected.
I modified it to use the Event API's set() and get() methods which worked out for me.
Initial approach (did not work) -
filter {
ruby {
code => "event['data']['myField'] = event['data']['myField'].to_i * 0.25"
}
}
Working solution -
filter {
ruby {
code => "event.set('myField',event.get('myField')* 0.25)
}
}

The math filter or ruby are options for the general case of doing math in logstash, but for this specific use-case (converting MB) there is the bytes filter.

Related

Karate test framework: retry until complex condition on arrays [duplicate]

I'm using Karate framework with JUnit.
Using this feature:
Given path 'save_token'
And request
"""
{
"token": "test_token"
}
"""
And retry until response.tokens ==
"""
[
"test_token"
]
"""
When method POST
I'm having this exception:
java.lang.ArrayIndexOutOfBoundsException: 1
at com.intuit.karate.core.MethodMatch.convertArgs(MethodMatch.java:60)
at com.intuit.karate.core.Engine.executeStep(Engine.java:141)
at com.intuit.karate.core.ScenarioExecutionUnit.execute(ScenarioExecutionUnit.java:171)
When response.tokens list is empty:
{
"tokens": []
}
I don't understand why == does not work in this case (it should return false, and keep retrying).
Thanks in advance!
The retry until expression has to be pure JavaScript and the special Karate match keywords such as contains are not supported, and you can't do a "deep equals" like how you are trying, as that also is not possible in JS.
EDIT: in 0.9.6. onwards you can do a complex match in JS: https://stackoverflow.com/a/50350442/143475
Also note that JsonPath is not supported, which means * or .. cannot appear in the expression.
So if your response is { "tokens": [ "value1" ] }, you can do this:
And retry until response.tokens.includes('value1')
Or:
And retry until response.tokens[0] == 'value1'
To experiment, you can try expressions like this:
* def response = { "tokens": [ "value1" ] }
* assert response.tokens.includes('value1')
At run time, you can use JS to take care of conditions when the response is not yet ready while polling:
And retry until response.tokens && response.tokens.length
EDIT: actually a more elegant way to do the above is shown below, because karate.get() gracefully handles a JS or JsonPath evaluation failure and returns null:
And retry until karate.get('response.tokens.length')
Or if you are dealing with XML, you can use the karate.xmlPath() API:
And retry until karate.xmlPath(response, '//result') == 5
And if you really want to use the power of Karate's match syntax, you can use the JS API:
And retry until karate.match(response, { tokens: '##[_ > 0]' }).pass
Note that if you have more complex logic, you can always wrap it into a re-usable function:
* def isValid = function(x){ return karate.match(x, { tokens: '##[_ > 0]' }).pass }
# ...
And retry until isValid(response)
Finally if none of the above works, you can always switch to a custom polling routine: polling.feature
EDIT: also see this answer for an example of how to use karate.filter() instead of JsonPath: https://stackoverflow.com/a/60537602/143475
EDIT: in version 0.9.6 onwards, Karate can do a match in JS, which can simplify some of the above: https://stackoverflow.com/a/50350442/143475

Using variables in Puppet array two at a time

I have an each iteration in Puppet to install Perl module extensions:
$extensions_list = ["extension1",
"extension2",
]
$extensions_list.each |$extls| {
exec { $extls:
path => '/usr/local/bin/:/usr/bin/:/bin/',
command => "wget http://search.cpan.org/CPAN/authors/id/B/BP/BPS/extension1-1.00.tar.gz",
}
}
What I would like it to do as well is to take into account the version number, as in:
$extensions_list = ["extension1", "1.00",
"extension2", "2.00",
]
$extensions_list.each |$extls| {
exec { $extls:
path => '/usr/local/bin/:/usr/bin/:/bin/',
command => "wget http://search.cpan.org/CPAN/authors/id/B/BP/BPS/extension1-1.00.tar.gz",
}
}
So, I'd like it to be able to take the first two variables in the array to install the first extension and then the next two and install that and so on and so on as I add new extensions. That way I can just add the name and version number to my array and it will install them in turn.
I have an each iteration in Puppet to install Perl module extensions:
Well, no, not exactly. You have an each operator to declare an Exec resource corresponding to each element of your array. Among potentially important distinctions from what you said is that the operation is evaluated during catalog building, so no actual installations are taking place at that time.
So, I'd like it to be able to take the first two variables in the array to install the first extension and then the next two and install that and so on
You could use the slice() function to split the array into an array of two-element arrays, and iterate over that. Consider, however, how much more natural it would be to use a hash instead of an array as the underlying data structure. Example:
$extensions_hash = {"extension1" => "1.00",
"extension2" => "2.00",
}
$extensions_hash.each |$extls, $extv| {
exec { $extls:
path => '/usr/local/bin/:/usr/bin/:/bin/',
command => "wget http://search.cpan.org/CPAN/authors/id/B/BP/BPS/$extls-$extv.tar.gz",
}
}

Logstash Filter not working when something has a period in the name

So I need to write a filter that changes all the periods in field names to underscores. I am using mutate, and I can do some things and not other things. For reference here is my current output in Kibana.
See those fields that say "packet.event-id" and so forth? I need to rename all of those. Here is my filter that I wrote and I do not know why it doesn't work
filter {
json {
source => "message"
}
mutate {
add_field => { "pooooo" => "AW CMON" }
rename => { "offset" = "my_offset" }
rename => { "packet.event-id" => "my_packet_event_id" }
}
}
The problem is that I CAN add a field, and the renaming of "offset" WORKS. But when I try and do the packet one nothing changes. I feel like this should be simple and I am very confused as to why only the one with a period in it doesn't work.
I have refreshed the index in Kibana, and still nothing changes. Anyone have a solution?
When they show up in dotted notation in Kibana, it's because there is structure to the document you originally loaded in json format.
To access the document structure using logstash, you need to use [packet][event-id] in your rename filter instead of packet.event-id.
For example:
filter {
mutate {
rename => {
"[packet][event-id]" => "my_packet_event_id"
}
}
}
You can do the JSON parsing directly in Filebeat by adding a few lines of config to your filebeat.yml.
filebeat.prospectors:
- paths:
- /var/log/snort/snort.alert
json.keys_under_root: true
json.add_error_key: true
json.message_key: log
You shouldn't need to rename the fields. If you do need to access a field in Logstash you can reference the field as [packet][length] for example. See Logstash field references for documentation on the syntax.
And by the way, there is a de_dot for replacing dots in field names, but that shouldn't be applied in this case.

How to do translation dictionary dynamically in logstash based on field value?

How to do translation dictionary dynamically in logstash based on field value?
For example my current configuration is:
if [host] == "1.1.1.1" {
translate {
field => "[netflow][input_snmp]"
destination => "[netflow][interface_in]"
dictionary_path => "/etc/logstash/yaml/1.1.1.1.yml"
}
}
if [host] == "2.2.2.2" {
translate {
field => "[netflow][input_snmp]"
destination => "[netflow][interface_in]"
dictionary_path => "/etc/logstash/yaml/2.2.2.2.yml"
}
}
Is there a generic way to achieve this?
Logstash version 2.2.4
Thanks
I guess you can use it as:
translate {
field => "[netflow][input_snmp]"
destination => "[netflow][interface_in]"
dictionary_path => "/etc/logstash/yaml/%{host}.yml"
}
Check that: https://www.elastic.co/guide/en/logstash/current/event-dependent-configuration.html#sprintf
You can't load dictionary files dynamically depening on field value, it's not a question of syntax.
At least for the moment (current logstash version is 7.6.2)
All dictionary files are loaded in memory at logstash startup (and I suppose after a logstash configuration reload), before any event is processed.
Then the contents of the existing dictionary files are dynamically reloaded according to the refresh_interval option.
The dictionary paths can't be modified "at run time" depending on the current event.
In the elastic support forums you can check extra explanation (the 1st link even has a reference to the source code involved) and workarounds, but in the end it revolves around the same idea shown in your config:
set a bunch of static dictionary file names and control their usage with conditionals. You may use environment variables in the dictionary_path but they will be used once per logstash startup/reload.
https://discuss.elastic.co/t/dynamic-dictionary/138798/5
https://discuss.elastic.co/t/logstash-translate-plugin-dynamic-dictionary-path/129889

Getting logstash "fingerprint" filter to source every field

I'm using the fingerprint filter in Logstash to create a fingerprint field that I set to document_id in the elasticsearch output.
Configuration is as follows:
filter {
fingerprint {
method => "SHA1"
key => "KEY"
}
}
output {
elasticsearch {
host => localhost
document_id => "%{fingerprint}"
}
}
This defaults to source being message, but how do I make it SHA1 the entire record and not just message? Note, what fields a record has depends on the message.
I think there is no built-in possibility to achieve this with the fingerprint plugin. Even the concatenate_sources option doesn't recognize all fields and as your fields change you cannot set them manually as source.
However, you might consider using the ruby plugin to calculate an SHA1 hash regarding all of your fields. Following might do what you want.
filter {
ruby {
init => "require 'digest/sha1'; require 'json'"
code => "event['fingerprint'] = Digest::SHA1.hexdigest event.to_json"
}
}
I've just tested it and I get suitable SHA1 hashes regarding all fields.
to add to #hurb's solution, with Logstash 5.x due to breaking changes the following seems to be working:
ruby {
init => "require 'digest/sha1'; require 'json'"
code => "event.set('fingerprint', Digest::SHA1.hexdigest(event.to_json))"
}

Resources