elapsed + aggregate passing custom fields in Logstash - logstash

I am using elapsed plugin to calculate time and aggregate plugin then to display it.
I added custom fields to elapsed filter
You can see it below:
add_field => {
"status" => "Status"
"User" => "%{byUser}"
}
One is static the other one is dynamic coming with event.
On output of logstash it display only static values not dynamic one..
It displays %{byUser} for dynamic one.
But for task id and status fields works just fine and I got right values.
Any idea why?
Little bit more code
elapsed {
unique_id_field => "assetId"
start_tag => "tag1:tag2"
end_tag => "tag3:tag4"
add_field => {
"wasInStatus" => "tag3"
"User" => "%{byUser}"
}
add_tag => ["CustomTag"]
}
grok input:
grok {
match => [
"message", "%{TIMESTAMP_ISO8601:timestamp} %{NUMBER:assetId} %{WORD:event}:%{WORD:event1} User:%{USERNAME:byUser}"]
if "CustomTag" in [tags] and "elapsed" in [tags] {
aggregate {
task_id => "%{assetId}"
code => "event.to_hash.merge!(map)"
map_action => "create_or_update"
}
}
problem is connected with:
elapsed filter:
new_event_on_match => true/false
Change new_event_on_match to false was true in my pipeline fixed issue.but still wonder why.

I also faced similar issue now, and found a fix for it. When new_event_on_match => true is used the elapsed event will be separated from the original log and a new elapsed event will be entered to the ElasticSearch as below
{
"_index": "elapsed_index_name",
"_type": "doc",
"_id": "DzO03mkBUePwPE-nv6I_",
"_version": 1,
"_score": null,
"_source": {
"execution_id": "dfiegfj3334fdsfsdweafe345435",
"elapsed_timestamp_start": "2019-03-19T15:18:34.218Z",
"tags": [
"elapsed",
"elapsed_match"
],
"#timestamp": "2019-04-02T15:39:40.142Z",
"host": "3f888b2ddeec",
"cus_code": "Custom_name", [This is a custom field]
"elapsed_time": 41.273,
"#version": "1"
},
"fields": {
"#timestamp": [
"2019-04-02T15:39:40.142Z"
],
"elapsed_timestamp_start": [
"2019-03-19T15:18:34.218Z"
]
},
"sort": [
1554219580142
]
}
For adding the "cus_code" to the elapsed event object from the original log (log from where the elapsed filter end tag is detected), I added an aggregate filter as below:
if "elapsed_end_tag" in [tags] {
aggregate {
task_id => "%{execution_id}"
code => "map['cus_code'] = event.get('custom_code_field_name')"
map_action => "create"
}
}
and add the end block of aggregation by validating the 'elapsed' tag
if "elapsed" in [tags] {
aggregate {
task_id => "%{execution_id}"
code => "event.set('cus_code', map['cus_code'])"
map_action => "update"
end_of_task => true
timeout => 400
}
}
So to add custom field to elapsed event we need to combine aggregate filter along with elapse filter

Related

All messages receive a "user level notice"

Im trying to parse a message from my network devices which send messages in format similar to
<30>Feb 14 11:33:59 wireless: ath0 Sending auth to xx:xx:xx:xx:xx:xx. Status: The request has been declined due to MAC ACL (52).\n
<190>Feb 14 11:01:29 CCR00 user admin logged out from xx.xx.xx.xx via winbox
<134>2023 Feb 14 11:00:33 ZTE command-log:An alarm 36609 level notification occurred at 11:00:33 02/14/2023 CET sent by MCP GponRm notify: <gpon-onu_1/1/1:1> SubType:1 Pos:1 ONU Uni lan los. restore\n on \n
using this logstash.conf file
input {
beats {
port => 5044
}
tcp {
port => 50000
}
udp {
port => 50000
}
}
## Add your filters / logstash plugins configuration here
filter {
grok {
match => {
"message" => "^(?:<%{POSINT:syslog_pri}>)?%{GREEDYDATA:message_payload}"
}
}
syslog_pri {
}
mutate {
remove_field => [ "#version" , "message" ]
}
}
output {
stdout {}
elasticsearch {
hosts => "elasticsearch:9200"
user => "logstash_internal"
password => "${LOGSTASH_INTERNAL_PASSWORD}"
}
}
which results in this output
{
"#timestamp": [
"2023-02-14T10:38:59.228Z"
],
"data_stream.dataset": [
"generic"
],
"data_stream.namespace": [
"default"
],
"data_stream.type": [
"logs"
],
"event.original": [
"<14> Feb 14 11:38:59 UBNT BOXSERV[boxs Req]: boxs.c(691) 55381193 %% Error 17 occurred reading thermal sensor 2 data\n\u0000"
],
"host.ip": [
"10.125.132.10"
],
"log.syslog.facility.code": [
1
],
"log.syslog.facility.name": [
"user-level"
],
"log.syslog.severity.code": [
5
],
"log.syslog.severity.name": [
"notice"
],
"message_payload": [
" Feb 14 11:38:59 UBNT[boxs Req]: boxs.c(691) 55381193 %% Error 17 occurred reading thermal sensor 2 data\n\u0000"
],
"syslog_pri": [
"14"
],
"_id": "UzmBT4YBAZPdbqc4m_IB",
"_index": ".ds-logs-generic-default-2023.02.04-000001",
"_score": null
}
which is mostly satisfactory, but i would expect the
log.syslog.facility.name
and
log.syslog.severity.name
fields to be processed by the
syslog_pri
filter
with imput of
<14>
to result into
secur/auth
and
Alert
recpectively,
but i keep getting the default user-level notice for all my messages, no matter what the part of the syslog message contains
anyone could advise and maybe fix my .conf syntax, if its wrong?
thank you very much!
i have logstash configured properly to receive logs and send them to elastics, but the grok/syslog_pri doesnt yield expected results
The fact that the syslog_pri filter is setting [log][syslog][facility][code] shows that it has ECS compatibility enabled. As a result, if you do not set the syslog_pri_field_name option on the syslog_pri filter, it will try to parse [log][syslog][priority]. If that field does not exist then it will parse the default value of 13, which is user-level/notice.
thank you for the answer, i have adjusted the code by the given advice
filter {
grok {
match => { "message" => "^(?:<%{POSINT:syslog_code}>)?%{GREEDYDATA:message_payload}"
} }
syslog_pri { syslog_pri_field_name => "syslog_code"
}
mutate { remove_field => [ "#version" , "message" ] } }
and now it behaves as intended
"event" => {
"original" => "<30>Feb 15 18:41:04 dnsmasq-dhcp[960]: DHCPACK(eth0) 10.0.0.165 xx:xx:xx:xx:xx CZ\n"
},
"#timestamp" => 2023-02-15T17:41:04.977038615Z,
"message_payload" => "Feb 15 18:41:04 dnsmasq-dhcp[960]: DHCPACK(eth0) 10.0.0.165 xx:xx:xx:xx:xx CZ\n",
"log" => {
"syslog" => {
"severity" => {
"code" => 6,
"name" => "informational"
},
"facility" => {
"code" => 3,
"name" => "daemon"
}
}
},
"syslog_code" => "30",
"host" => {
"ip" => "xx.xx.xx.xx"
} }
i will adjust the message a bit to fit my needs,
but that is out of the scope of this question
thank you very much!

LogStash Conf | Drop Empty Lines

The contents of LogStash's conf file looks like this:
input {
beats {
port => 5044
}
file {
path => "/usr/share/logstash/iway_logs/*"
start_position => "beginning"
sincedb_path => "/dev/null"
#ignore_older => 0
codec => multiline {
pattern => "^\[%{NOTSPACE:timestamp}\]"
negate => true
what => "previous"
max_lines => 2500
}
}
}
filter {
grok {
match => { "message" =>
['(?m)\[%{NOTSPACE:timestamp}\]%{SPACE}%{WORD:level}%{SPACE}\(%{NOTSPACE:entity}\)%{SPACE}%{GREEDYDATA:rawlog}'
]
}
}
date {
match => [ "timestamp", "yyyy-MM-dd'T'HH:mm:ss.SSS"]
target => "#timestamp"
}
grok {
match => { "entity" => ['(?:W.%{GREEDYDATA:channel}:%{GREEDYDATA:inlet}:%{GREEDYDATA:listener}\.%{GREEDYDATA:workerid}|W.%{GREEDYDATA:channel}\.%{GREEDYDATA:workerid}|%{GREEDYDATA:channel}:%{GREEDYDATA:inlet}:%{GREEDYDATA:listener}\.%{GREEDYDATA:workerid}|%{GREEDYDATA:channel}:%{GREEDYDATA:inlet}:%{GREEDYDATA:listener}|%{GREEDYDATA:channel})']
}
}
dissect {
mapping => {
"[log][file][path]" => "/usr/share/logstash/iway_logs/%{serverName}#%{configName}#%{?ignore}.log"
}
}
}
output {
elasticsearch {
hosts => "${ELASTICSEARCH_HOST_PORT}"
index => "iway_"
user => "${ELASTIC_USERNAME}"
password => "${ELASTIC_PASSWORD}"
ssl => true
ssl_certificate_verification => false
cacert => "/certs/ca.crt"
}
}
As one can make out, the idea is to parse a custom log employing multiline extraction. The extraction does its job. The log occasionally contains an empty first line. So:
[2022-11-29T12:23:15.073] DEBUG (manager) Generic XPath iFL functions use full XPath 1.0 syntax
[2022-11-29T12:23:15.074] DEBUG (manager) XPath 1.0 iFL functions use iWay's full syntax implementation
which naturally is causing Kibana to report an empty line:
In an attempt to supress this line from being sent to ES, I added the following as a last filter item:
if ![message] {
drop { }
}
if [message] =~ /^\s*$/ {
drop { }
}
The resulting JSON payload to ES:
{
"#timestamp": [
"2022-12-09T14:09:35.616Z"
],
"#version": [
"1"
],
"#version.keyword": [
"1"
],
"event.original": [
"\r"
],
"event.original.keyword": [
"\r"
],
"host.name": [
"xxx"
],
"host.name.keyword": [
"xxx"
],
"log.file.path": [
"/usr/share/logstash/iway_logs/localhost#iCLP#iway_2022-11-29T12_23_33.log"
],
"log.file.path.keyword": [
"/usr/share/logstash/iway_logs/localhost#iCLP#iway_2022-11-29T12_23_33.log"
],
"message": [
"\r"
],
"message.keyword": [
"\r"
],
"tags": [
"_grokparsefailure"
],
"tags.keyword": [
"_grokparsefailure"
],
"_id": "oRc494QBirnaojU7W0Uf",
"_index": "iway_",
"_score": null
}
While this does drop the empty first line, it also unfortunately interferes with the multiline operation on other lines. In other words, the multiline operation does not work anymore. What am I doing incorrectly?
Use of the following variation resolved the issue:
if [message] =~ /\A\s*\Z/ {
drop { }
}
This solution is based on Badger's answer provided on the Logstash forums, where this question was raised as well.

Agregate on agregate filter in Logstash?

I have a stock table where I have stocks for shops/products:
input {
jdbc {
statement => "SELECT ShopId, ProductCode, Quantity FROM stock ORDER BY productcode;"
}
}
then I have a simple filter to aggregate that data:
filter {
aggregate {
task_id => "%{productcode}"
code => "
map['productcode'] ||= event.get('productcode')
map['objectID'] ||= event.get('productcode')
map['stocks'] ||= []
map['stocks'] << {
'ShopId' => event.get('ShopId'),
'quantity' => event.get('quantity'),
}
event.cancel()
"
push_previous_map_as_event => true
timeout => 3
}
}
which gives me output I expect, for example:
{
"productcode": "123",
"objectID": "123",
"stocks": [
{
"ShopId": 1
"Quantity": 2
},
{
"ShopId": 2
"Quantity": 5
}
]
}
now I can push that data to Algolia via http output plugin.
But the issue I have is that it's thousands of objects which makes thousands of calls.
That's why I think to use batch endpoint, pack those objects to package of f.e. 1000 objects, but to do so, I need to adjust structure to:
{
"requests": [
{
"action": "addObject",
"body": {
"productcode": "123",
"objectID": "123",
...
}
},
{
"action": "addObject",
"body": {
"productcode": "456",
"objectID": "456",
...
}
}
]
}
which looks to me like another aggregate function, but I already tried:
aggregate {
task_id => "%{source}"
code => "
map['requests'] ||= []
map['requests'] << {
'action' => 'addObject',
'body' => {
'productcode' => event.get('productcode'),
'objectId' => event.get('objectID'),
'stocks' => event.get('stocks')
}
}
event.cancel()
"
push_previous_map_as_event => true
timeout => 3
but it does not work.
Also with this type of aggregate function I'm not able to configure how big packages I would like to send to batch output.
I will be very grateful for any help or clues.

Logstash make a copy a nested field with mutate.add_field

I wanted to make a copy of a nested field in a Logstash filter but I can't figure out the correct syntax.
Here is what I try:
incorrect syntax:
mutate {
add_field => { "received_from" => %{beat.hostname} }
}
beat.hostname is not replaced
mutate {
add_field => { "received_from" => "%{beat.hostname}" }
}
beat.hostname is not replaced
mutate {
add_field => { "received_from" => "%{[beat][hostname]}" }
}
beat.hostname is not replaced
mutate {
add_field => { "received_from" => "%[beat][hostname]" }
}
No way. If I give a non nested field it works as expected.
The data structure received by logstash is the following:
{
"#timestamp" => "2016-08-24T13:01:28.369Z",
"beat" => {
"hostname" => "etg-dbs-master-tmp",
"name" => "etg-dbs-master-tmp"
},
"count" => 1,
"fs" => {
"device_name" => "/dev/vdb",
"total" => 5150212096,
"used" => 99287040,
"used_p" => 0.02,
"free" => 5050925056,
"avail" => 4765712384,
"files" => 327680,
"free_files" => 326476,
"mount_point" => "/opt/ws-etg/datas"
},
"type" => "filesystem",
"#version" => "1",
"tags" => [
[0] "topbeat"
],
"received_at" => "2016-08-24T13:01:28.369Z",
"received_from" => "%[beat][hostname]"
}
EDIT:
Since you didn't show your input message I worked off your output. In your output the field you are trying to copy into already exists, which is why you need to use replace. If it does not exist, you do in deed need to use add_field. I updated my answer for both cases.
EDIT 2: I realised that your problem might be to access the value that is nested, so I added that as well :)
you are using the mutate filter wrong/backwards.
First mistake:
You want to replace a field, not add one. In the docs, it gives you the "replace" option. See: https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-replace
Second mistake, you are using the syntax in reverse. It appears that you believe this is true:
"text I want to write" => "Field I want to write it in"
While this is true:
"myDestinationFieldName" => "My Value to be in the field"
With this knowledge, we can now do this:
mutate {
replace => { "[test][a]" => "%{s}"}
}
or if you want to actually add a NEW NOT EXISTING FIELD:
mutate {
add_field => {"[test][myNewField]" => "%{s}"}
}
Or add a new existing field with the value of a nested field:
mutate {
add_field => {"some" => "%{[test][a]}"}
}
Or more details, in my example:
input {
stdin {
}
}
filter {
json {
source => "message"
}
mutate {
replace => { "[test][a]" => "%{s}"}
add_field => {"[test][myNewField]" => "%{s}"}
add_field => {"some" => "%{[test][a]}"}
}
}
output {
stdout { codec => rubydebug }
}
This example takes stdin and outputs to stdout. It uses a json filter to parse the message, and then the mutate filter to replace the nested field. I also add a completely new field in the nested test object.
And finally creates a new field "some" that has the value of test.a
So for this message:
{"test" : { "a": "hello"}, "s" : "to_Repalce"}
We want to replace test.a (value: "Hello") with s (Value: "to_Repalce"), and add a field test.myNewField with the value of s.
On my terminal:
artur#pandaadb:~/dev/logstash$ ./logstash-2.3.2/bin/logstash -f conf2/
Settings: Default pipeline workers: 8
Pipeline main started
{"test" : { "a": "hello"}, "s" : "to_Repalce"}
{
"message" => "{\"test\" : { \"a\": \"hello\"}, \"s\" : \"to_Repalce\"}",
"#version" => "1",
"#timestamp" => "2016-08-24T14:39:52.002Z",
"host" => "pandaadb",
"test" => {
"a" => "to_Repalce",
"myNewField" => "to_Repalce"
},
"s" => "to_Repalce"
"some" => "to_Repalce"
}
The value has succesfully been replaced.
A field "some" with the replaces value has been added
A new field in the nested array has been added.
if you use add_field, it will convert a into an array and append your value there.
Hope this solves your issue,
Artur

Kibana4 geo map Error - not showing the client_ip field

I am trying to get kibana-4 geo map to work with ELB logs
when i click the discover tab i can clearly see a field geoip.location with values of [lat, lon]
but when i click the visualise tab -> Tile map -> new search -> Geo coordinates
i get an error (not showing anywhere what is the error i've also checked the kibana logs - but nothing is there)
I checked inspect element - also nothing
I then select GeoHash, but the field is empty (when i click on it its blank with a check icon)
How can i see what is the error ?
How can get this map to work ?
my config is:
input {
file {
path => "/logstash_data/logs/elb/**/*"
exclude => "*.gz"
type => "elb"
start_position => "beginning"
sincedb_path => "log_sincedb"
}
}
filter {
if [type] == "elb" {
grok {
match => [
"message", '%{TIMESTAMP_ISO8601:timestamp} %{NGUSERNAME:loadbalancer} %{IP:client_ip}:%{POSINT:client_port} (%{IP:backend_ip}:%{POSINT:backend_port}|-) %{NUMBER:request_processing_time} %{NUMBER:backend_processing_time} %{NUMBER:response_processing_time} %{POSINT:elb_status_code} %{INT:backend_status_code} %{NUMBER:received_bytes} %{NUMBER:sent_bytes} \\?"%{WORD:method} https?://%{WORD:request_subdomain}.server.com:%{POSINT:request_port}%{URIPATH:request_path}(?:%{URIPARAM:query_string})? %{NOTSPACE}"'
]
}
date {
match => [ "timestamp", "ISO8601" ]
target => "#timestamp"
}
if [query_string] {
kv {
field_split => "&?"
source => "query_string"
prefix => "query_string_"
}
mutate {
remove => [ "query_string" ]
}
}
if [client_ip] {
geoip {
source => "client_ip"
add_tag => [ "geoip" ]
}
}
if [timestamp] {
ruby { code => "event['log_timestamp'] = event['#timestamp'].strftime('%Y-%m-%d')"}
}
}
}
}
output {
elasticsearch {
cluster => "ElasticSearch"
host => "elasticsearch.server.com"
port => 9300
protocol => "node"
manage_template => true
template => "/etc/logstash/lib/logstash/outputs/elasticsearch/elasticsearch-template.json"
index => "elb-%{log_timestamp}"
}
}
geo_ip index did not work in my case because my index names did not started with logstash-
if you want the custom index name to get the geo-ip, you must create a template for that index name
in the output for elasticsearch use it
elasticsearch {
manage_template => true
template => "/etc/logstash/templates/custom_template.json"
}
your template should look like this
{
"template" : "index_name-*",
"settings" : {
"index.refresh_interval" : "5s"
},
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true, "omit_norms" : true},
"dynamic_templates" : [ {
"message_field" : {
"match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true
}
}
}, {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
}
}
}
} ],
"properties" : {
"#version": { "type": "string", "index": "not_analyzed" },
"geoip" : {
"type" : "object",
"dynamic": true,
"properties" : {
"location" : { "type" : "geo_point" }
}
}
}
}
}
}
On our maps, we specify a field geoip.location which according to the documentation is automatically created by the geoip filter.
Can you see that field in discover? If not, can you try amending your geoip filter to
if [client_ip] {
geoip {
source => "client_ip"
add_tag => [ "geoip" ]
target => "geoip"
}
}
and see if you can now see geoip.location in new entries?
The elasticsearch templates look for the "geoip" target when creating the associated geoip fields.
Once we have the geoip.location being created, we can create a new map with the following steps in Kibana 4.
Click on visualise
Choose 'Tile Map' from the list of visualisation types
Select either new search or saved - we're using a saved search that filters out Apache entries, but as long as the data contains geoip.location you should be good
Select the 'geo coordinates' bucket type - you'll have an error flagged at this point
In 'aggregation' dropdown, select 'geohash'
In 'field' dropdown, select 'geoip.location'

Resources