Logstash splitting and tagging events - logstash

I need to ingest following format json evens
{
"test1": {some nested json here},
"test2": {some nested json here},
"test3": {some nested json here},
"test4": {some nested json here}
}
I have 3 problems:
When i make split
json {
source => "message"
}
split {
field => "[message][test1]"
target => "test1"
add_tag => ["test1"]
}
This tag didn't appear anywhere (i want to use it later in output
Second one is with output:
Now i can ingest with:
tcp {
codec => line { format => "%{test1}" }
host => "127.0.0.1"
port => 7515
id => "TCP-SPLUNK-test1"
}
I can do same for all split items, but i guess there is more clever way to do it.
Last one is question related to identifying events, like
if format is { "test1":{},"test2":{},"test3":{},"test4":{} } then do something, else do something different
I guess this should be done with grok, but I'll play whit that after manage to fix first 2 issues.

Related

Logstash Filter, add field from split array, if not empty

I am doing a split on two fields, and assigning different array elements to new fields. However when they dont exist it ends up assinging the code to the field, e.g"%{variable}"
I assume I could do 5 if statements on the array element to see if its present before assigning it to the new field, but this seems a very messy way of doing it. Is there a better way to only assign if populated
split => { "HOSTALIAS" => ", " }
split => { "HOSTGROUP" => "," }
add_field => {
"host-group" => "%{[HOSTGROUP][0]}"
"ci_alias" => "%{[HOSTALIAS][0]}"
"blueprint-id" => "%{[HOSTALIAS][1]}"
"instance-id" => "%{[HOSTALIAS][2]}"
"vm-location" => "%{[HOSTALIAS][3]}"
}
You could use grok filter. Here we drop failing messages but we could deal with it differently.
filter {
grok {
match => [ "HOSTALIAS", "%{WORD:ci_alias},%{WORD:blueprint-id},%{WORD:instance-id},%{WORD:vm-location}"]
}
if "_grokparsefailure" in [tags] {
drop { }
}
}

convert string to array based on pattern in logstash

My original data.
{
message: {
data: "["1,2","3,4","5,6"]"
}
}
Now I want to convert value of data field to an array.
So it should become:
{
message: {
data: ["1,2", "3,4", "5,6"]
}
}
By using
mutate {
gsub => ["data", "[\[\]]", ""]
}
I got rid of square brackets.
After this, I tried splitting based on commas. But that won't work. Since my data has commas as well.
I tried writing a dissect block but that is not useful.
So how should I go ahead with this?
Have you tried the json filter? If the data field always contains valid json data, you use the json filter like this:
json {
source => "data"
target => "data"
}
Using target => "data" will overwrite the data field.

LogStash dissect with key=value, comma

I have a pattern of logs that contain performance&statistical data. I have configured LogStash to dissect this data as csv format in order to save the values to ES.
<1>,www1,3,BISTATS,SCAN,330,712.6,2035,17.3,221.4,656.3
I am using the following LogSTash filter and getting the desired results..
grok {
match => { "Message" => "\A<%{POSINT:priority}>,%{DATA:pan_host},%{DATA:pan_serial_number},%{DATA:pan_type},%{GREEDYDATA:message}\z" }
overwrite => [ "Message" ]
}
csv {
separator => ","
columns => ["pan_scan","pf01","pf02","pf03","kk04","uy05","xd06"]
}
This is currently working well for me as long as the order of the columns doesn't get messed up.
However I want to make this logfile more meaningful and have each column-name in the original log. example-- <1>,www1,30000,BISTATS,SCAN,pf01=330,pf02=712.6,pf03=2035,kk04=17.3,uy05=221.4,xd06=656.3
This way I can keep inserting or appending key/values in the middle of the process without corrupting the data. (Using LogStash5.3)
By using #baudsp recommendations, I was able to formulate the following. I deleted the csv{} block completely and replace it with the kv{} block. The kv{} automatically created all the key values leaving me to only mutate{} the fields into floats and integers.
json {
source => "message"
remove_field => [ "message", "headers" ]
}
date {
match => [ "timestamp", "YYYY-MM-dd'T'HH:mm:ss.SSS'Z'" ]
target => "timestamp"
}
grok {
match => { "Message" => "\A<%{POSINT:priority}>,%{DATA:pan_host},%{DATA:pan_serial_number},%{DATA:pan_type},%{GREEDYDATA:message}\z" }
overwrite => [ "Message" ]
}
kv {
allow_duplicate_values => false
field_split_pattern => ","
}
Using the above block, I was able to insert the K=V, pairs anywhere in the message. Thanks again for all the help. I have added a sample code block for anyone trying to accomplish this task.
Note: I am using NLog for logging, which produces JSON outputs. From the C# code, the format looks like this.
var logger = NLog.LogManager.GetCurrentClassLogger();
logger.ExtendedInfo("<1>,www1,30000,BISTATS,SCAN,pf01=330,pf02=712.6,pf03=2035,kk04=17.3,uy05=221.4,xd06=656.3");

How to easily promote a JSON member to the main event level? [duplicate]

This question already has an answer here:
Eliminate the top-level field in Logstash
(1 answer)
Closed 5 years ago.
I'm using an http_poller to hit an API endpoint for some info I want to index with elasticsearch. The result is in JSON and is a list of records, looking like this:
{
"result": [
{...},
{...},
...
]
}
Each result object in the array is what I really want to turn into an event that gets indexed in ElasticSearch, so I tried using the split filter to turn the object into a series of events instead. It worked reasonably well, but now I have a series of events that look like this:
{
result: { ... }
}
My current filter looks like this:
filter {
if [type] == "history" {
split {
field => "result"
}
}
}
Each of those result objects has about 20 fields, most of which I want, so while I know I can transform them by doing something along the lines of
filter {
if [type] == "history" {
split {
field => "result"
}
mutate {
add_field => { "field1" => "%{[result][field1]}"
#... x15-20 more fields
remove_field => "result"
}
}
}
But with so many fields I was hoping there's a one-liner to just copy all the fields of the 'result' value up to be the event.
This can be done with a ruby filter like this:
ruby {
code => '
if (event.get("result"))
event.get("result").each { |k,v|
event.set(k,v);
}
event.remove("result");
end
'
}
I don't know of any way to do this with any of the built in/publicly available filters.

How to get logstash to store all of the words from one field as an array in another field

I am struggling with the filters in Logstash.
I am trying to take a well structured json stream (I am using a twitter feed for test data) and augment the data. One of our needs is to take an existing field, such as message, and store all of the unique tokens (in this case simple space delimited words).
In the long run we would like to be able to use elastic search analyzers to break down the message into normalized chunks (using stemming, stopwords, toLower, etc...)
The desired goal is to take something like:
{
"#timestamp":"2016-10-12T19:01:33.000Z",
"message":"The quickest Brown fox",
...
}
and get something like:
{
"#timestamp":"2016-10-12T19:01:33.000Z",
"message":"The quickest Brown fox",
"tokens":["The", "quickest", "Brown", "fox"],
...
}
and ultimately like this:
{
"#timestamp":"2016-10-12T19:01:33.000Z",
"message":"The quickest Brown fox",
"tokens":["quick", "brown", "fox"],
...
}
I feel like I am pounding my head against a wall. Any help pointing me in the right direction would be appreciated.
Thanks
This can be done easily using the mutate filter:
mutate {
split => {"message" => " "}
}
This tells LS to split the field denoted with "message" using the separator assigned in the hash, in this case a whitespace.
Test:
artur#pandaadb:~/dev/logstash$ ./logstash-2.3.2/bin/logstash -f conf2/
Settings: Default pipeline workers: 8
Pipeline main started
The quickest Brown Fox
{
"message" => [
[0] "The",
[1] "quickest",
[2] "Brown",
[3] "Fox"
],
"#version" => "1",
"#timestamp" => "2016-10-13T13:46:20.509Z",
"host" => "pandaadb"
}
Result is the message field being an array of elements.
Alternatively, you can also copy the message field into a tokens field before splitting it. It needs to be done in 2 steps for this to work, for example to write it into a tokens field:
mutate {
add_field => {"tokens" => "%{message}" }
}
mutate {
split => { "tokens" => " "}
}
The first mutate adds a new field called token with the content of the message field, while the second one splits the token field by space.
Hope that helps,
Artur

Resources