how to generate new fields in logstash - logstash

I need to generate the new fields (loglevel) using logstash,finally displaying in kibana.
How to extract this log and make the pattern using grok filter for this log.
How to create the field of loglevel using logstash configuration.

I found this page helpful when I was setting up my log4net filter. Based on what your logs look like, you'll end up with something like this (copied from that page):
filter {
if [type] == "log4net" {
grok {
match => [ "message", "(?m)%{LOGLEVEL:level} %{TIMESTAMP_ISO8601:sourceTimestamp} %{DATA:logger} \[%{NUMBER:threadId}\] \[%{IPORHOST:tempHost}\] %{GREEDYDATA:tempMessage}" ]
}
mutate {
replace => [ "message" , "%{tempMessage}" ]
replace => [ "host" , "%{tempHost}" ]
remove_field => [ "tempMessage" ]
remove_field => [ "tempHost" ]
}
}
}

Yes, Now i got an answer. Please find the below configuration for creating new fields and filter some fields.
filter {
multiline{
pattern => "^%{TIMESTAMP_ISO8601}"
what => "previous"
negate=> true
}
# Delete trailing whitespaces
mutate {
strip => "message"
}
# Delete \n from messages
mutate {
gsub => ['message', "\n", " "]
}
# Delete \r from messages
mutate {
gsub => ['message', "\r", " "]
}
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:time} \[%{NUMBER:thread}\] %{LOGLEVEL:loglevel} %{JAVACLASS:class} - %{GREEDYDATA:msg}" }
}
if "Exception" in [msg] {
mutate {
add_field => { "msg_error" => "%{msg}" }
}
}
}

Related

Using if-else with Logstash split

I have a string field called description delimited with _.
I split it as follows:
filter {
mutate {
split => ["description", "_"]
add_field => {"location" => "%{[description][3]}"}
}
How can I check if the split values are empty or not?
I have attempted:
if !["%{[description][3]}"] {
# do something
}
if ![[description][3]] {
# do something
}
if ![description][3] {
# do something
}
None of them work.
The goal is to have the value of the new field location as its actual value or a generic value such as NA.
you made a really simple mistake with your mutate split.
this
mutate {
split => ["description", "_"]
add_field => {"location" => "%{[description][3]}"}
}
should have been
mutate {
split => ["description"=> "_"] <=== see I removed the comma and added =>
add_field => {"location" => "%{[description][3]}"}
}
here is sample I tested out with
filter {
mutate {
remove_field => ["headers", "#version"]
add_field => { "description" => "Python_Java_ruby_perl " }
}
mutate {
split => {"description" => "_"}
}
if [description][4] {
mutate {
add_field => {"result" => "The 4 th field exists"}
}
} else {
mutate {
add_field => {"result" => "The 4 th field DOES NOT exists"}
}
}
and the result on console (since there is no 4 th element, it went to else block
{
"host" => "0:0:0:0:0:0:0:1",
"result" => "The 4 th field DOES NOT exists", <==== from else block
"#timestamp" => 2020-01-14T19:35:41.013Z,
"message" => "hello",
"description" => [
[0] "Python",
[1] "Java",
[2] "ruby",
[3] "perl "
]
}

Grok help for a custom metric

I have a log line like this:
09 Nov 2018 15:51:35 DEBUG api.MapAnythingProvider - Calling API For Client: XXX Number of ELEMENTS Requested YYY
I want to ignore all other log lines and only want those lines that have the words "Calling API For Client" in it. Further, I am only interested in the String XXX and Number YYY.
Thanks for the help.
input {
file {
path => ["C:/apache-tomcat-9.0.7/logs/service/service.log"]
sincedb_path => "nul"
start_position => "beginning"
}
}
filter {
grok {
match => {
"message" => "%{MONTHDAY:monthDay} %{MONTH:mon} %{YEAR:year} %{TIME:ts} %{WORD:severity} %{JAVACLASS:claz} - %{GREEDYDATA:logmessage}"
}
}
grok {
match => {
"logmessage" => "%{WORD:keyword} %{WORD:customer} %{WORD:key2} %{NUMBER:mapAnythingCreditsConsumed:float} %{WORD:key3} %{NUMBER:elementsFromCache:int}"
}
}
if "_grokparsefailure" in [tags] {
drop {}
}
mutate {
remove_field => [ "monthDay", "mon", "ts", "severity", "claz", "keyword", "key2", "path", "message", "year", "key3" ]
}
}
output {
if [logmessage] =~ /ExecutingJobFor/ {
elasticsearch {
hosts => ["localhost:9200"]
index => "test"
manage_template => false
}
stdout {
codec => rubydebug
}
}
}

Can't grok multiline logs

I have logs where each event is:
ExitNode FF33F91CC06B6CC5C3EE804E7D8DBE42CB5707F9
Published 2017-11-05 02:55:09
LastStatus 2017-11-05 04:02:27
ExitAddress 66.42.224.235 2017-11-05 04:06:26
I tried to use multiline:
input {
file {
path => "/path/input"
}
}
filter {
multiline {
pattern => "^\b[A-Za-z]{8}\b"
what => "next"
}
}
filter {
multiline {
pattern => "^\b[A-Za-z]{8}\b"
what => "next"
}
}
filter {
multiline {
pattern => "^\b[A-Za-z]{11}\b"
what => "previous"
}
}
output {
file {
codec => rubydebug
path => "/path/output"
}
}
And I get something like this:
{
"path" => "/path/input",
"#timestamp" => 2017-11-05T10:25:34.112Z,
"#version" => "1",
"host" => "HOST",
"message" => "ExitNode FE3CB742E73674F1BC2382723209ECEE44AD4AEC\nPublished 2017-11-04 20:34:55\nLastStatus 2017-11-04 21:03:26\nExitAddress 77.250.227.12 2017-11-04 21:06:45",
"tags" => [
[0] "multiline"
]
}
And I can't grok this message field because I don't know how to remove or replace \n and gsub => ["message", "\n", "Line_Break"] doesn't work properly.
Thanks
From the comment of #baudsp:
mutate {
gsub =>
["message", "[\r\n]","_"]
}

Use Logstash with HTML log

I'm new to Logstash, trying to use it to parse a HTML log file.
I need to output only the log lines, i.e. ignore preceding JS, CSS and HTML that are also included in the file.
A log line in the file looks like this:
<tr bgcolor="tomato"><td>Jan 28<br>13:52:25.692</td><td>Jan 28<br>13:52:23.950</td><td>qtp114615276-1648 [POST] [call_id:-8009072655119858507]</td><td>REST</td><td>sa</td><td>0.0.0.0</td><td>ERR</td><td>ProjectValidator.validate(36)</td><td>Project does not exist</td></tr>
I have no problem getting all the lines, but I would like to have an output which contains only the relevant ones, without HTML tags, and looks something like that:
{
"db_timestamp": "2015-01-28 13:52:25.692",
"server_timestamp": "2015-01-28 13:52:25.950",
"node": "qtp114615276-1648 [POST] [call_id:-8009072655119858507]",
"thread": "REST",
"user": "sa",
"ip": "0.0.0.0",
"level": "ERR",
"method": "ProjectValidator.validate(36)",
"message": "Project does not exist"
}
My Logstash configuration is:
input {
file {
type => "request"
path => "<some path>/*.log"
start_position => "beginning"
}
file {
type => "log"
path => "<some path>/*.html"
start_position => "beginning"
}
}
filter {
if [type] == "log" {
grok {
match => [ WHAT SHOULD I PUT HERE??? ]
}
}
}
output {
stdout {}
if [type] == "request" {
http {
http_method => "post"
url => "http://<some url>"
mapping => ["type", "request", "host" ,"%{host}", "timestamp", "%{#timestamp}", "message", "%{message}"]
}
}
if [type] == "log" {
http {
http_method => "post"
url => "http://<some url>"
mapping => [ ALSO WHAT SHOULD I PUT HERE??? ]
}
}
}
Is there a way to do that? So far I haven't found any relevant documentation or samples.
Thanks!
Finally figured out the answer.
Not sure this is the best or most elegant solution, but it works.
I changed the http output format to "message", which enabled me to override and format the whole message as JSON, instead of using mapping. Also, found out how to name parameters in the grok filter and use them in the output.
This is the new Logstash configuration file:
input {
file {
type => "request"
path => "<some path>/*.log"
start_position => "beginning"
}
file {
type => "log"
path => "<some path>/*.html"
start_position => "beginning"
}
}
filter {
if [type] == "log" {
grok {
match => { "message" => "<tr bgcolor=.*><td>%{MONTH:db_date}%{SPACE}%{MONTHDAY:db_date}<br>%{TIME:db_date}</td><td>%{MONTH:alm_date}%{SPACE}%{MONTHDAY:alm_date}<br>%{TIME:alm_date}</td><td>%{DATA:thread}</td><td>%{DATA:req_type}</td><td>%{DATA:username}</td><td>%{IP:ip}</td><td>%{DATA:level}</td><td>%{DATA:method}</td><td>%{DATA:err_message}</td></tr>" }
}
}
}
output { stdout { codec => rubydebug }
if [type] == "request" {
http {
http_method => "post"
url => "http://<some URL>"
mapping => ["type", "request", "host" ,"%{host}", "timestamp", "%{#timestamp}", "message", "%{message}"]
}
}
if [type] == "log" {
http {
format => "message"
content_type => "application/json"
http_method => "post"
url => "http://<some URL>"
message=> '{
"db_date":"%{db_date}",
"alm_date":"%{alm_date}",
"thread": "%{thread}",
"req_type": "%{req_type}",
"username": "%{username}",
"ip": "%{ip}",
"level": "%{level}",
"method": "%{method}",
"message": "%{err_message}"
}'
}
}
}
Note the single quote for the http message block and the double quotes for the parameters inside this block.
For anyone parsing HP ALM logs, the following Logstash filter will do the work:
grok {
break_on_match => true
match => [ "message", "<tr bgcolor=.*><td>%{MONTH:db_date_mon}%{SPACE}%{MONTHDAY:db_date_day}<br>%{TIME:db_date_time}<\/td><td>%{MONTH:alm_date_mon}%{SPACE}%{MONTHDAY:alm_date_day}<br>%{TIME:alm_date_time}<\/td><td>(?<thread_col1>.*?)<\/td><td>(?<request_type>.*?)<\/td><td>(?<login>.*?)<\/td><td>(?<ip>.*?)<\/td><td>(?<level>.*?)<\/td><td>(?<method>.*?)<\/td><td>(?m:(?<log_message>.*?))</td></tr>" ]
}
mutate {
add_field => ["db_date", "%{db_date_mon} %{db_date_day}"]
add_field => ["alm_date", "%{alm_date_mon} %{alm_date_day}"]
remove_field => [ "db_date_mon", "db_date_day", "alm_date_mon", "alm_date_day" ]
gsub => [
"log_message", "<br>", "
"
]
gsub => [
"log_message", "<p>", " "
]
}
Tested and working fine with Logstash 2.4.0

File input add_field not adding field to every row

I am parsing several logfiles of different load balanced serverclusters with my logstash config and would like to add a field "log_origin" to each file's entries for the later easy filtering.
Here's my input->file config in a simple example:
input {
file {
type => "node1"
path => "C:/Development/node1/log/*"
add_field => [ "log_origin", "live_logs" ]
}
file {
type => "node2"
path => "C:/Development/node2/log/*"
add_field => [ "log_origin", "live_logs" ]
}
file {
type => "node3"
path => "C:/Development/node1/log/*"
add_field => [ "log_origin", "live_logs" ]
}
file {
type => "node4"
path => "C:/Development/node1/log/*"
add_field => [ "log_origin", "live_logs" ]
}
}
filter {
grok {
match => [
"message","%{DATESTAMP:log_timestamp}%{SPACE}\[%{DATA:class}\]%{SPACE}%{LOGLEVEL:loglevel}%{SPACE}%{GREEDYDATA:log_message}"
]
}
date {
match => [ "log_timestamp", "dd.MM.YY HH:mm:ss", "ISO8601" ]
target => "#timestamp"
}
mutate {
lowercase => ["loglevel"]
strip => ["loglevel"]
}
if "_grokparsefailure" in [tags] {
multiline {
pattern => ".*"
what => "previous"
}
}
if[fields.log_origin] == "live_logs"{
if [type] == "node1" {
mutate {
add_tag => "realsServerName1"
}
}
if [type] == "node2" {
mutate {
add_tag => "realsServerName2"
}
}
if [type] == "node3" {
mutate {
add_tag => "realsServerName3"
}
}
if [type] == "node4" {
mutate {
add_tag => "realsServerName4"
}
}
}
}
output {
stdout { }
elasticsearch { embedded => true }
}
I would have expected logstash to add this field with the value given to every logentry it finds, but it doesn't. Maybe I am completely taking the wrong approach here?
Edit: I am not able to retrieve the logs directly from the nodes, but have to copy them over to my "server". Otherwise i would be able to just use the filepath for distinguishing different clusters...
Edit: It's working. I should have cleand my data in between. Old entries without the field added cluttered up my results.
The add_field expects a hash. It should be
add_field => {
"log_origin" => "live_logs"
}

Resources