How to transfrom log message using logstash? - logstash

I have below log messages like
2021-03-26 11:49:25.575: 2021-03-26 11:49:25.575 [INFO] 10.0.3.12 - "POST https://api.kr-seo.assistant.watson.cloud.ibm.com/instances/a33da834-a7a7-48c2-9bf6-d3207849ad71/v1/workspaces/c6e3035b-411a-468d-adac-1ae608f7bf68/message?version=2018-07-10" 200 462 ms
2021-03-26 11:49:26.514: 2021-03-26 11:49:26.514 [INFO] 10.0.3.12 + "POST http://test-bff.lotteon.com/order/v1/mylotte/getOrderList"
I want to transfrom using logstash like
"timestamp" : "2021-03-26 11:49:26.514",
"logLevel" : "INFO",
"IP" : "10.0.3.12",
"inout" : "-",
"Method" : "POST",
"url" : "https://api.kr-seo.assistant.watson.cloud.ibm.com/instances/a33da834-a7a7-48c2-9bf6-d3207849ad71/v1/workspaces/c6e3035b-411a-468d-adac-1ae608f7bf68/message?version=2018-07-10",
"status" : "200",
"duration" : "462 ms"
if, inout field is '+' that status/ duration filed are null ('')
How can I script logstash grok filter? (grok, mutate any other filter OK ...etc)
Help me..!

filter {
grok { match => [ "message", "%{GREEDYDATA:predata} (?<inout>[-+]) \"%{GREEDYDATA:postdata}\""] }
if [inout] == "+"
{
grok { match => [ "message", "%{DATESTAMP:timestamp}: %{GREEDYDATA:data} \[%{LOGLEVEL:loglevel}\] %{IP:IP} (?<inout>[-+]) \"%{WORD:method} %{URI:url}\"" ] }
}
else {
grok { match => [ "message", "%{DATESTAMP:timestamp}: %{GREEDYDATA:data} \[%{LOGLEVEL:loglevel}\] %{IP:IP} (?<inout>[-+]) \"%{WORD:method} %{URI:url}\" %{POSINT:statucode} %{POSINT:duration}" ] }
}
}
Now, you can remove the unnecessary fields:
filter {
mutate {
remove_field => [
"message",
"predata",
"postdata",
"DATE_US",
"IPV6",
"USER",
"USERNAME",
"URIHOST",
"IPORHOST",
"HOSTNAME",
"URIPATHPARAM",
"port",
"URIPATH",
"URIPARAM"
]
remove_tag => [
"multiline",
"_grokparsefailure"
]
}
}

Related

How can convert a json nested string to json in logstash?

My string after json decode js:
"{\"#timestamp\":\"2022-09-27T10:14:49.082014+02:00\",\"#version\":1,\"host\":\"hieu-GF63-Thin-10SC\",\"message\":\"{\\\"command\\\":\\\"test:upload\\\",\\\"title\\\":\\\"Import success\\\",\\\"total_success\\\":10,\\\"total_fails\\\":0,\\\"log_message\\\":\\\"\\\"}\",\"type\":\"Datahub\",\"channel\":\"logstash.main\",\"level\":\"INFO\",\"monolog_level\":200,\"context\":{\"host\":{\"ip\":\"127.0.0.1\"}}}\n"
My logstash script is
udp {
port => 5000
}
}
filter{
json { source => "message" }
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "my-index"
user => "elastic"
password => "123456"
}
}
My result in elastic
{
"_index" : "my-index",
"_id" : "PskDfoMBtWToAIWATogd",
"_score" : 1.0,
"_ignored" : [
"event.original.keyword"
],
"_source" : {
"channel" : "logstash.main",
"context" : {
"host" : {
"ip" : "127.0.0.1"
}
},
"type" : "Datahub",
"monolog_level" : 200,
"message" : "{\"command\":\"test:upload\",\"title\":\"Import success\",\"total_success\":10,\"total_fails\":0,\"log_message\":\"\"}",
"host" : "hieu-GF63-Thin-10SC",
"level" : "INFO",
"#timestamp" : "2022-09-27T08:14:49.082014Z",
"#version" : 1,
"event" : {
"original" : "{\"#timestamp\":\"2022-09-27T10:14:49.082014+02:00\",\"#version\":1,\"host\":\"hieu-GF63-Thin-10SC\",\"message\":\"{\\\"command\\\":\\\"test:upload\\\",\\\"title\\\":\\\"Import success\\\",\\\"total_success\\\":10,\\\"total_fails\\\":0,\\\"log_message\\\":\\\"\\\"}\",\"type\":\"Datahub\",\"channel\":\"logstash.main\",\"level\":\"INFO\",\"monolog_level\":200,\"context\":{\"host\":{\"ip\":\"127.0.0.1\"}}}\n"
}
}
}
How can I extract the value in message field into json data and append in _source
For example, I want to command, total_success field append into _source

Creating a custom GROK pattern

currently, I'm trying to create a grok pattern for this log
2020-03-11 05:54:26,174 JMXINSTRUMENTS-Threading [{"timestamp":"1583906066","label":"Threading","ObjectName":"java.lang:type\u003dThreading","attributes":[{"name":"CurrentThreadUserTime","value":18600000000},{"name":"ThreadCount","value":152},{"name":"TotalStartedThreadCount","value":1138},{"name":"CurrentThreadCpuTime","value":20804323112},{"name":"PeakThreadCount","value":164},{"name":"DaemonThreadCount","value":136}]}]
At the moment I can match correctly until the JMXINTRUMENTS-Threading by using this pattern:
%{TIMESTAMP_ISO8601:timestamp} (?<instrument>[^\ ]*) ?%{GREEDYDATA:log_message}
But I can not seem to match all the values after this. Has anybody got an idea as to what pattern I should use?
It worked for me after defining a different source and target in the JSON filter. Thanks for the help!
filter {
if "atlassian-jira-perf" in [tags] {
grok {
match => { "message" =>"%{TIMESTAMP_ISO8601:timestamp} (?<instrument>[^\ ]*) ?%{GREEDYDATA:log_message_raw}" }
tag_on_failure => ["no_match"]
add_tag => ["bananas"]
}
if "no_match" not in [tags] {
json {
source => "log_message_raw"
target => "parsed"
}
}
mutate {
remove_field => ["message"]
}
}
}
i'm trying your pattern in https://grokdebug.herokuapp.com/ (which is the official debugger for logstash) and it does match everything after "JMXINTRUMENTS-Threading" with your pattern in a big field called log message, in this way:
{
"timestamp": [
[
"2020-03-11 05:54:26,174"
]
],
"YEAR": [
[
"2020"
]
],
"MONTHNUM": [
[
"03"
]
],
"MONTHDAY": [
[
"11"
]
],
"HOUR": [
[
"05",
null
]
],
"MINUTE": [
[
"54",
null
]
],
"SECOND": [
[
"26,174"
]
],
"ISO8601_TIMEZONE": [
[
null
]
],
"instrument": [
[
"JMXINSTRUMENTS-Threading"
]
],
"log_message": [
[
"[{"timestamp":"1583906066","label":"Threading","ObjectName":"java.lang:type\\u003dThreading","attributes":[{"name":"CurrentThreadUserTime","value":18600000000},{"name":"ThreadCount","value":152},{"name":"TotalStartedThreadCount","value":1138},{"name":"CurrentThreadCpuTime","value":20804323112},{"name":"PeakThreadCount","value":164},{"name":"DaemonThreadCount","value":136}]}]"
]
]
}
if you wish to match all the field contained in log message you should use a json filter in your logstash pipeline filter section, just right below your grok filter:
For example:
grok {
match => { "message" =>"%{TIMESTAMP_ISO8601:timestamp} (?<instrument>[^\ ]*) ?%{GREEDYDATA:log_message}" }
tag_on_failure => ["no_match"]
}
if "no_match" not in [tags] {
json {
source => "log_message"
}
}
In that way your json will be splitted in key: value and parsed.
EDIT:
You could try to use a kv filter instead of json, here the docs: https://www.elastic.co/guide/en/logstash/current/plugins-filters-kv.html
grok {
match => { "message" =>"%{TIMESTAMP_ISO8601:timestamp} (?<instrument>[^\ ]*) ?%{GREEDYDATA:log_message}" }
tag_on_failure => ["no_match"]
}
if "no_match" not in [tags] {
kv {
source => "log_message"
value_split => ":"
include_brackets => true #remove brackets
remove_char_key => "\""
remove_char_value => "\""
field_split => ","
}
}

Logstash is sending a log twice. Repeating logs Issue

I am parsing logs of a file of my server and sending only info, warning and error level logs to my API but problem is that I am receiving a log two times. In output I am mapping parsed logs values to on my JSON fields and I am send that json to my API but I am receiving that mapping of json twice.
I am analyzing my logstash log file but a log entry is only appeared once in log file.
{
"log_EventMessage" => "Unable to sendViaPost to url[http://ubuntu:8280/services/TestProxy.TestProxyHttpSoap12Endpoint] Read timed ",
"message" => "TID: [-1234] [] [2017-08-11 12:03:11,545] INFO {org.apache.axis2.transport.http.HTTPSender} - Unable to sendViaPost to url[http://ubuntu:8280/services/TestProxy.TestProxyHttpSoap12Endpoint] Read time",
"type" => "carbon",
"TimeStamp" => "2017-08-11T12:03:11.545",
"tags" => [
[0] "grokked",
[1] "loglevelinfo",
[2] "_grokparsefailure"
],
"log_EventTitle" => "org.apache.axis2.transport.http.HTTPSender",
"path" => "/home/waqas/Documents/repository/logs/carbon.log",
"#timestamp" => 2017-08-11T07:03:13.668Z,
"#version" => "1",
"host" => "ubuntu",
"log_SourceSystemId" => "-1234",
"EventId" => "b81a054e-babb-426c-b0a0-268494d14a0e",
"log_EventType" => "INFO"
}
Following are my configuration.
Need help. Unable to figure out the reason that why this is happening.
input {
file {
path => "LOG_FILE_PATH"
type => "carbon"
start_position => "end"
codec => multiline {
pattern => "(^\s*at .+)|^(?!TID).*$"
negate => false
what => "previous"
auto_flush_interval => 1
}
}
}
filter {
#***********************************************************
# Grok Pattern to parse Single Line Log Entries
#**********************************************************
if [type] == "carbon" {
grok {
match => [ "message", "TID:%{SPACE}\[%{INT:log_SourceSystemId}\]%{SPACE}\[%{DATA:log_ProcessName}\]%{SPACE}\[%{TIMESTAMP_ISO8601:TimeStamp}\]%{SPACE}%{LOGLEVEL:log_EventType}%{SPACE}{%{JAVACLASS:log_EventTitle}}%{SPACE}-%{SPACE}%{GREEDYDATA:log_EventMessage}" ]
add_tag => [ "grokked" ]
}
mutate {
gsub => [
"TimeStamp", "\s", "T",
"TimeStamp", ",", "."
]
}
if "grokked" in [tags] {
grok {
match => ["log_EventType", "INFO"]
add_tag => [ "loglevelinfo" ]
}
grok {
match => ["log_EventType", "ERROR"]
add_tag => [ "loglevelerror" ]
}
grok {
match => ["log_EventType", "WARN"]
add_tag => [ "loglevelwarn" ]
}
}
#*****************************************************
# Grok Pattern in Case of Failure
#*****************************************************
if !( "_grokparsefailure" in [tags] ) {
grok{
match => [ "message", "%{GREEDYDATA:log_StackTrace}" ]
add_tag => [ "grokked" ]
}
date {
match => [ "timestamp", "yyyy MMM dd HH:mm:ss:SSS" ]
target => "TimeStamp"
timezone => "UTC"
}
}
}
#*******************************************************************
# Grok Pattern to handle MultiLines Exceptions and StackTraces
#*******************************************************************
if ( "multiline" in [tags] ) {
grok {
match => [ "message", "%{GREEDYDATA:log_StackTrace}" ]
add_tag => [ "multiline" ]
tag_on_failure => [ "multiline" ]
}
date {
match => [ "timestamp", "yyyy MMM dd HH:mm:ss:SSS" ]
target => "TimeStamp"
}
}
}
filter {
uuid {
target => "EventId"
}
}
output {
if [type] == "carbon" {
if "loglevelerror" in [tags] {
stdout{codec => rubydebug}
#*******************************************************************
# Sending Error Messages to API
#*******************************************************************
http {
url => "https://localhost:8000/logs"
headers => {
"Accept" => "application/json"
}
connect_timeout => 60
socket_timeout => 60
http_method => "post"
format => "json"
mapping => ["EventId","%{EventId}","EventSeverity","High","TimeStamp","%{TimeStamp}","EventType","%{log_EventType}","EventTitle","%{log_EventTitle}","EventMessage","%{log_EventMessage}","SourceSystemId","%{log_SourceSystemId}","StackTrace","%{log_StackTrace}"]
}
}
}
if [type] == "carbon" {
if "loglevelinfo" in [tags] {
stdout{codec => rubydebug}
#*******************************************************************
# Sending Info Messages to API
#*******************************************************************
http {
url => "https://localhost:8000/logs"
headers => {
"Accept" => "application/json"
}
connect_timeout => 60
socket_timeout => 60
http_method => "post"
format => "json"
mapping => ["EventId","%{EventId}","EventSeverity","Low","TimeStamp","%{TimeStamp}","EventType","%{log_EventType}","EventTitle","%{log_EventTitle}","EventMessage","%{log_EventMessage}","SourceSystemId","%{log_SourceSystemId}","StackTrace","%{log_StackTrace}"]
}
}
}
if [type] == "carbon" {
if "loglevelwarn" in [tags] {
stdout{codec => rubydebug}
#*******************************************************************
# Sending Warn Messages to API
http {
url => "https://localhost:8000/logs"
headers => {
"Accept" => "application/json"
}
connect_timeout => 60
socket_timeout => 60
http_method => "post"
format => "json"
mapping => ["EventId","%{EventId}","EventSeverity","Medium","TimeStamp","%{TimeStamp}","EventType","%{log_EventType}","EventTitle","%{log_EventTitle}","EventMessage","%{log_EventMessage}","SourceSystemId","%{log_SourceSystemId}","StackTrace","%{log_StackTrace}"]
}
}
}
}

grok parser failure - for django logs

This is 1 of my log entries,
INFO 2017-05-16 17:24:11,690 views 14463 139643033982720 https://play.google.com/store/apps/details?id=com.VoDrive&referrer=referral_code%3DP5E
This is my pattern ,
DJANGOTIMESTAMP %{YEAR}-%{MONTHNUM}-%{MONTHDAY}%{SPACE}%{HOUR}:%{MINUTE}:%{SECOND}
This is my logstash conf file,
input {
beats {
port => "5043"
}
}
filter {
if [type] in ["django"] {
grok {
patterns_dir => ["/opt/logstash/patterns"]
match => [ "message" , "%{LOGLEVEL:level}%{SPACE}%{DJANGOTIMESTAMP:timestamp},%{INT:pid}%{SPACE}%{WORD:origin}%{SPACE}%{INT:uid}%{SPACE}%{INT:django-id}%{SPACE}%{GREEDYDATA:action}" ]
}
}
}
output {
elasticsearch {
hosts => [ "localhost:9200" ]
index => "%{type}_indexer"
}
}
IN elasticsearch output, the fields are not made,
luvpreet#DHARI-Inspiron-3542:/usr/bin$ curl -XGET 'localhost:9200/django_indexer/_search?pretty=true&q=*:*'
{
"_index" : "django_indexer",
"_type" : "django",
"_id" : "AVwu8tE7j-Kh6vl1kUdf",
"_score" : 1.0,
"_source" : {
"#timestamp" : "2017-05-22T06:55:52.819Z",
"offset" : 144,
"#version" : "1",
"beat" : {
"hostname" : "DHARI-Inspiron-3542",
"name" : "DHARI-Inspiron-3542",
"version" : "5.4.0"
},
"input_type" : "log",
"host" : "DHARI-Inspiron-3542",
"source" : "/var/log/django/a.log",
"message" : "INFO 2017-05-16 06:33:08,673 views 40152 139731056719616 https://play.google.com/store/apps/details?id=com.VoDrive&referrer=referral_code%3DP5E",
"type" : "django",
"tags" : [
"beats_input_codec_plain_applied"
]
}
It is not saying that parser has failed, but why are the fields not being made ?
What am I lacking ?
Try with this grok pattern:
%{LOGLEVEL:loglevel}%{SPACE}%{TIMESTAMP_ISO8601:timestamp},%{INT:pid}%{SPACE}%{WORD:origin}%{SPACE}%{INT:id}%{SPACE}%{INT:number}%{SPACE}%{URI:action}
Input
INFO 2017-05-16 17:24:11,690 views 14463 139643033982720 https://play.google.com/store/apps/details?id=com.VoDrive&referrer=referral_code%3DP5E
Output
number 139643033982720
timestamp 2017-05-16ยท17:24:11
id 14463
port
pid 690
origin views
action https://play.google.com/store/apps/details?id=com.VoDrive&referrer=referral_code%3DP5E
loglevel INFO
You can then remove the field port with a mutate in you filter plugin
mutate {
remove_field => ["port"]
}
UPDATE
Ok, I tried your configuration on with my logstash.
This is what I did:
1- Configure filebeat:
filebeat.prospectors:
- paths:
- /etc/filebeat/FilebeatInputTest.txt
document_type: django
output.logstash:
hosts: ["127.0.0.1:5044"]
2- Configure logstash
input {
beats {
port => "5044"
}
}
filter {
if [type] == "django" {
grok {
match => [ "message" , "%{LOGLEVEL:loglevel}%{SPACE}%{TIMESTAMP_ISO8601:timestamp},%{INT:pid}%{SPACE}%{WORD:origin}%{SPACE}%{INT:id}%{SPACE}%{INT:number}%{SPACE}%{GREEDYDATA:action}" ]
}
mutate {
remove_field => ["#timestamp", "beat","input_type","offset","source","#version","host","tags","message"]
}
}
}
output {
elasticsearch {
hosts => [ "xx.xx.xx.xx:9200" ]
index => "%{type}_indexer"
user => "xxxx"
password => "xxxx"
}
}
You can remove user and password if your elasticsearch is not secured.
Input (content of /etc/filebeat/FilebeatInputTest.txt)
INFO 2017-05-16 17:24:11,690 views 14463 139643033982720 https://play.google.com/store/apps/details?id=com.VoDrive&referrer=referral_code%3DP5E
Output (In elasticsearch)
{
"_index" : "django_indexer",
"_type" : "django",
"_id" : "AVwhFe30JYGYNG_7C7YI",
"_score" : 1.0,
"_source" : {
"origin" : "views",
"pid" : "690",
"type" : "django",
"number" : "139643033982720",
"loglevel" : "INFO",
"action" : "https://play.google.com/store/apps/details?id=com.VoDrive&referrer=referral_code%3DP5E",
"id" : "14463",
"timestamp" : "2017-05-16 17:24:11"
}
}
Hope this helps.

Logstash: TestResult comes out as an array

The generated results of running the config below show the TestResult section as an array. I am trying to get rid of that array to make sending the data to Elasticsearch.
I have the following XML file:
<tem:SubmitTestResult xmlns:tem="http://www.example.com" xmlns:acs="http://www.example.com" xmlns:acs1="http://www.example.com">
<tem:LabId>123</tem:LabId>
<tem:userId>123</tem:userId>
<tem:TestResult>
<acs:CreatedBy>123</acs:CreatedBy>
<acs:CreatedDate>123</acs:CreatedDate>
<acs:LastUpdatedBy>123</acs:LastUpdatedBy>
<acs:LastUpdatedDate>123</acs:LastUpdatedDate>
<acs1:Capacity95FHigh>123</acs1:Capacity95FHigh>
<acs1:Capacity95FHigh_AHRI>123</acs1:Capacity95FHigh_AHRI>
<acs1:CondensateDisposal_AHRI>123</acs1:CondensateDisposal_AHRI>
<acs1:DegradationCoeffCool>123</acs1:DegradationCoeffCool>
</tem:TestResult>
</tem:SubmitTestResult>
And I am using this config:
input {
file {
path => "/var/log/logstash/test3.xml"
}
}
filter {
multiline {
pattern => "<tem:SubmitTestResult>"
negate => "true"
what => "previous"
}
if "multiline" in [tags] {
mutate {
gsub => ["message", "\n", ""]
}
mutate {
replace => ["message", '<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>%{message}']
}
xml {
source => "message"
target => "SubmitTestResult"
}
mutate {
remove_field => ["message", "#version", "host", "#timestamp", "path", "tags", "type"]
remove_field => ["[SubmitTestResult][xmlns:tem]","[SubmitTestResult][xmlns:acs]","[SubmitTestResult][xmlns:acs1]"]
}
mutate {
replace => [ "[SubmitTestResult][LabId]", "%{[SubmitTestResult][LabId]}" ]
replace => [ "[SubmitTestResult][userId]", "%{[SubmitTestResult][userId]}" ]
}
mutate {
replace => [ "[SubmitTestResult][TestResult][0][CreatedBy]", "%{[SubmitTestResult][TestResult][0][CreatedBy]}" ]
replace => [ "[SubmitTestResult][TestResult][0][CreatedDate]", "%{[SubmitTestResult][TestResult][0][CreatedDate]}" ]
replace => [ "[SubmitTestResult][TestResult][0][LastUpdatedBy]", "%{[SubmitTestResult][TestResult][0][LastUpdatedBy]}" ]
replace => [ "[SubmitTestResult][TestResult][0][LastUpdatedDate]", "%{[SubmitTestResult][TestResult][0][LastUpdatedDate]}" ]
replace => [ "[SubmitTestResult][TestResult][0][Capacity95FHigh]", "%{[SubmitTestResult][TestResult][0][Capacity95FHigh]}" ]
replace => [ "[SubmitTestResult][TestResult][0][Capacity95FHigh_AHRI]", "%{[SubmitTestResult][TestResult][0][Capacity95FHigh_AHRI]}" ]
replace => [ "[SubmitTestResult][TestResult][0][CondensateDisposal_AHRI]", "%{[SubmitTestResult][TestResult][0][CondensateDisposal_AHRI]}" ]
replace => [ "[SubmitTestResult][TestResult][0][DegradationCoeffCool]", "%{[SubmitTestResult][TestResult][0][DegradationCoeffCool]}" ]
}
}
}
output {
stdout {
codec => "rubydebug"
}
}
The result is:
"SubmitTestResult" => {
"LabId" => "123",
"userId" => "123",
"TestResult" => [
[0] {
"CreatedBy" => "123",
"CreatedDate" => "123",
"LastUpdatedBy" => "123",
"LastUpdatedDate" => "123",
"Capacity95FHigh" => "123",
"Capacity95FHigh_AHRI" => "123",
"CondensateDisposal_AHRI" => "123",
"DegradationCoeffCool" => "123"
}
]
}
As you can see, TestResult has the "[0]" array in there. Is there some config change I can do to make sure that it doesn't come out as an array? I want to send this to Elasticsearch and want the data correct.
I figured this out. After the last mutate block, I added one more mutate block. All I had to do was rename the field and that did the trick.
mutate {
rename => {"[SubmitTestResult][TestResult][0]" => "[SubmitTestResult][TestResult]"}
}
The result now looks proper:
"SubmitTestResult" => {
"LabId" => "123",
"userId" => "123",
"TestResult" => {
"CreatedBy" => "123",
"CreatedDate" => "123",
"LastUpdatedBy" => "123",
"LastUpdatedDate" => "123",
"Capacity95FHigh" => "123",
"Capacity95FHigh_AHRI" => "123",
"CondensateDisposal_AHRI" => "123",
"DegradationCoeffCool" => "123"
}
}

Resources