The contents of LogStash's conf file looks like this:
input {
beats {
port => 5044
}
file {
path => "/usr/share/logstash/iway_logs/*"
start_position => "beginning"
sincedb_path => "/dev/null"
#ignore_older => 0
codec => multiline {
pattern => "^\[%{NOTSPACE:timestamp}\]"
negate => true
what => "previous"
max_lines => 2500
}
}
}
filter {
grok {
match => { "message" =>
['(?m)\[%{NOTSPACE:timestamp}\]%{SPACE}%{WORD:level}%{SPACE}\(%{NOTSPACE:entity}\)%{SPACE}%{GREEDYDATA:rawlog}'
]
}
}
date {
match => [ "timestamp", "yyyy-MM-dd'T'HH:mm:ss.SSS"]
target => "#timestamp"
}
grok {
match => { "entity" => ['(?:W.%{GREEDYDATA:channel}:%{GREEDYDATA:inlet}:%{GREEDYDATA:listener}\.%{GREEDYDATA:workerid}|W.%{GREEDYDATA:channel}\.%{GREEDYDATA:workerid}|%{GREEDYDATA:channel}:%{GREEDYDATA:inlet}:%{GREEDYDATA:listener}\.%{GREEDYDATA:workerid}|%{GREEDYDATA:channel}:%{GREEDYDATA:inlet}:%{GREEDYDATA:listener}|%{GREEDYDATA:channel})']
}
}
dissect {
mapping => {
"[log][file][path]" => "/usr/share/logstash/iway_logs/%{serverName}#%{configName}#%{?ignore}.log"
}
}
}
output {
elasticsearch {
hosts => "${ELASTICSEARCH_HOST_PORT}"
index => "iway_"
user => "${ELASTIC_USERNAME}"
password => "${ELASTIC_PASSWORD}"
ssl => true
ssl_certificate_verification => false
cacert => "/certs/ca.crt"
}
}
As one can make out, the idea is to parse a custom log employing multiline extraction. The extraction does its job. The log occasionally contains an empty first line. So:
[2022-11-29T12:23:15.073] DEBUG (manager) Generic XPath iFL functions use full XPath 1.0 syntax
[2022-11-29T12:23:15.074] DEBUG (manager) XPath 1.0 iFL functions use iWay's full syntax implementation
which naturally is causing Kibana to report an empty line:
In an attempt to supress this line from being sent to ES, I added the following as a last filter item:
if ![message] {
drop { }
}
if [message] =~ /^\s*$/ {
drop { }
}
The resulting JSON payload to ES:
{
"#timestamp": [
"2022-12-09T14:09:35.616Z"
],
"#version": [
"1"
],
"#version.keyword": [
"1"
],
"event.original": [
"\r"
],
"event.original.keyword": [
"\r"
],
"host.name": [
"xxx"
],
"host.name.keyword": [
"xxx"
],
"log.file.path": [
"/usr/share/logstash/iway_logs/localhost#iCLP#iway_2022-11-29T12_23_33.log"
],
"log.file.path.keyword": [
"/usr/share/logstash/iway_logs/localhost#iCLP#iway_2022-11-29T12_23_33.log"
],
"message": [
"\r"
],
"message.keyword": [
"\r"
],
"tags": [
"_grokparsefailure"
],
"tags.keyword": [
"_grokparsefailure"
],
"_id": "oRc494QBirnaojU7W0Uf",
"_index": "iway_",
"_score": null
}
While this does drop the empty first line, it also unfortunately interferes with the multiline operation on other lines. In other words, the multiline operation does not work anymore. What am I doing incorrectly?
Use of the following variation resolved the issue:
if [message] =~ /\A\s*\Z/ {
drop { }
}
This solution is based on Badger's answer provided on the Logstash forums, where this question was raised as well.
Related
Im trying to parse a message from my network devices which send messages in format similar to
<30>Feb 14 11:33:59 wireless: ath0 Sending auth to xx:xx:xx:xx:xx:xx. Status: The request has been declined due to MAC ACL (52).\n
<190>Feb 14 11:01:29 CCR00 user admin logged out from xx.xx.xx.xx via winbox
<134>2023 Feb 14 11:00:33 ZTE command-log:An alarm 36609 level notification occurred at 11:00:33 02/14/2023 CET sent by MCP GponRm notify: <gpon-onu_1/1/1:1> SubType:1 Pos:1 ONU Uni lan los. restore\n on \n
using this logstash.conf file
input {
beats {
port => 5044
}
tcp {
port => 50000
}
udp {
port => 50000
}
}
## Add your filters / logstash plugins configuration here
filter {
grok {
match => {
"message" => "^(?:<%{POSINT:syslog_pri}>)?%{GREEDYDATA:message_payload}"
}
}
syslog_pri {
}
mutate {
remove_field => [ "#version" , "message" ]
}
}
output {
stdout {}
elasticsearch {
hosts => "elasticsearch:9200"
user => "logstash_internal"
password => "${LOGSTASH_INTERNAL_PASSWORD}"
}
}
which results in this output
{
"#timestamp": [
"2023-02-14T10:38:59.228Z"
],
"data_stream.dataset": [
"generic"
],
"data_stream.namespace": [
"default"
],
"data_stream.type": [
"logs"
],
"event.original": [
"<14> Feb 14 11:38:59 UBNT BOXSERV[boxs Req]: boxs.c(691) 55381193 %% Error 17 occurred reading thermal sensor 2 data\n\u0000"
],
"host.ip": [
"10.125.132.10"
],
"log.syslog.facility.code": [
1
],
"log.syslog.facility.name": [
"user-level"
],
"log.syslog.severity.code": [
5
],
"log.syslog.severity.name": [
"notice"
],
"message_payload": [
" Feb 14 11:38:59 UBNT[boxs Req]: boxs.c(691) 55381193 %% Error 17 occurred reading thermal sensor 2 data\n\u0000"
],
"syslog_pri": [
"14"
],
"_id": "UzmBT4YBAZPdbqc4m_IB",
"_index": ".ds-logs-generic-default-2023.02.04-000001",
"_score": null
}
which is mostly satisfactory, but i would expect the
log.syslog.facility.name
and
log.syslog.severity.name
fields to be processed by the
syslog_pri
filter
with imput of
<14>
to result into
secur/auth
and
Alert
recpectively,
but i keep getting the default user-level notice for all my messages, no matter what the part of the syslog message contains
anyone could advise and maybe fix my .conf syntax, if its wrong?
thank you very much!
i have logstash configured properly to receive logs and send them to elastics, but the grok/syslog_pri doesnt yield expected results
The fact that the syslog_pri filter is setting [log][syslog][facility][code] shows that it has ECS compatibility enabled. As a result, if you do not set the syslog_pri_field_name option on the syslog_pri filter, it will try to parse [log][syslog][priority]. If that field does not exist then it will parse the default value of 13, which is user-level/notice.
thank you for the answer, i have adjusted the code by the given advice
filter {
grok {
match => { "message" => "^(?:<%{POSINT:syslog_code}>)?%{GREEDYDATA:message_payload}"
} }
syslog_pri { syslog_pri_field_name => "syslog_code"
}
mutate { remove_field => [ "#version" , "message" ] } }
and now it behaves as intended
"event" => {
"original" => "<30>Feb 15 18:41:04 dnsmasq-dhcp[960]: DHCPACK(eth0) 10.0.0.165 xx:xx:xx:xx:xx CZ\n"
},
"#timestamp" => 2023-02-15T17:41:04.977038615Z,
"message_payload" => "Feb 15 18:41:04 dnsmasq-dhcp[960]: DHCPACK(eth0) 10.0.0.165 xx:xx:xx:xx:xx CZ\n",
"log" => {
"syslog" => {
"severity" => {
"code" => 6,
"name" => "informational"
},
"facility" => {
"code" => 3,
"name" => "daemon"
}
}
},
"syslog_code" => "30",
"host" => {
"ip" => "xx.xx.xx.xx"
} }
i will adjust the message a bit to fit my needs,
but that is out of the scope of this question
thank you very much!
I cannot parse the incoming Syslog by JSON. The message field is not getting parsed. I tried JSON filter using addfield and also with mutate but no luck. I used GROK to parse specific fields but the message field has keys and values. How to parse the below message field into JSON
conf file
input {
file {
path => "/opt/log/sample/*.txt"
codec => "plain" # { format => "%{message}" }
}
}
filter {
# mutate { gsub => [ "message","(\")", "" ] }
mutate { gsub => [ "message","(\\")", "" ] }
json {
source => "message"
}
}
output {
file {
path => "/opt/log/out/out.txt"
codec => json_lines
}
stdout {}
}
GROK
%{TIME:timestamp} %{HOST:host} %{GREEDYDATA:message}
GROK output
"timestamp": [
[
"18:11:58"
]
],
"host": [
[
"myhost.aco.mydomain.net"
]
],
"message": [
[
"{destinationPort:90,exception:-,totalByteUsage:0,sourcePort:160,extension:.com\\\\/,contentTypeHeader:-,callout:0,scheme:http,reportingGroup:0,requestMethod:GET,privateIp:-,sAction:Allowed,sourceIpAddress:10.10.10.10,description:-,categoryName:News,sandBoxDecoded:-,urlLogId:0,responseCode:0,sandboxResult:-,computerName:-,totalByteCount:0,audit:0,host:www.local.com,action:Allowed,useTime:0,upstreamByteUsage:0,uriPath:\\\\/,computerMacAddress:00:00:00:00:00:00,direction:0,myboss:myhost,malware:0,ipAddress:10.10.10.10,userAgent:-,publicIp:-,url:http:\\\\/\\\\/www.local.com\\\\/,logTime:2022-07-12,referrerUrl:-,mde:-,sha256Sum:-,macAddress:00:00:00:00:00:00,filename:-,uriQuery:-,filteringGroupName:Default Catch All,downstreamByteUsage:0,cncFlag:0,location:-,time:18:11:57,username:*10.10.10.10}""
]
]
}
I have a log line like this:
09 Nov 2018 15:51:35 DEBUG api.MapAnythingProvider - Calling API For Client: XXX Number of ELEMENTS Requested YYY
I want to ignore all other log lines and only want those lines that have the words "Calling API For Client" in it. Further, I am only interested in the String XXX and Number YYY.
Thanks for the help.
input {
file {
path => ["C:/apache-tomcat-9.0.7/logs/service/service.log"]
sincedb_path => "nul"
start_position => "beginning"
}
}
filter {
grok {
match => {
"message" => "%{MONTHDAY:monthDay} %{MONTH:mon} %{YEAR:year} %{TIME:ts} %{WORD:severity} %{JAVACLASS:claz} - %{GREEDYDATA:logmessage}"
}
}
grok {
match => {
"logmessage" => "%{WORD:keyword} %{WORD:customer} %{WORD:key2} %{NUMBER:mapAnythingCreditsConsumed:float} %{WORD:key3} %{NUMBER:elementsFromCache:int}"
}
}
if "_grokparsefailure" in [tags] {
drop {}
}
mutate {
remove_field => [ "monthDay", "mon", "ts", "severity", "claz", "keyword", "key2", "path", "message", "year", "key3" ]
}
}
output {
if [logmessage] =~ /ExecutingJobFor/ {
elasticsearch {
hosts => ["localhost:9200"]
index => "test"
manage_template => false
}
stdout {
codec => rubydebug
}
}
}
I need some help with Logstash. I currently have the below Logstash config which works. When the [message] tag has "Token validation failed" in it it sends an email out saying auth issue.
input {
tcp {
codec => "json"
port => 5144
tags => ["windows","nxlog"]
type => "nxlog-json"
}
} # end input
filter {
if [type] == "nxlog-json" {
date {
match => ["[EventTime]", "YYYY-MM-dd HH:mm:ss"]
timezone => "Europe/London"
}
mutate {
rename => [ "AccountName", "user" ]
rename => [ "AccountType", "[eventlog][account_type]" ]
rename => [ "ActivityId", "[eventlog][activity_id]" ]
rename => [ "Address", "ip6" ]
rename => [ "ApplicationPath", "[eventlog][application_path]" ]
rename => [ "AuthenticationPackageName", "[eventlog][authentication_package_name]" ]
rename => [ "Category", "[eventlog][category]" ]
rename => [ "Channel", "[eventlog][channel]" ]
rename => [ "Domain", "domain" ]
rename => [ "EventID", "[eventlog][event_id]" ]
rename => [ "EventType", "[eventlog][event_type]" ]
rename => [ "File", "[eventlog][file_path]" ]
rename => [ "Guid", "[eventlog][guid]" ]
rename => [ "Hostname", "hostname" ]
rename => [ "Interface", "[eventlog][interface]" ]
rename => [ "InterfaceGuid", "[eventlog][interface_guid]" ]
rename => [ "InterfaceName", "[eventlog][interface_name]" ]
rename => [ "IpAddress", "ip" ]
rename => [ "IpPort", "port" ]
rename => [ "Key", "[eventlog][key]" ]
rename => [ "LogonGuid", "[eventlog][logon_guid]" ]
rename => [ "Message", "message" ]
rename => [ "ModifyingUser", "[eventlog][modifying_user]" ]
rename => [ "NewProfile", "[eventlog][new_profile]" ]
rename => [ "OldProfile", "[eventlog][old_profile]" ]
rename => [ "Port", "port" ]
rename => [ "PrivilegeList", "[eventlog][privilege_list]" ]
rename => [ "ProcessID", "pid" ]
rename => [ "ProcessName", "[eventlog][process_name]" ]
rename => [ "ProviderGuid", "[eventlog][provider_guid]" ]
rename => [ "ReasonCode", "[eventlog][reason_code]" ]
rename => [ "RecordNumber", "[eventlog][record_number]" ]
rename => [ "ScenarioId", "[eventlog][scenario_id]" ]
rename => [ "Severity", "level" ]
rename => [ "SeverityValue", "[eventlog][severity_code]" ]
rename => [ "SourceModuleName", "nxlog_input" ]
rename => [ "SourceName", "[eventlog][program]" ]
rename => [ "SubjectDomainName", "[eventlog][subject_domain_name]" ]
rename => [ "SubjectLogonId", "[eventlog][subject_logonid]" ]
rename => [ "SubjectUserName", "[eventlog][subject_user_name]" ]
rename => [ "SubjectUserSid", "[eventlog][subject_user_sid]" ]
rename => [ "System", "[eventlog][system]" ]
rename => [ "TargetDomainName", "[eventlog][target_domain_name]" ]
rename => [ "TargetLogonId", "[eventlog][target_logonid]" ]
rename => [ "TargetUserName", "[eventlog][target_user_name]" ]
rename => [ "TargetUserSid", "[eventlog][target_user_sid]" ]
rename => [ "ThreadID", "thread" ]
}
mutate {
remove_field => [
"CurrentOrNextState",
"Description",
"EventReceivedTime",
"EventTime",
"EventTimeWritten",
"IPVersion",
"KeyLength",
"Keywords",
"LmPackageName",
"LogonProcessName",
"LogonType",
"Name",
"Opcode",
"OpcodeValue",
"PolicyProcessingMode",
"Protocol",
"ProtocolType",
"SourceModuleType",
"State",
"Task",
"TransmittedServices",
"Type",
"UserID",
"Version"
]
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
}
if "Token validation failed" in [message] {
email {
address => "smtp01.domain.com"
to => "example#domain.com"
from => "Sender#domain.com"
subject => "Auth Issue"
body => "Auth Issue"
port => 25
use_tls => false
via => "smtp"
}
}
} # end output
I would like to know how to get the email to send only if the message tag "Token validation failed" 10 times in one minute. If it has 9 or below entries it will not send any emails. What config do I need to setup to get this to work?
There are a few ways to achieve that.
A. You can use XPack Alerting (formerly called Watcher) or ElastAlert as described in this answer
B. You can use the aggregate Logstash filter in order to keep track and count the "Token validation failed" messages as described in this answer. You simply need to
aggregate {
task_id => "%{[eventlog][target_logonid]}"
code => "map['failed_count'] ||= 0; map['failed_count'] += 1;"
push_map_as_event_on_timeout => true
timeout => 60 # 1 minute timeout
timeout_tags => ['_aggregatetimeout']
timeout_code => "event.set('token_failed', event.get('failed_count') >= 10)"
}
Then you can send your email only if [token_failed]
C. You can use the ruby Logstash filter in order to count and cache the number of times the "Token validation failed" message has occurred. It's basically the same as B but by implementing the logic yourself in Ruby code.
D. You can use the metrics Logstash filter in order to compute the rate of events having "Token validation failed" in the message field.
metrics {
meter => [ "message" ]
rates => [ 1 ]
add_tag => "metric"
}
Then in your output you can simply use the metered info like this:
if "metric" in [tags] and [Token validation failed][count] >= 10 {
email {
...
}
}
Note that with solutions B and C you cannot launch Logstash with more than one worker (i.e. -w 1). I've filed an enhancement request to "fix" that issue, but since the Logstash team already has a huge pipeline of TODOs, we'll see what happens.
I'm new to Logstash, trying to use it to parse a HTML log file.
I need to output only the log lines, i.e. ignore preceding JS, CSS and HTML that are also included in the file.
A log line in the file looks like this:
<tr bgcolor="tomato"><td>Jan 28<br>13:52:25.692</td><td>Jan 28<br>13:52:23.950</td><td>qtp114615276-1648 [POST] [call_id:-8009072655119858507]</td><td>REST</td><td>sa</td><td>0.0.0.0</td><td>ERR</td><td>ProjectValidator.validate(36)</td><td>Project does not exist</td></tr>
I have no problem getting all the lines, but I would like to have an output which contains only the relevant ones, without HTML tags, and looks something like that:
{
"db_timestamp": "2015-01-28 13:52:25.692",
"server_timestamp": "2015-01-28 13:52:25.950",
"node": "qtp114615276-1648 [POST] [call_id:-8009072655119858507]",
"thread": "REST",
"user": "sa",
"ip": "0.0.0.0",
"level": "ERR",
"method": "ProjectValidator.validate(36)",
"message": "Project does not exist"
}
My Logstash configuration is:
input {
file {
type => "request"
path => "<some path>/*.log"
start_position => "beginning"
}
file {
type => "log"
path => "<some path>/*.html"
start_position => "beginning"
}
}
filter {
if [type] == "log" {
grok {
match => [ WHAT SHOULD I PUT HERE??? ]
}
}
}
output {
stdout {}
if [type] == "request" {
http {
http_method => "post"
url => "http://<some url>"
mapping => ["type", "request", "host" ,"%{host}", "timestamp", "%{#timestamp}", "message", "%{message}"]
}
}
if [type] == "log" {
http {
http_method => "post"
url => "http://<some url>"
mapping => [ ALSO WHAT SHOULD I PUT HERE??? ]
}
}
}
Is there a way to do that? So far I haven't found any relevant documentation or samples.
Thanks!
Finally figured out the answer.
Not sure this is the best or most elegant solution, but it works.
I changed the http output format to "message", which enabled me to override and format the whole message as JSON, instead of using mapping. Also, found out how to name parameters in the grok filter and use them in the output.
This is the new Logstash configuration file:
input {
file {
type => "request"
path => "<some path>/*.log"
start_position => "beginning"
}
file {
type => "log"
path => "<some path>/*.html"
start_position => "beginning"
}
}
filter {
if [type] == "log" {
grok {
match => { "message" => "<tr bgcolor=.*><td>%{MONTH:db_date}%{SPACE}%{MONTHDAY:db_date}<br>%{TIME:db_date}</td><td>%{MONTH:alm_date}%{SPACE}%{MONTHDAY:alm_date}<br>%{TIME:alm_date}</td><td>%{DATA:thread}</td><td>%{DATA:req_type}</td><td>%{DATA:username}</td><td>%{IP:ip}</td><td>%{DATA:level}</td><td>%{DATA:method}</td><td>%{DATA:err_message}</td></tr>" }
}
}
}
output { stdout { codec => rubydebug }
if [type] == "request" {
http {
http_method => "post"
url => "http://<some URL>"
mapping => ["type", "request", "host" ,"%{host}", "timestamp", "%{#timestamp}", "message", "%{message}"]
}
}
if [type] == "log" {
http {
format => "message"
content_type => "application/json"
http_method => "post"
url => "http://<some URL>"
message=> '{
"db_date":"%{db_date}",
"alm_date":"%{alm_date}",
"thread": "%{thread}",
"req_type": "%{req_type}",
"username": "%{username}",
"ip": "%{ip}",
"level": "%{level}",
"method": "%{method}",
"message": "%{err_message}"
}'
}
}
}
Note the single quote for the http message block and the double quotes for the parameters inside this block.
For anyone parsing HP ALM logs, the following Logstash filter will do the work:
grok {
break_on_match => true
match => [ "message", "<tr bgcolor=.*><td>%{MONTH:db_date_mon}%{SPACE}%{MONTHDAY:db_date_day}<br>%{TIME:db_date_time}<\/td><td>%{MONTH:alm_date_mon}%{SPACE}%{MONTHDAY:alm_date_day}<br>%{TIME:alm_date_time}<\/td><td>(?<thread_col1>.*?)<\/td><td>(?<request_type>.*?)<\/td><td>(?<login>.*?)<\/td><td>(?<ip>.*?)<\/td><td>(?<level>.*?)<\/td><td>(?<method>.*?)<\/td><td>(?m:(?<log_message>.*?))</td></tr>" ]
}
mutate {
add_field => ["db_date", "%{db_date_mon} %{db_date_day}"]
add_field => ["alm_date", "%{alm_date_mon} %{alm_date_day}"]
remove_field => [ "db_date_mon", "db_date_day", "alm_date_mon", "alm_date_day" ]
gsub => [
"log_message", "<br>", "
"
]
gsub => [
"log_message", "<p>", " "
]
}
Tested and working fine with Logstash 2.4.0