Logstash Parse Date Issue - logstash

I am using logstash to read some logs. I have a log file which the Timestamp only consist of time field, i.e. 08:28:20,500, but no date field. I would like to map it with the datetime of today. How should I do that with date filter.
A line of my log file is like this.
08:28:20,500 INFO [org.jboss.as.connector.subsystems.datasources] (ServerService Thread Pool -- 27) JBAS010403: Deploying JDBC-compliant driver class org.h2.Driver (version 1.3)>>"C:\CIGNA\jboss\jboss.log"
Is there anyone who can help with this issue?Great thanks in advance.
EDIT
After using ruby as filter, I have managed to solve the issue. However, there is occasionally a ruby exception. As seen from below, the first message has come across with ruby exception while the 2nd one runs fine. I would wonder how this happen and if anyone can provide me some advice. Thanks.
{
"message" => "10:30:39 FATAL [org.jboss.as.server] (default task-1) JBAS015957: Server boot has failed in an unre
coverable manner; exiting. See previous messages for details.\r",
"#version" => "1",
"#timestamp" => "2016-07-26T02:43:17.379Z",
"path" => "C:/CIGNA/jboss/jboss.log",
"host" => "SIMSPad",
"type" => "txt",
"Time" => "10:30:39",
"Level" => "FATAL",
"JavaClass" => "org.jboss.as.server",
"Message" => "(default task-1) JBAS015957: Server boot has failed in an unrecoverable manner; exiting. See previo
us messages for details.\r",
"tags" => [
[0] "_rubyexception"
]
}
{
"message" => "10:30:39 DEBUG [org.jboss.as.quickstarts.logging.LoggingExample] (default task-1) Settings reconfig
ured: JBOSS EAP Resettlement\r",
"#version" => "1",
"#timestamp" => "2016-07-26T02:30:39.000Z",
"path" => "C:/CIGNA/jboss/jboss.log",
"host" => "SIMSPad",
"type" => "txt",
"Time" => "10:30:39",
"Level" => "DEBUG",
"JavaClass" => "org.jboss.as.quickstarts.logging.LoggingExample",
"Message" => "(default task-1) Settings reconfigured: JBOSS EAP Resettlement\r"
}
And my updated filter part in my logstash .conf file is as shown.
filter {
grok {
match => { "message" => '\A%{TIME:Time}%{SPACE}%{WORD:Level}%{SPACE}\[%{PROG:JavaClass}]%{SPACE}%{JAVALOGMESSAGE:Message}'}
}
ruby {
code => "
p = Time.parse(event['message']);
event['#timestamp'] = LogStash::Timestamp.new(p);
"
}
}

you can do that via ruby filter. Ruby can parse this out of the box. Sorry, I have not tried it with the date filter (might work as well). Here is my example:
my configuration:
input {
stdin {
}
}
filter {
ruby {
code => "
p = Time.parse(event['message']);
event['myTime'] = p;
"
}
}
output {
stdout { codec => rubydebug }
}
Input and output:
[
artur#pandaadb:~/dev/logstash$ ./logstash-2.3.2/bin/logstash -f conf2/
Settings: Default pipeline workers: 8
Pipeline main started
08:28:20
{
"message" => "08:28:20",
"#version" => "1",
"#timestamp" => "2016-07-25T09:43:28.814Z",
"host" => "pandaadb",
"myTime" => 2016-07-25 08:28:20 +0100
}
I am simply passing your string, you can use the variable that you parsed, e.g. "Time" in the ruby code.
Ruby is quite smart when parsing dates and recognises that it is a time, rather than an entire date. So it uses today's timestamp and modifies the time only.
Hope that helps!
EDIT:
I tried the date filter just now and that one works differently. It sets the date to the 1st of this year. So it appears that the ruby filter will be your solution as the date filter does not offer any date modifications that I know of to modify the date after it has been matched.
EDIT 2:
In the comment you asked how to write it into the #timestmap field. The #timestamp field is a predefined field that expects a Logstash Timestamp Objects (not a string or datetime object). So you can write it directly into that field, however you must create an object. (Alternatively this would also work by using the date filter, but why double the filters)
Here is the necessary code:
ruby {
code => "
p = Time.parse(event['message']);
event['#timestamp'] = LogStash::Timestamp.new(p);
"
}
EDIT:
With regards to the updated question, your issue is that you are using the wrong variable in your event.
From your log update, you can see that your grok is parsing the things correctly, e.g.:
"message" => "10:30:39 FATAL [org.jboss.as.server] (default task-1) JBAS015957: Server boot has failed in an unre ...",
"Time" => "10:30:39"
In your filter however you reference the "message" variable of the event, not the "Time" variable.
So ruby will attempt to parse the entire message string into a date. Why this works in the second log is a mystery to me :D
You need to change your filter to:
filter {
grok {
match => { "message" => '\A%{TIME:Time}%{SPACE}%{WORD:Level}%{SPACE}\[%{PROG:JavaClass}]%{SPACE}%{JAVALOGMESSAGE:Message}'}
}
ruby {
code => "
p = Time.parse(event['Time']);
event['#timestamp'] = LogStash::Timestamp.new(p);
"
}
}
This will tell the parsing to take the time that is in the event field "Time".
Regards,
Artur

Related

Using grok to create fields in sample log

FMT="1358 15:41:07W19/03/21 (A) Interlocking Link 116 Restored" STY="A" AMSEQ="LINKFAIL" AMSST="RTN" ALTID="1358" TS="20210319154107" CP="LOC A" CP="LOC X" MP="104.95" MP="104.95" EQ="MDIPRIMARYOFF" POS="TC-NORTH"
The log format is as above. I would like to capture the following fields using grok
Time - 15:41:07
Date - 19/03/21
Message - Interlocking Link 116 Restored
Location - Loc X
Anyone help with creating grok pattern that I can use on my logstash filter to parse my logs?
I would not use grok to start with. This is key/value data, so a kv filter will get you started, then you can grok the parts of the FMT field out.
kv { include_keys => [ "FMT", "CP" ] target => "[#metadata]" }
mutate { add_field => { "Location" => "%{[#metadata][CP][1]}" } }
grok { match => { "[#metadata][FMT]" => "%{NUMBER} %{TIME:Time}W%{DATE_EU:Date} \(%{WORD}\) %{GREEDYDATA:Message}" } }
will result in
"Message" => "Interlocking Link 116 Restored",
"Date" => "19/03/21",
"Time" => "15:41:07",
"Location" => "LOC X",
Although having multiple CP fields feels fragile.
The include_keys option on the kv filter tells the filter to ignore other keys. Using target to put the fields under [#metadata] means they are available to other filters but are not sent to the output. The remove_field option on the kv filter is only processed if the filter is able to parse the message, so if your kv data is invalid you will have a [message] field on the event that you can look at.

Logstash - Multiple grok pattern not working together

I am very new in using Logstash. I have two kinds of log,
Pattern 1 : --2019-05-09 08:53:45.057 -INFO 11736 --- [ntainer#1-0-C-1] c.s.s.service.MessageLogServiceImpl : [adc7fd862db5307a688817198046b284dbb12b9347bed9067320caa49d8efa381557392024151] Event => Message Status Change [Start Time : 09052019 08:53:44] : CUSTOM_PROCESSING_COMPLETED
Pattern 2 : --2019-05-09 06:49:05.590 -TRACE 6293 --- [ntainer#0-0-C-1] c.s.s.service.MessageLogServiceImpl : [41a6811cbc1c66eda0e942712a12a003d6bf4654b3edb6d24bf159b592afc64f1557384545548] Event => Message Failure Identified : INVALID_STRUCTURE
Though there are many more other lines, but I want to consider only these two types. Hence I used below filter,
grok {
#Event : message status change
match => {
"message" => "--(?<logtime>[^\]]*) -%{LOGLEVEL:level} (?<pid>\d+) --- \[(?<thread>[^\]]+)] (?<classname>[\w.]+)\s+: \[(?<token>[^\]]+)] Event \=> Message Status Change \[Start Time : (?<start>[^\]]*)\] : (?<status>[\w]+)"
}
add_field => {
"event" => "message_status_change"
}
}
grok {
#Event : message failure
match => {
"message" => "--(?<logtime>[^\]]*) -%{LOGLEVEL:level} (?<pid>\d+) --- \[(?<thread>[^\]]+)] (?<classname>[\w.]+)\s+: \[(?<token>[^\]]+)] Event \=> Message Failure Identified : (?<code>[\w]+)"
}
add_field => {
"event" => "message_failure"
}
}
I have also noticed that both of these grok patterns work individually (if I comment one, then other one works perfectly). Logstash server also ok when both patterns are active. But it raises a grokparse error when both of them is open and a new line is added in the log file.
Also I want to know, though I am configured the input to read from a file from beginning, it is not reading even after server restart unless I add a new line in the log. Why this behaviour?
Thanks in advance.

Hard to stash a log file with different occurrence of order for a field using Logstash

I am trying to stash a log file to elasticsearch using Logstash. I am facing a problem while doing this.
If the log file has same kind of log lines like the below,
[12/Sep/2016:18:23:07] VendorID=5037 Code=C AcctID=5317605039838520
[12/Sep/2016:18:23:22] VendorID=9108 Code=A AcctID=2194850084423218
[12/Sep/2016:18:23:49] VendorID=1285 Code=F AcctID=8560077531775179
[12/Sep/2016:18:23:59] VendorID=1153 Code=D AcctID=4433276107716482
where the date, vendorId, code and acctID's order of occurrence of fields does not change or a new element is not added in to it, then the filter(given below) in the config files work well.
\[%{MONTHDAY}/%{MONTH}/%{YEAR}:%{TIME}\] VendorID=%{INT:VendorID} Code=%{WORD:Code} AcctID=%{INT:AcctID}
Suppose the order changes like the example given below or if a new element is added to one of the log lines, then the grokparsefailure occurs.
[12/Sep/2016:18:23:07] VendorID=5037 Code=C AcctID=5317605039838520
[12/Sep/2016:18:23:22] VendorID=9108 Code=A AcctID=2194850084423218
[12/Sep/2016:18:23:49] VendorID=1285 Code=F AcctID=8560077531775179
[12/Sep/2016:18:23:59] VendorID=1153 Code=D AcctID=4433276107716482
[12/Sep/2016:18:24:50] AcctID=3168124750473449 VendorID=1065 Code=L
[12/Sep/2016:18:24:50] AcctID=3168124750473449 VendorID=1065 Code=L
[12/Sep/2016:18:24:50] AcctID=3168124750473449 VendorID=1065 Code=L
Here in the example, the last three log lines are different from the first four log lines in order of occurrence of the fields. And because of this, the filter message with the grok pattern could not parse the below three lines as it is written for the first four lines.
How should I handle this scenario, when i come across this case? Please help me solve this problem. Also provide any link to any document for detailed explanation with examples.
Thank you very much in advance.
As correctly pointed out by baudsp, this can be achieved by multiple grok filters. The KV filter seems like a nicer option, but as for grok, this is one solution:
input {
stdin {}
}
filter {
grok {
match => {
"message" => ".*test1=%{INT:test1}.*"
}
}
grok {
match => {
"message" => ".*test2=%{INT:test2}.*"
}
}
}
output {
stdout { codec => rubydebug }
}
By having 2 different grok filter applied, we can disregard the order of the logs coming in. The patterns specified basically do not care about what comes before or after the String test and rather just standalone match their respective patterns.
So, for these 2 strings:
test1=12 test2=23
test2=23 test1=12
You will get the correct output. Test:
artur#pandaadb:~/dev/logstash$ ./logstash-2.3.2/bin/logstash -f conf_grok_ordering/
Settings: Default pipeline workers: 8
Pipeline main started
test1=12 test2=23
{
"message" => "test1=12 test2=23",
"#version" => "1",
"#timestamp" => "2016-12-21T16:48:24.175Z",
"host" => "pandaadb",
"test1" => "12",
"test2" => "23"
}
test2=23 test1=12
{
"message" => "test2=23 test1=12",
"#version" => "1",
"#timestamp" => "2016-12-21T16:48:29.567Z",
"host" => "pandaadb",
"test1" => "12",
"test2" => "23"
}
Hope that helps

Grok Pattern not working in Logstash

After parsing logs I am find there are some new lines at the end of the message
Sample message
ts:2016-04-26 05-02-16-018
CDT|ll:TRACE|tid:10000.140|scf:xxxxxxxxxxxxxxxxxxxxxxxxxxx.pc|mn:null|fn:xxxxxxxxxxxxxxxxxxxxxxxxxxx|ln:749|auid:xxxxxxxxxxxxxxxxxxxxxxxxxxx|eid:xxx.xxx.xxx.xxx-58261618-1-1461664935955-139|cid:900009865|ml:null|mid:-99|uip:xxx.xxx.xxx.xxx|hip:xxx.xxx.xxx.xxx|pli:null|msg:
xxxxxxxxxxxxxxxxxxxxxxxxxxx|pl: xxxxxxxxxxxxxxxxxxxxxxxxxxx
TAKE 1 xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxx
I am using the regex pattern below as suggested below as answers
ts:(?(([0-9]+)-)+ ([0-9]+-)+ [A-Z]+)\|ll:%{WORD:ll}\|tid:%{NUMBER:tid}\|scf:%{DATA:scf}\|mn:%{WORD:mn}\|fn:%{WORD:fn}\|ln:%{WORD:ln}\|auid:%{WORD:auid}\|eid:%{DATA:eid}\|cid:%{WORD:cid}\|ml:%{WORD:ml}\|mid:%{NUMBER:mid}\|uip:%{DATA:uip}\|hip:%{DATA:hip}\|pli:%{WORD:pli}\|\smsg:%{GREEDYDATA:msg}(\|pl:(?(.|\r|\n)))
But unfortunately it is not working properly when the last part of the log is not present
ts:2016-04-26 05-02-16-018
CDT|ll:TRACE|tid:10000.140|scf:xxxxxxxxxxxxxxxxxxxxxxxxxxx.pc|mn:null|fn:xxxxxxxxxxxxxxxxxxxxxxxxxxx|ln:749|auid:xxxxxxxxxxxxxxxxxxxxxxxxxxx|eid:xxx.xxx.xxx.xxx-58261618-1-1461664935955-139|cid:900009865|ml:null|mid:-99|uip:xxx.xxx.xxx.xxx|hip:xxx.xxx.xxx.xxx
What should be the correct pattern?
-------------------Previous Question --------------------------------------
I am trying to parse log line such as this one.
ts:2016-04-26 05-02-16-018 CDT|ll:TRACE|tid:10000.140|scf:xxxxxxxxxxxxxxxxxxxxxxxxxxx.pc|mn:null|fn:xxxxxxxxxxxxxxxxxxxxxxxxxxx|ln:749|auid:xxxxxxxxxxxxxxxxxxxxxxxxxxx|eid:xxx.xxx.xxx.xxx-58261618-1-1461664935955-139|cid:900009865|ml:null|mid:-99|uip:xxx.xxx.xxx.xxx|hip:xxx.xxx.xxx.xxx|pli:null|msg: xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxx
Below is my logstash filter
filter {
grok {
match => ["mesage", "ts:(?<date>(([0-9]+)-*)+ ([0-9]+-*)+ [A-Z]+)\|ll:%{WORD:ll}\|tid:%{WORD:tid}\|scf:%{WORD:scf}\|mn:%{WORD:mn}\|fn:%{WORD:fn}\|ln:%{WORD:ln}\|auid:%{WORD:auid}\|eid:%{WORD:eid}\|cid:%{WORD:cid}\|ml:%{WORD:ml}\|mid:%{WORD:mid}\|uip:%{WORD:uip}\|hip:%{WORD:hip}\|pli:%{WORD:pli}\|msg:%{WORD:msg}"]
}
date {
match => ["ts","yyyy-MM-dd HH-mm-ss-SSS ZZZ"]
target => "#timestamp"
}
}
I am getting "_grokparsefailure"
I have tested the configuration from #HAL, there was a few things to change:
In the grok filter mesage => message
In the date filter ts => date so the date parsing is on the right field
The CDT is a time zone name, it is captured by z in the date syntax.
So the right configuration would look like this :
filter{
grok {
match => ["message", "ts:(?<date>(([0-9]+)-*)+ ([0-9]+-*)+ [A-Z]+)\|ll:%{WORD:ll}\|tid:%{NUMBER:tid}\|scf:%{DATA:scf}\|mn:%{WORD:mn}\|fn:%{WORD:fn}\|ln:%{WORD:ln}\|auid:%{WORD:auid}\|eid:%{DATA:eid}\|cid:%{WORD:cid}\|ml:%{WORD:ml}\|mid:%{NUMBER:mid}\|uip:%{DATA:uip}\|hip:%{DATA:hip}\|pli:%{WORD:pli}\|\s*msg:%{GREEDYDATA:msg}"]
}
date {
match => ["date","yyyy-MM-dd HH-mm-ss-SSS z"]
target => "#timestamp"
}
}
Tried to parse your input via grokdebug with your expression but it failed to read out any fields. Managed to get it to work by changing the expression to:
ts:(?<date>(([0-9]+)-*)+ ([0-9]+-*)+ [A-Z]+)\|ll:%{WORD:ll}\|tid:%{NUMBER:tid}\|scf:%{DATA:scf}\|mn:%{WORD:mn}\|fn:%{WORD:fn}\|ln:%{WORD:ln}\|auid:%{WORD:auid}\|eid:%{DATA:eid}\|cid:%{WORD:cid}\|ml:%{WORD:ml}\|mid:%{NUMBER:mid}\|uip:%{DATA:uip}\|hip:%{DATA:hip}\|pli:%{WORD:pli}\|\s*msg:%{GREEDYDATA:msg}
I also think that you need to change the name of the column that logstash shall parse from mesage to message.
Also, the date parsing pattern should match the format of the date in the input. There is no timezone identity (ZZZ) in your input data (at least not in the example).
Something like this should work better (not tested though):
filter {
grok {
match => ["mesage", "ts:(?<date>(([0-9]+)-*)+ ([0-9]+-*)+ [A-Z]+)\|ll:%{WORD:ll}\|tid:%{NUMBER:tid}\|scf:%{DATA:scf}\|mn:%{WORD:mn}\|fn:%{WORD:fn}\|ln:%{WORD:ln}\|auid:%{WORD:auid}\|eid:%{DATA:eid}\|cid:%{WORD:cid}\|ml:%{WORD:ml}\|mid:%{NUMBER:mid}\|uip:%{DATA:uip}\|hip:%{DATA:hip}\|pli:%{WORD:pli}\|\s*msg:%{GREEDYDATA:msg}"]
}
date {
match => ["ts","yyyy-MM-dd HH-mm-ss-SSS"]
target => "#timestamp"
}
}

Logstash records from a server being rejected by ElasticSearch due to malformed date

I am in the process of installing ELK including REDIS and have successfully got one server/process delivering its logs through to ElasticSearch(ES).
Most happy with this.
However, on updating an existing server/process to start using logstash I am seeing the logdate come through in the form yyyy-MM-dd HH:mm:ss,sss.
Note the absence of the T between date and time. ES is not happy with this.
Log4j pattern in use by both servers is:
<PatternLayout pattern="~%d{ISO8601} [%p] [%t] [%c{1.}] %m%n"/>
Logstash config is identical with the exception of the path to the source log file
input{
file{
type => "log4j"
path => "/var/log/restapi/*.log"
add_field => {
"process" => "restapi"
"environment" => "DEVELOPMENT"
}
codec => multiline {
pattern => "^~%{TIMESTAMP_ISO8601} "
negate => "true"
what => "previous"
}
}
}
filter{
if [type] == "log4j"{
grok{
match => {
message => "~%{TIMESTAMP_ISO8601:logdate}%{SPACE}\[%{LOGLEVEL:level}\]%{SPACE}\[%{DATA:thread}\]%{SPACE}\[%{DATA:category}\]%{SPACE}%{GREEDYDATA:messagetext}"
}
}
}
}
output{
redis{
host => "sched01"
data_type => "list"
key => "logstash"
codec => json
}
stdout{codec => rubydebug}
}
The stdout line is for current debug purposes, whereby it is evident that on correctly working server that logdate is being correctly formed by the GROK filter.
compared to the incorrectly formed output.
The only difference from a high level is when the servers were built.
Looking for ideas on what could be causing or a means to add the T into the field
A bug raised under DatePatternConverter ISO8601_PATTERN does not conform to ISO8601 https://issues.apache.org/jira/browse/LOG4J2-670 led me to check the version of the log4j2 library used in the older application. Found it was Beta. Updated to v2.3 and the dateTime value started to populate correctly. The value now being correctly formed, ElasticSearch is happy to accept it.

Resources