how to deal with multine messages in log files using logstash - logstash

I am using beats for file path in logstash and I want to merge multiline messages into a single event, I am using codec in my config file but it is not working, I am getting an error as below
Failed to execute action
{:action=>LogStash::PipelineAction::Create/pipeline_id:main,
:exception=>"LogStash::ConfigurationError", :message=>"Expected one of
, { at line 8, column 8 (byte 169) after # The # character at the beginning of a li e indicates a comment. Use\n# cooments to describe
your configuration.\ninput {\n beats{\n\tport => \"5044\"\n
}\n \n codec ",

Related

"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", \"input\", \"filter\", \"output\" at line 1, column 1 (byte 1)"

I'm getting this error while trying to pass a file as input through my logstash configuration file. Can someone please help with proper steps on how to fix this
Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \t\r\n], \"#\", \"input\", \"filter\", \"output\" at line 1, column 1 (byte 1)",

How to Generate Grok Patterns automatically using LogMine

I am trying to generate GROK patterns automatically using LogMine
Log sample:
Error IGXL error [Slot 2, Chan 16, Site 0] HSDMPI:0217 : TSC3 Fifo Edge EG0-7 Underflow. Please check the timing programming. Edge events should be fired in the sequence and the time between two edges should be more than 2 MOSC ticks.
Error IGXL error [Slot 2, Chan 18, Site 0] HSDMPI:0217 : TSC3 Fifo Edge EG0-7 Underflow. Please check the timing programming. Edge events should be fired in the sequence and the time between two edges should be more than 2 MOSC ticks.
For the above logs, I am getting the following pattern:
re.compile('^(?P<Event>.*?)\\s+(?P<Tester>.*?)\\s+(?P<State>.*?)\\s+(?P<Slot>.*?)\\s+(?P<Instrument>.*?)\\s+(?P<Content1>.*?):\\s+(?P<Content>.*?)$')
But I expect a Grok Pattern(Logstash) that looks like this:
%{LOGLEVEL:level} *%{DATA:Instrument} %{LOGLEVEL:State} \[%{DATA:slot} %{DATA:slot} %{DATA:channel} %{DATA:channel} %{DATA:Site}] %{DATA:Tester} : %{DATA:Content}
Code: LogMine is imported from the following link: https://github.com/logpai/logparser/tree/master/logparser/LogMine
import sys
import os
sys.path.append('../')
import LogMine
input_dir ='E:\LogMine\LogMine' # The input directory of log file
output_dir ='E:\LogMine\LogMine/output/' # The output directory of parsing results
log_file ='E:\LogMine\LogMine/log_teradyne.txt' # The input log file name
log_format ='<Event> <Tester> <State> <Slot> <Instrument><content> <contents> <context> <desc> <junk> ' # HDFS log format
levels =1 # The levels of hierarchy of patterns
max_dist =0.001 # The maximum distance between any log message in a cluster and the cluster representative
k =1 # The message distance weight (default: 1)
regex =[] # Regular expression list for optional preprocessing (default: [])
print(os.getcwd())
parser = LogMine.LogParser(input_dir, output_dir, log_format, rex=regex, levels=levels, max_dist=max_dist, k=k)
parser.parse(log_file)
This code returns only the parsed CSV file, I am looking to generate the GROK Patterns and use it later in a Logstash application to parse the logs.

logstash - output single event into multiple line output file

I have a jdbc input with a select statement. each row in the restult set has 3 columns. c1, c2, c3. the event emitted has the following structure:
{"c1":"v1", "c2":"v2", "c3":"v3", "file_name":"tmp.csv"}
I want to output the values in a file in the following manner:
output file:
v1
v2
v3
this is the output configuration:
file {
path => "/tmp/%{file_name}"
codec => plain { format => "%{c1}\n%{c2}\n%{c3}" }
write_behavior => "overwrite"
flush_interval => 0
}
but what is generated is
outputfile:
v1\nv2\nv3
is the plain codec plugin not the one i need? is there any other codec plugin for the output file plugin that i can use? or is the only option i have is to write my own plugin?
Thanks!
A bit late to the party, but maybe this helps others. Although it looks funky, you should be able to get away with simply hitting Enter within the format string (using the line codec).
file {
path => "/tmp/%{file_name}"
codec => line {
format => "%{c1}
%{c2}
%{c3}"
}
write_behavior => "overwrite"
flush_interval => 0
}
Not the prettiest approach, but it works. Not sure if there is a better way.
what you are looking for is the line codec plugin: https://www.elastic.co/guide/en/logstash/current/plugins-codecs-line.html

How use regex in logstash input file

I am trying add multiple logs files to my logstash for load all index data in kibana (according to my regular expression https://regex101.com/r/njG6Qq/2).
This is my /etc/logstash/conf.d/apache-01.conf
It appears not work because the index be dont shows in in kibana
input {
file {
path => "/var/lib/jenkins/workspace/GetLogs/(.+\.)?themaindomain\.com-ssl_log-.+[0-9]{4}$"
type => "apache_access"
sincedb_path => ["/var/lib/logstash/"]
start_position => "beginning"
}
}
Example of my logs files in contain in /var/lib/jenkins/workspace/GetLogs/ folder like my regex https://regex101.com/r/njG6Qq/2
somesubdomain.themaindomain.com-Nov-2018
somesubdomain.themaindomain.com-Oct-2018
somesubdomain.themaindomain.com-Sep-2018
somesubdomain.themaindomain.com-Sep-2018.gz.1
somesubdomain.themaindomain.com-ssl_log-Jan-2018
somesubdomain.themaindomain.com-ssl_log-Jan-2018.gz.1
somesubdomain.themaindomain.com-ssl_log-Nov-2018
somesubdomain.themaindomain.com-ssl_log-Oct-2018
somesubdomain.themaindomain.com-ssl_log-Sep-2018
somesubdomain.themaindomain.com-ssl_log-Sep-2018.gz.1
ftp.themaindomain.com-ftp_log-Mar-2018
ftp.themaindomain.com-ftp_log-Mar-2018.gz.1
ftp.themaindomain.com-ftp_log-Oct-2018
ftp.themaindomain.com-ftp_log-Sep-2018
merged.txt
somesubdomain.themaindomain.com-Oct-2018
somesubdomain.themaindomain.com-Sep-2018
somesubdomain.themaindomain.com-Sep-2018.gz.1
somesubdomain.themaindomain.com-ssl_log-Oct-2018
somesubdomain.themaindomain.com-ssl_log-Sep-2018
somesubdomain.themaindomain.com-ssl_log-Sep-2018.gz.1
OTHERsubdomain.themaindomain.com-Sep-2018
OTHERsubdomain.themaindomain.com-Sep-2018.gz.1
OTHERsubdomain.themaindomain.com-ssl_log-Sep-2018
OTHERsubdomain.themaindomain.com-ssl_log-Sep-2018.gz.1
somesubdomain.themaindomain.com-Nov-2018
somesubdomain.themaindomain.com-Oct-2018
somesubdomain.themaindomain.com-Sep-2018
somesubdomain.themaindomain.com-Sep-2018.gz.1
somesubdomain.themaindomain.com-ssl_log-Nov-2018
somesubdomain.themaindomain.com-ssl_log-Oct-2018
somesubdomain.themaindomain.com-ssl_log-Sep-2018
somesubdomain.themaindomain.com-ssl_log-Sep-2018.gz.1
OTHERsubdomain.themaindomain.com-Jun-2018
OTHERsubdomain.themaindomain.com-Jun-2018.gz.1
OTHERsubdomain.themaindomain.com-May-2018
OTHERsubdomain.themaindomain.com-May-2018.gz.1
OTHERsubdomain.themaindomain.com-ssl_log-Jun-2018
OTHERsubdomain.themaindomain.com-ssl_log-Jun-2018.gz.1
OTHERsubdomain.themaindomain.com-ssl_log-May-2018
OTHERsubdomain.themaindomain.com-ssl_log-May-2018.gz.1
somesubdomain.themaindomain.com-Nov-2018
somesubdomain.themaindomain.com-Oct-2018
somesubdomain.themaindomain.com-Sep-2018
how should I add the regex sentence to the logstash configure file?
Can someone explain me?
Thank you very much
You can only use filename patterns in path. In your case use /var/lib/jenkins/workspace/GetLogs/*.themaindomain.com-ssl_log-???-????
Documentation shows only filename patterns can be used - https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html#plugins-inputs-file-path
For more information on file patterns see the below link
https://www.ibm.com/support/knowledgecenter/en/SSMKHH_10.0.0/com.ibm.etools.mft.doc/ac55200_.htm

Parsing two formats of log messages in LogStash

In a single log file, there are two formats of log messages. First as so:
Apr 22, 2017 2:00:14 AM org.activebpel.rt.util.AeLoggerFactory info
INFO:
======================================================
ActiveVOS 9.* version Full license.
Licensed for All application server(s), for 8 cpus,
License expiration date: Never.
======================================================
and second:
Apr 22, 2017 2:00:14 AM org.activebpel.rt.AeException logWarning
WARNING: The product license does not include Socrates.
First line is same, but on the other lines, there can be (written in pseudo) :loglevel: <msg>, or loglevel:<newline><many of =><newline><multiple line msg><newline><many of =>
I have the following configuration:
Query:
%{TIMESTAMP_MW_ERR:timestamp} %{DATA:logger} %{GREEDYDATA:info}%{SPACE}%{LOGLEVEL:level}:(%{SPACE}%{GREEDYDATA:msg}|%{SPACE}=+(%{GREEDYDATA:msg}%{SPACE})*=+)
Grok patterns:
AMPM (am|AM|pm|PM|Am|Pm)
TIMESTAMP_MW_ERR %{MONTH} %{MONTHDAY}, %{YEAR} %{HOUR}:%{MINUTE}:%{SECOND} %{AMPM}
Multiline filter:
%{LOGLEVEL}|%{GREEDYDATA}|=+
The problem is that all messages are always identified with %{SPACE}%{GREEDYDATA:msg}, and so in second case return <many of => as msg, and never with %{SPACE}=+(%{GREEDYDATA:msg}%{SPACE})*=+, probably as first msg pattern contains the second.
How can I parse these two patterns of msg ?
I fixed it by following:
Query:
%{TIMESTAMP_MW_ERR:timestamp} %{DATA:logger} %{DATA:info}\s%{LOGLEVEL:level}:\s((=+\s%{GDS:msg}\s=+)|%{GDS:msg})
Patterns:
AMPM (am|AM|pm|PM|Am|Pm)
TIMESTAMP_MW_ERR %{MONTH} %{MONTHDAY}, %{YEAR} %{HOUR}:%{MINUTE}:%{SECOND} %{AMPM}
GDS (.|\s)*
Multiline pattern:
%{LOGLEVEL}|%{GREEDYDATA}
Logs are correctly parsed.

Resources