Extracting fields from input file path logstash? - cron

I want to read my log files from various directories, like: Server1, Server2...
Server1 has subdirectories as cron, auth...inside these subdirectories is the log file respectively.
So I am contemplating of reading files like this:
input{
file{
#path/to/folders/server1/cronLog/cron_log
path => "path/to/folders/**/*_log"
}
}
However, I am having difficulty in filtering them i.e to know that for which server (Server1) and logtype (cron), I must apply the grok pattern:
Eg: I thought of doing something like this
if [path] =~ "auth"{
grok{
match => ["message", ***patteren****]
}
}else if [path] =~ "cron"{
grok{
match => ["message", ***pattern***]
}
Above cron is for log file (not cronLog directory).
But like this I also want to filter on server name as every server will have cron, auth,etc logs.
How to filter on both?
Is there a way to grab directory names from path in input ?? Like from here
path => "path/to/folders/**/*_log"
How should I proceed? Any help is appreciated?

its very straight forward, and almost exactly like in my other answer... you use the grok on the path to extract the pieces out that you care about and then you can do whatever you want from there
filter {
grok { "path", "path/to/here/(?<server>[^/]+)/(?<logtype>[^/]+)/(?<logname>.*) }
if [server] == "blah" && [logtype] =~ "cron" {
grok { "message", "** pattern **" }
}
}

Related

logstash or filebeat to create multiple output files from tomcat log

I need to parse a tomcat log file and output it into several output files.
Each file is the result of a certain filter that will pick certain entries in the tomcat file that match a series of regexes or other transformation rule.
Currently I am doing this using a python script but it is not very flexible.
Is there a configurable tool for doing this?
I have looked into filebeat and logstash [none of which I am very familiar with] but it is not clear if it is possible to configure them to map a single input file into multiple output files each with a different filter/grok set of expressions.
Is it possible to achieve this with filebeat/logstash?
If all logs files are on the same servers you dont need Filebeat. Logstash can do the work.
Here an example of what your config logstash can look like.
In input you have you tomcat log file and you have multi output (json) depend of loglevel once logs have been parsed.
The grok is also an example you must define your own grok pattern depend on your log format.
input {
file {
path => "/var/log/tomcat.log"
}
}
filter {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:loglevel} - %{POSTFIX_SESSIONID:sessionId}: %{GREEDYDATA:messageText}" }
}
}
output {
if [loglevel] == "info" {
file {
codec => "json"
path => "/var/log/tomcat_info_parsed.log"
}
}
if [loglevel] == "warning" {
file {
codec => "json"
path => "/var/log/tomcat_warning_parsed.log"
}
}
}

Logstash if field contains value

I'm using Filebeat to forward logs into Logstash.
I have filenames that contain "v2" in them, for an example:
C:\logs\Engine\v2.latest.log
I'd like to perform a different grok on these files.
I tried both of the following:
filter{
if "v2" in [filename] {
grok {
.....
.....
}
}
}
OR
filter{
if [filename] =~ /v2/ {
grok {
.....
.....
}
}
}
Well, my issue was that the "Filename" field was being generated AFTER the filter. So my syntax was correct but it simply was not catching anything because it didnt exist. However, Starting from version 6.7 they've added a "log.file.path" field which is the "Filename" field I previously generated.

Parsing log file with logs of different patterns Logstash

I am new to logstash in that matter ELK stack. A log file is having different processes logging data to it. Each process writes logs with different patterns. I want to parse this log file. Each log in this log file is started with below grok pattern,
%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:logsource} %{SYSLOGPROG}: +
%{SRCFILE:srcfile}:%{NUMBER:linenumber} where SRCFILE is defined as
[a-zA-Z0-9._-]+
Please let me know how can I parse this file so that different type of logs from each process logging in this file can be parsed.
Since you're trying to pass in log files, you might have to use the file input plugin in order to retrieve a file or x number of files from a given path. So a basic input could look something like this:
input {
file {
path => "/your/path/*"
exclude => "*.gz"
start_position => "beginning"
ignore_older => 0
sincedb_path => "/dev/null"
}
}
The above is just a sample for you to reproduce. So once you get the files and start processing them line by line, you could use the grok filter in order to match the keywords from your log file. A sample filter could look something like this:
grok {
patterns_dir => ["/pathto/patterns"]
match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:logsource} %{SYSLOGPROG}: + %{SRCFILE:srcfile}:%{NUMBER:linenumber} where SRCFILE is defined as [a-zA-Z0-9._-]+" }
}
You might have to use different filters if you're having different type of logs printed within a single file OR you could have it in the same line with a , comma separated values. Something like:
grok {
match => { "message" => [
"TYPE1,%{WORD:a1},%{WORD:a2},%{WORD:a3},%{POSINT:a4}",
"TYPE2,%{WORD:b1},%{WORD:b2},%{WORD:b3},%{WORD:b4}",
"TYPE3,%{POSINT:c1},%{WORD:c2},%{POSINT:c3},%{WORD:c4}" ]
}
}
And then maybe you could play around with the message, since you've got all the values you needed right within it. Hope it helps!

logstash pattern don't match in the expected way

I'm using logstash to collect my server.log from several glassfish domains. Unfortunatly in the log is no domainname. But the pathname have.
So I tried to get a part of the filename to match it to the GF-domain. The Problem is, that the pattern I defined don't matche the right part.
here the logstash.conf
file {
type => "GlassFish_Server"
sincedb_path => "D:/logstash/.sincedb_GF"
#start_position => beginning
path => "D:/logdir/GlassFish/Logs/GF0/server.log"
}
grok {
patterns_dir => "./patterns"
match =>
[ 'path', '%{DOMAIN:Domain}']
}
I' ve created a custom-pattern file and filled it with a regexp
my custom-pattern-file
DOMAIN (?:[a-zA-Z0-9_-]+[\/]){3}([a-zA-Z0-9_-]+)
And the result is:
"Domain" => "logdir/GlassFish/Logs/GF0"
I've tested my RegExp on https://www.regex101.com/ and is working fine.
Using http://grokdebug.herokuapp.com/ to verify the pattern brings the same "unwanted" result.
What I'm doing wrong? Has anybody an idea to get only the domain name "GF0", e.g. modify my pattern or using mutate in the logstash.conf?
I'm assuming that you're trying to strip out the GF0 portion from path?
If that's the case and you know that the path will always be in the same format, you could just use something like this for the grok:
filter {
grok {
match => [ 'path', '(?i)/Logs/%{WORD:Domain}/' ]
}
}
not as elegant as a regexp, but it should work.

logstash : file input is not working

I am trying to run sample eg. using logstash-1.4.2 in CDH 4.4. Whenever I use file input instead of stdin, the window freezes at the following message:
Using milestone 2 plugin 'file'. This plugin should be stable but if
you see strange behavior, please let us know! For more
information.....
My code looks like this:
input {
file {
path => "/tmp/access_log"
start_position => "beginning"
}
}
filter {
if [path] =~ "access" {
mutate { replace => { "type" => "apache_access" } }
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
file{
path =>"/logs/output_log"
}
}
Command- bin/logstash -f logstash-apache.conf
I have tried deleting all my previous sincedb files in the $HOME. directory and re-run logstash, but that doesn't seem to work either. Am I missing something?
if you have just one line in your input file,
you should add an empty line at the end!
that should work!
edited:
AND if you are on a windows machine, you need to write the absolute path like
"c:/dev/access-log.txt"
and take care of just using one / instead of // after the c:
I got stuck because logstash tracks which logs it has already read: https://stackoverflow.com/a/24034718/268907
Remember that this option only modifies “first contact” situations where a file is new and not seen before. If a file has already been seen before, this option has no effect. Otherwise you have to set your sincedb_path to /dev/null .
Set sincedb_path to /dev/null and you will prevent it from tracking the position in the file that it last read.
Are you running with root permissions? It looks like /logs/output_log needs root permission to be written to.
I tried your configuration locally with logstash 1.4.1 (and sudo) and it seems to be working fine.
Could you try the below one. It worked for me.
path => "/tmp/access_log/*"
instead of
path => "/tmp/access_log"

Resources