Is there a way to use Grok or Regex Match in File Input Plugin - logstash

I want to ask that is there a way to use grok or regex match in file input plugin in logstash ? And if yes, how ?
Thanks for answering.

With your limited details, I assume you want to pick up some specific files in your directories, based on some matching patterns. This can be handled in multiple ways:
Exclusions (matched against the filename, not full path).
Filename patterns are valid here, too. For example, if you have to just include ".log" files from a directory, you can use:
input { file {
path => "/var/log/applicationDir/*.log"
} }
In Tail mode, you might want to exclude gzipped files:
input { file {
path => "/var/log/applicationDir/"
exclude => "*.gz"
} }

Related

Logstash - match filename

I have two servers. The first one hosts elastic stack. Both servers have a file /var/log/commands.log which is configured in the same way and are being shipped with filebeat to logstash.
Using grok, I tried parsing the data into custom fields using this statement:
if [log][file][path] == "/var/log/commands.log" {
grok{
match => { "message" => "*some grok stuff*"
}
}
}
Problem is, even though on both servers the file is /var/log/commands.log & they're configured the same - it skips the if statement as if it's false.
I've noticed that if I ship the logs locally (without filebeat - just do input{file{input => "/var/log/commands.log}} ) it works for the local "/var/log/commands.log" file on that machine that hosts logstash.
For reference, this is the full .conf file for logstash: https://pastebin.com/1QbnAG7G
This is how elastic sees the file path: https://i.imgur.com/5h9HXf2.png
Does anyone why it skips the "if" statement? How to make it filter by name. Thanks ahead!
So, it looks like you're using =~ in your pastebin code. If that's the case, you'll be matching a regex. Is that what you meant to do?
If you're intending to use a regex, then you'd want something like this, probably:
if [log][file][path] =~ /\/var\/log\/commands\.log/
Example: https://regex101.com/r/nqGg9G/1

Creating custom grok filters

I need to find grok pattern for files where the lines are of the format :
3 dbm.kfa 0 340220 7766754 93.9
3 swapper/3 0 340220 7766754 93.9
This is the grok pattern that I have done so far.
\s*%{INT:no1}\s*%{USERNAME:name}\s*%{INT:no2}\s*%{INT:no3}\s*%{INT:no4}\s*%{GREEDYDATA:ans}
The field USERNAME works for dbm.kfa but not for swapper/3 as USERNAME does not include \ character. I would like to create some custom filter for this purpose, but have no idea how to create one.
Any help would be really appreciated. Thanks a lot !
To create a custom pattern you need to use an external file in the following format and put that file in a directory the will be used only for pattern files.
PATTERN_NAME [regex for your pattern]
Then you will need to change your grok config to point to the pattern files directory.
grok {
patterns_dir => ["/path/to/patterns/dir"]
match => { "message" => "%{PATTERN_NAME:fieldName}" }
}
But in your specific case if you change %{USERNAME:name} to %{DATA:name} it should work.
For a better explanation about the custom patterns you should read this part of the documentation.
You also can find all the core grok patterns that ships with logstash in this github repository, the most used are in the grok-patterns file.

Reading from rotating log files in logstash

As per the documentation of logstash's file plugin, the section on File Rotation says the following:
To support programs that write to the rotated file for some time after
the rotation has taken place, include both the original filename and
the rotated filename (e.g. /var/log/syslog and /var/log/syslog.1) in
the filename patterns to watch (the path option).
If anyone can clarify how to specify two filenames in the path configuration, that will be of great help as I did not find an exact example. Some examples suggest to use wild-cards like /var/log/syslog*, however I am looking for an example that achieves exactly what is said in documentation - two filenames in the path option.
The attribute path is an array and thus you can specify multiple files as follows:
input {
file{
path => [ "/var/log/syslog.log", "/var/log/syslog1.log"]
}
}
You can also use * notation for name or directory as follows:
input {
file{
path => [ "/var/log/syslog.log", "/var/log/syslog1.log", "/var/log/*.log", "/var/*/*.log"]
}
}
When you specify path as /var/*/*.log it does a recursive search to get all files with .log extension.
Reference Documentation

Logstash -- delimit event in log4net.log which may contain multiple lines

Here is a typical log file generated from log4net
So, this log file is read by the logstash file input plugin.
By default, the delimiter in configuration is \n, which means each line is an event.
But in the log file above, you can see there could be multiple lines for one event. (like ERROR or FAULT or others)
How to configure Logstash to delimit the event correctly?
I suppose I could configure multiple delimiters like \nINFO \nDEBUG \nERROR \nFAULT . But the document says there can only be one delimiter.
The following config should delimit your events properly.
Input config:
input {
file {
path => "/absolute/path/here.log"
type => "log4net"
codec => multiline {
pattern => "^(DEBUG|WARN|ERROR|INFO|FATAL)"
negate => true
what => previous
}
}
}
What you have there is a multiline event. There is a codec that will help you process that.
The basic idea is to define a pattern that identifies the beginning of a log entry (in your case, the log level), and then roll all other lines into the previous one.

Logstash file input glob?

I am using logstash file input with glob to read my files:
path => "/home/Desktop/LogstashInput/**/*.log"
Directory structure format:
LogstashInput => server-name => date => abc.log
This is reading all log files within every date directory ending with ".log".
Now I want to read only some particular log files within all date directories. Eg: 2014.11.05 directory has abc.log, xyz.log............ 10 such files. Then I want to read say only five particular files, how should the path input be ??
I read about exclude in logstash but it becomes a lot of files to be excluded as there are different type of files within different server-name directories and different dates
The logstash agent is written in ruby, so refer to the ruby glob rules. Based on your actual file names, you might be able to get one working.

Resources