Read logs for last month with logstash

Read logs for last month with logstash - logstash

I have configured my logstash config file to read apache access logs like this:
input {
file {
type => "apache_access"
path => "/etc/httpd/logs/access_log*"
start_position => beginning
sincedb_path => "/dev/null"
}
}
filter {
if [path] =~ "access" {
mutate { replace => { "type" => "apache_access" } }
grok {
match => { "message" => "%{IPORHOST:clientip} - %{DATA:username} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-)" }
}
kv {
source => "request"
field_split => "&?"
prefix => "requestarg_"
}
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
host => "10.13.10.18"
cluster => "awstutorialseries"
}
}
files that i have in the directory /etc/httpd/logs are:
access_log
access_log-20161002
access_log-20161005
access_log-20161008
access_log-20161011
...
When accessing all files in path access_log* it can make time if we have a interesting number of archived files.
In the server we rotate logs avery 3 days, so we archive the access_log file to be access_log-{date} and logstash as the config says, it reads all access_log files in that directory even the archived ones are included.
after some month we are in front of a lot of files that logstash should read so it can make time to read them all.
Q1: Is there a way to read all the logs once, and then just access_log file?
Q2: Is there a way or a custom expression to do in config file to read just some log files deponds on date and not all of them ?
I have tried a plenty of conbinaison and filters on my config file based on official documentation but no chance.

Your pattern "access_log*" will match all the old files, too, but logstash will ignore any files older than a day. See the ignore_older param in the file{} input. When catching up on old files, you can set this to a higher value.
Once you're caught up, I would release a new config that only looked at "access_log" (no wildcard, this the latest file only).

Related

Data missed in Logstash?

Data missed a lot in logstash version 5.0,
is it a serous bug ,when a config the config file so many times ,it useless,data lost happen again and agin, how to use logstash to collect log event property ?
any reply will thankness

Logstash is all about reading logs from specific location and based on you interested information you can create index in elastic search or other output also possible.
Example of logstash conf
input {
file {
# PLEASE SET APPROPRIATE PATH WHERE LOG FILE AVAILABLE
#type => "java"
type => "json-log"
path => "d:/vox/logs/logs/vox.json"
start_position => "beginning"
codec => json
}
}
filter {
if [type] == "json-log" {
grok {
match => { "message" => "UserName:%{JAVALOGMESSAGE:UserName} -DL_JobID:%{JAVALOGMESSAGE:DL_JobID} -DL_EntityID:%{JAVALOGMESSAGE:DL_EntityID} -BatchesPerJob:%{JAVALOGMESSAGE:BatchesPerJob} -RecordsInInputFile:%{JAVALOGMESSAGE:RecordsInInputFile} -TimeTakenToProcess:%{JAVALOGMESSAGE:TimeTakenToProcess} -DocsUpdatedInSOLR:%{JAVALOGMESSAGE:DocsUpdatedInSOLR} -Failed:%{JAVALOGMESSAGE:Failed} -RecordsSavedInDSE:%{JAVALOGMESSAGE:RecordsSavedInDSE} -FileLoadStartTime:%{JAVALOGMESSAGE:FileLoadStartTime} -FileLoadEndTime:%{JAVALOGMESSAGE:FileLoadEndTime}" }
add_field => ["STATS_TYPE", "FILE_LOADED"]
}
}
}
filter {
mutate {
# here converting data type
convert => { "FileLoadStartTime" => "integer" }
convert => { "RecordsInInputFile" => "integer" }
}
}
output {
elasticsearch {
# PLEASE CONFIGURE ES IP AND PORT WHERE LOG DOCs HAS TO PUSH
document_type => "json-log"
hosts => ["localhost:9200"]
# action => "index"
# host => "localhost"
index => "locallogstashdx_new"
# workers => 1
}
stdout { codec => rubydebug }
#stdout { debug => true }
}
To know more you can go throw many available websites like
https://www.elastic.co/guide/en/logstash/current/first-event.html

Logstash not printing anything

I am using logstash for the first time and trying to setup a simple pipeline for just printing the nginx logs. Below is my config file
input {
file {
path => "/var/log/nginx/*access*"
}
}
output {
stdout { codec => rubydebug }
}
I have saved the file as /opt/logstash/nginx_simple.conf
And trying to execute the following command
sudo /opt/logstash/bin/logstash -f /opt/logstash/nginx_simple.conf
However the only output I can see is:
Logstash startup completed
Logstash shutdown completed
The file is not empty for sure. As per my understanding I should be seeing the output on my console. What am I doing wrong ?

Make sure that the character encoding of your logfile is UTF-8. If it is not, try to change it and restart the Logstash.

Please try this code as your Logstash configuration, in order to setup a simple pipeline for just printing the nginx logs.
input {
file {
path => "/var/log/nginx/*.log"
type => "nginx"
start_position => "beginning"
sincedb_path=> "/dev/null"
}
}
filter {
if [type] == "nginx" {
grok {
patterns_dir => "/home/krishna/Downloads/logstash-2.1.0/pattern"
match => {
"message" => "%{NGINX_LOGPATTERN:data}"
}
}
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch {
hosts => [ "127.0.0.1:9200" ]
}
stdout { codec => rubydebug }
}

configuring the logstash-output-csv

I am pretty new to logstash and I have been trying to convert an existing log into a csv format using the logstash-output-csv plugin.
My input log string looks as follows which is a custom log written in our application.
'128.111.111.11/cpu0/log:5988:W/"00601654e51a15472-76":687358:<9>2015/08/18 21:06:56.05: comp/45 55% of memory in use: 2787115008 bytes (change of 0)'
I wrote a quick regex and added it to the patterns_dir using the grok plugin.
My pattern is as follows :
IP_ADDRESS [0-9,.]+
CPU [0-9]
NSFW \S+
NUMBER [0-9]
DATE [0-9,/]+\s+[0-9]+[:]+[0-9]+[:]+[0-9,.]+
TIME \S+
COMPONENT_ID \S+
LOG_MESSAGE .+
without adding any csv filters I was able to get this output.
{
"message" => "128.111.111.11/cpu0/log:5988:W/"00601654e51a15472-76":687358:<9>2015/08/18 21:06:56.05: comp/45 55% of memory in use: 2787115008 bytes (change of 0)",
"#version" => "1",
"#timestamp" => "2015-08-18T21:06:56.05Z",
"host" => "hostname",
"path" => "/usr/phd/raveesh/sample.log_20150819000609",
"tags" => [
[0] "_grokparsefailure"
]
}
This is my configuration in order to get the csv as an output
input {
file {
path => "/usr/phd/raveesh/temporary.log_20150819000609"
start_position => beginning
}
}
filter {
grok {
patterns_dir => "./patterns"
match =>["message", "%{IP_ADDRESS:ipaddress}/%{CPU:cpu}/%{NSFW:nsfw}<%{NUMBER:number}>%{DATE}:%{SPACE:space}%{COMPONENT_ID:componentId}%{SPACE:space}%{LOG_MESSAGE:logmessage}" ]
break_on_match => false
}
csv {
add_field =>{"ipaddress" => "%{ipaddress}" }
}
}
output {
# Print each event to stdout.
csv {
fields => ["ipaddress"]
path => "./logs/firmwareEvents.log"
}
stdout {
# Enabling 'rubydebug' codec on the stdout output will make logstash
# pretty-print the entire event as something similar to a JSON representation.
codec => rubydebug
}
}
The above configuration does not seem to give the output. I am trying only to print the ipaddress in a csv file but finally I need to print all the captured patterns in a csv file. so I need the output as follows :
128.111.111.111,cpu0,nsfw, ....
Could you please let me know the changes i need to make. ?
Thanks in advance
EDIT:
I fixed the regex as suggested using the tool http://grokconstructor.appspot.com/do/match#result
Now my regex filter looks as follows :
%{IP:client}\/%{WORD:cpu}\/%{NOTSPACE:nsfw}<%{NUMBER:number}>%{YEAR:year}\/%{MONTHNUM:month}\/%{MONTHDAY:day}%{SPACE:space}%{TIME:time}:%{SPACE:space2}%{NOTSPACE:comp}%{SPACE:space3}%{GREEDYDATA:messagetext}
How do I capture the individual splits and save it as a csv ?
Thanks
EDIT:
I finally resolved this using the File plugin .
output {
file{
path => "./logs/sample.log"
message_pattern =>"%{client},%{number}"
}
}

The csv tag in the filter section is for parsing the input and exploding the message to key/value pairs.
In your case you are already parsing the input with the grok, so I bet you don't need the csv filter.
But in the output we can see there is a gorkfailure
{
"message" => "128.111.111.11/cpu0/log:5988:W/"00601654e51a15472-76":687358:<9>2015/08/18 21:06:56.05: comp/45 55% of memory in use: 2787115008 bytes (change of 0)",
"#version" => "1",
"#timestamp" => "2015-08-18T21:06:56.05Z",
"host" => "hostname",
"path" => "/usr/phd/raveesh/sample.log_20150819000609",
"tags" => [
[0] "****_grokparsefailure****"
]
}
That means your grok expression cannot parse the input.
You should fix the expression according to your input and then the csv will output properly.
Checkout http://grokconstructor.appspot.com/do/match for some help
BTW, are you sure the patterns NSFW, CPU, COMPONENT_ID, ... are defined somewhere ?
HIH

How configure multiple files using logstash.conf file

I am going to configure all log files present in location(D:\Logs folder)
log files are
1.Magna_Log4Net.log.20150623_bak
2.Magna_Log4Net.log.20150624_bak
3.Magna_Log4Net.log.20150625_bak
4.Magna_Log4Net.log.20150626_bak
logstash.conf file
input {
file {
path =["C:\Test\Logs\Magna_Log4Net.log.*_bak"]
start_position => "beginning"
}
}
filter {
grok { match => [ "message", "%{HTTPDATE:[#metadata][timestamp]}" ] }
date { match => [ "[#metadata][timestamp]", "dd/MMM/yyyy:HH:mm:ss Z" ] }
}
output {
elasticsearch { host => localhost}
stdout { codec => rubydebug }
}
I am not able to load all files into elastic search , I didn't understand the problem here. can any body help to to how to parse multiple files into logstash config files ???

Logstash - grok use a field other than message

I am receiving Log4j generated log files from remote servers using Logstash forwarder. The log event has fields including a field named "file" in the format /tomcat/logs/app.log, /tomcat/logs/app.log.1, etc. Of course file path /tomcat/logs is on the remote machine and I would like Logstash to create files on the local file system using only the file name and not use the remote file path.
Locally, I would like to create a file based on file name app.log, app.log.1, etc. How can one accomplish this?
I am unable to use grok since it appears to work only with "message" field and not others.
Example Log Event:
{"message":["11 Sep 2014 16:29:04,934 INFO LOG MESSAGE DETAILS HERE "],"#version":"1","#timestamp":"2014-09-15T05:44:43.472Z","file":["/tomcat/logs/app.log.1"],"host":"aus-002157","offset":"3116","type":"app.log"}
Logstash configuration - what do I use to write the filter section?
input {
lumberjack {
port => 48080
ssl_certificate => "/tools/LogStash/logstash-1.4.2/ssl/logstash.crt"
ssl_key => "/tools/LogStash/logstash-1.4.2/ssl/logstash.key"
}
}
filter{
}
output {
file{
#message_format => "%{message}"
flush_interval => 0
path => "/tmp/%{host}.%{type}.%{filename}"
max_size => "4M"
}
}

Figured out the pattern to be as follows:
grok{
match => [ "file", "^(/.*/)(?<filename>(.*))$" ]
}
Thanks for the help!

Logstash Grok can parse all the fields in a log event, not only message field.
For example, you want to extract the file field,
you can do like this
filter {
grok {
match => [ "file", "your pattern" ]
}
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Read logs for last month with logstash - logstash

Related

Data missed in Logstash?

Logstash not printing anything

configuring the logstash-output-csv

How configure multiple files using logstash.conf file

Logstash - grok use a field other than message

Categories

Resources