Collect some missing lines of a log file with logstash - logstash

How to collect new lines of a log file if we stop logstash in a fixed period and after we restart it? knowing that:
start_position => 'end'

As documented, the file input's start_position parameter only controls what happens the first time Logstash encounters a file. Once the first contact is made, Logstash only uses its internal bookmark stored in the sincedb file.
In other words, it's generally fine to restart or shut down Logstash a little while. It'll pick up everything when it starts up again. There are two caveats though:
If the logfile is rotated via a rename operation (which is the default in most cases) while Logstash is down and the filename pattern(s) don't cover the rotated file, the last lines will be lost. For this reason it's a good easy to have Logstash track the first rotated file too, so if you e.g. want to track /var/log/syslog and that file is rotated to /var/log/syslog.1 every morning it's wise to include both files.
Logstash 1.4.2 doesn't shut down gracefully if it receives a SIGTERM signal. All messages currently in the 20-message buffer will be lost. Additionally, I'm not sure the sincedb file is flushed. Sending SIGINT will ensure a graceful shutdown.

Related

what is $InputFilePollInterval in rsyslog.conf? by increasing this value will it impact on level of logging?

in rsyslog configuration file we configured like all application logs are to be write in /var/log/messages but the logs get written at very high rate, how can i decrease the level of logging at application level
Hope this is what you are looking for.
Open the file in a text editor:
/etc/rsyslog.conf
change the following parameter to what you think is good for you:
$SystemLogRateLimitInterval 3
$SystemLogRateLimitBurst 40
restart rsyslogd
service rsyslog restart
$InputFilePollInterval equivalent to: “PollingInterval”
PollingInterval seconds
Default: 10
This setting specifies how often files are to be polled for new data.
The time specified is in seconds. During each polling interval, all
files are processed in a round-robin fashion.
A short poll interval provides more rapid message forwarding, but
requires more system resources. While it is possible, we stongly
recommend not to set the polling interval to 0 seconds
.
There are a few approaches to this, and it depends on what exactly you're looking to do, but you'll likely want to look into separating your facilities into separate output files, based on severity. This can be done using RFC5424 severity priority levels in your configuration file.
By splitting logging into separate files by facility and/or severity, and setting the stop option, messages based on severity can be output to as many or few files as you like.
Example (set in the rsyslog.conf file):
*.*;auth,authpriv,kern.none /var/log/syslog
kern.* /var/log/kern.log
kern.debug stop
*.=debug;\
auth,authpriv.none;\
news.none;mail.none /var/log/debug
This configuration:
Will not output output any kern facility messages to syslog (due to kern.none)
Will output all debug level logging of kern to kern.log and "stop" there
Will output any other debug logs that are not excluded by .none to debug
How you separate things out is up to you, but I would recommend looking over the first link I included. You also may want to look into the different local facilities that can be used as separate log pipelines.

Is there any indication that logstash forwarder finished processing a file?

I would like to delete files after logstash forwarder sent them (otherwise I get too many files open error).
Is there any indication that logstash forwarder is done with the file?
logstash-forwarder keeps a "registry" file called .logstash-forwarder that contains information about the file (really inode) and byte offset into that file.
You can compare that information with the actual file itself to see if LSF is finished.
I do the same to tell is LSF is falling behind in its processing.

Dynamically remove duplicate log messages

Recently we had a message fill up /var/log/libvirt/qemu/.log in a matter of minutes with a line that repeated that crashed our system due to the root partition being filled (20+ Gigs in minutes).
"block I/O error in device 'drive-virtio-disk0': Operation not permitted (1)"
Is there a way to ensure that duplicate lines are not pushed into logs, or a way to limit that directory from filling up? Logstash maxsize will not work for us since we run it on a daily cronjob.
It depends on which logging utility you are using (rsyslog or syslog-ng)
Rsyslog can remove repeated messages by adding lines like:
"last message repeated 3044 times".
To enable this option you should add:
$RepeatedMsgReduction on
to /etc/rsyslog.conf
I don't know if such reduction is possible with syslog-ng.
Both syslog-ng and rsyslog can completely remove lines matching some pattern:
rsyslog - take a look into this manual: http://www.rsyslog.com/discarding-unwanted-messages/
syslog-ng - take a look in filters. there is some example how to do it: https://serverfault.com/questions/540038/excluding-some-messages-from-syslog-ng

Better way to output to both console and output file than tee?

What I need to display is a log refreshing periodically. It's a block of about 10 lines of text. I'm using |tee and it works right now. However, the performance is less satisfying. It waits a while and then outputs several blocks of texts from multiple refreshes (especially when the program just starts, it takes quite a while to start displaying anything on the console and the first time I saw this, I thought the program was hanging). In addition, it breaks randomly in the middle of the last block, so it's quite ugly to present.
Is there a way to improve this? (Maybe output less each time and switch between output file and console more frequently?)
Solved by flushing stdout after printing each block. Credit to Kenneth L!
https://superuser.com/questions/889019/bash-better-way-to-output-to-both-console-and-output-file-than-tee
Assuming you can monitor the log as a file directly [update: turned out not to be the case]:
The usual way of monitoring a [log] file for new lines is to use tail -f, which - from what I can tell - prints new data added to the log file as it is being added, without buffering.
Similarly, tee passes data it receives via stdin on without buffering.
Thus, you should be able to combine the two:
tail -f logFile | tee newLogEntriesFile

Reduce Size of .forever Log Files Without Disrupting forever Process

The log files (in /root/.forever) created by forever have reached a large size and is almost filling up the hard disk.
If the log file were to be deleted while the forever process is still running, forever logs 0 will return undefined. The only way for logging of the current forever process to resume is to stop it and start the node script again.
Is there a way to just trim the log file without disrupting logging or the forever process?
So Foreverjs will continue to write to the same file handle and ideally would support something that allows you to send it a signal and rotate to a different file.
Without that, which requires code change on the Forever.js package, your options look like:
A command line version:
Make a backup
Null out the file
cp forever-guid.log backup && :> forever-guid.log;
This has the slight risk of if your writing to the log file at a speedy pace, that you'll end up writing a log line between the backup and the nulling, resulting in the loss of the log line.
Use Logrotate w/copytruncate
You can set up logrotate to watch the forever log directory to copy and truncate automatically based on filesize or time.
Have your node code handle this
You can have your logging code look at how many lines the log file is and then doing the copy truncate - this would allow you to avoid the potential data loss.
EDIT: I had originally thought that split and truncate could do the job. They probably can but an implementation would look really awkward. Split doesn't have a good way to splitting the file into a short one (the original log) and a long one (the backup). Truncate (which in addition to the fact that it's not always installed) doesn't reset the write pointer, so forever just writes the same byte as it would have, resulting in strange data.
You can truncate the log file without losing its handle (reference).
cat /dev/null > largefile.txt

Resources