Shell multiple logs monitoring and correlation - linux

I have been trying this for days but still struggling.
The objective of the script is to perform real time log monitoring on multiple servers (29 in particular) and correlate login failure records between servers. The servers' log will be compressed at 23:59:59 everyday, and a new log starts from 0 o'clock.
My idea was to use tail -f | grep "failed password" | tee centralized_log on every server, activated by a loop through all server names, run on background, and output the login failure records to a centralized log. But it dosn't work. And it creates a lot of daemons which will become zombies as soon as I terminates the script.
I am also considering to do tail at some minutes interval. But as the logs grow larger, the processing time will increase. How to set a pointer to where the previous tail stopped?
So could you please suggest a better and working way to do multiple logs monitoring and correlation. Additional installations are not encouraged unless totally necessary.

If your logs are going through syslog, and you're using rsyslogd, then you can configure the syslog on each machine to forward the specific messages you're interested in to one (or two) centralized log servers, using a property match like:
:msg, contains, "failed password"
See the rsyslog documentation for more details about how to set up reliable syslog forwarding.

Related

Watch for variable change

I'm trying to log all commands that users run on our servers and then ship them into one centralized logging server.
For that, I created an rsyslog logger that writes everything into one file (/var/log/commands.log) using the history command. I then use filebeat to ship the logs to the log-server. My problem is someone being able to unset HISTFILE which would stop logging. I'm not too worried about someone doing echo "" > .bash_history because I'm shipping logs immediately to another server.
It shouldn't be super fool-proof because everything can be outsmarted, but I would still like to improve it. Is it possible to create an audit watch for changes to HISTFILE? Or create some sort of listener that whenever a user unsets HISTFILE it would immediately set it back & alert me? Should I create some daemon that sets HISTFILE every 5 seconds?
Huge thanks ahead!
Just as a side note, I know it's possible to log commands with auditd but it recorded commands that the system runs too which cluttered everything.
Solved,
Created a file: /etc/profile.d/histfile.sh
and in it put :
HISTFILE="${HOME:-~}/.bash_history"
readonly HISTFILE

How to log - the 12 factor application way

I want to know the best practice behind logging my node application. I was reading the 12 factor app guidelines at https://12factor.net/logs and it states that logs should always be sent to the stdout. Cool, but then how would someone manage logs in production? Is there an application that scoops up whatever is sent to stdout? In addition, is it recommended that I only be logging to stdout and not stderr? I would appreciate a perspective on this matter.
Is there an application that scoops up whatever is sent to stdout?
The page you linked to provides some examples of log management tools, but the simplest version of this would be just redirecting the output of your application to a file. So in bash node app.js > app.out. You could also split your stdout and stderr like node app.js 2> app.err 1> app.out.
You could additionally have some sort of service that collects the logs from this file, and then puts them indexes them for searching somewhere else.
The idea behind the suggestion to only log to stdout is to let the environment control what to do with the logs because the application doesn't necessarily know the environment that it will eventually run within. Furthermore, by treating all logs as an event stream, you leave the choice of what to do with this stream up to the environment. You may want to send the log stream directly to a log aggregation service for instance, or you may want to first preprocess it, and then stream the result somewhere else. If you mandate a specific output such as logging to a file, you reduce the portability of your service.
Two of the primary goals of the 12 factor guidelines are to be "suitable for deployment on modern cloud platforms" and to offer "maximum portability between execution environments". On a cloud platform where you might have ephemeral storage on your instance, or many instances running the same service, you'd want to aggregate your logs into some central store. By providing a log stream, you leave it up to the environment to coordinate how to do this. If you put them directly into a file, then you would have to tailor your environment to wherever each application has decided to put the logs in order to then redirect them to the central store. Using stdout for logs is thus primarily a useful convention.
I think it's a mistake to categorically say "[web] applications should write logs to stdout".
Rather, I would suggest:
a) Professional-quality, robust web apps should HAVE logs
b) The application should treat the "log" as an abstract, "stream" object
c) Ideally, the logger implementation MAY be configured to write to stdout, to stderr, to a file, to a date-stamped file, to a rotating file, filter by severity level, etc. etc. as appropriate.
I would strongly argue that hard-coded writes to stdout, without any intervening "logger" abstraction, is POOR practice.
Here is a good article:
https://blog.risingstack.com/node-js-logging-tutorial/
Cool, but then how would someone manage logs in production?
The log sink is what you're looking for.
Is there an application that scoops up whatever is sent to stdout?
Yes and no. It's the log ship (or log router). It could be an application, but it's really just some process within the execution or runtime environment that your app doesn't really know about.
Another way to look at this is separation of concern. As it was stated in a different answer, it's about letting the environment own what happens to the log and only expecting the application to concern itself with emitting log events at all. I think what's missing from the 12FA documentation is that they don't try to complete the puzzle for you because there will be different opinions on where to go from stdout, so I'll help by adding in those missing pieces based on my personal experience and what I'm seeing all over the cloud space.
Logger sends log event to log stream (aka 'the log')
It goes without saying that your application should have some sort of "logger" abstraction, but that's really just an entry point for emitting a log event to stdout. That abstraction's responsibility is to get your log event onto the log stream (stdout) in the desired format and then your application's responsibility is done. In fact, the 12FA documentation ends here.
12 Factor App is about creating cloud-friendly and portable applications, so you have to assume that you don't know what the executing/runtime environment even is. So we don't know what "the environment" is and that's the whole point. So from here, it is the responsibility of the executing/runtime environment to process the stream and move it to the sink.
Log ship/router realizes log stream to log sink
So the way we solve for this now is to have some sort of listener for the stdout stream that will take the output and send it downstream to the log sink.
The "ship" (also known as the log router or scraper) might be something in the environment or the runtime, or honestly it could be something running the background of your application (a stream listener); it could be some other custom process; it could be even be Kafka -- I think GCP uses fluentd to scoop up logs from various sources and put them in stackdriver. The point is that it should be a separate "class" in your application that your application doesn't really know about. It just listens to the stream and sends it to the sink. In some solutions, this is something you need to build, in other solutions, it's handled by your platform. Put simply "how do I get the stream to the sink?"
The "sink" is the destination. This can be the console (hello it's literally a stream reader), it can be a file, it can be Splunk, Application Insights, Stack Driver, etc. There are simple solutions and there are larger more complex enterprise solutions, but the concept stays the same.
So in short, this is the answer to your question, if we're writing to stdout "how do we manage logs in production." It's the log sink or log aggregator that you're looking for. In 12FA vernacular, something like "splunk" isn't the "log". The log is the stream itself (stdout). In terms of 12FA - Your application doesn't know what the sink is and ideally, it shouldn't because that sink could change, in which case all of your applications would break, or there could be many different sinks and that could bog your application down particularly if you're writing straight to the sinks instead of stdout first. It's just another decoupling exercise if nothing else.
You can send to a single sink, multiple sinks at once, or you can send to a single sink and have some other component 'ship' your logs from that sink to another (e.g. write to a rolling file and have a router scrape that into splunk). Just depends on your needs.
You can actually see this popping up more and more in cloud providers by default. For example, on GCP, all logs to stdout automatically get picked up and sent to stackdriver. In Azure, so long as you add the instrumentation to your .NET application (the application diagnostics package), it will emit events to stdout and it'll get picked up by azure monitor. There are also more and more packages out there that are beginning to implement this pattern, so in .NET you could use Serilog to abstract most of these concepts.
Logger -> Log Event -> Log [stream] (stdout) -> Sink -> Your eyeballs
Logger: The thing you use to emit the log, typically an abstraction (e.g. Serilog, NLog, Log4net)
Log Event: The individual log itself
Log Stream (or 'the log'): stdout it's the unbuffered, time-ordered aggregation of all events and has no beginning or end.
Log Ship/Router: The transport that sends the stream to one or more sinks. (e.g. in process like log4net, out of process like fluentd)
Log Sink: The thing that you're actually looking at like a console, file, or index/search engine, or analytics/monitoring platform (e.g. splunk, datadog, appinsights, stackdriver, etc.)
There are packages and platforms that provide one or more of these pieces, but all of those pieces are always there. It makes 12FA logging make more sense when you're aware of them.

Logstash should log only grok parsed messages

Currently I have a ELK stack in which logs are shipped by filebeat and after some filters in logstash, it is forwarded to ES. As there are a lot of servers and logs, a huge logs are coming to logstash, but I have configured the filter to only process a very specific type of log message. Which it is doing fine, but the logs which are not even matching are logged in logstash.log file. As I mentioned earlier that huge logs are coming, the size of logstash.log file is soon reaching to a high value and there is space issue coming up. How to configure the logstash so that I only log the processed logs, and not all.
You could use logrotate to automatically rotate on either a daily basis or once it hits a certain threshold. You could then set the number of rotations to be 1 or 2. This would allow you time to see what is going to the file in case you need to troubleshoot, but purge before it creates space contention.

Apache James - reduce message time in spool

I'm using a local Apache James 2.3.2 install for development and automated testing. It's configured to forward all incoming messages to a single address and to not relay emails outside:
<mailet match="All" class="Forward">
<forwardto>test#localhost</forwardto>
</mailet>
Everything works correctly: emails are accepted, placed in the spool directory, then finally moved to the inbox/test directory, from which they are then picked up by my automated tests for verification.
The only problem is, it can take anywhere between 10 and 60 seconds for those emails to be moved from the spool directory to the inbox/test directory, meaning the tests need to wait that long before retrieving them and doing their checks.
Is this something that can be configured otherwise? Or should I simply move to a different email server for testing purposes?
Thanks!
Not a direct answer to this question, but I've ended up switching to JES http://www.ericdaugherty.com/java/mailserver/ . You can configure how many SMTP and POP3 threads do the work as well as the frequency at which these threads pick up messages from the spool and try to send deliver them
# The server stores incoming SMTP messages on disk before attempting to deliver them. This
# setting determines how often (in seconds) the server checks the disk for new messages to deliver. The
# smaller the number, the faster message will be processed. However, a smaller number will cause
# the server to use more of your system's resources.
smtpdelivery.interval=5
This meets my needs.

syslog question

I am looking into syslog.
I understand that it is a centralized logging facility that collects logs from various sources.
I have heard that syslog can generate alerts on conditions e.g. max file size of log file is reached.
Is this true?
Because I haven't found how this is done.
Most posts just refer to the logging.
How is the event generation done?
I.e. if I have an app that acts as a log source (redirects logging to a syslog) then is it possible my app can receive an alert, if the max file size has been reached?
How is this configured?
Thank you!
From the application perspective, the syslog function is primarily a receiver of information from the application; the application can write messages to the syslog. There are various bits of information that the application provides to the syslog daemon, including the severity of the message.
The syslog daemon can be configured to take different actions on receipt of different types of message.
No, your application cannot receive an alert when the maximum file size is reached - at least, not via syslog. You might get a SIGXFSZ signal which you can trap. You might prefer to look at your resource limits and keep tabs on your file size to avoid the problem.

Resources