Logs monitoring via CWagent / Alert / How to identify the instance - linux

I'd like to ask you for a help.
I've just configured the CloudWatch agent in order to monitor the Linux system logs e.g. /var/log/messages. Furthermore I've also configured the CloudWatch Alarm which should be triggered every time the "error" string is detected in system logs.
I see that the logs are already present in the CloudWatch dashboard, also I've got the email SNS notification that the "error" is detected in log file while I was testing it,
BUT:
I don't see any =instance identification, =any Log groups identification, =any Log stream idenfitication in the notification/alerting E-mail I received...
So How can I identify the instance with "errors" if I miss such info -> Let's imagine that I would run 20 EC2 ???
I would really appreciate any hint which could move me into the right direction...
Screensthots:
https://petybucket.s3.eu-central-1.amazonaws.com/upload/aws1.PNG
https://petybucket.s3.eu-central-1.amazonaws.com/upload/aws2.png
https://petybucket.s3.eu-central-1.amazonaws.com/upload/aws3.PNG
Thanks in advance
Peter

Related

How to fetch IIS Start log for a corresponding IIS Stop log in Azure Log Analytics outside of Alert's monitoring time period

I'm working on configuring an Azure Log Analytics alert (using KQL) to capture the IIS Stop & Start events (from Events table) in my OMS Workspace, and if the alert query finds that there's no corresponding IIS Start event log generated from a PaaS Role for a particular IIS Stop event log- the user should get notified by an alert so that he can bring IIS back up.
Problem: Let’s say I setup my alert to run over a Time Period & Frequency of 15mins. If the alert triggered at 10:30AM, that means it will scan the IIS logs from 10:15:01 AM to 10:29:59 AM. Now, suppose an IIS Stop event got logged in around 10:28 AM, then the respective IIS Start log (if any) will be logged in after a couple of minutes around 10:31AM or 10:32 AM – and hence it will go out of the alert’s monitoring time period. This will create a false positive failure scenario. (IIS got started back but my alert didn’t captured the Start event log). And thus, it might lead to some unnecessary IIS Start/Reset operations on my PaaS roles.
Attaching a representative quick sketch to explain it figuratively.
Please let me know if there's any possible approach to achieve this. Any suggestions are welcome. Thanks in advance!
Current implementation as follows.
Here we can see False Alert generated at 10:30.
You can see the below approach, where we select last 10 minutes data(Overlapped) every 5 minutes.
For the below case you can generate the alert
See if its helping you.

Azure Web Job logs

I want to check my WebJob app.
I am sending a queue message to 'blobsaddressqueueu' queue.
After few seconds the message disappears from the queue - means that it triggered the WebJob app.
Then I see the message in 'blobsaddressqueueu-poison' - means that something went wrong in the process.
When I go to Log Stream (after I turn it on) under ParseMatanDataWebJob20170214032408, I do not see any changes and the only message I get in Log Stream is 'No new trace in the past 1 min(s)' and so on.
What am I doing wrong?
All I want to do is check the csv file (the queue message directs the webJob to the blob container with the csv file), and check the process when the csv file is read by WebJob so I will figure out why it goes to poison.
I do not see any changes and the only message I get in Log Stream is
'No new trace in the past 1 min(s)' and so on.
Maybe you could change your Logging Level in diagnostics logs, and if your level is right and you could not see the logs you could go to D:\home\LogFiles\SiteExtensions\Kudu in Kudu to check the detailed log file.
For you I suggest checking the running logs, you could get it in portal like the pic shows.Also you could get the log file in Kudu at data/jobs/continuous/jobName.
You still could add trace message logging in a WebJob, about the details you could refer to this article.
If you still have other questions, please me know.

QRadar SIEM AIO v7.3.0 manually added Logsources are showing status N/A

After QRadar deployment, some of the Log sources were autodiscovered as expected, but others which were not discovered by QRadar automatically, i had added them manually in admin->Log Sources using Bulk option.
All of them are added successfully but they are still showing there Status as N/A. Even the log sources with status N/A are also appearing on Assets tab.
I have also checked that there logs are also appearing in Log Activity tab. Is it a known issue why the status is not showing Success on v7.3.0 even after receiving logs on QRadar?
Thanks in Advance
You can check the Log Source İdentifier , Is it Hostname or IP? You should write "Hostname" if There is hostname after time information in the log.Likely you should write IP if there is IP after time information in the log.After that you sould enable/disable log source and wait a few minutes, it should be success.
For example;
Apr 10 17:35:25 127.0.0.1 [Thread-62] com.q1labs.hostcontext.health.Agent: [INFO] ...
You should write 127.0.0.1 on Log source identifier .
I hope this information will help you.
If you are seeing logs from the sources showing as N/A, this is a known issue. If memory serves me this is pretty common for Cisco eStreamer protocol devices.

Logstash should log only grok parsed messages

Currently I have a ELK stack in which logs are shipped by filebeat and after some filters in logstash, it is forwarded to ES. As there are a lot of servers and logs, a huge logs are coming to logstash, but I have configured the filter to only process a very specific type of log message. Which it is doing fine, but the logs which are not even matching are logged in logstash.log file. As I mentioned earlier that huge logs are coming, the size of logstash.log file is soon reaching to a high value and there is space issue coming up. How to configure the logstash so that I only log the processed logs, and not all.
You could use logrotate to automatically rotate on either a daily basis or once it hits a certain threshold. You could then set the number of rotations to be 1 or 2. This would allow you time to see what is going to the file in case you need to troubleshoot, but purge before it creates space contention.

syslog question

I am looking into syslog.
I understand that it is a centralized logging facility that collects logs from various sources.
I have heard that syslog can generate alerts on conditions e.g. max file size of log file is reached.
Is this true?
Because I haven't found how this is done.
Most posts just refer to the logging.
How is the event generation done?
I.e. if I have an app that acts as a log source (redirects logging to a syslog) then is it possible my app can receive an alert, if the max file size has been reached?
How is this configured?
Thank you!
From the application perspective, the syslog function is primarily a receiver of information from the application; the application can write messages to the syslog. There are various bits of information that the application provides to the syslog daemon, including the severity of the message.
The syslog daemon can be configured to take different actions on receipt of different types of message.
No, your application cannot receive an alert when the maximum file size is reached - at least, not via syslog. You might get a SIGXFSZ signal which you can trap. You might prefer to look at your resource limits and keep tabs on your file size to avoid the problem.

Resources