I have installed the ELK stack on a VM centos7, now I want to analyze a folder of logs that have a relation to the authentication failure to a supercomputer that works with slurm and I need help to know how I can analyze these log files.
Related
I am just looking any UI based tool to monitor cassandra system.log so that we could analyze and extract errors efficiently. if any please let me know the steps to configure.
As usual people use ELK stack - Elasticsearch, Logstash, Kibana, where Kibana is a UI
My Apache Spark application handles giant RDDs and generates EventLogs through the History Server.
How can I export these logs and import them to another computer to view them through History Server UI?
My cluster uses Windows 10 and for some reason, with this OS, the log files don't load if they aren't generated on the machine itself. Using another OS like Ubuntu, I was able to view History Server's logs on the browser.
The spark while running applications writes events to the spark.eventLog.dir (for eg HDFS - hdfs://namenode/shared/spark-logs) as configured in the spark-defaults.conf.
These are then read by the spark history server based on the
spark.history.fs.logDirectory setting.
Both these log directories need to be the same and spark history server process should have permissions to read those files.
So these would be json files in the event log directory for each application. These you can access using appropriate filesystem commands.
How to configure Filebeats to read apache spark application log. The logs generated is moved to history server, in non readable format as soon as the application is completed. What is the ideal way here.
You can configure Spark logging via Log4J. For a discussion around some edge cases for setting up log4j configuration, see SPARK-16784, but if you simply want to collect all application logs coming off a cluster (vs logs per job) you shouldn't need to consider any of that.
On the ELK side, there was a log4j input plugin for logstash, but it is deprecated.
Thankfully, the documentation for the deprecated plugin describes how to configure log4j to write data locally for FileBeat, and how to set up FileBeat to consume this data and sent it to a Logstash instance. This is now the recommended way to ship logs from systems using log4j.
So in summary, the recommended way to get logs from Spark into ELK is:
Set the Log4J configuration for your Spark cluster to write to local files
Run FileBeat to consume from these files and sent to logstash
Logstash will send data into Elastisearch
You can search through your indexed log data using Kibana
I have multiple microservices written in node and microservices are installed into the docker container and we are using Mesos+Marathon for clustering.
How can I aggregate the logs of all the containers(microservices) on different instance.?
We're using Docker+Mesos as well and are shipping all logs to a log analytics service (it's the service the company I work for offers, http://logz.io). There a couple of ways to achieve that:
Have a log shipper agent within each docker - an agent like rsyslog, nxlog, logstash, logstash-forwarder - that agent would ship data to a central logging solution
Create a Docker that is running the shipper agent (like rsyslog, nxlog, logstash, logstash-forwarder) and that agent reads logs from all Dockers on each machine and ships them to a central location - this is the path we're taking
This is a broad question but I suggest you setup an Elastic Search, Logstash, Kibana stack (ELK)
https://www.elastic.co/products/elasticsearch
https://www.elastic.co/products/logstash
https://www.elastic.co/products/kibana
Then on each one of your containers you can run the logstash forwarder/shipper to send logs to your logstash frontend.
Logs get stored in Elastic Search and then you search for them using Kibana or the Elastic Search API
Hope it helps.
I am also doing some Docker + Mesos + Marathon work so, I guess, I am going to face same doubt that you have.
I don't know if there's any native solution yet. But there's a blog by the folks at elastic.io on how they went about solving this issue.
Here's the link - Log aggregation for Docker containers in Mesos / Marathon cluster
When the application is running, I am able to view the log from the RM UI. But after the application exits, I got this message when trying to view the log:
Failed while trying to construct the redirect url to the log server.
Log Server url may not be configured java.lang.Exception: Unknown
container. Container either has not started or has already completed
or doesn't belong to this node at all.
I looked around my HDInsight storage but I could not find any log file.
In case you are using YARN for your Spark execution, you could use its built-in log system.
According to the official Spark documentation:
If log aggregation is turned on (with the yarn.log-aggregation-enable config), container logs are copied to HDFS and deleted on the local machine. These logs can be viewed from anywhere on the cluster with the “yarn logs” command.
HDInsight clusters support this type of logging. In order to access them, the command below can be used from a command line:
yarn logs -applicationId <app ID>
To identify the application ID, you might want to access the Hadoop user interface and look up for the All Applications section:
Note: In order to output the entire log into a file, you might want to append > TextFile.txt to the above command.