When we use the command
set <cloumn-family-name>[timeuuid()][utf8(name)] = utf8(value);
It gives me this error
no appenders could be found please initialize system log4j system
properly.
The error/message is not related to TimeUUID. It seems that cassandra is using Log4j as its logging library and it needs a log4j.properties file to configure which messages will be logged etc. just put a simple log4j.properties file in correct place and you may see actual errors/messages printed by cassandra.
Related
How to configure Filebeats to read apache spark application log. The logs generated is moved to history server, in non readable format as soon as the application is completed. What is the ideal way here.
You can configure Spark logging via Log4J. For a discussion around some edge cases for setting up log4j configuration, see SPARK-16784, but if you simply want to collect all application logs coming off a cluster (vs logs per job) you shouldn't need to consider any of that.
On the ELK side, there was a log4j input plugin for logstash, but it is deprecated.
Thankfully, the documentation for the deprecated plugin describes how to configure log4j to write data locally for FileBeat, and how to set up FileBeat to consume this data and sent it to a Logstash instance. This is now the recommended way to ship logs from systems using log4j.
So in summary, the recommended way to get logs from Spark into ELK is:
Set the Log4J configuration for your Spark cluster to write to local files
Run FileBeat to consume from these files and sent to logstash
Logstash will send data into Elastisearch
You can search through your indexed log data using Kibana
I am having a spark-streaming application, and I want to analyse the logs of the job using Elasticsearch-Kibana. My job is run on yarn cluster, so the logs are getting written to HDFS as I have set yarn.log-aggregation-enable to true. But, when I try to do this :
hadoop fs -cat ${yarn.nodemanager.remote-app-log-dir}/${user.name}/logs/<application ID>
I am seeing some encrypted/compressed data. What file format is this? How can I read the logs from this file? Can I use logstash to read this?
Also, if there is a better approach to analyse Spark logs, I am open to your suggestions.
Thanks.
The format is called a TFile, and it is a compressed file format.
Yarn however chooses to write the application logs into a TFile!! For those of you who don’t know what a TFile is (and I bet a lot of you don’t), you can learn more about it here, but for now this basic definition should suffice “A TFile is a container of key-value pairs. Both keys and values are type-less bytes”.
Splunk / Hadoop Rant
There may be a way to edit YARN and Spark's log4j.properties to send messages to Logstash using SocketAppender
However, that method is being deprecated
I'm using Hazelcast as a library in my program.
I don't want Hazelcast to print all the stuff to console while adding a node or creating newHazelcastInstance.
It should just add node in background without printing? How can I achieve this?
It depends on your logging framework you are using, or how you have configured Hazelcast.
For example if you are using log4j and have selected the log4j logger in Hazelcast, then you could filter out the Hazelcast log entries.
I was trying to insert log4j logs from a Java application into Cassandra. I got the configuration for the log4j properties from http://www.datastax.com/docs/datastax_enterprise2.0/logging/log4j_logging.
I was not able to get the com.datastax.logging.appender.CassandraAppender. Can anybody let me know where to get the Cassandra appender, or is there a way we can integrate log4j and Cassandra?
The log4j appender is part of DataStax Enterprise; it's not included in Apache Cassandra by itself. If you're already using DataStax Enterprise, make sure that you've added path/to/dse/resources/log4j-appender/lib/ to the classpath for your application.
I have a distributed system that all of them use lig4j for log system events.I have a Cassandra cluster that I want to put all of my logs in log4j format in that.
Is there any open source program that integrates logs in log4j format to my Cassandra cluster?
A good choice would be Apache Flume. https://cwiki.apache.org/FLUME/
Flume is distributed log collection engine. It has built in support for log4j: http://archive.cloudera.com/cdh/3/flume/UserGuide/#_logging_via_log4j_directly
There is also a plugin for using cassandra as the log storage mechanism instead of hdfs: https://github.com/thobbs/flume-cassandra-plugin
I started an open source project with a custom log4j appender sending the message to a apache cassandra cluster. Project site is: https://github.com/rviper/cassandra-log4j