how to find HADOOP_HOME path on Linux? - linux

I am trying to run the below java code on a hadoop server.
javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d wordcount_classes WordCount.java
but I am not able to locate {HADOOP_HOME}. I tried with hadoop -classpath but it is giving output as below:
/etc/hadoop/conf:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/.//*:/usr/lib/hadoop-yarn/.//*:/usr/lib/hadoop-0.20-mapreduce/./:/usr/lib/hadoop-0.20-mapreduce/lib/*:/usr/lib/hadoop-0.20-mapreduce/.//*
Anyone has any idea about this?

Navigate to the path where hadoop is installed. locate ${HADOOP_HOME}/etc/hadoop, e.g.
/usr/lib/hadoop-2.2.0/etc/hadoop
When you type the ls for this folder you should see all these files.
capacity-scheduler.xml httpfs-site.xml
configuration.xsl log4j.properties
container-executor.cfg mapred-env.cmd
core-site.xml mapred-env.sh
core-site.xml~ mapred-queues.xml.template
hadoop-env.cmd mapred-site.xml
hadoop-env.sh mapred-site.xml~
hadoop-env.sh~ mapred-site.xml.template
hadoop-metrics2.properties slaves
hadoop-metrics.properties ssl-client.xml.example
hadoop-policy.xml ssl-server.xml.example
hdfs-site.xml yarn-env.cmd
hdfs-site.xml~ yarn-env.sh
httpfs-env.sh yarn-site.xml
httpfs-log4j.properties yarn-site.xml~
httpfs-signature.secret
Core configuration settings are available in hadoop-env.sh.
You can see classpath settings in this file and I copied some sample here for your reference.
# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_67
# The jsvc implementation to use. Jsvc is required to run secure datanodes.
#export JSVC_HOME=${JSVC_HOME}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR}
# Extra Java CLASSPATH elements. Automatically insert capacity-scheduler.
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
export HADOOP_CLASSPATH=${HADOOP_CLASSPATH+$HADOOP_CLASSPATH:}$f
done
Hope this helps!

hadoop-core jar file is in ${HADOOP_HOME}/share/hadoop/common directory, not in ${HADOOP_HOME} directory.
You can set the environment variable in your .bashrc file.
vim ~/.bashrc
Then add the following line to the end of .bashrc file.
export HADOOP_HOME=/your/hadoop/installation/directory
Just replace the path with your hadoop installation path.

Related

Problems with log4j2

I have two problems with log4j2:
I cannot figure out how to specify the log root level in the command line.
I execute runnable jar with log4j2.xml config file
java -Dlog4j.configurationFile=log4j2.xml -jar my.jar
The log4j2.xml has default loggers root log level set to INFO. But sometimes I need to specify DEBUG.
-Dlog4j.configurationFile=log4j2.xml does not work on Windows though it works fine on Mac and Linux. log4j2.xml does exist in current folder.
I get error when executing above mentioned command line with Windows PowerShell
Error: Could not find or load main class .configurationFile=log4j2.xml
I tried -Dlog4j.configurationFile=file://log4j2.xml or -Dlog4j.configurationFile=./log4j2.xml or -Dlog4j.configurationFile=file://<full_path_to_log4j2.xml> the same error
Define a property named "rootLevel" as
${sys:rootLevel:=INFO}
then on your root Logger specify
<Root level="${rootLevel}">
The root level will now default to Info and you can override it with -DrootLevel=DEBUG on the command line.
The problem is the dot in the property name. On Windows you need to put it in quotes.

flyway commandline, How to log the messages in a log file during migration

This question is regarding the "logging the messages in a log file, while migrating through flyway command line". I went through the below links in StackOverflow, and followed the steps mentioned, but couldn't get the list of steps to follow to see the messages in log file.
How configure logging for Flyway command line
Flyway logging with log4j?
Flyway logging with Logback
I have placed log4j-1.2.17.jar, logback-classic-1.1.7.jar, logback-core-1.1.7.jar and slf4j-api-1.7.21.jar under flyway/lib folder and placed the logback.xml in conf location (also tried moving out of conf location too).
Lib is mentioned in the classpath in flyway and flyway.cmd files.
But I always see the debug messages on stdout, and no log file is being created.
Flyway version 4.2.0
Could someone share the list of steps, to write the log messages on a log file during migration/info.
While keeping logback.xml inside conf directory, You need to edit flyway/flyway.cmd based on your environment.
Replace line CP="$INSTALLDIR/lib/*:$INSTALLDIR/drivers/*"
with
CP="$INSTALLDIR/conf:$INSTALLDIR/lib/*:$INSTALLDIR/drivers/*"
Replace line
%JAVA_CMD% -cp "%INSTALLDIR%\lib\*;%INSTALLDIR%\drivers\*" org.flywaydb.commandline.Main %*
with
%JAVA_CMD% -cp "%INSTALLDIR%\conf:%INSTALLDIR%\lib\*;%INSTALLDIR%\drivers\*" org.flywaydb.commandline.Main %*
Explanation:
conf directory is not declared as classpath in the execution scripts. So need to add it in classpath so that logback.xml can be read from classpath.
anywhere you put the logback.xml file, it has to be declared as classpath.

Configurate log4j for external jar

In my project (myProject) I use an external jar (external.jar). Both of them make logging with log4j.jar . With the help of log4j.properties file (located in myProject) I can configure logging from myProject. How can I configurate log levels of logging from the the external.jar without changing that jar file ?
Simpy adding package from external.jar ( let say org.external) in property file
log4j.logger.org.external=ERROR does not make any difference.
Here I have found the salution.

Which directory contains third party libraries for Spark

When we use
spark-submit
which directory contains third party libraries that will be loaded on each of the slaves? I would like to scp one or more libraries to each of the slaves instead of shipping the contents in the application uber-jar.
Note: I did try adding to
$SPARK_HOME/lib_managed/jars
But the spark-submit still results in a ClassNotFoundException for classes included in the added library.
Hope these points will help you.
$SPARK_HOME/lib/ [contains the jar files ]
$SPARK_HOME/bin/ [contains the launch scripts - Spark-Submit,Spark-Class,pySpark,compute-classpath.sh etc]
Spark-Submit ---will call ---> Spark-Class.
Spark-class internally calls compute-Classpath.sh before executing / launching the job.
compute-Classpath.sh will pick the jars availble in $SPARK_HOME/lib to CLASSPATH.
(execute ./compute-classpath.sh //returns jars in lib dir)
So try these options.
option-1 - Placing user-specific-jars in $SPARK_HOME/lib/ will works
option-2 - Tweak compute-classpath.sh so that it will be able to pic
your jars specified in a user specific jars dir

WARN No appenders could be found for logger (org.apache.accumulo.start.classloader.AccumuloClassLoader)

Does anyone know how to get rid of the following warnings when starting accumulo:
log4j:WARN No appenders could be found for logger (org.apache.accumulo.start.classloader.AccumuloClassLoader).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
I am running accumulo 1.4.0 hadoop 0.20.2 and zookeeper 3.3.3. I understand this warning happens because the class can not find the log4j.properties file and yes I have read http://logging.apache.org/log4j/1.2/manual.html. My log4j.properties file has the following lines copied from an accumulo 1.4.3 log4j file (I dont have the option to upgrade my system to 1.4.3):
# default logging properties:
# by default, log everything at INFO or higher to the console
log4j.rootLogger=INFO,A1
# hide Jetty junk
log4j.logger.org.mortbay.log=WARN,A1
# hide "Got brand-new compresssor" messages
log4j.logger.org.apache.hadoop.io.compress=WARN,A1
# hide junk from TestRandomDeletes
log4j.logger.org.apache.accumulo.server.test.TestRandomDeletes=WARN,A1
# hide almost everything from zookeeper
log4j.logger.org.apache.zookeeper=ERROR,A1
# hide AUDIT messages in the shell, alternatively you could send them to a different logger
log4j.logger.org.apache.accumulo.core.util.shell.Shell.audit=WARN,A1
# Send most things to the console
log4j.appender.A1=org.apache.log4j.ConsoleAppender
log4j.appender.A1.layout.ConversionPattern=%d{ISO8601} [%-8c{2}] %-5p: %m%n
log4j.appender.A1.layout=org.apache.log4j.PatternLayout
I have put this log4j file everyone. In the accumulo/bin folder, in the accumulo/conf folder, in the accumulo/lib folder but can not get rid of this warning (I know it has to go on the accumulo class path but dont know where that is). I also can't pass a log4j.configuration option to the java compiler because the accmulo executable comes pre-compiled (I just run it).
Thanks in advance for the help.
EDIT: Below is the result of an "accumulo classpath" command on my system:
[admin-cloud#NODE1 bin]$ echo $ACCUMULO_HOME
/accumulo/accumulo-1.4.0
[admin-cloud#NODE1 bin]$ accumulo classpath
Accumulo List of classpath items are:
file:/accumulo/accumulo-1.4.0/lib/commons-collections-3.2.jar
file:/accumulo/accumulo-1.4.0/lib/commons-configuration-1.5.jar
file:/accumulo/accumulo-1.4.0/lib/log4j-1.2.16.jar
file:/accumulo/accumulo-1.4.0/lib/libthrift-0.6.1.jar
file:/accumulo/accumulo-1.4.0/lib/commons-jci-core-1.0.jar
file:/accumulo/accumulo-1.4.0/lib/commons-lang-2.4.jar
file:/accumulo/accumulo-1.4.0/lib/commons-logging-api-1.0.4.jar
file:/accumulo/accumulo-1.4.0/lib/accumulo-server-1.4.0.jar
file:/accumulo/accumulo-1.4.0/lib/accumulo-start-1.4.0.jar
file:/accumulo/accumulo-1.4.0/lib/commons-jci-fam-1.0.jar
file:/accumulo/accumulo-1.4.0/lib/jline-0.9.94.jar
file:/accumulo/accumulo-1.4.0/lib/examples-simple-1.4.0.jar
file:/accumulo/accumulo-1.4.0/lib/cloudtrace-1.4.0.jar
file:/accumulo/accumulo-1.4.0/lib/commons-logging-1.0.4.jar
file:/accumulo/accumulo-1.4.0/lib/accumulo-core-1.4.0.jar
file:/accumulo/accumulo-1.4.0/lib/commons-io-1.4.jar
file:/zookeeper/zookeeper-3.3.6/zookeeper-3.3.6.jar
file:/hadoop/hadoop-0.20.2/conf/
file:/hadoop/hadoop-0.20.2/hadoop-0.20.2-examples.jar
file:/hadoop/hadoop-0.20.2/hadoop-0.20.2-test.jar
file:/hadoop/hadoop-0.20.2/hadoop-0.20.2-tools.jar
file:/hadoop/hadoop-0.20.2/hadoop-0.20.2-ant.jar
file:/hadoop/hadoop-0.20.2/hadoop-0.20.2-core.jar
file:/hadoop/hadoop-0.20.2/lib/log4j-1.2.15.jar
file:/hadoop/hadoop-0.20.2/lib/jasper-runtime-5.5.12.jar
file:/hadoop/hadoop-0.20.2/lib/slf4j-log4j12-1.4.3.jar
file:/hadoop/hadoop-0.20.2/lib/commons-httpclient-3.0.1.jar
file:/hadoop/hadoop-0.20.2/lib/mockito-all-1.8.0.jar
file:/hadoop/hadoop-0.20.2/lib/jetty-6.1.14.jar
file:/hadoop/hadoop-0.20.2/lib/oro-2.0.8.jar
file:/hadoop/hadoop-0.20.2/lib/servlet-api-2.5-6.1.14.jar
file:/hadoop/hadoop-0.20.2/lib/junit-3.8.1.jar
file:/hadoop/hadoop-0.20.2/lib/commons-logging-api-1.0.4.jar
file:/hadoop/hadoop-0.20.2/lib/commons-codec-1.3.jar
file:/hadoop/hadoop-0.20.2/lib/core-3.1.1.jar
file:/hadoop/hadoop-0.20.2/lib/jets3t-0.6.1.jar
file:/hadoop/hadoop-0.20.2/lib/hsqldb-1.8.0.10.jar
file:/hadoop/hadoop-0.20.2/lib/slf4j-api-1.4.3.jar
file:/hadoop/hadoop-0.20.2/lib/jasper-compiler-5.5.12.jar
file:/hadoop/hadoop-0.20.2/lib/jetty-util-6.1.14.jar
file:/hadoop/hadoop-0.20.2/lib/commons-net-1.4.1.jar
file:/hadoop/hadoop-0.20.2/lib/commons-logging-1.0.4.jar
file:/hadoop/hadoop-0.20.2/lib/commons-cli-1.2.jar
file:/hadoop/hadoop-0.20.2/lib/xmlenc-0.52.jar
file:/hadoop/hadoop-0.20.2/lib/kfs-0.2.2.jar
file:/hadoop/hadoop-0.20.2/lib/commons-el-1.0.jar
Line 84 of bin/accumulo in Apache Accumulo 1.4.0 sets the variable XML_FILES to $ACCUMULO_HOME/conf and then adds XML_FILES to the CLASSPATH variable which is later passed to the java command.
https://svn.apache.org/repos/asf/accumulo/tags/1.4.0/bin/accumulo
It sounds you have a misconfiguration of ACCUMULO_HOME either through your shell environment or in $ACCUMULO_HOME/conf/accumulo-env.sh.
I was troubleshooting an installation someone else set up that was having the same problem. My solution to this problem was simply that there actually was no log4j.properties in the conf directory! So I just copied up one of the log4j.properties from the conf/examples directory, restarted and everything worked like it should!

Resources