I'm completely new to hadoop framework and for the past few months I've been using linux . After I installing hadoop to /usr/local directory. I tried to run hadoop command in CLI and it responds as hadoop command not found, then I figured out environment variables aren't set, so I set the environment variables by the following commands
export HADOOP_HOME=/usr/local/hadoop/
export PATH=$PATH:$HADOOP_HOME/bin/
It worked. Also I know what is an environment variable but my doubt is how does the Shell refers hadoop command by using the HADOOP_HOME variable
HADOOP_HOME does nothing for when you type the hadoop command. (Or anything else in $HADOOP_HOME/bin, for that matter).
The $PATH where all the commands you type into the terminal are looked for.
Just echo $PATH and you'll see all the folders.
HADOOP_HOME is looked for by the hadoop components themselves, and isn't Linux specific.
HADOOP_HOME variable is used by shell files like yarn-config.sh, mapred-config.sh that is why it is required for setting HADOOP_HOME variable so that when config files access it they can reach to main hadoop folder.
If you do not want to define HADOOP_HOME then you need to edit config script files by replacing HAOOP_HOME with the required directory address
Related
I use windows10.
My node.js's path is C:\Program Files\nodejs\node.exe and I can use node command.
But I haven't set my environment variable path.
It does not mean just that I have not set it myself. I checked the user environment variables and the system environment variables but could not find them. (The path of npm was in the user environment variable.)
Why can I use node command without setting path?
For the node command to work in Windows from a command shell, one of the following must be true:
Your current directory in the command shell is C:\Program Files\nodejs and thus node.exe or node.bat can be found in that current directory.
C:\Program Files\nodejs is in the search path which can be either a system wide path setting or a local user path setting (what you see in the environment is a combination of those two).
There is a node.bat file somewhere in your system path or in the current directory that launches node.exe for you by directly referencing its path.
On Windows, you can type "where node" in the command shell and it will tell you where it's finding the file to run. If what it is finding is not in the current directory, then you must have its directory in your path somewhere.
Can someone please explain, What is ANT_HOME?
Like in our environment it's set
-bash-3.2$ echo $ANT_HOME
/mhfidm01/apps/oracle/middleware/modules/org.apache.ant.patch_1.2.0.0_1-7-1.jar
It's the location of where Apache Ant is installed.
Its value should be a directory path, not a JAR file.
From the Ant manual:
Set the ANT_HOME environment variable to the directory where you installed Ant. On some operating systems, Ant's startup scripts can guess ANT_HOME (Unix dialects and Windows NT/2000), but it is better to not rely on this behavior.
I am trying to build a hadoop cluster with four nodes.
The four machines are from my school's lab and I found their /usr/local are mount from a same public disk which means their /usr/local are identical.
The problem is, I can not start data node on slaves because the hadoop files are always the same(like tmp/dfs/data).
I am planning to configure and insatll hadoop in other dirs like /opt .
The problem is I found almost all the installation tutorial ask us to install it in /usr/local , so I was wondering will there be any bad consequence if I install hadoop in other place like /opt ?
Btw, I am using Ubuntu 16.04
As long as HADOOP_HOME points to where you extracted the hadoop binaries, then it shouldn't matter.
You'll also want to update PATH in ~/.bashrc, for example.
export HADOOP_HOME=/path/to/hadoop_x.yy
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
For reference, I have some configuration files inside of /etc/hadoop.
(Note: Apache Ambari makes installation easier)
It is not at all necessary to install hadoop under /usr/local. That location is generally used when you install single node hadoop cluster (although it is not mandatory). As long as you have following variables specified in .bashrc, any location should work.
export HADOOP_HOME=<path-to-hadoop-install-dir>
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
I am not able to find CASSANDRA_HOME variable anywhere being set in the CASSANDRA installed path.
I could guess that it is my installation directory of cassandra because the log files are created in the installed_dir/logs.
Where can I find CASSANDRA_HOME being set?
You haven't provided a lot of information but I'll try and answer.
CASSANDRA_HOME is set in cassandra.in.sh or cassandra.bat if you are running on windows. If CASSANDRA_HOME isn't set it sets it to the parent of the directory that the script is running in.
I'm assuming that you are running from a tarball installation since you say that the log files are enter up under your install directory, hence your bin directory is directly under the install directory.
I need to update Apache Ant on my server.
I downloaded the newest Ant, built it, and (I thought) installed it. But when when I check it says the old version is still installed.
How do I update/replace the previous version of Apache Ant on a CentOS 5.? server?
take care,
lee
As mentioned it's probably getting picked up in your path. Post the output from echo $PATH
To configure your CentOS after installing a new version of Apache Ant, proceed to the following steps:
Locate the directory where the new Ant is located
Set the ANT_HOME environment variable to this directory
Add $ANT_HOME/bin to your PATH
P.S. To modify environment variables, you may edit the /etc/environment file, and reboot, or modify your local .bashrc. Look at your current environment variables by analyzing the output of printenv, e.g., to see the current value of PATH and then add the Ant path to it, e.g.
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/jvm/adoptopenjdk-8-hotspot-amd64/bin:/usr/local/ant/bin