I am learning Hadoop (2.7.1). I am configuring it on Ubuntu (15.04) and I created a separate user for Hadoop to isolate Hadoop file system from Linux file system. But when I try to use sudo under this hadoop user I get an error:
hadoop is not in the sudoers file. This incident will be reported.
Should this user be in sudoers file? In which cases should I work under hadoop and root users?
No, hadoop user should not be (need not be) in sudoers file.
As you have said, to isolate Hadoop related operations from your local operations, you should use the specific users for specific purposes.
You should use your normal Linux user (or root user) for, say, installing Linux packages needed for hadoop e.g. OpenSSH, Java etc.
You should use hadoop user for hadoop related operations e.g. Start cluster, Use HDFS, Run MR programs etc.
Hope this helps!
Related
In order to test and learn Spark functions, developers require Spark latest version. As the API's and methods earlier to version 2.0 are obsolete and no longer work in the newer version. This throws a bigger challenge and developers are forced to install Spark manually which wastes a considerable amount of development time.
How do I use a later version of Spark on the Quickstart VM?
Every one should not waste setup time which I have wasted, so here is the solution.
SPARK 2.2 Installation Setup on Cloudera VM
Step 1: Download a quickstart_vm from the link:
Prefer a vmware platform as it is easy to use, anyways all the options are viable.
Size is around 5.4gb of the entire tar file. We need to provide the business email id as it won’t accept personal email ids.
Step 2: The virtual environment requires around 8gb of RAM, please allocate sufficient memory to avoid performance glitches.
Step 3: Please open the terminal and switch to root user as:
su root
password: cloudera
Step 4: Cloudera provides java –version 1.7.0_67 which is old and does not match with our needs. To avoid java related exceptions, please install java with the following commands:
Downloading Java:
wget -c --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz
Switch to /usr/java/ directory with “cd /usr/java/” command.
cp the java download tar file to the /usr/java/ directory.
Untar the directory with “tar –zxvf jdk-8u31-linux-x64.tar.gz”
Open the profile file with the command “vi ~/.bash_profile”
export JAVA_HOME to the new java directory.
export JAVA_HOME=/usr/java/jdk1.8.0_131
Save and Exit.
In order to reflect the above change, following command needs to be executed on the shell:
source ~/.bash_profile
The Cloudera VM provides spark 1.6 version by default. However, 1.6 API’s are old and do not match with production environments. In that case, we need to download and manually install Spark 2.2.
Switch to /opt/ directory with the command:
cd /opt/
Download spark with the command:
wget https://d3kbcqa49mib13.cloudfront.net/spark-2.2.0-bin-hadoop2.7.tgz
Untar the spark tar with the following command:
tar -zxvf spark-2.2.0-bin-hadoop2.7.tgz
We need to define some environment variables as default settings:
Please open a file with the following command:
vi /opt/spark-2.2.0-bin-hadoop2.7/conf/spark-env.sh
Paste the following configurations in the file:
SPARK_MASTER_IP=192.168.50.1
SPARK_EXECUTOR_MEMORY=512m
SPARK_DRIVER_MEMORY=512m
SPARK_WORKER_MEMORY=512m
SPARK_DAEMON_MEMORY=512m
Save and exit
We need to start spark with the following command:
/opt/spark-2.2.0-bin-hadoop2.7/sbin/start-all.sh
Export spark_home :
export SPARK_HOME=/opt/spark-2.2.0-bin-hadoop2.7/
Change the permissions of the directory:
chmod 777 -R /tmp/hive
Try “spark-shell”, it should work.
I am trying to build a hadoop cluster with four nodes.
The four machines are from my school's lab and I found their /usr/local are mount from a same public disk which means their /usr/local are identical.
The problem is, I can not start data node on slaves because the hadoop files are always the same(like tmp/dfs/data).
I am planning to configure and insatll hadoop in other dirs like /opt .
The problem is I found almost all the installation tutorial ask us to install it in /usr/local , so I was wondering will there be any bad consequence if I install hadoop in other place like /opt ?
Btw, I am using Ubuntu 16.04
As long as HADOOP_HOME points to where you extracted the hadoop binaries, then it shouldn't matter.
You'll also want to update PATH in ~/.bashrc, for example.
export HADOOP_HOME=/path/to/hadoop_x.yy
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
For reference, I have some configuration files inside of /etc/hadoop.
(Note: Apache Ambari makes installation easier)
It is not at all necessary to install hadoop under /usr/local. That location is generally used when you install single node hadoop cluster (although it is not mandatory). As long as you have following variables specified in .bashrc, any location should work.
export HADOOP_HOME=<path-to-hadoop-install-dir>
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
I'm running into an issue where I do not have write access to the /var directory on a UNIX environment, and InstallAnywhere doesn't provide me the option of writing the .com.zerog.registry.xml to any other location for a product installation. Is there a parameter out there that allows for this file to be written to a different directory?
According to the IA docs:
If logged in as root, the global registry is located in \var.
If logged in as a user, it is located in the user’s home directory.
So, if you're running as root and can't write to /var, it sounds like a permissions problem with the /var directory, independent of IA. Check the permissions on /var.
If you're running as a non-root user, then the registry shouldn't be going to /var, but to $HOME/.com.zerog.registry.xml (FWIW, I just checked one of our test Linux boxes and found .com.zerog.registry.xml under both /var and under test-user $HOME directories. The docs appear to be correct).
I've also seen some very strange behavior if IA is low on space in $TMP. Make sure you have plenty of space there.
Also, have you considered running the installer with sudo, or the graphical equivalents kdesudo (KDE) and gksu (Gnome)? Those might get you where you want to go.
I need to grant jenkins user permission to access some specific directories like usr/lib or usr/local/include so that he can copy some files into those directories during the execution of some Jenkins jobs. How can I do that?
The idea that something accessed from the web can overwrite system files is very scary (and insecure), but I think you would need to grant the user under which Jenkins is running the privileges need to write there.
Again, there are good reasons why ordinary user's aren't granted permissions to write to those directories. You might want to consider running the job in a chroot jail. That way, if something goes wrong, you won't destroy your system.
For specific task i would say use sudo
You mentioned usr/lib or usr/local/include directories, and if your goal is to install some tools and packages during job execution, you could install it locally into your job workspace (for example, into .local directory) and after that make your jobs work with those directories by setting environment variables like LD_LIBRARY_PATH, CFLAGS, etc.
I am developing with Xampp for Linux and Tomcat (similar to Xampp on Windows). Many programs like /IDEA, Tomcat and Xampp are recommended to be installed under /opt Now I have heard that it is not recommended to run services as root, but on Ubuntu (I am using this) unpacking any directory to /opt implies that it belongs to root owner and root group. This may be specific to Xampp as per the instructions on their Linux page:
Step 2: Installation After downloading simply type in the following commands:
Go to a Linux shell and login as the system administrator root:
su
Extract the downloaded archive file to /opt:
tar xvfz xampp-linux-1.8.1.tar.gz -C /opt
Warning: Please use only this command to install XAMPP. DON'T use any Microsoft Windows tools to extract the archive, it won't work.
Warning 2: already installed XAMPP versions get overwritten by this command.
That's all. XAMPP is now installed below the /opt/lampp directory.
* Step 3: Start To start XAMPP simply call this command:
/opt/lampp/lampp start
Placing it here implies that Apache must be run as root as one is only able to run it with sudo on Ubuntu.
This may be an issue specific to Ubuntu. Is it? Because Xampp is a development tool I posted this here as I am more likely to find an appropriate answer here from developers who use it on Ubuntu (and other Linux systems). I would appreciate any information on if the same problem occurs on other systems, I notice my production environment has Tomcat installed in /opt too, but belongs to tomcat: tomcat
The question here is how to get around this for all tools under /opt, because even though Xampp may not be the tool for my needs, I still want to place Tomcat under /opt to replicate my production environment and the same thing will surely happen unless this is just a Ubuntu issue?
Ubuntu and some other distributions differ to the general Linux principle where the account that you create upon install of the OS is added to specific groups that can be viewed with the following command:
groups username
You will notice that root is not amongst these. It is also not possible to log in or su to the root account. sudo is most likey a command that has been granted permission to be used from other accounts so I imagine the 'sudo' command has a file permission of 775 for user: root:root
Thus launching services from /opt' does not run them asroot`