Starting Hadoop without ssh'ing to localhost - linux

I've a very tricky situation in my hand. I'm installing Hadoop on few nodes which run Ubuntu 12.04 and our IT guys have created a user "hadoop" for me to use on all the nodes. The issue with this user is that it does not allow ssh on localhost because of some security constraints. So, I'm not able to start Hadoop daemons at all.
I can connect to itself using "ssh hadoop#hadoops_address" but not using loopback address. I also cannot make any changes to the /etc/hosts. Is there a way I can tell Hadoop to ssh to itself using "ssh hadoop#hadoops_address" instead of "ssh hadoop#localhost"?

Hadoop reads the hostname from "masters" and "slaves" file which is present inside conf dir,
edit the file and change the value from localhost to hadoops_address.
This should fix your problem.

Related

How to control /etc/hosts file from over writing

I am deploying a Kops Kubernetes cluster on AWS with Debian Jessie image.
Mine is a hybrid environment where my artifactory is in a physical env in our DC. Now I have been facing an issue, my worker nodes are unable to pull images from my artifactory unless I specify the artifactory FQDN and IP in the /etc/hosts file.
So this is a manual edit, it works all fine after I do this. So I went ahead and added the data in my additional userdata of the Kops worker node group, but I am seeing after some time the hosts file on worker nodes is getting overwritten and also this is evident upon node reboot.
So how can I resolve this!!
The real answer is to run your own DNS server, or at least use DNS hostnames to resolve. If your router supports it, you can set local hostnames (machine-1.local)
If that isn't possible, you could try a solution like puppet if you own the virtual machines. Also, I believe Kubernetes does have a DNS addon. Also, you could use a crontab for on boot to write to the hosts file, but that's a dirty solution.
In addition, your hosts file would get rewritten for every DHCP renew. You could use static IPs, but again, DNS is the way to go.
Another workaround for this is to put it in your /etc/rc.local file:
If the file exists add this to the end:
echo '<ip-address-of-artifactory> <fqdn-of-artifactory>' >> /etc/hosts
If the file doesn't exist, create it:
$ cat << EOF > /etc/rc.local
#!/bin/sh -e
#
echo '<ip-address-of-artifactory> <fqdn-of-artifactory>' >> /etc/hosts
EOF
$ chmod 755 /etc/rc.local
$ reboot # check that it works

Difference between connecting throug Jenkins SSH plugin and normal ssh

I have a remote server.
If I use ssh to connect with the server as the Jenkins user it works perfectly
ssh jenkins#remoteserver.com
The jenkins user is allowed to change to user jboss WITHOUT being asked for password:
sudo su jboss
This works perfectly, no need for entering a password. Everything as expected.
If I make a Jenkins build, connecting to the remote server through a SSH plugin, the connection works fine. I can also run a testscript, it works also!
But if I make the sudo su jboss through Jenkins on my remote server, it's not working.
Jenkins is not throwing any error, there is just the spinning circle
It's never stopping, only if I cancel the job.
Anyone got an idea, what's the difference between running ssh in Jenkins and conncecting through a plugin?
Is the connection lost, when changing the username? (looks like it)
The SSH plugin and the ssh command provide two completely different implementations of the SSH protocol:
Your ssh command will probably run the OpenSSH client
The SSH plugin uses the SSH protocol implementation provided by JSch
I'm not into JSch, but I'd suspect there's a either a problem in the way the plugin configures JSch terminal handling, or there's a problem related to terminal handling with JSch. Either may break the behaviour of sudo:
sudo is somewhat sensitive to terminal/tty settings; see e.g. this discussion, which also contains a few hints which may help to work around the issue.

HDP 2.5 Hortonworks ambari-admin-password-reset missing

I have downloaded the sandbox from hortonworks (Centos OS), then tried to follow the tutorial. It seems like the ambari-admin-password-reset command is not there and missing. I tried also to login with putty, the console asked me to change the password so I did.
now it seems like the command is there, but I have different passwords for the console and one for the putty for the same user.
I have tried to look for the reason why for the same user 'root' I have 2 different passwords (one for the virtual box console and one for the putty) that I can login with. I see different commands on each box. more than that when I share folder I can only see it on the virtual box console but not on the putty console) which is really frustrating.
How can I enforce that what I would see from putty would be the same as what I see from the virtual box console.
I think it somehow related to TTY but I am not sure.
EDIT:
running commands from the virtual box machine output:
grep "^passwd" /etc/nsswitch.conf
OUT: passwd: files sss
grep root /etc/passwd
OUT: rppt"x"0"0"root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
getent passwd root
OUT: root:x:0:0:root:/root:/bin/bash
EDIT:
I think this is all about docker containers. It seems like the machine 2222 port is the ssh port for the hdp 2.5 container and not for the hosting machine.
Now I get another problem. when running
docker exec sandbox ls
it is getting stuck. any help ?
Thanks for helpers
So now I had the time to analyze the sandbox vm, and write it up for other users.
As you stated correctly in your edit of the question, its the docker container setup of the sandbox, which confuses with two separate root users:
via ssh root#127.0.0.1 -p 2222 you get into the docker container called "sandbox". This is a CentOS release 6.8 (Final), containing all the HDP services, especially the ambari service. The configuration enforces a password change at first login for the root user. Inside this VM you can also execute the ambari-admin-password-reset and set there a password for the ambari admin.
via console access you reach the docker host running a Centos 7.2, here you can login with the default root password for the VM as found in the HDP docs.
Coming to your sub-question with the hanging docker exec, it seems to be a bug in that specific docker version. If you google that, you will find issues discussing this or similar problems with docker.
So I thought that it would be a good idea to just update the host via yum update. However this turned out to be a difficult path.
yum tried to update the kernel, but complained that there is not enough space on the boot partion.
So I moved the boot partion to the root partition:
edit /etc/fsab and comment out the boot entry
unmount /boot
mv /boot
cp -a /boot.org /boot
grub2-mkconfig -o /boot/grub2/grub.cfg
grub2-install /dev/sda
reboot
After that I have found out that the docker configuration is broken and docker does not start anymore. In the logs it complained about
"Error starting daemon: error initializing graphdriver:
\"/var/lib/docker\" contains other graphdrivers: devicemapper; Please
cleanup or explicitly choose storage driver (-s )"
So I edited /etc/systemd/system/multi-user.target.wants/docker.service and changed the ExecStart setting to:
ExecStart=/usr/bin/dockerd --storage-driver=overlay
After a service docker start and a docker start sandbox. The container worked again and I could could login to the container and after a ambari-server restart everything worked again.
And now - with the new docker version 1.12.2, docker exec sandbox ls works again.
So to sum up the docker exec command has a bug in that specific version of the sandbox, but you should think twice if you want to upgrade your sandbox.
I ran into the same issue.
The HDP 2.5 sandbox runs all of its components in a docker container, but commands like docker exec -it sandbox /bin/bash or docker attach sandbox got stuck.
When I ran a simple ps aux, I found several /usr/bin/docker-proxy commands which looked like :
/usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 60000 -container-ip 172.17.0.2 -container-port 60000
They probably forward the HTTP ports of the various UIs of HDP components.
I could ssh into the container ip (here 172.17.0.2) using root/hadoop to authenticate. From there, I could use all "missing" commands like ambari-admin-password-reset.
$ ssh root#172.17.0.2
... # change password
$ ambari-admin-password-reset
NB: I am new to docker, so there's probably a better way to deal with this.
I'd like to post here the instructions for 3.0.1 here.
I followed the instructions of installing hortonworks version 3.0.1 here: https://youtu.be/5TJMudSNn9c
After running the docker container, go to your browser and enter "localhost:4200", that will take you to the in browser terminal of the container, that hosts ambari. Enter "root" for login and "hadoop" for password, change the root password, and then enter "ambari-admin-password-reset" in order to reset ambari password.
In order to be able to use sandbox-hdp.hortonworks.com, you need to add the line "127.0.0.1 sandbox-hdp.hortonworks.com" at the end of the /private/etc/hosts file on your mac.
Incorrect Pass
Then right corner click on power button >> power off drop down >> Restart >> when it boots up then press Esc key to get into recovery menu
Restart
select advance option and hit enter
Advance Option
Select Recovery mode hit enter
Select Root
Root enter
Command
mount -rw -o remount/
ls /home
change pass command
passwd username
user as yours
last step
enter pass two times by pressing enter
enter image description here
Hopefully you changed password (:

Could not connect to cassandra with cqlsh

I want to connect to cassandra but got this error:
$ bin/cqlsh
Connection error: ('Unable to connect to any servers', {'192.168.1.200': error(10061, "Tried connecting to [('192.168.1.200', 9042)]. Last error: No connection could be made because the target machine actively refused it")})
Pretty simple.
The machine is actively refusing it because your system does not have cassandra running on it. Follow the following steps to completely get rid of this trouble :
Install Cassandra from DataStax (Datastax-DDC; Cassandra version 3).
Go to ~\installation\path\DataStax-DDC\apache-cassandra\bin.
Open up cmd there. (Use Alt+F+P to open it if you are on windows 8 or later).
type cassandra -f this will generate a lot of stuff on the window and you must get the last line as INFO 11:32:31 Created default superuser role 'cassandra'
Now open another cmd window in the same folder.
Type cqlsh
This should give you a prompt, without any error.
I also discovered that this error doesn't pop up if I use cassadra v2.x found here Archived version of Cassandra. I don't know why :( (If you find out please comment).
So, if the above steps do not work, you can always go back to Cassandra v2.x.
Cheers.
Check if you have started Cassandra server, then provide the host and port as the arguments.
$ bin/cqlsh 127.0.0.1 4092
I run into the same problem. This worked for me.
Go to any directory for example E:\ (doesn't have to be the same disc as the cassandra installation)
Create the following directories
E:\cassandra\storage\commitlogs
E:\cassandra\storage\data
E:\cassandra\storage\savedcaches
Then go to your cassandra installations conf path. In my case.
D:\DataStax-DDC\apache-cassandra\conf
Open cassandra.yaml. Edit the lines containing: data_file_directories, commitlog_directory, saved_caches_directory to look like the code below (change paths accordingly to where you created the folders)
data_file_directories:
- E:\cassandra\storage\data
commitlog_directory: E:\cassandra\storage\commitlog
saved_caches_directory: E:\cassandra\storage\savedcaches
Then open the cmd (I did it as administrator, but didn't check if it is necessary) to your cassandra installations bin path. In my case.
D:\DataStax-DDC\apache-cassandra\bin
run cassandra -f
Lots of stuff will be logged to your screen.
You should now be able to run cqlsh and all other stuff without problems.
Edit: The operating system was windows10 64bit
Edit2: If it stops working after a while check if the service is till running using nodetool status. If it isn't follow this instruction.
I also faced the same problem on a Win32 windows 7 machine.
Check if you have JAVA installed correctly and JAVA_HOME variable set.
Once you have checked the java installation and set JAVA_HOME, uninstall Cassandra and install it again.
Hopefully this would solve the problem. Mine was solved after applying the above two steps.
You need to mention host, user, password for cassandra cqlsh connection. Default cassandra cqlsh user is cassandra and password is cassandra.
$ bin/cqlsh <host> -u cassandra -p cassandra
I also had same problem. I applied many methods given on google and youtube but none of them worked in my case. Finally, I applied the following 3 steps and it worked in my case:-
Create a folder without any space in C or D whichever is your system drive. eg:- C:\cassandra
Install Cassandra in this folder instead of installing in"Program Files".
After installation, it will be like this- C:\cassandra\apache-cassandra-3.11.6
Copy python 2.7 installed in bin folder i.e.,C:\cassandra\apache-cassandra-3.11.6\bin
Now your program is ready for work.
There is no special method to connect cqlsh it simple as below:-
$ bin/cqlsh 127.0.0.1(host IP) 9042 or $ bin/cqlsh 127.0.0.1(host IP) 9160 (if older version of Cassandra)
Don't forget to check port connectivity if you are connecting cqlsh to remote host. Also you can use username/password if you enabled by default it is disabled.

Hadoop 2.2.0 multi-node cluster setup on ec2 - 4 ubuntu 12.04 t2.micro identical instances

I have followed this tutorial to setup Hadoop 2.2.0 multi-node cluster on Amazon EC2. I have had a number of issues with ssh and scp which i was either able to resolve or workaround with help of articles on Stackoverflow but unfortunately, i could not resolve the latest problem.
I am attaching the core configuration files core-site.xml, hdfs-site.xml etc. Also attaching a log file which is a dump output when i run the start-dfs.sh command. It is the final step for starting the cluster and it is giving a mix of errors and i don't have a clue what to do with them.
So i have 4 nodes exactly the same AMI is used. Ubuntu 12.04 64 bit t2.micro 8GB instances.
Namenode
SecondaryNode (SNN)
Slave1
Slave2
The configuration is almost the same as suggested in the tutorial mentioned above.
I have been able to connect with WinSCP and ssh from one instance to the other. Have copied all the configuration files, masters, slaves and .pem files for security purposes and the instances seem to be accessible from one another.
If someone could please look at the log, config files, .bashrc file and let me know what am i doing wrong.
Same security group HadoopEC2SecurityGroup is used for all the instances. All TCP traffic is allowed and ssh port is open. Screenshot in the zipped folder attached. I am able to ssh from Namenode to secondary namenode (SSN). Same goes for slaves as well which means that ssh is working but when i start the hdfs every thing goes down. The error log is not throwing any useful exceptions either. All the files and screenshots can be found as zipped folder here.
Excerpt from error output on console looks like
Starting namenodes on [OpenJDK 64-Bit Server VM warning: You have loaded library /usr/local/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c ', or link it with '-z noexecstack'.
ec2-54-72-106-167.eu-west-1.compute.amazonaws.com]
You: ssh: Could not resolve hostname you: Name or service not known
have: ssh: Could not resolve hostname have: Name or service not known
loaded: ssh: Could not resolve hostname loaded: Name or service not known
VM: ssh: Could not resolve hostname vm: Name or service not known
library: ssh: Could not resolve hostname library: Name or service not known
Server: ssh: Could not resolve hostname server: Name or service not known
warning:: ssh: Could not resolve hostname warning:: Name or service not known
which: ssh: Could not resolve hostname which: Name or service not known
guard.: ssh: Could not resolve hostname guard.: Name or service not known
have: ssh: Could not resolve hostname have: Name or service not known
might: ssh: Could not resolve hostname might: Name or service not known
.....
Add the following entries to .bashrc where HADOOP_HOME is your hadoop folder:
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export
HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
Hadoop 2.2.0 : "name or service not known" Warning
hadoop 2.2.0 64-bit installing but cannot start

Resources