namenode port is getting blocked - linux

I have installed 7 VM instances of Ubuntu 14.04 LTS servers. First instance runs the namenode service and all other 6 nodes run datanode service.I think my NameNode is getting crashed or blocked due to some issue.
After rebooting if I check JPS command output my namenode is running. In core-site.xml the fs.defaultfs property is set to hdfs://instance-1:8020.
but in the netstat -tulpn output 8020 port is not there.
this is the JPS output right after rebooting.
root#instance-1:~# jps
3017 VersionInfo
2613 NameNode
3371 VersionInfo
3313 ResourceManager
3015 Main
2524 QuorumPeerMain
2877 HeadlampServer
1556 Main
3480 Jps
2517 SecondaryNameNode
3171 JobHistoryServer
2790 EventCatcherService
2842 AlertPublisher
2600 Bootstrap
2909 Main
this is the netstat output that I checked after jps.
root#instance-1:~# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 600/rpcbind
tcp 0 0 0.0.0.0:9010 0.0.0.0:* LISTEN 2524/java
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1164/sshd
tcp 0 0 127.0.0.1:5432 0.0.0.0:* LISTEN 1158/postgres
tcp 0 0 127.0.0.1:19001 0.0.0.0:* LISTEN 1496/python
tcp 0 0 0.0.0.0:42043 0.0.0.0:* LISTEN 2524/java
tcp 0 0 10.240.71.132:9000 0.0.0.0:* LISTEN 1419/python
tcp 0 0 0.0.0.0:7432 0.0.0.0:* LISTEN 1405/postgres
tcp6 0 0 :::111 :::* LISTEN 600/rpcbind
tcp6 0 0 :::22 :::* LISTEN 1164/sshd
tcp6 0 0 :::7432 :::* LISTEN 1405/postgres
udp 0 0 0.0.0.0:68 0.0.0.0:* 684/dhclient
udp 0 0 0.0.0.0:111 0.0.0.0:* 600/rpcbind
udp 0 0 10.240.71.132:123 0.0.0.0:* 3323/ntpd
udp 0 0 127.0.0.1:123 0.0.0.0:* 3323/ntpd
udp 0 0 0.0.0.0:123 0.0.0.0:* 3323/ntpd
udp 0 0 0.0.0.0:721 0.0.0.0:* 600/rpcbind
udp 0 0 0.0.0.0:29611 0.0.0.0:* 684/dhclient
udp6 0 0 :::111 :::* 600/rpcbind
udp6 0 0 :::123 :::* 3323/ntpd
udp6 0 0 :::721 :::* 600/rpcbind
udp6 0 0 :::22577 :::* 684/dhclient
As I said I don't see 8020 port. After one minute I checked JPS output and the namenode is gone.
this is the jps output one minute after rebooting.
root#instance-1:~# jps
3794 Main
3313 ResourceManager
3907 EventCatcherService
4325 Jps
2530 RunJar
3082 RunJar
2524 QuorumPeerMain
2656 Bootstrap
2877 HeadlampServer
1556 Main
2517 SecondaryNameNode
3171 JobHistoryServer
2842 AlertPublisher
2600 Bootstrap
As I said namenode is not there. I repeated the above process couple of times and everytime I get the same results port 8020 not there and namenode getting crashed. I think it is a firewall issue , what do you think?
Thanks in advance.

Looks like your namenode is indeed getting crashed. try stopping all the hadoop daemons then delete all the datanode data and format your namenode.
for stopping hadoop daemons use
stop-all.sh
now delete all the data from datanodes manually for terminal using rm -r command
for formatting your namenode use this
hadoop namenode -format
then start all the daemons again by using this
start-all.sh
hope it helps.

I don't have a full answer, but I know that you can go to the Hadoop folder in the machine where the namenode is running, and go to logs folder, and open the file that contains the log for the namenode, it should have a name like hadoop-username-namenode-machineName.log
where username is the username for your computer and machine name is the name of the host of this machine.
Go untill the end of that file and you will probably see the exact error causing the problem
Best of luck

Related

how to change default start-up folder jupyter hub on Azure DSVM virtual machine

Azure DSVM has jupyterlab enabled on port 8000.
However, the startup folder in jupyterlab always starts at /home/*user-name/notbook.
I want to change the jupyterlab start folder, but I don't know what to do.
tried
1.
I created a jupyterlab configuration by referring to stackoverflow to change c.NotebookApp.notbook_dir, but rebooted and checked again for server operation. I've confirmed it's not working.
2.
I also tried to create a jupyterhub configuration, but it came out that there was no command called jupyterhub.
I thought it was working in docker, but there was no process for jupyterhub in docker.
3.
netstat -tnlp
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:8081 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:8001 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:44675 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:46277 0.0.0.0:* LISTEN 2048/python
tcp6 0 0 :::111 :::* LISTEN -
tcp6 0 0 :::22 :::* LISTEN -
tcp6 0 0 :::8000 :::* LISTEN -
ps aux | grep jupyter
root 1345 0.0 0.0 13316 3304 ? Ss 00:06 0:00 /bin/bash /etc/jupyterhub/start_jupyterhub.sh
root 1624 0.0 0.0 246340 64268 ? Sl 00:06 0:01 /anaconda/bin/python /anaconda/bin/jupyterhub --log-file=/var/log/jupyterhub.log
root 1949 0.0 0.0 606436 51356 ? Ssl 00:06 0:01 node /usr/local/bin/configurable-http-proxy --ip * --port 8000 --api-ip 127.0.0.1 --api-port 8001 --error-target http://127.0.0.1:8081/hub/error --ssl-key /etc/jupyterhub/srv/server.key --ssl-cert /etc/jupyterhub/srv/server.crt
rootadm+ 2048 0.2 0.0 343804 101160 ? Ssl 00:07 0:03 /anaconda/bin/python /anaconda/bin/jupyterhub-singleuser --port=46277 --notebook-dir=~/notebooks --SingleUserNotebookApp.default_url=/lab --config=/etc/jupyterhub/default_jupyter_config.py
rootadm+ 3841 0.0 0.0 14864 1040 pts/0 S+ 00:27 0:00 grep --color=auto jupyter

Tight VNC server and Gucamole

I have a VM in which I installed the VNC server (TightVNC) using the link : https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04
It is installed successfully and I can see the port 5901 running
/etc/tigervnc$ netstat -tulpn
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:5901 0.0.0.0:* LISTEN 16460/Xtigervnc
tcp 0 0 127.0.0.1:5902 0.0.0.0:* LISTEN 16183/Xtigervnc
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN -
tcp6 0 0 ::1:5901 :::* LISTEN 16460/Xtigervnc
tcp6 0 0 ::1:5902 :::* LISTEN 16183/Xtigervnc
tcp6 0 0 :::22 :::* LISTEN -
tcp6 0 0 ::1:631 :::* LISTEN -
udp 0 0 0.0.0.0:36618 0.0.0.0:* -
udp 29184 0 127.0.0.53:53 0.0.0.0:* -
udp 0 0 0.0.0.0:68 0.0.0.0:* -
udp 0 0 0.0.0.0:631 0.0.0.0:* -
udp 7680 0 0.0.0.0:5353 0.0.0.0:* -
udp6 0 0 :::37372 :::* -
udp6 20736 0 :::5353 :::*
Now from my local machine, I tried to do the port binding to my local from VM (as per the link https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04)
ssh -L 5901:127.0.0.1:5901 -C -N -l test 172.1.1.1
In my local machine, I able to see the port is binded to 5901
/etc/guacamole$ fuser 5901/tcp
5901/tcp: 22049
Now when I try to take the VNC connection using 127.0.0.1:5901, It promopts for VM's password and shows only the blank page.
Could someone help me with this?
Thanks,
Hari
edit your ~/.vnc/xstartup file thus:
#!/bin/sh
startxfce4 &
I had the same problem and this solved it
For reference i got it from here:
https://www.raspberrypi.org/forums/viewtopic.php?t=52557
You can also try killing and restarting your VNC server
kill $(pgrep Xvnc)
vncserver
Are you trying to VNC from the local machine to the local machine? I am assuming just for testing correct?
If you are not getting a rejection, at least it should be talking to the service.

Why "service sshd start" command can not return to command prompt?

I'm install a new CentOS7, its sshd service works fine. Then I download the source code of openssh7.5p1, build it and install it to the default directory "/usr/local/sbin/sshd". I want to use it to replace the system's sshd.
I modify the file "/usr/lib/systemd/system/sshd.service", change following line:
old:
ExecStart=/usr/sbin/sshd $OPTIONS
new:
ExecStart=/usr/local/sbin/sshd $OPTIONS
After that, type command "service sshd start", the command can not return and seems hang up. Looks as follows:
[root#localhost ~]# service sshd start
Redirecting to /bin/systemctl start sshd.service
I press Ctl+C to terminate the command. Then use command "netstat -ntlp" to find that the "sshd" already started, not sure why the "service sshd start" can not return to prompt.
[root#localhost ~]# netstat -ntlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 1/systemd
tcp 0 0 192.168.122.1:53 0.0.0.0:* LISTEN 2443/dnsmasq
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 63144/sshd
tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 1043/cupsd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1815/master
tcp6 0 0 :::111 :::* LISTEN 1/systemd
tcp6 0 0 :::22 :::* LISTEN 63144/sshd
tcp6 0 0 ::1:631 :::* LISTEN 1043/cupsd
tcp6 0 0 ::1:25 :::* LISTEN 1815/master
I try to start sshd manually, it works fine, the sshd started successfully(no any warning message) and the command return immediately. The command as follows:
[root#localhost ~]# /usr/local/sbin/sshd -f /etc/ssh/sshd_config
Any help is appreciated. Let me know if you want to known more information. Thanks.
How about tinkering With type in your .service ?
have you tried to set it to idle ?
maybe systemd waits to receive a message from sshd and seems to hang..

telnet refused on specific port on AWS instances

I'm tryign to telnet from one linux env (10.205.116.141) to 10.205.117.246 on port 7199 but keep getting a connection refused. I did a chkconfig iptables off on both servers and even make sure iptables if stopped as well.
what else should I be looking at?
[root#ip-10-205-116-141 bin]# telnet 10.205.117.246 7199
Trying 10.205.117.246...
telnet: connect to address 10.205.117.246: Connection refused
trace route seems to be working as well...
[root#ip-10-205-116-141 bin]# traceroute 10.205.117.246 -p 7199
traceroute to 10.205.117.246 (10.205.117.246), 30 hops max, 60 byte packets
1 ip-10-205-117-246.xyz.cxcvs.com (10.205.117.246) 0.416 ms 0.440 ms 0.444 ms
also, I'm on a aws vpc so we don't get public IPs provisioned for use...
checked my security group and it looks like all ports are open as well
EDIT:
here is netstat as well, they look the same on both nodes:
[ec2-user#ip-10-205-116-141 ~]$ netstat -an | grep LISTEN
tcp 0 0 127.0.0.1:46626 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9160 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:36523 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9042 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:2738 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 10.205.116.141:7000 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:8089 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:4445 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:7199 0.0.0.0:* LISTEN
shouldn't 127.0.0.1:7199 really be 10.205.116.141:7199?
sorry, can't post a sc of the security groups...

Cassandra dont' listen on 7199 JMX port

On one of my nodes I see in netstat -ln output:
tcp 0 0 192.168.25.207:9160 0.0.0.0:* LISTEN
On another for the same port:
tcp 0 0 ::ffff:192.168.25.208:9160 :::* LISTEN
And that's why I think on another node I can't see JMX 7199 port open. On first it's opened, I can see it with netstat -ln | grep 7199 command:
tcp 0 0 0.0.0.0:7199 0.0.0.0:* LISTEN
What's the difference in configuration of my system, why there is ipv6 on one node? Machines are equal, cassandra configs are equals too?
Sorry, guys, my bad - I fell asleep on my keyboard while vi was opened on /etc/cassandra/conf/cassandra-env.sh, the file was corrupted.

Resources