Hadoop daemons can't stop using proper command - linux

Running Hadoop system would run some daemon jobs like namenode, journalnode, etc. I will use namenode as an example.
When we start namenode we can use command: hadoop-daemon.sh start namenode
When we stop namenode we can use command: hadoop-daemon.sh stop namenode.
But here comes the question, if I just start the namenode yesterday or couple of hours ago, the stop command would work fine. But if the namenode has been working for say 1 month. When I am using the stop command, it will show:
no namenode to stop.
But I can see the daemon NameNode running using command JPS. Then I have to use the kill command to kill the process.
Why would this happen? Any way to make sure the stop command can always work?
Thanks

The reason why hadoop-daemon.sh is not working after some time is because in hadoop-env.sh there are parameters called:
export HADOOP_PID_DIR
export HADOOP_SECURE_DN_PID_DIR
which stored pid number of those daemons. The default location of that directory is /tmp. The problem is /tmp folder will be automatically cleaned up after sometime(Red Hat Linux). In this case the pid file is deleted, so
when we run the daemon command, the command can't find the process id stored in the that file.
The same reson with yarn-daemon.sh command.
Modify hadoop-env.sh:
HADOOP_PID_DIR
HADOOP_SECURE_DN_PID_DIR
yarn-env.sh:
YARN_PID_DIR
mapred-env.sh:
HADOOP_MAPRED_PID_DIR
to other directories instead of using default /tmp folder should solve the problem.
After modification, restart all the processes related to that.
Also for security concern, folder contains pid should not be able to be accessed by other non-admin users.

Related

How to make Redis dump.rdb save in directory when daemonized

So, I can get the directory of dump.rdb's location to change by using the dir option in redis.conf when I start it normally (just calling redis-server). If I want redis-server to run all of the time (I do) without needing a terminal window always open, I think I need to daemonize it. However, it doesn't seem this ever persists to the disk automatically and whenever the redis-server process ends (I've been ending it in testing by just running redis-cli shutdown or sometimes just killing the process with kill PID) and starts back up, all database changes are lost, which seems pretty bad if a crash or unexpected shutdown were to happen in the future. In the code that runs the processing of data (either python with redis-py or java with jedis), I can explicitly run bgsave(), but that saves dump.rdb in the directory that the code was run in and not where the dir option specifies in redis.conf
So, is there either another way to run redis-server without requiring a whole terminal window to stay open that allows what I want to do or is there a way to get the data to persist on disk in the proper directory when it's run as redis-server --daemonize yes or similar?
You could put it on linux "background" using nohup. It does not need a terminal window to stay up and running. I don't know the daemonize option to give you an advice about that, but, see if it works for you:
nohup redis-server &> redis.log&
or
Set daemonize yes in the conf file and run:
redis-server path/to/redis.conf

Tomcat starts after issuing 'stop' command

I am seeing very confusing behavior in my tomcat.
I execute:
/usr/libexec/tomcat/server stop
But instead of stopping tomcat restarts. I issue the command a 2nd time, and then it actually stops. I have tried searching but have not come up with a good way to search 'restarts after stop'. Almost all results talk about scripts to restart tomcat, and stop/start functionality.
You can stop tomcat with this command: your/path/tomcat/bin/./catalina.sh stop and your/path/tomcat/bin/./catalina.sh start to start the server.
If you use linux and your catalina.sh does not have executable permission you need execute this: chmod +x catalina.sh before

Cassandra process killed on exit

When I run dsc cassandra on CoreOS(tarball) using telnet everything comes up fine. But when i close the telnet session, it kills the process. How do i keep the cassandra server running?
I tried sudo bin/cassandra and sudo bin/cassandra -f
both didnt help.
I have no issues in other OS.
Option Description
-f Start the cassandra process in foreground. The default is to start as background process.
-h Help.
-p filename Log the process ID in the named file. Useful for stopping Cassandra by killing its PID.
-v Print the version and exit.
When you are starting cassandra using -f it runs in foreground, hence it will stop as soon as terminal is closed. Same is true for background process.
This will happen with any application you run in telnet session.
You can try
sudo service cassandra start OR nohup bin/cassandra this will keep your application running even when terminal is closed
You need to run Cassandra as a systemd service, as described here: https://coreos.com/os/docs/latest/getting-started-with-systemd.html
Running in the foreground with cassandra -f as your ExecStart= command will allow systemd to manage the state of the process (ideally inside a container).
While this is a bit different than what you're used to, it will lead to an overall more stable mechanism since you'll be using an init system that understands dependency chains, restart and reboot behavior, logging, etc.
Run the process in a screen or tmux session. Detaching from the screen session should allow the process to keep running.

how to start all daemon process in hadoop - like start-all.sh in linux

i have just started to work on hadoop, cygwin in windows 7. i need to know , that is there any method to start all the services using command like- start-all.sh in linux. i used this command in cygwin doesnt work. And if possible pls suggest me any reference to work on hadoop in windows 7 with cygwin.
i need to do following steps every time to start five daemons like
Start the namenode in the first window by executing
cd hadoop
bin/hadoop namenode
Start the secondary namenode in the second window by executing
cd hadoop
bin/hadoop secondarynamenode
Start the job tracker the third window by executing
cd hadoop
bin/haoop jobtracker
Start the data node the fourth window by executing
cd hadoop
bin/haoop datanode
Start the task tracker the fifth window by executing
cd hadoop
bin/haoop tasktracker
pls any body help
Change the for condition to match the path of your Hadoop daemons:
#!/bin/bash
for service in /etc/init.d/hadoop-hdfs-* /etc/init.d/hadoop-0.20-mapreduce-*
do
sudo $service start
done

Linux command to restart application

For example - i have process id which i want to restart. What command i should use to restart this process application ? I didn't find something about it(
Thanks!
You can find very similar question at Restart process script linux.
Linux doesn't have general command for restart, normally you should kill your process and start it over. However, if your process has been started as a service, i.e. it's contained in /etc/init.d/ directory, then you can do the following:
/etc/init.d/SERVICE_NAME restart
or
service SERVICE_NAME restart

Resources