Who does the daemonizing? - linux

There are various tricks to daemonize a linux process, i.e. to make a command running after the terminal is closed.
nohup is used for this purpose, and fork()/setsid() combination can be used in a C program to make itself a daemon process.
The above was my knowledge about linux daemon, but today I noticed that exiting the terminal doesn't really terminate processes started with & at the end of the command.
$ while :; do echo "hi" >> temp.log ; done &
[1] 11108
$ ps -ef | grep 11108
username 11108 11076 83 15:25 pts/0 00:00:05 /bin/sh
username 11116 11076 0 15:25 pts/0 00:00:00 grep 11108
$ exit
(after reconnecting)
$ ps -ef | grep 11108
username 11108 1 91 15:25 pts/0 00:00:17 /bin/sh
username 11130 11540 0 15:25 pts/0 00:00:00 grep 11108
So apparently, the process's PPID changed to 1, meaning that it got daemonized somehow.
This contradicts my knowledge, that & is not enough and one must use nohup or some other tricks to a process 'daemon'.
Does anyone know who is doing this daemonizing?
I'm using a CentOS 6.3 host and putty/cygwin/sshclient produced the same result.

You can daemonize a process if that doesn't respond to SIGHUP signal.
When bash shell is terminated while it is running background tasks, bash shell sends SIGHUP
(hangup signal) to all tasks. However bash won't wait until child processes are completely
terminated. If child process doesn't respond to SIGHUP signal, that process becomes an orphan
process. (its parent pid is changed to 1 - init process - to prevent becoming a useless zombie process)
Subshell executions basically do not responds to SIGHUP signals, thus your command will still be running after logging out from the first shell.

Related

kill -s SIGTERM kills parent process and one level child process only

I've been doing some experimenting with writing a command to kill parent and all it's children recursively. I've a script as below
parent.sh:
#!/bin/bash
/home/oracle/child.sh &
sleep infinity
child.sh:
#!/bin/bash
sleep infinity
Started command using
su oracle -c parent.sh &
I see a process tree like below
[root#source ~]# ps -ef | grep "/home/oracle"
root 14129 1171 0 12:39 pts/1 00:00:00 su oracle -c /home/oracle/parent.sh
oracle 14130 14129 0 12:39 ? 00:00:00 /bin/bash /home/oracle/parent.sh
oracle 14131 14130 0 12:39 ? 00:00:00 /bin/bash /home/oracle/child.sh
When I send sigterm to 14129 using kill -s SIGTERM 14129 it appears to kill 14129 and then 14130 goes down as well immediately; but 14131 stays up for a very long time. The last level child appears to have been reparented and has become a zombie.
oracle 14131 1 0 12:39 ? 00:00:00 /bin/bash /home/oracle/child.sh
If kill doesn't terminate any child processes why did 14130 get killed when I sent a SIGTERM to 14129?
If kill can kill child processes, why would does it go only one level down? Is the behavior here guaranteed?
The relevant part of what pilcrow provided is this:
SIGNALS top
Upon receiving either SIGINT, SIGQUIT or SIGTERM, su terminates
its child and afterwards terminates itself with the received
signal.
>> The child is terminated by SIGTERM,
>> (then) after unsuccessful attempt (to kill with SIGTERM) and
>> (after) 2 seconds of delay (,) the child is (then) killed
>> by SIGKILL [a second, harsher method].
That harsher method, SIGKILL, prevents that child process from attempting to kill its own children, hence the zombie state.
I haven't used it myself, but it seems that something like
killall --process-group parent.sh
would kill all processes tied to the process group associated with the "parent.sh" script. BUT ... not sure if "--wait" will serve you well, if the method used in the attempt to terminate is not being accepted.

In unix I used kill command by providing a ppid then it close the terminal . why? kill -9 ppid

sleep 5000
In one terminal and in second terminal I'm running:
ps -ef | grep sleep
Then I'm killing this process in second terminal by using the ppid. Then it will close the first terminal where I run the sleep command. It will not create sleep command as an orphan.
$ ps -ef | grep sleep
trainee 4887 4864 0 17:05 pts/0 00:00:00 sleep 5000
trainee 4889 4264 0 17:05 pts/1 00:00:00 grep --color=auto sleep
kill -9 4864
Why?
Presumably the parent of the sleep is your shell. When you kill that your login is terminated and your terminal closes.
The Wikipedia article on Orphan process reads (in part),
An orphan process is a computer process whose parent process has finished or terminated, though it remains running itself.
and
A process can be orphaned unintentionally, such as when the parent process terminates or crashes. The process group mechanism in most Unix-like operation systems can be used to help protect against accidental orphaning, where in coordination with the user's shell will try to terminate all the child processes with the SIGHUP process signal, rather than letting them continue to run as orphans.

Why does (ps -f) create no subshell but a separate process?

I need some help because I don't get something. From what I read from Internet, a subshell is created when we execute a shell script or if we run command in brackets: ( )
I tried to test this with a script which contains only the following command:
ps -f
When I run it I see the following result:
ID PID PPID C STIME TTY TIME CMD
me 2213 2160 0 08:53 pts/14 00:00:00 bash
me 3832 2213 0 18:41 pts/14 00:00:00 bash
me 3833 3832 0 18:41 pts/14 00:00:00 ps -f
Which is good, because I see that my bash process has spawned another bash process for my script.
But when I do:
( ps -f )
it produces:
UID PID PPID C STIME TTY TIME CMD
me 2213 2160 0 08:53 pts/14 00:00:00 bash
me 3840 2213 0 18:46 pts/14 00:00:00 ps -f
So if brackets spawn a subshell why it is not shown in the processes? And why does ps -f is counted as another process? Does every command run as a separate process?
It seems you've caught bash in a little bit of an optimization. if a subshell contains only a single command, why really make it a subshell?
$ ( ps -f )
UID PID PPID C STIME TTY TIME CMD
jovalko 29393 24133 0 12:05 pts/10 00:00:00 bash
jovalko 29555 29393 0 12:07 pts/10 00:00:00 ps -f
However, add a second command, say : (the bash null command, which does nothing) and this is the result:
$ ( ps -f ; : )
UID PID PPID C STIME TTY TIME CMD
jovalko 29393 24133 0 12:05 pts/10 00:00:00 bash
jovalko 29565 29393 0 12:08 pts/10 00:00:00 bash
jovalko 29566 29565 0 12:08 pts/10 00:00:00 ps -f
One of the main reasons to use a subshell is that you can perform operations like I/O redirection on a group of commands instead a single command, but if your subshell contains only a single command there's not much reason to really fork a new bash process first.
As to ps counting as a process, it varies. Many commands you use like ls, grep, awk are all external programs. But, there are builtins like cd, kill, too.
You can determine which a command is in bash using the type command:
$ type ps
ps is hashed (/bin/ps)
$ type cd
cd is a shell builtin
The main part of the question is:
Does every command run as a separate process?
YES!. Every command what isn't built-in into bash (like declare and such), runs as separate process. How it is works?
When you type ps and press enter, the bash analyze what you typed, do usual things as globbing, variable expansionas and such, and finally when it is an external command
the bash forks itself.
The forking mean, than immediatelly after the fork, you will have two identical bash processes (each one with different process ID (PID)) - called as "parent" and "child", and the only difference between those two running bash programs is, than the "parent" gets (return value from the fork) the PID of the child but the child don't know the PID of the parent. (fork for the child returns 0).
after the fork, (bash is written this way) - the child replaces itself with the new program image (such: ps) - using the exec call.
after this, the child bash of course doesn't exist anymore, and running only the newly executed command - e.g. the ps.
Of course, when the bash going to fork itself, do many other things, like I/O redirections, opens-closes filehandles, changes signal handling for the child and many-many other things.

Why SIGINT can stop bash in terminal but not via kill -INT?

I noticed that when I am running a hanging process via bash script like this
foo.sh:
sleep 999
If I run it via command, and press Ctrl+C
./foo.sh
^C
The sleep will be interrupted. However, when I try to kill it with SIGINT
ps aux | grep foo
kill -INT 12345 # the /bin/bash ./foo.sh process
Then it looks like bash and sleep ignores the SIGINT and keep running. This surprises me. I thought Ctrl + C is actually sending SIGINT to the foreground process, so why is that behaviors differently for Ctrl + C in terminal and kill -INT?
CtrlC actually sends SIGINT to the foreground process group (which consists of a bash process, and a sleep process). To do the same with a kill command, send the signal to the process group, e.g:
kill -INT -12345
Your script is executing "sleep 999" and when you hit CTRL-C the shell that is running the sleep command will send the SIGINT to its foreground process, sleep. However, when you tried to kill the shell script from another window with kill, you didn't target "sleep" process, you targetted the parent shell process, which is catching SIGINT. Instead, find the process ID for the "sleep 999" and kill -2 it, it should exit.
In short, you are killing 2 different processes in your test cases, and comparing apples to oranges.
root 27979 27977 0 03:33 pts/0 00:00:00 -bash <-- CTRL-C is interpreted by shell
root 28001 27999 0 03:33 pts/1 00:00:00 -bash
root 28078 27979 0 03:49 pts/0 00:00:00 /bin/bash ./foo.sh
root 28079 28078 0 03:49 pts/0 00:00:00 sleep 100 <-- this is what you should try killing

How to get the process ID to kill a nohup process?

I'm running a nohup process on the server. When I try to kill it my putty console closes instead.
this is how I try to find the process ID:
ps -ef |grep nohup
this is the command to kill
kill -9 1787 787
When using nohup and you put the task in the background, the background operator (&) will give you the PID at the command prompt. If your plan is to manually manage the process, you can save that PID and use it later to kill the process if needed, via kill PID or kill -9 PID (if you need to force kill). Alternatively, you can find the PID later on by ps -ef | grep "command name" and locate the PID from there. Note that nohup keyword/command itself does not appear in the ps output for the command in question.
If you use a script, you could do something like this in the script:
nohup my_command > my.log 2>&1 &
echo $! > save_pid.txt
This will run my_command saving all output into my.log (in a script, $! represents the PID of the last process executed). The 2 is the file descriptor for standard error (stderr) and 2>&1 tells the shell to route standard error output to the standard output (file descriptor 1). It requires &1 so that the shell knows it's a file descriptor in that context instead of just a file named 1. The 2>&1 is needed to capture any error messages that normally are written to standard error into our my.log file (which is coming from standard output). See I/O Redirection for more details on handling I/O redirection with the shell.
If the command sends output on a regular basis, you can check the output occasionally with tail my.log, or if you want to follow it "live" you can use tail -f my.log. Finally, if you need to kill the process, you can do it via:
kill -9 `cat save_pid.txt`
rm save_pid.txt
I am using red hat linux on a VPS server (and via SSH - putty), for me the following worked:
First, you list all the running processes:
ps -ef
Then in the first column you find your user name; I found it the following three times:
One was the SSH connection
The second was an FTP connection
The last one was the nohup process
Then in the second column you can find the PID of the nohup process and you only type:
kill PID
(replacing the PID with the nohup process's PID of course)
And that is it!
I hope this answer will be useful for someone I'm also very new to bash and SSH, but found 95% of the knowledge I need here :)
suppose i am running ruby script in the background with below command
nohup ruby script.rb &
then i can get the pid of above background process by specifying command name. In my case command is ruby.
ps -ef | grep ruby
output
ubuntu 25938 25742 0 05:16 pts/0 00:00:00 ruby test.rb
Now you can easily kill the process by using kill command
kill 25938
jobs -l should give you the pid for the list of nohup processes.
kill (-9) them gently.
;)
You could try
kill -9 `pgrep [command name]`
Suppose you are executing a java program with nohup you can get java process id by
`ps aux | grep java`
output
xxxxx 9643 0.0 0.0 14232 968 pts/2
then you can kill the process by typing
sudo kill 9643
or lets say that you need to kill all the java processes then just use
sudo killall java
this command kills all the java processes. you can use this with process. just give the process name at the end of the command
sudo killall {processName}
If your application always uses the same port, you can kill all the processes in that port like this.
kill -9 $(lsof -t -i:8080)
This works in Ubuntu
Type this to find out the PID
ps aux | grep java
All the running process regarding to java will be shown
In my case is
johnjoe 3315 9.1 4.0 1465240 335728 ? Sl 09:42 3:19 java -jar batch.jar
Now kill it kill -9 3315
The zombie process finally stopped.
when you create a job in nohup it will tell you the process ID !
nohup sh test.sh &
the output will show you the process ID like
25013
you can kill it then :
kill 25013
I started django server with the following command.
nohup manage.py runserver <localhost:port>
This works on CentOS:
:~ ns$netstat -ntlp
:~ ns$kill -9 PID
This works for mi fine on mac
kill -9 `ps -ef | awk '/nohup/{ print \$2 }'`
I often do this way. Try this way :
ps aux | grep script_Name
Here, script_Name could be any script/file run by nohup.
This command gets you a process ID. Then use this command below to kill the script running on nohup.
kill -9 1787 787
Here, 1787 and 787 are Process ID as mentioned in the question as an example.
This should do what was intended in the question.
If you are unaware of the PID, then first find it using TOP command
top -U userid
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
You will get the PID using top, then perform the kill operation.
$ kill -9 <PID>
Today I met the same problem. And since it was a long time ago, I totally forgot which command I used and when. I tried three methods:
Using the STIME shown in ps -ef command. This shows the time you start your process, and it's very likely that you nohup you command just before you close ssh(depends on you) . Unfortunately I don't think the latest command is the command I run using nohup, so this doesn't work for me.
Second is the PPID, also shown in ps -ef command. It means Parent Process ID, the ID of process that creates the process. The ppid is 1 in ubuntu for process that using nohup to run. Then you can use ps --ppid "1" to get the list, and check TIME(the total CPU time your process use) or CMD to find the process's PID.
Use lsof -i:port if the process occupy some ports, and you will get the command. Then just like the answer above, use ps -ef | grep command and you will get the PID.
Once you find the PID of the process, then can use kill pid to terminal the process.
About losing your putty: often the ps ... | awk/grep/perl/... process gets matched, too! So the old school trick is like this
ps -ef | grep -i [n]ohup
That way the regex search doesn't match the regex search process!
if you are on a remote server, check memory usage with top , and find your process and its ID. After that, just execute kill [your process ID] .

Resources