background jobs change to daemon without nohup/disown? - linux

a strange thing to me
a script while.sh,it's content is:
while [ 1 ];do
sleep 1
echo `date`
done
run as $while.sh >& while.log & (without nohup or disown or setsid or double fork())
exit and login again can see this process is still exist,it's ppid is 1 and it's tty is ?
my system is rhel6(rhel5 is the same, bash
in centos5.x it must use nohup or disown or do double fork() in code
what happen in rhel6

Is the huponexit shell option set?
$ shopt
...
huponexit off
Bash will send a SIGHUP signal to its jobs if it receives a SIGHUP itself, but it won't signal them when it exits normally unless you enable this option.
For what it's worth this is disabled on both RHEL6 and RHEL5, at least on the systems I just tested. I tried this command:
$ sleep 1000 &
It was not killed when I logged out and logged back in unless I deliberately enabled shopt -s huponexit.

Related

nohup "does not work" MPIrun

I am trying to use the "nohup" command to avoid killing a background process when exiting the terminal on linux MATE.
The process I want to run is a MPIrun process and I use the following command:
nohup mpirun -np 8 solverName -parallel >log 2>&1
when I leave the terminal, the processes running on the different cores are killed.
Also another thing I remarked in the log file, is that if I try to just run the following command
mpirun -np 8 solverName -parallel >log 2>&1
and then to CTRL+Z (stopping the process) the log file indicates :
Forwarding signal 20 to job
and I am unable to actually stop the mpirun command. So I guess there is something I don't understand in what I am doing
The job run in the background is still owned by your login shell (the nohup command doesn't exit until the mpirun command terminates), so it gets signalled when you disconnect. This script (I call it bk) is what I use:
#!/bin/sh
#
# #(#)$Id: bk.sh,v 1.9 2008/06/25 16:43:25 jleffler Exp $"
#
# Run process in background
# Immune from logoffs -- output to file log
(
echo "Date: `date`"
echo "Command: $*"
nice nohup "$#"
echo "Completed: `date`"
echo
) >>${LOGFILE:=log} 2>&1 &
(If you're into curiosities, note the careful use of $* and "$#". The nice runs the job at a lower priority when I'm not there. And version 1.1 was checked into version control — SCCS at the time — on 1987-08-10.)
For your process, you'd run:
$ bk mpirun -np 8 solverName -parallel
$
The prompt returns almost immediately. The key differences between what is in that code and what you do direct from the command line are:
There's a sub-process for the shell script, which terminates promptly.
The script itself runs the command in a sub-shell in background.
Between them, these mean that the process is not interfered with by your login shell; it doesn't know about the grandchild process.
Running direct on the command line, you'd write:
(nohup mpirun -np 8 solverName -parallel >log 2>&1 &)
The parentheses start a subshell; the sub-shell runs nohup in the background with I/O redirection and terminates. The continuing command is a grandchild of your login shell and is not interfered with by your login shell.
I'm not an expert in mpirun, never having used it, so there's a chance it does something I'm not expecting. My impression from the manual page is that it acts more or less like a regular process even though it can run multiple other processes, possibly on multiple nodes. That is, it runs the other processes but monitors and coordinates them and only exits when its children are complete. If that's correct, then what I've outlined is accurate enough.
To kill the process you need the following command.
first:
$ jobs -l
this gives you the PID of the process like this
[1]+ 47274 Running nohup mpirun -np 8 solverName -parallel >log 2>&1
then execute the following command to kill the process.
kill -9 {program PID i.e 47274 }
this will help you with killing the process.
note that ctrl+Z does not kill the process but it suspends it.
for the first part of the question, I recommend to try this command and see if it works or not.
nohup nohup mpirun -n 8 --your_flags ./compited_solver_name > Output.txt &
it worked for me.
tell us if it doesn't work for you.

Don't show the output of kill command in a Linux bash script [duplicate]

How can you suppress the Terminated message that comes up after you kill a
process in a bash script?
I tried set +bm, but that doesn't work.
I know another solution involves calling exec 2> /dev/null, but is that
reliable? How do I reset it back so that I can continue to see stderr?
In order to silence the message, you must be redirecting stderr at the time the message is generated. Because the kill command sends a signal and doesn't wait for the target process to respond, redirecting stderr of the kill command does you no good. The bash builtin wait was made specifically for this purpose.
Here is very simple example that kills the most recent background command. (Learn more about $! here.)
kill $!
wait $! 2>/dev/null
Because both kill and wait accept multiple pids, you can also do batch kills. Here is an example that kills all background processes (of the current process/script of course).
kill $(jobs -rp)
wait $(jobs -rp) 2>/dev/null
I was led here from bash: silently kill background function process.
The short answer is that you can't. Bash always prints the status of foreground jobs. The monitoring flag only applies for background jobs, and only for interactive shells, not scripts.
see notify_of_job_status() in jobs.c.
As you say, you can redirect so standard error is pointing to /dev/null but then you miss any other error messages. You can make it temporary by doing the redirection in a subshell which runs the script. This leaves the original environment alone.
(script 2> /dev/null)
which will lose all error messages, but just from that script, not from anything else run in that shell.
You can save and restore standard error, by redirecting a new filedescriptor to point there:
exec 3>&2 # 3 is now a copy of 2
exec 2> /dev/null # 2 now points to /dev/null
script # run script with redirected stderr
exec 2>&3 # restore stderr to saved
exec 3>&- # close saved version
But I wouldn't recommend this -- the only upside from the first one is that it saves a sub-shell invocation, while being more complicated and, possibly even altering the behavior of the script, if the script alters file descriptors.
EDIT:
For more appropriate answer check answer given by Mark Edgar
Solution: use SIGINT (works only in non-interactive shells)
Demo:
cat > silent.sh <<"EOF"
sleep 100 &
kill -INT $!
sleep 1
EOF
sh silent.sh
http://thread.gmane.org/gmane.comp.shells.bash.bugs/15798
Maybe detach the process from the current shell process by calling disown?
The Terminated is logged by the default signal handler of bash 3.x and 4.x. Just trap the TERM signal at the very first of child process:
#!/bin/sh
## assume script name is test.sh
foo() {
trap 'exit 0' TERM ## here is the key
while true; do sleep 1; done
}
echo before child
ps aux | grep 'test\.s[h]\|slee[p]'
foo &
pid=$!
sleep 1 # wait trap is done
echo before kill
ps aux | grep 'test\.s[h]\|slee[p]'
kill $pid ## no need to redirect stdin/stderr
sleep 1 # wait kill is done
echo after kill
ps aux | grep 'test\.s[h]\|slee[p]'
Is this what we are all looking for?
Not wanted:
$ sleep 3 &
[1] 234
<pressing enter a few times....>
$
$
[1]+ Done sleep 3
$
Wanted:
$ (set +m; sleep 3 &)
<again, pressing enter several times....>
$
$
$
$
$
As you can see, no job end message. Works for me in bash scripts as well, also for killed background processes.
'set +m' disables job control (see 'help set') for the current shell. So if you enter your command in a subshell (as done here in brackets) you will not influence the job control settings of the current shell. Only disadvantage is that you need to get the pid of your background process back to the current shell if you want to check whether it has terminated, or evaluate the return code.
This also works for killall (for those who prefer it):
killall -s SIGINT (yourprogram)
suppresses the message... I was running mpg123 in background mode.
It could only silently be killed by sending a ctrl-c (SIGINT) instead of a SIGTERM (default).
disown did exactly the right thing for me -- the exec 3>&2 is risky for a lot of reasons -- set +bm didn't seem to work inside a script, only at the command prompt
Had success with adding 'jobs 2>&1 >/dev/null' to the script, not certain if it will help anyone else's script, but here is a sample.
while true; do echo $RANDOM; done | while read line
do
echo Random is $line the last jobid is $(jobs -lp)
jobs 2>&1 >/dev/null
sleep 3
done
Another way to disable job notifications is to place your command to be backgrounded in a sh -c 'cmd &' construct.
#!/bin/bash
# ...
pid="`sh -c 'sleep 30 & echo ${!}' | head -1`"
kill "$pid"
# ...
# or put several cmds in sh -c '...' construct
sh -c '
sleep 30 &
pid="${!}"
sleep 5
kill "${pid}"
'
I found that putting the kill command in a function and then backgrounding the function suppresses the termination output
function killCmd() {
kill $1
}
killCmd $somePID &
Simple:
{ kill $! } 2>/dev/null
Advantage? can use any signal
ex:
{ kill -9 $PID } 2>/dev/null

Kill ssh or\and remote process from bash script

I am trying to run the following command as part of the bash script which suppose to open ssh channel, run the program on the remote machine, save the output to the file for 10 sec, kill the process, which was writing to the file and then give the control back to bash script.
#!/bin/bash
ssh hostname '/root/bin/nodes-listener > /tmp/nodesListener.out </dev/null; sshpid=!$; sleep 10; kill -9 $sshpid 2>/dev/null &'
Unfortunately, what it seems to be doing is starting the program: nodes-listener remotely, but it never gets any further and it doesn't give control to the bash script. So, the only way to stop the execution is to do Ctrl+C.
Killing ssh doesn't help (or rather can't be executed) since the control is not with bash script as it waits for the command within the ssh session to complete, which of course never happens as it has to be killed to stop.
Here's the command line that you're running on the remote system:
/root/bin/nodes-listener > /tmp/nodesListener.out </dev/null
sshpid=!$
sleep 10
kill -9 $sshpid 2>/dev/null &
You should change it to this:
/root/bin/nodes-listener > /tmp/nodesListener.out </dev/null & <-- Ampersand goes here
sshpid=$!
sleep 10
kill -9 $sshpid 2>/dev/null
You want to start nodes-listener and then kill it after ten seconds. To do this, you need to start nodes-listener as a background process, so that the shell which is executing this command line to move on to the next command after starting nodes-listener. The & in your command line is in the wrong place, and would apply only to the kill command. You need to apply it to the nodes-listener command.
I'll also note that your sshpid=!$ line was incorrect. You want sshpid=$!. $! is the process ID of the last command started in the background.
You need to place the ampersand after the first command, then put the remaining commands onto the next line:
ssh hostname -- '/root/bin/nodes-listener > /tmp/nodesListener.out </dev/null &
sshpid=$!; sleep 10; kill $sshpid 2>/dev/null'
Btw, ssh is returning after all commands had been executed. This does mean it will close the allocated pty as well. If there are still background jobs running in that shell session, they would being killed by SIGHUP. This means, you can probably omit the explicit kill command. (Depends on whether nodes-listener handles SIGHUP and SIGTERM differently). Having this, you could simplify the code to the following:
ssh hostname -- sh -c '/root/bin/nodes-listener > /tmp/nodesListener.out </dev/null &
sleep 10'
I have resolved this by pushing the shell script to the remote machine and executing it there. It is actually less tidy and relies on space being available on the remote computer.
Since my remote machine is a small physical device, the issue of the space usage is important (even for the tiny amount of space required in this case).
/root/bin/nodes-listener > /tmp/nodesListener.out </dev/null &
sshpid=!$
sleep 20
sync
# killing nodes-listener process and giving control back to the base bash
killall -9 nodes-listener 2>/dev/null && echo "nodes-listener is killed"

How to run a program and know its PID in Linux?

How to run a program and know its PID in Linux?
If I have several shells running each other, will they all have separate PIDs?
Greg's wiki to the rescue:
$! is the PID of the last backgrounded process.
kill -0 $PID checks whether $PID is still running. Only use this for processes started by the current process or its descendants, otherwise the PID could have been recycled.
wait waits for all children to exit before continuing.
Actually, just read the link - It's all there (and more).
$$ is the PID of the current shell.
And yes, each shell will have its own PID (unless it's some homebrewed shell which doesn't fork to create a "new" shell).
1) There is a variable for that, often $$:
edd#max:~$ echo $$ # shell itself
20559
edd#max:~$ bash -c 'echo $$' # new shell with different PID
19284
edd#max:~$ bash -c 'echo $$' # dito
19382
edd#max:~$
2) Yes they do, the OS / kernel does that for you.
the top command in linux(Ubuntu) shows the memory usage of all running programs in linux with their pid. Kill pid can kill the process.

How bash handles the jobs when logout?

As far as I understood from the books and bash manuals is that. When a user logs out from bash all the background jobs that is started by the user will automatically terminate, if he is not using nohup or disown. But today I tested it :
Logged in to my gnome desktop and accessed gnome-terminal.
There are two tabs in the terminal and in one I created a new user called test and logged in as test
su - test
started a script.
cat test.sh
#!/bin/bash
sleep 60
printf "hello world!!"
exit 0
./test.sh &
After that I logged out of test and closed the tab
In the next tab I exected ps aux as root and found that job is still running.
How this is happening ?
Whether running background jobs are terminated on exit depends on the shell. Bash normally does not do this, but can be configured to for login shells (shopt -s huponexit). In any case, access to the tty is impossible after the controlling process (such as a login shell) has terminated.
Situations that do always cause SIGHUP include:
Anything in foreground when the tty is closed down.
Any background job that includes stopped processes when their shell terminates (SIGCONT and SIGHUP). Shells typically warn you before letting this happen.
huponexit summary:
On: Background jobs will be terminated with SIGHUP when shell exits
$ shopt -s huponexit
$ shopt huponexit
huponexit on
Off: Background jobs will NOT be terminated with SIGHUP when shell exits.
$ shopt -u huponexit
$ shopt huponexit
huponexit off
Only interactive shells kill jobs when you close them. Other shells (for example those you get by using su - username) don't do that. And interactive shells only kill direct subprocesses.

Resources