Why use (ps -f&) to display process information, and then display the PPID of 1 instead of the PID of the main shell (-bash)? [closed] - linux

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 years ago.
Improve this question
When I use (ps -f&) to display the process information, I found that its PPID is 1, I am confused, why is it not the PID of the main shell (-bash)? I continued to execute the same command twice, and produced a strange process ([bash] <defunct>) with the following output:
The first test:
[root#localhost ~]# (ps -f&)
UID PID PPID C STIME TTY TIME CMD
root 2078 2076 0 01:44 pts/0 00:00:00 -bash
root 2244 1 0 03:07 pts/0 00:00:00 ps -f
The second test:
[root#localhost ~]# (ps -f&)
UID PID PPID C STIME TTY TIME CMD
root 2078 2076 0 01:44 pts/0 00:00:00 -bash
root 2245 2078 0 03:07 pts/0 00:00:00 [bash] <defunct>
root 2246 1 0 03:07 pts/0 00:00:00 ps -f
I tested it many times and found that the [bash] <defunct> process rarely appears (occasionally), but the PPID of the ps -f process is always 1.
My question is:
Why is the PPID of ps -f 1 instead of the PID of the main shell (-bash)?
What is the strange [bash] <defunct> process? Why didn't it appear in the first test?

When you do ( ps -f & ) with the ampersand, the subshell doesn't wait on the ps process so chances are it'll exit sooner than ps. If it does, ps no longer has a parent who'd reap its exit status with wait/waitpid/waitid so what happens on UNIXes is such processes (so called orphan processes) get reparented, normally to the init process (pid == 1) (Linux also allows for the concept of subreapers other than init).
What you're seeing in the second test is a temporary zombie. When a child process exits, it becomes a zombie ([defunct]) until its parent reaps its exit information. You must have caught the subshell at a moment where it exited but its parent (your main shell) hasn't managed to reap its exit info yet. Unless the parent shell is blocked in some way from continuing and thereby reaping the exit info, this should be only a temporary, transient state.

Related

How to get the PID, process name, command line of the current terminal window session? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
I was trying to find out, how do i get the pid, process name, command line of the current terminal(what is running in the background and got started with that terminal)?
By running:
echo $$
15925
You will get the process ID of your current session. Using this process ID, you can then run:
ps -ef | grep 15925
foo 14870 15925 0 10:32 pts/6 00:00:00 sleep 120
foo 14871 15925 0 10:32 pts/6 00:00:00 ps -ef
foo 14872 15925 0 10:32 pts/6 00:00:00 grep --color=auto 15925
foo 15925 15919 0 Nov23 pts/6 00:00:08 -bash
The second column will show the parent process (15925) and the second the parent

pkill and killall command is not working on Ubuntu 18 [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 years ago.
Improve this question
I have the following process:
root 18538 1 0 00:03 ? 00:00:36 /data/software/anaconda2/envs/py3/bin/python /data/software/anaconda2/envs/py3/bin/gunicorn zapier2cloud.wsgi:application -c ./zapier2cloud.conf.py
root 18541 1 0 00:03 ? 00:00:32 /data/software/anaconda2/envs/py3/bin/python /data/software/anaconda2/envs/py3/bin/gunicorn zapier2cloud.wsgi:application -c ./zapier2cloud.conf.py
root 18544 1 0 00:03 ? 00:00:36 /data/software/anaconda2/envs/py3/bin/python /data/software/anaconda2/envs/py3/bin/gunicorn zapier2cloud.wsgi:application -c ./zapier2cloud.conf.py
root 18545 1 0 00:03 ? 00:00:37 /data/software/anaconda2/envs/py3/bin/python /data/software/anaconda2/envs/py3/bin/gunicorn zapier2cloud.wsgi:application -c ./zapier2cloud.conf.py
root 18546 1 0 00:03 ? 00:00:36 /data/software/anaconda2/envs/py3/bin/python /data/software/anaconda2/envs/py3/bin/gunicorn zapier2cloud.wsgi:application -c ./zapier2cloud.conf.py
root 18547 1 0 00:03 ? 00:00:40 /data/software/anaconda2/envs/py3/bin/python /data/software/anaconda2/envs/py3/bin/gunicorn zapier2cloud.wsgi:application -c ./zapier2cloud.conf.py
I ran the command: sudo pkill -f gunicorn, but after that, it still shows the same processes with the same pids.
What happened? is there any thing wrong?
Sometimes processes can just ignored your kill command. Default signal sent is SIGTERM, if it is not caught, your process will continue to run.
To force kill your process, send a SIGKILL signal using '-9' :
kill -9 {PID}
List of signals :
kill -l

Difference between pidof and pgrep? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 years ago.
Improve this question
I'm not sure why pidof doesn’t work, but pgrep works.
$ pidof squid
returns nothing
$ pgrep squid
returns 3322
How can I get the 3322 using pidof?
pidof will return details regarding the name of a actual program whereas pgrep will return details regarding any processes that match the provided pattern. This is clearly stated in the man pages of both tools.
pidof [-s] [-c] [-n] [-x] [-m] [-o omitpid[,omitpid..]] [-o omitpid[,omitpid..]..] program [program..]
vs.
pgrep [options] pattern
When you're looking for the executable squid, pgrep can match it because the pattern matches /usr/bin/squid*. Whereas pidof cannot find a program called squid, because the Squid daemon is likely called something like /usr/bin/squid-server.
For example, here I'm looking at the output of ps and looking for programs running with the name systemd within them:
$ ps -eaf | grep systemd
root 1 0 0 Sep03 ? 00:00:05 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
root 425 1 0 Sep03 ? 00:00:03 /usr/lib/systemd/systemd-journald
root 480 1 0 Sep03 ? 00:00:00 /usr/lib/systemd/systemd-udevd
dbus 630 1 0 Sep03 ? 00:00:01 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root 648 1 0 Sep03 ? 00:00:00 /usr/lib/systemd/systemd-logind
pgrep is able to find them as well:
$ pgrep -l systemd
1 systemd
425 systemd-journal
480 systemd-udevd
648 systemd-logind
But pidof only finds the first one:
$ pidof systemd
1
That's because the PID 1, has the name /usr/bin/systemd.

Recursively kill R process with children in linux

I am looking for a general method to launch and then kill an R process, including possibly all forks or other processes that it invoked.
For example, a user runs a script like this:
library(multicore);
for(i in 1:3) parallel(foo <- "bar");
for(i in 1:3) system("sleep 300", wait=FALSE);
for(i in 1:3) system("sleep 300&");
q("no")
After the user quits the R session, the child processes are still running:
jeroen#jeroen-ubuntu:~$ ps -ef | grep R
jeroen 4469 1 0 16:38 pts/1 00:00:00 /usr/lib/R/bin/exec/R
jeroen 4470 1 0 16:38 pts/1 00:00:00 /usr/lib/R/bin/exec/R
jeroen 4471 1 0 16:38 pts/1 00:00:00 /usr/lib/R/bin/exec/R
jeroen 4502 4195 0 16:39 pts/1 00:00:00 grep --color=auto R
jeroen#jeroen-ubuntu:~$ ps -ef | grep "sleep"
jeroen 4473 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4475 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4477 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4479 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4481 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4483 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4504 4195 0 16:39 pts/1 00:00:00 grep --color=auto sleep
To make things worse, their their parent process id is 1 making it hard to identify them. Is there a method to run an R script in a way that allows me to recursively kill the process and its children at any time?
Edit: so I don't want to manually have to go in to search & kill processes. Also I don't want to kill all R processes, as there might be others that are doing fine. I need a method to kill a specific process and all of its children.
This is mainly about the multicore part. Children are waiting for you to collect the results - see ?collect. Normally, you should never use parallel without a provision to clean up, typically in on.exit. multicore cleans up in high-level functions like mclapply, but if you use lower-level functions it is your responsibility to perform the cleanup (since multicore cannot know if you left the children running intentionally or not).
Your example is really bogus, because you don't even consider collecting results. But anyway, if that is really what you want, you'll have to do the cleanup at some point. For example, if you want to terminate all children on exit, you could define .Last like this:
.Last <- function(...) {
collect(wait=FALSE)
all <- children()
if (length(all)) {
kill(all, SIGTERM)
collect(all)
}
}
Again, the above is not a recommended way to deal with this - it rather a last resort. You should really assign jobs and collect results like
jobs <- lapply(1:3, function(i) parallel({Sys.sleep(i); i}))
collect(jobs)
As for the general process child question - init inherits the children only after R quits, but in .Last you can still find their pids since the parent process exists at that point so you could perform similar cleanup as in the multicore case.
Before the user quits the R session, the processes you want to kill will have parent process ID equal to the process ID of the session that started them. You could perhaps use the .Last or .Last.sys hooks (see help(q)) to kill all processes with the appropriate PPID at that point; those can be suppressed with q(runLast=FALSE), so it isn't perfect, but I think it's the best option you have.
After the user quits the R session, there is no reliable way to do what you want -- the only record the kernel keeps of process parentage is the PPID you see in ps -ef, and when a parent process exits, that information is destroyed, as you have discovered.
Note that if one of the child processes forks, the grandchild will have PPID equal to the child's PID, and that will get reset to 1 when the child exits, which it might do before the grandparent exits. Thus, there is no reliable way to catch all of a process's descendants in general, even if you do so before the process exits. (One hears that "cgroups" provide a way, but one is unfamiliar with the details; in any case, that is an optional feature which only some iterations/configurations of the Linux kernel provide, and is not available at all elsewhere.)
I believe the latter part of the question is more a consideration of the shell, rather than the kernel. (Simon Urbanek has answered the multicore part better than pretty much anyone else could, as he's the author. :))
If you're using bash, you can find the PID of the most recently launched child process in $!. You can aggregate the PIDs and then be sure to kill those off when you close R.
If you want to be really gonzo, you could store parent PID (i.e. the output of Sys.getpid()) and child PID in a file and have a cleaning daemon that checks whether or not the parent PID exists and, if not, kills the orphans. I don't think it'll be that easy to get a package called oRphanKilleR onto CRAN, though.
Here is an example of appending the child PID to a file:
system('(sleep 20) & echo $! >> ~/childPIDs.txt', wait = FALSE)
You can modify this to create your own shell command and use R's tempfile() command to create a temporary file (albeit, that will disappear when the R instance is terminated, unless you take a special effort to preserve the file via permissions).
For some other clever ideas, see this other post on SO.
You can also create a do while loop in the shell that will check for whether or not a particular PID is in existence. While it is, the loop sleeps. Once the loop terminates (because the PID is no longer in use), the script will kill another PID.
Basically, I think your solution will be in shell scripting, rather than R.

How is it possible that kill -9 for a process on Linux has no effect?

I'm writing a plugin to highlight text strings automatically as you visit a web site. It's like the highlight search results but automatic and for many words; it could be used for people with allergies to make words really stand out, for example, when they browse a food site.
But I have problem. When I try to close an empty, fresh FF window, it somehow blocks the whole process. When I kill the process, all the windows vanish, but the Firefox process stays alive (parent PID is 1, doesn't listen to any signals, has lots of resources open, still eats CPU, but won't budge).
So two questions:
How is it even possible for a process not to listen to kill -9 (neither as user nor as root)?
Is there anything I can do but a reboot?
[EDIT] This is the offending process:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
digulla 16688 4.3 4.2 784476 345464 pts/14 D Mar28 75:02 /opt/firefox-3.0/firefox-bin
Same with ps -ef | grep firefox
UID PID PPID C STIME TTY TIME CMD
digulla 16688 1 4 Mar28 pts/14 01:15:02 /opt/firefox-3.0/firefox-bin
It's the only process left. As you can see, it's not a zombie, it's running! It doesn't listen to kill -9, no matter if I kill by PID or name! If I try to connect with strace, then the strace also hangs and can't be killed. There is no output, either. My guess is that FF hangs in some kernel routine but which?
[EDIT2] Based on feedback by sigjuice:
ps axopid,comm,wchan
can show you in which kernel routine a process hangs. In my case, the offending plugin was the Beagle Indexer (openSUSE 11.1). After disabling the plugin, FF was a quick and happy fox again.
As noted in comments to the OP, a process status (STAT) of D indicates that the process is in an "uninterruptible sleep" state. In real-world terms, this generally means that it's waiting on I/O and can't/won't do anything - including dying - until that I/O operation completes.
Processes in a D state will normally only be there for a fraction of a second before the operation completes and they return to R/S. In my experience, if a process gets stuck in D, it's most often trying to communicate with an unreachable NFS or other remote filesystem, trying to access a failing hard drive, or making use of some piece of hardware by way of a flaky device driver. In such cases, the only way to recover and allow the process to die is to either get the fs/drive/hardware back up and running so the I/O can complete or to give up and reboot the system. In the specific case of NFS, the mount may also eventually time out and return from the I/O operation (with a failure code), but this is dependent on the mount options and it's very common for NFS mounts to be set to wait forever.
This is distinct from a zombie process, which will have a status of Z.
Double-check that the parent-id is really 1. If not, and this is firefox, first try sudo killall -9 firefox-bin. After that, try killing the specific process IDs individually with sudo killall -9 [process-id].
How is it even possible for a process not to listen to kill -9 (neiter as user nor as root)?
If a process has gone <defunct> and then becomes a zombie with a parent of 1, you can't kill it manually; only init can. Zombie processes are already dead and gone - they've lost the ability to be killed as they are no longer processes, only a process table entry and its associated exit code, waiting to be collected. You need to kill the parent, and you can't kill init for obvious reasons.
But see here for more general information. A reboot will kill everything, naturally.
Is it possible, that this process is restarted (for example by init) just at the time you kill it?
You can check this easily. If the PID is the same after kill -9 PID then the process wasn't killed, but if it has changed the process has been restarted.
I lately get trapped into a pitfall of Double Fork and had landed to this page before finally finding my answer. The symptoms are identical even if the problem is not the same:
WYKINWYT :What You Kill Is Not What You Thought
The minimal test code is shown below based on an example for an SNMP Daemon
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
int main(int argc, char* argv[])
{
//We omit the -f option (do not Fork) to reproduce the problem
char * options[]={"/usr/local/sbin/snmpd",/*"-f","*/-d","--master=agentx", "-Dagentx","--agentXSocket=tcp:localhost:1706", "udp:10161", (char*) NULL};
pid_t pid = fork();
if ( 0 > pid ) return -1;
switch(pid)
{
case 0:
{ //Child launches SNMP daemon
execv(options[0],options);
exit(-2);
break;
}
default:
{
sleep(10); //Simulate "long" activity
kill(pid,SIGTERM);//kill what should be child,
//i.e the SNMP daemon I assume
printf("Signal sent to %d\n",pid);
sleep(10); //Simulate "long" operation before closing
waitpid(pid);
printf("SNMP should be now down\n");
getchar();//Blocking (for observation only)
break;
}
}
printf("Bye!\n");
}
During the first phase the main process (7699) launches the SNMP daemon (7700) but we can see that this one is now Defunct/Zombie. Beside we can see another process (7702) with the options we specified
[nils#localhost ~]$ ps -ef | tail
root 7439 2 0 23:00 ? 00:00:00 [kworker/1:0]
root 7494 2 0 23:03 ? 00:00:00 [kworker/0:1]
root 7544 2 0 23:08 ? 00:00:00 [kworker/0:2]
root 7605 2 0 23:10 ? 00:00:00 [kworker/1:2]
root 7698 729 0 23:11 ? 00:00:00 sleep 60
nils 7699 2832 0 23:11 pts/0 00:00:00 ./main
nils 7700 7699 0 23:11 pts/0 00:00:00 [snmpd] <defunct>
nils 7702 1 0 23:11 ? 00:00:00 /usr/local/sbin/snmpd -Lo -d --master=agentx -Dagentx --agentXSocket=tcp:localhost:1706 udp:10161
nils 7727 3706 0 23:11 pts/1 00:00:00 ps -ef
nils 7728 3706 0 23:11 pts/1 00:00:00 tail
After the 10 sec simulated we will try to kill the only process we know (7700). What we succeed at last with waitpid(). But Process 7702 is still here
[nils#localhost ~]$ ps -ef | tail
root 7431 2 0 23:00 ? 00:00:00 [kworker/u256:1]
root 7439 2 0 23:00 ? 00:00:00 [kworker/1:0]
root 7494 2 0 23:03 ? 00:00:00 [kworker/0:1]
root 7544 2 0 23:08 ? 00:00:00 [kworker/0:2]
root 7605 2 0 23:10 ? 00:00:00 [kworker/1:2]
root 7698 729 0 23:11 ? 00:00:00 sleep 60
nils 7699 2832 0 23:11 pts/0 00:00:00 ./main
nils 7702 1 0 23:11 ? 00:00:00 /usr/local/sbin/snmpd -Lo -d --master=agentx -Dagentx --agentXSocket=tcp:localhost:1706 udp:10161
nils 7751 3706 0 23:12 pts/1 00:00:00 ps -ef
nils 7752 3706 0 23:12 pts/1 00:00:00 tail
After giving a character to the getchar() function our main process terminates but the SNMP daemon with the pid 7002 is still here
[nils#localhost ~]$ ps -ef | tail
postfix 7399 1511 0 22:58 ? 00:00:00 pickup -l -t unix -u
root 7431 2 0 23:00 ? 00:00:00 [kworker/u256:1]
root 7439 2 0 23:00 ? 00:00:00 [kworker/1:0]
root 7494 2 0 23:03 ? 00:00:00 [kworker/0:1]
root 7544 2 0 23:08 ? 00:00:00 [kworker/0:2]
root 7605 2 0 23:10 ? 00:00:00 [kworker/1:2]
root 7698 729 0 23:11 ? 00:00:00 sleep 60
nils 7702 1 0 23:11 ? 00:00:00 /usr/local/sbin/snmpd -Lo -d --master=agentx -Dagentx --agentXSocket=tcp:localhost:1706 udp:10161
nils 7765 3706 0 23:12 pts/1 00:00:00 ps -ef
nils 7766 3706 0 23:12 pts/1 00:00:00 tail
Conclusion
The fact that we ignored the double fork mechanism made us think that the kill action did not succeed. But in fact we simply killed the wrong process !!
By adding the -f option ( Do Not (Double) Fork ) all go as expected
ps -ef | grep firefox;
and you can see 3 process, kill them all.
sudo killall -9 firefox
Should work
EDIT: [PID] changed to firefox
You can also do a pstree and kill the parent. This makes sure that you get the entire offending process tree and not just the leaf.

Resources