Is kthreadd included in the linux processes? - linux

I am tasked to implement a simple version of pstree (linux command), while I am confused about the content between what pstree shows and what I find under /proc/[pid] directory.
After I type pstree, it shows the root of the whole process tree is systemd, just like this:
systemd─┬─ECAgent───3*[{ECAgent}]
├─EasyMonitor
├─ModemManager───2*[{ModemManager}]
├─NetworkManager─┬─dhclient
While after I try to read all /proc/[pid]/stat files, I got the following result (do a little formatting):
pid comm state ppid
1 systemd S 0
2 kthreadd S 0
3 rcu_gp I 2
4 rcu_par_gp I 2
It seems that there is another process kthreadd that is paralleled with systemd. This is different from what shows in pstree command.
After reading some manuals and web materials, I know that pstree displays all runnnig processes and kthreadd is the root thread of all related threads. But I am still confused that kthreadd doesn't count as a running process by pstree command. So it's like kthreadd is not a process even it owns one pid (which is 2)? Should I include kthreadd as a running process in my version of pstree?

kthreadd is not a process started by systemd. Kthreadd is a worker thread in kernel address space started by the kernel.
pstree is more to do with the user space processes that shows the parent and child hierarchy.
In my opinion you should not include kthreadd in your implementation.
One of the way to find the kernel threads is /proc/$pid/cmdline is empty for kernel threads.

Related

Do Linux kernel processes multithread?

For Linux user space processes it seems pretty easy to determine which processes are multithreading. You can use ps -eLf and look at the NLWP value for the number of threads, which also corresponds to the 'Threads:' value in /proc/$pid/status. Apparently back in the day of LinuxThreads the implementation was not POSIX compliant. But This stackoverflow answer says "POSIX.1 requires threads share a same process ID" which is apparently rectified in NPTL. So with NPTL it allows nifty displays of threads with commands like ps -eLf because the threads all share the same PID, and you can verify that under /proc/$pid/task/ and see all the thread subfolders belonging to that "parent" process.
I can't find similar thread groupings under a "parent" process for kernel processes spawned by kthreadd, and I suspect an implementation difference since a comment under this answer says "You can not use POSIX Threads in kernel-space" and the nifty thread grouping is a POSIX feature. Thus with ps -eLf I never see multiple threads listed for kernel processes created by kthreadd which have the square brackets around it, like [ksoftirqd/0] or [nfsd], unlike user-space processes created by init.
From the man page for pthreads (which is used in the user space):
A single process can contain multiple threads, all of which are
executing the same program. These threads share the same global
memory (data and heap segments), but each thread has its own stack
(automatic variables).
This however is precisely what I do not see for kernel "threads", in terms of one process containing multiple threads.
In short, I never see any of the processes listed by 'ps' that are children of kthreadd having a NLWP (Threads) value higher than one, which makes me wonder if any kernel process forks/parallelizes and multithreads like user-space programs do (with pthreads). Where do the implementations differ?
Practical example:
Output from ps auxf for the NFS processes.
root 2 0.0 0.0 0 0 ? S Jan12 0:00 [kthreadd]
root 1546 0.0 0.0 0 0 ? S Jan12 0:00 \_ [lockd]
root 1547 0.0 0.0 0 0 ? S Jan12 0:00 \_ [nfsd4]
root 1548 0.0 0.0 0 0 ? S Jan12 0:00 \_ [nfsd4_callbacks]
root 1549 0.0 0.0 0 0 ? S Jan12 2:40 \_ [nfsd]
root 1550 0.0 0.0 0 0 ? S Jan12 2:39 \_ [nfsd]
root 1551 0.0 0.0 0 0 ? S Jan12 2:40 \_ [nfsd]
root 1552 0.0 0.0 0 0 ? S Jan12 2:47 \_ [nfsd]
root 1553 0.0 0.0 0 0 ? S Jan12 2:34 \_ [nfsd]
root 1554 0.0 0.0 0 0 ? S Jan12 2:39 \_ [nfsd]
root 1555 0.0 0.0 0 0 ? S Jan12 2:57 \_ [nfsd]
root 1556 0.0 0.0 0 0 ? S Jan12 2:41 \_ [nfsd]
By default when you start the rpc.nfsd service (at least with the init.d service script) it spawns 8 processes (or at least they have PIDs). If I wanted to write a multi-threaded version of NFS, which is implemented as a kernel module, with those nfsd "processes" as a frontend, why couldn't I group the default 8 different nfsd processes under one PID and have 8 threads running under it, versus (as is shown - and as is different than user space processes) eight different PIDs?
NSLCD is an example of a user space program that uses multithreading by contrast:
UID PID PPID LWP C NLWP STIME TTY TIME CMD
nslcd 1424 1 1424 0 6 Jan12 ? 00:00:00 /usr/sbin/nslcd
nslcd 1424 1 1425 0 6 Jan12 ? 00:00:28 /usr/sbin/nslcd
nslcd 1424 1 1426 0 6 Jan12 ? 00:00:28 /usr/sbin/nslcd
nslcd 1424 1 1427 0 6 Jan12 ? 00:00:27 /usr/sbin/nslcd
nslcd 1424 1 1428 0 6 Jan12 ? 00:00:28 /usr/sbin/nslcd
nslcd 1424 1 1429 0 6 Jan12 ? 00:00:28 /usr/sbin/nslcd
The PID is the same but the LWP is unique per thread.
Update on the function of kthreadd
From this stackoverflow answer:
kthreadd is a daemon thread that runs in kernel space. The reason is
that kernel needs to some times create threads but creating thread in
kernel is very tricky. Hence kthreadd is a thread that kernel uses to
spawn newer threads if required from there . This thread can access
userspace address space also but should not do so . Its managed by
kernel...
And this one:
kthreadd() is main function (and main loop) of daemon kthreadd which
is a kernel thread daemon, the parent of all other kernel threads.
So in the code quoted, there is a creation of request to kthreadd
daemon. To fulfill this request kthreadd will read it and start a
kernel thread.
There is no concept of a process in the kernel, so your question doesn't really make sense. The Linux kernel can and does create threads that run completely in kernel context, but all of these threads run in the same address space. There's no grouping of similar threads by PID, although related threads usually have related names.
If multiple kernel threads are working on the same task or otherwise sharing data, then they need to coordinate access to that data via locking or other concurrent algorithms. Of course the pthreads API isn't available in the kernel, but one can use kernel mutexes, wait queues, etc to get the same capabilities as pthread mutexes, condition variables, etc.
Calling these contexts of execution "kernel threads" is a reasonably good name, since they are closely analogous to multiple threads in a userspace process. They all share the (kernel's) address space, but have their own execution context (stack, program counter, etc) and are each scheduled independently and run in parallel. On the other hand, the kernel is what actually implements all the nice POSIX API abstractions (with help from the C library in userspace), so internal to that implementation we don't have the full abstraction.

Recursively kill R process with children in linux

I am looking for a general method to launch and then kill an R process, including possibly all forks or other processes that it invoked.
For example, a user runs a script like this:
library(multicore);
for(i in 1:3) parallel(foo <- "bar");
for(i in 1:3) system("sleep 300", wait=FALSE);
for(i in 1:3) system("sleep 300&");
q("no")
After the user quits the R session, the child processes are still running:
jeroen#jeroen-ubuntu:~$ ps -ef | grep R
jeroen 4469 1 0 16:38 pts/1 00:00:00 /usr/lib/R/bin/exec/R
jeroen 4470 1 0 16:38 pts/1 00:00:00 /usr/lib/R/bin/exec/R
jeroen 4471 1 0 16:38 pts/1 00:00:00 /usr/lib/R/bin/exec/R
jeroen 4502 4195 0 16:39 pts/1 00:00:00 grep --color=auto R
jeroen#jeroen-ubuntu:~$ ps -ef | grep "sleep"
jeroen 4473 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4475 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4477 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4479 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4481 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4483 1 0 16:38 pts/1 00:00:00 sleep 300
jeroen 4504 4195 0 16:39 pts/1 00:00:00 grep --color=auto sleep
To make things worse, their their parent process id is 1 making it hard to identify them. Is there a method to run an R script in a way that allows me to recursively kill the process and its children at any time?
Edit: so I don't want to manually have to go in to search & kill processes. Also I don't want to kill all R processes, as there might be others that are doing fine. I need a method to kill a specific process and all of its children.
This is mainly about the multicore part. Children are waiting for you to collect the results - see ?collect. Normally, you should never use parallel without a provision to clean up, typically in on.exit. multicore cleans up in high-level functions like mclapply, but if you use lower-level functions it is your responsibility to perform the cleanup (since multicore cannot know if you left the children running intentionally or not).
Your example is really bogus, because you don't even consider collecting results. But anyway, if that is really what you want, you'll have to do the cleanup at some point. For example, if you want to terminate all children on exit, you could define .Last like this:
.Last <- function(...) {
collect(wait=FALSE)
all <- children()
if (length(all)) {
kill(all, SIGTERM)
collect(all)
}
}
Again, the above is not a recommended way to deal with this - it rather a last resort. You should really assign jobs and collect results like
jobs <- lapply(1:3, function(i) parallel({Sys.sleep(i); i}))
collect(jobs)
As for the general process child question - init inherits the children only after R quits, but in .Last you can still find their pids since the parent process exists at that point so you could perform similar cleanup as in the multicore case.
Before the user quits the R session, the processes you want to kill will have parent process ID equal to the process ID of the session that started them. You could perhaps use the .Last or .Last.sys hooks (see help(q)) to kill all processes with the appropriate PPID at that point; those can be suppressed with q(runLast=FALSE), so it isn't perfect, but I think it's the best option you have.
After the user quits the R session, there is no reliable way to do what you want -- the only record the kernel keeps of process parentage is the PPID you see in ps -ef, and when a parent process exits, that information is destroyed, as you have discovered.
Note that if one of the child processes forks, the grandchild will have PPID equal to the child's PID, and that will get reset to 1 when the child exits, which it might do before the grandparent exits. Thus, there is no reliable way to catch all of a process's descendants in general, even if you do so before the process exits. (One hears that "cgroups" provide a way, but one is unfamiliar with the details; in any case, that is an optional feature which only some iterations/configurations of the Linux kernel provide, and is not available at all elsewhere.)
I believe the latter part of the question is more a consideration of the shell, rather than the kernel. (Simon Urbanek has answered the multicore part better than pretty much anyone else could, as he's the author. :))
If you're using bash, you can find the PID of the most recently launched child process in $!. You can aggregate the PIDs and then be sure to kill those off when you close R.
If you want to be really gonzo, you could store parent PID (i.e. the output of Sys.getpid()) and child PID in a file and have a cleaning daemon that checks whether or not the parent PID exists and, if not, kills the orphans. I don't think it'll be that easy to get a package called oRphanKilleR onto CRAN, though.
Here is an example of appending the child PID to a file:
system('(sleep 20) & echo $! >> ~/childPIDs.txt', wait = FALSE)
You can modify this to create your own shell command and use R's tempfile() command to create a temporary file (albeit, that will disappear when the R instance is terminated, unless you take a special effort to preserve the file via permissions).
For some other clever ideas, see this other post on SO.
You can also create a do while loop in the shell that will check for whether or not a particular PID is in existence. While it is, the loop sleeps. Once the loop terminates (because the PID is no longer in use), the script will kill another PID.
Basically, I think your solution will be in shell scripting, rather than R.

Process in a polling state?

Given a process ID, how can I tell if that process is currently blocked in a polling state? i.e. it has called poll() with a negative timeout, and is waiting for input to become ready.
On UNIX-like systems the command line utility 'ps' provides this information. There are many flavors of ps depending on the OS, so read the man page.
On a BSD-like system (mac):
ps -eo pid,user,cpu,state,comm
PID USER CPU STAT COMM
1 root 0 Ss /sbin/launchd
15 root 0 Ss /usr/libexec/kextd
90710 root 0 R+ ps
83804 joe 0 Ss /bin/bash
89631 joe 0 S+ ssh
where STAT is the process state. S means interruptible sleep. s (lower case) means session leader. '+' means it's in the foreground process group. R means running, or runnable (on run queue). There are many more possible states.

Mapping thread id from top to gdb

I am using top to see the thread wise cpu usage using
top -H -p `pgrep app.out`
It is showing some pid for each thread like
4015
4016
I had attached gdb to the application using gdb attach command.
Now I want to switch to thread 4015 which is showing inside top o/p.
How can I do that ?
If I fire thread 4015 it is showing no thread . as I need to give thread id in gdb.
So how can I map top thread id to gdb thread id ?
You should be able to match the LWP displayed in GDB with the top information:
according to my quick tests with Firefox, you can see that in your top -H -p:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6492 kevin 20 0 1242m 386m 31m S 0.3 4.9 0:09.00 firefox
6470 kevin 20 0 1242m 386m 31m S 5.7 4.9 5:04.89 firefox
and that in GDB info threads:
22 Thread 0x7fe3d2393700 (LWP 6492) "firefox" pthread_cond_timedwait...
...
* 1 Thread 0x7fe3dd868740 (LWP 6470) "firefox" __GI___poll ()...
EDIT: just for you in exclusivity, here is a brand new commands for gdb: lwp_to_id <lwp>:
import gdb
class lwp_to_id (gdb.Command):
def __init__(self):
gdb.Command.__init__(self, "lwp_to_id", gdb.COMMAND_OBSCURE)
def invoke(self, args, from_tty):
lwp = int(args)
for thr in gdb.selected_inferior().threads():
if thr.ptid[1] == lwp:
print "LWP %s maps to thread #%d" % (lwp, thr.num)
return
else:
print "LWP %s doesn't match any threads in the current inferior." % lwp
lwp_to_id()
(working at least on the trunk version of GDB, not sure about the official releases !
Do a
ps xjf
This will give you a tree of all processes with their pid and parent pid.

How can I find out what these files or processes do (linux)

When I go onto a *nix system and look as ps -A or -e or top I get a large number of processes that are running. For example.
init
migration/0
ksoftirqd/0
events/0
khelper
kacpid
kblockd/0
khubd
pdflush
pdflush
kswapd0
aio/0
kseriod
scsi_eh_0
kjournald
udevd
kauditd
kjournald
kjournald
kjournald
kjournald
kjournald
klogd
portmap
rpc.idmapd
sshd
xinetd
gpm
xfs
salinfod
dbus-daemon-1
cups-config-dae
hald
kjournald
agetty
minilogd
kjournald
screen
bash
sshd
bash
Now some i know what their purpose is, but many i cannot even seem to track down on Google, or i just get oblique references to, such as a post from a forum in 1999 complaining about the process.
Other than tracking them down one by one is there somewhere i can go to get a better explanation?
N.B. I am not asking anyone to tell me directly what they are but pointers to where i can get the understanding myself.
The stuff in square brackets are kernel threads. For the others, get the full name (try adding www to the command line) and hit Google, or look at /proc/<pid>/exe and use your package manager to figure out which package the executable comes from.
Some processes might have an associated manpage (the d at the end of most processes stand for daemon, you can also try the name without the d)
man processname

Resources