Can not get correct pid in WSL2 - linux

I am learning Linux programing.
When I trying to write a simple module to get family of a process, I find I can not get current pid of a process and its parent process. How to fix it?
Here is a part of my code.
static pid_t pid = 1;
module_param(pid, int, 0644);
static int hello_init(void) {
struct task_struct *p;
struct list_head *pp;
struct task_struct *psibling;
struct pid *kpid;
kpid = find_get_pid(pid);
p = pid_task(kpid, PIDTYPE_PID);
printk("me: %d %s\n", pid, p->comm);
if (p->parent == NULL) {
printk("No Parent\n");
}
else {
printk("Parent: %d %s\n", p->parent->pid, p->parent->comm);
}
list_for_each(pp, &p->parent->children) {
psibling = list_entry(pp, struct task_struct, sibling);
printk("sibling %d %s \n", psibling->pid, psibling->comm);
}
list_for_each(pp, &p->children) {
psibling = list_entry(pp, struct task_struct, sibling);
printk("children %d %s \n", psibling->pid, psibling->comm);
}
return 0;
}
result:
sudo insmod module.ko pid=1
dmesg
[ 6396.170631] me: 237 systemd
[ 6396.170633] Parent: 235 unshare
[ 6396.170633] sibling 237 systemd
[ 6396.170633] children 286 systemd-journal
[ 6396.170634] children 306 systemd-udevd
[ 6396.170635] children 314 systemd-network
[ 6396.170635] children 501 snapfuse
[ 6396.170636] children 508 dbus-daemon
[ 6396.170636] children 509 NetworkManager
[ 6396.170637] children 632 systemd-logind
[ 6396.170637] children 639 systemd
[ 6396.170638] children 665 rtkit-daemon
[ 6396.170638] children 671 polkitd
[ 6396.170638] children 711 udisksd
[ 6396.170639] children 761 upowerd

I'm not a Linux systems development expert, but I'll take a stab at helping based on what I see you trying.
First, you don't mention it in your question, but you are clearly running some sort of Systemd enablement. As you know, Systemd isn't normally supported on WSL. At a high level, the scripts to enable Systemd on WSL all have two essential functions:
Create a new PID namespace where Systemd is running as PID1. At the most basic level, this can be done via:
sudo -b unshare --pid --fork --mount-proc /lib/systemd/systemd --system-unit=basic.target
We can see the unshare in the list of processes returned, so that's getting called, at least.
Wait for Systemd to fully start, then enter the namespace that was created above. This is typically something like:
sudo -E nsenter --all -t $(pgrep -xo systemd) $SHELL
The actual scripts are typically a bit more complicated in order to handle multiple shells, distributions, etc. They also attempt to preserve more of the WSL environment inside the namespace in order to enable the Interop features such as running Windows .exes. But the core concept is always the same.
So, taking a guess here (again, as a non-systems-dev guy), it seems that:
kpid=find_get_pid(1) is returning the systemd process inside the namespace
pid_task(kpid, PIDTYPE_PID) is returning the "true" process information from the root namespace.
It seems to me that code must be running outside the namespace, since you see the unshare as part of it. From within the namespace, the unshare doesn't exist. You can verify this (inside the namespace) with ps -ef | grep unshare.
There are at least two possible solutions:
If it's not an issue (and from the comments, it wasn't), then just run your code from the root pid namespace. I'm assuming that your Systemd script is running via your shell startup files, so you should be able to get back to the root namespace by starting up with something like wsl ~ -e bash --noprofile --norc. This will start the shell without any of the startup scripts.
Of course, other techniques for disabling the Systemd script are probably documented by whatever script you are using.
If you do want your code to work properly from within a PID namespace, then you'll probably need to find the namespace (I'd start with the source of lsns as an example).
Then find the task struct within that namespace (probably find_task_by_pid_ns?).

Related

Umount("/proc") syscall for mount namespaces "Invalid Argument" error

i'm currently trying to use different namespaces for test purposes. For this i tried to implement a MNT namespace (combined with a PID namespace) so that a program within this namespace cannot see other processes on the system.
When trying to use the umount system call like this (same goes with umount("/proc"), or with umount2 and the Force-option ):
if (umount2("/proc", 0)!= 0)
{
fprintf(stderr, "Error when unmounting /proc: %s\n",strerror(errno));
printf("\tKernel version might be incorrect\n");
exit(-1);
}
the system call execution ends with error number 22 "Invalid Argument".
This code snipped is called within a function that gets called when a child process with the namespaces is created:
pid_t child_pid = clone(child_exec, child_stack+1024*1024, Child_Flags,&args);
(the child_exec function). Flags are set as following:
int Child_Flags = CLONE_NEWIPC | CLONE_NEWUSER | CLONE_NEWUTS | CLONE_NEWNET |CLONE_NEWPID | CLONE_NEWNS |SIGCHLD ;
With the CLONE_NEWNS for a new mount namespace (http://man7.org/linux/man-pages/man7/namespaces.7.html)
Output of the program is as follows:
Testing with Isolation
Starting Container engine
In-Child-PID: 1
Error number 22
Error when unmounting /proc: Invalid argument
Can somebody point me to my error, so i can unmount the folder? Thank you in advance
You can't unmount things that were mounted in a different user namespace except by using pivot_root followed by umount to unmount /. You can overmount /proc without unmounting the old /proc.

unshare --pid /bin/bash - fork cannot allocate memory

I'm experimenting with linux namespaces. Specifically the pid namespace.
I thought I'd test something out with bash but run into this problem:
unshare -p /bin/bash
bash: fork: Cannot allocate memory
Running ls from there gave a core dump. Exit is the only thing possible.
Why is it doing that?
The error is caused by the PID 1 process exits in the new namespace.
After bash start to run, bash will fork several new sub-processes to do somethings. If you run unshare without -f, bash will have the same pid as the current "unshare" process. The current "unshare" process call the unshare systemcall, create a new pid namespace, but the current "unshare" process is not in the new pid namespace. It is the desired behavior of linux kernel: process A creates a new namespace, the process A itself won't be put into the new namespace, only the sub-processes of process A will be put into the new namespace. So when you run:
unshare -p /bin/bash
The unshare process will exec /bin/bash, and /bin/bash forks several sub-processes, the first sub-process of bash will become PID 1 of the new namespace, and the subprocess will exit after it completes its job. So the PID 1 of the new namespace exits.
The PID 1 process has a special function: it should become all the orphan processes' parent process. If PID 1 process in the root namespace exits, kernel will panic. If PID 1 process in a sub namespace exits, linux kernel will call the disable_pid_allocation function, which will clean the PIDNS_HASH_ADDING flag in that namespace. When linux kernel create a new process, kernel will call alloc_pid function to allocate a PID in a namespace, and if the PIDNS_HASH_ADDING flag is not set, alloc_pid function will return a -ENOMEM error. That's why you got the "Cannot allocate memory" error.
You can resolve this issue by use the '-f' option:
unshare -fp /bin/bash
If you run unshare with '-f' option, unshare will fork a new process after it create the new pid namespace. And run /bin/bash in the new process. The new process will be the pid 1 of the new pid namespace. Then bash will also fork several sub-processes to do some jobs. As bash itself is the pid 1 of the new pid namespace, its sub-processes can exit without any problem.
This does not explain why this happens, but shows how to correctly launch a shell in a new pid namespace:
Use the -f flag to fork off the shell from unshare, so that the new shell gets PID 1 in the newly created namespace:
unshare -fp /bin/bash
You probably also want to pass the --mount-proc option, so that your ps listing reflects your newly created PID namespace rather than the parent PID namespace:
unshare -fp --mount-proc /bin/bash
Now run ps:
# ps
PID TTY TIME CMD
1 pts/1 00:00:00 bash
11 pts/1 00:00:00 ps

why kthread have pid 2 and systemd have PID 1?

In the systemd environment, when I performed ps -auxf, I see that the kthread has PID of 2 while systemd has PID 1 assigned.
So, who assigns PID 2 to kthread and why is it getting PID 2 when kthread is what that calls systemd?
I don't think that kthreadd is starting init (in your case symlinked to systemd).
init is started by the kernel initialization. kthreadd is started just after. See this kernel threads wikipage, and, for Linux 4.2, its file init/main.c, function rest_init, near lines 397, where you see:
/*
* We need to spawn init first so that it obtains pid 1, however
* the init task will end up wanting to create kthreads, which, if
* we schedule it before we create kthreadd, will OOPS.
*/
kernel_thread(kernel_init, NULL, CLONE_FS);
numa_default_policy();
pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);
So kthreadd is not starting init, but both are started in the kernel before being scheduled, so before starting their execution.
The static kernel_init function (lines 930 - 975 of init/main.c) has notably:
if (execute_command) {
ret = run_init_process(execute_command);
if (!ret)
return 0;
panic("Requested init %s failed (error %d).",
execute_command, ret);
}
if (!try_to_run_init_process("/sbin/init") ||
!try_to_run_init_process("/etc/init") ||
!try_to_run_init_process("/bin/init") ||
!try_to_run_init_process("/bin/sh"))
return 0;
so is setting up the init process (à la execve(2)....) and has hardwired /sbin/init etc...
And the init process has pid 1 for ages (since primordial Unix of the 1970s, and also in old Linux 1.x kernels without kernel threads). It is a strong Unix convention (on which probably a lot of software depends). You can use systemd as your init process, but you could also use sysvinit or simply bash (it is sometimes useful to pass init=/bin/bash to the kernel thru GRUB for repairing purposes) or something else (e.g. runit)

How to set resource limits for init on boot?

I'm trying to find a way to set the rlimit value for the init process during the boot time. Normally, rlimit is set by calling the "setrlimit" system call.
So I was wondering is there any way to call a system call in the boot time (like calling it in the shell script)? Or, is there any other way to perform equivalent operation of setrlimit?
On Linux you can use the prlimit syscall/utility to set resource limits for other processes:
prlimit — get and set process resource limits
prlimit [options] [ −−resource [=limits] ] [ −−pid PID ]
and
int prlimit(pid_t pid, int resource, const struct rlimit *new_limit,
struct rlimit *old_limit);
You can run this on boot however you want, such as in /etc/rc.local or equivalent, or #reboot in crontab if supported.

How to get child process from parent process

Is it possible to get the child process id from parent process id in shell script?
I have a file to execute using shell script, which leads to a new process process1 (parent process). This process1 has forked another process process2(child process). Using script, I'm able to get the pid of process1 using the command:
cat /path/of/file/to/be/executed
but i'm unable to fetch the pid of the child process.
Just use :
pgrep -P $your_process1_pid
I am not sure if I understand you correctly, does this help?
ps --ppid <pid of the parent>
I've written a script to get all child process pids of a parent process.
Here is the code. Hope it will help.
function getcpid() {
cpids=`pgrep -P $1|xargs`
# echo "cpids=$cpids"
for cpid in $cpids;
do
echo "$cpid"
getcpid $cpid
done
}
getcpid $1
To get the child process and thread,
pstree -p PID.
It also show the hierarchical tree
The shell process is $$ since it is a special parameter
On Linux, the proc(5) filesystem gives a lot of information about processes. Perhaps
pgrep(1) (which accesses /proc) might help too.
So try cat /proc/$$/status to get the status of the shell process.
Hence, its parent process id could be retrieved with e.g.
parpid=$(awk '/PPid:/{print $2}' /proc/$$/status)
Then use $parpid in your script to refer to the parent process pid (the parent of the shell).
But I don't think you need it!
Read some Bash Guide (or with caution advanced bash scripting guide, which has mistakes) and advanced linux programming.
Notice that some server daemon processes (wich usually need to be unique) are explicitly writing their pid into /var/run, e.g. the  sshd server daemon is writing its pid into the textual file /var/run/sshd.pid). You may want to add such a feature into your own server-like programs (coded in C, C++, Ocaml, Go, Rust or some other compiled language).
You can get the pids of all child processes of a given parent process <pid> by reading the /proc/<pid>/task/<tid>/children entry.
This file contain the pids of first level child processes.
Recursively do this for all children pids.
For more information head over to https://lwn.net/Articles/475688/
ps -axf | grep parent_pid
Above command prints respective processes generated from parent_pid, hope it helps.
+++++++++++++++++++++++++++++++++++++++++++
root#root:~/chk_prgrm/lp#
parent...18685
child... 18686
root#root:~/chk_prgrm/lp# ps axf | grep frk
18685 pts/45 R 0:11 | \_ ./frk
18686 pts/45 R 0:11 | | \_ ./frk
18688 pts/45 S+ 0:00 | \_ grep frk
You can print the PID of all child processes invoked by a parent process:
pstree -p <PARENT_PID> | grep -oP '\(\K[^\)]+'
This prints a list of pids for the main process and its children recursively
I used the following to gather the parent process and child processes of a specific process PID:
ps -p $PID --ppid $PID --forest | tail -n +2 | awk '{print$1}'
Note, this will not work to find child processes of child processes.
For the case when the process tree of interest has more than 2 levels (e.g. Chromium spawns 4-level deep process tree), pgrep isn't of much use. As others have mentioned above, procfs files contain all the information about processes and one just needs to read them. I built a CLI tool called Procpath which does exactly this. It reads all /proc/N/stat files, represents the contents as a JSON tree and expose it to JSONPath queries.
To get all descendant process' comma-separated PIDs of a non-root process (for the root it's ..stat.pid) it's:
$ procpath query -d, "..children[?(#.stat.pid == 24243)]..pid"
24243,24259,24284,24289,24260,24262,24333,24337,24439,24570,24592,24606,...
#include<stdio.h>
#include <sys/types.h>
#include <unistd.h>
int main()
{
// Create a child process
int pid = fork();
if (pid > 0)
{
int j=getpid();
printf("in parent process %d\n",j);
}
// Note that pid is 0 in child process
// and negative if fork() fails
else if (pid == 0)
{
int i=getppid();
printf("Before sleep %d\n",i);
sleep(5);
int k=getppid();
printf("in child process %d\n",k);
}
return 0;
}

Resources