Linux process data structure - linux

According to GNU website the linux shell uses the following data structure for a process :
typedef struct process
{
struct process *next; /* next process in pipeline */
char **argv; /* for exec */
pid_t pid; /* process ID */
char completed; /* true if process has completed */
char stopped; /* true if process has stopped */
int status; /* reported status value */
} process;
Why can't the shell use the task_struct data structure for a process when it is already present in the kernel. Why use a separate data structure ?

Related

Confusion about list_head children of linux kernel

I want to traverse all the processes and try to show pid,ppid,first children pid as well as the next sibling pid of each process.
It seems that some children pids showed in my test result are incorrect.
For example,process 3,4,5,...,16,17,..are all children of process 2. But from the data it also shows that process 4,5,14,18..their first child refer to process 16!
My code attached here:
struct prinfo p;
p.state = t->state; /* get state */
p.pid = t->pid; /* get pid */
p.parent_pid = t->parent->pid; /* get parent pid */
struct list_head *chd; /* get first child_task */
chd = &(t->children);
struct task_struct *child_task;
child_task = list_entry(chd->next, struct task_struct, sibling);
p.first_child_pid = child_task->pid;
struct list_head *sbl; /* get next sibling_task */
sbl = &(t->sibling);
struct task_struct *sibling_task;
sibling_task = list_entry(sbl->next, struct task_struct, sibling);
p.next_sibling_pid = sibling_task->pid;
p.uid = t->cred->uid; /* get uid */
Can anyone help me out here?

Creating a process in Linux with a different mount namespace

I'm trying to create a process that has a different mnt namespace from his parent.
For that, I use the following code:
static int childFunc(void *arg){
if (mount("/","/myfs", "sysfs", 0, NULL) == -1)
errExit("mount");
printf("Starting new bash. Child PID is %d\n",getpid());
execle("/bin/bash",NULL);
printf("Shouldn't arrive here.\n");
return 0; /* Child terminates now */
}
#define STACK_SIZE (1024 * 1024) /* Stack size for cloned child */
int main(int argc, char *argv[]){
char *stack; /* Start of stack buffer */
char *stackTop; /* End of stack buffer */
pid_t pid;
/* Allocate stack for child */
stack = malloc(STACK_SIZE);
if (stack == NULL)
errExit("malloc");
stackTop = stack + STACK_SIZE; /* Assume stack grows downward */
/* Create child that has its own MNT namespaces*/
pid = clone(childFunc, stackTop, CLONE_NEWNS | SIGCHLD, argv[1]);
if (pid == -1)
errExit("clone");
printf("clone() returned %ld\n", (long) pid);
sleep(1);
if (waitpid(pid, NULL, 0) == -1) /* Wait for child */
errExit("waitpid");
printf("child has terminated\n");
exit(EXIT_SUCCESS);
}
When running it, I do get a bash shell, running in a different MNT namespace.
In order to verify it, I execute in another shell sudo ls -l /proc/<child_pid>/ns, and I indeed see that the child process has a different namespace from the rest of the processes in the system.
However, if I execute mount from both of the shells - I get the same FSTAB output, and the line myfs on /myfs type sysfs (rw,relatime) appears in both of them.
What is the explanation for that?
You need to mark the the existing mounts as "private" before creating the new namespace:
mount --make-rprivate /

Cleaning up of child processes

What is actually meant by cleaning up of child processes after it has ended?
In the book Advanced Linux Programming, there is a section on cleaning child processes asynchronously by handling SIGCHLD signal. This signal is sent to the parent process after the termination of child process.
sig_atomic_t child_exit_status;
void clean_up_child_process (int signal_number)
{
/* Clean up the child process. */
int status;
wait (&status);
/* Store its exit status in a global variable.
child_exit_status = status;
}
*/
int main ()
{
/* Handle SIGCHLD by calling clean_up_child_process. */
struct sigaction sigchld_action;
memset (&sigchld_action, 0, sizeof (sigchld_action));
sigchld_action.sa_handler = &clean_up_child_process;
sigaction (SIGCHLD, &sigchld_action, NULL);
/* Now do things, including forking a child process.
/* ... */
*/
return 0;
}
After catching the signal, the signal handler does nothing except storing the exit status in a global variable. So in this context what is meant by cleaning up of child processes?

"current" in Linux kernel code

As I was going through the below chunk of Linux char driver code, I found the structure pointer current in printk.
I want to know what structure the current is pointing to and its complete elements.
What purpose does this structure serve?
ssize_t sleepy_read (struct file *filp, char __user *buf, size_t count, loff_t *pos)
{
printk(KERN_DEBUG "process %i (%s) going to sleep\n",
current->pid, current->comm);
wait_event_interruptible(wq, flag != 0);
flag = 0;
printk(KERN_DEBUG "awoken %i (%s)\n", current->pid, current->comm);
return 0;
}
It is a pointer to the current process ie, the process which has issued the system call.
From the docs:
The Current Process
Although kernel modules don't execute sequentially as applications do,
most actions performed by the kernel are related to a specific
process. Kernel code can know the current process driving it by
accessing the global item current, a pointer to struct task_struct,
which as of version 2.4 of the kernel is declared in
<asm/current.h>, included by <linux/sched.h>. The current pointer
refers to the user process currently executing. During the execution
of a system call, such as open or read, the current process is the one
that invoked the call. Kernel code can use process-specific
information by using current, if it needs to do so. An example of this
technique is presented in "Access Control on a Device File", in
Chapter 5, "Enhanced Char Driver Operations".
Actually, current is not properly a global variable any more, like it
was in the first Linux kernels. The developers optimized access to the
structure describing the current process by hiding it in the stack
page. You can look at the details of current in <asm/current.h>. While
the code you'll look at might seem hairy, we must keep in mind that
Linux is an SMP-compliant system, and a global variable simply won't
work when you are dealing with multiple CPUs. The details of the
implementation remain hidden to other kernel subsystems though, and a
device driver can just include and refer to the
current process.
From a module's point of view, current is just like the external
reference printk. A module can refer to current wherever it sees fit.
For example, the following statement prints the process ID and the
command name of the current process by accessing certain fields in
struct task_struct:
printk("The process is \"%s\" (pid %i)\n",
current->comm, current->pid);
The command name stored in current->comm is the base name of the
program file that is being executed by the current process.
Here is the complete structure the "current" is pointing to
task_struct
Each task_struct data structure describes a process or task in the system.
struct task_struct {
/* these are hardcoded - don't touch */
volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */
long counter;
long priority;
unsigned long signal;
unsigned long blocked; /* bitmap of masked signals */
unsigned long flags; /* per process flags, defined below */
int errno;
long debugreg[8]; /* Hardware debugging registers */
struct exec_domain *exec_domain;
/* various fields */
struct linux_binfmt *binfmt;
struct task_struct *next_task, *prev_task;
struct task_struct *next_run, *prev_run;
unsigned long saved_kernel_stack;
unsigned long kernel_stack_page;
int exit_code, exit_signal;
/* ??? */
unsigned long personality;
int dumpable:1;
int did_exec:1;
int pid;
int pgrp;
int tty_old_pgrp;
int session;
/* boolean value for session group leader */
int leader;
int groups[NGROUPS];
/*
* pointers to (original) parent process, youngest child, younger sibling,
* older sibling, respectively. (p->father can be replaced with
* p->p_pptr->pid)
*/
struct task_struct *p_opptr, *p_pptr, *p_cptr,
*p_ysptr, *p_osptr;
struct wait_queue *wait_chldexit;
unsigned short uid,euid,suid,fsuid;
unsigned short gid,egid,sgid,fsgid;
unsigned long timeout, policy, rt_priority;
unsigned long it_real_value, it_prof_value, it_virt_value;
unsigned long it_real_incr, it_prof_incr, it_virt_incr;
struct timer_list real_timer;
long utime, stime, cutime, cstime, start_time;
/* mm fault and swap info: this can arguably be seen as either
mm-specific or thread-specific */
unsigned long min_flt, maj_flt, nswap, cmin_flt, cmaj_flt, cnswap;
int swappable:1;
unsigned long swap_address;
unsigned long old_maj_flt; /* old value of maj_flt */
unsigned long dec_flt; /* page fault count of the last time */
unsigned long swap_cnt; /* number of pages to swap on next pass */
/* limits */
struct rlimit rlim[RLIM_NLIMITS];
unsigned short used_math;
char comm[16];
/* file system info */
int link_count;
struct tty_struct *tty; /* NULL if no tty */
/* ipc stuff */
struct sem_undo *semundo;
struct sem_queue *semsleeping;
/* ldt for this task - used by Wine. If NULL, default_ldt is used */
struct desc_struct *ldt;
/* tss for this task */
struct thread_struct tss;
/* filesystem information */
struct fs_struct *fs;
/* open file information */
struct files_struct *files;
/* memory management info */
struct mm_struct *mm;
/* signal handlers */
struct signal_struct *sig;
#ifdef __SMP__
int processor;
int last_processor;
int lock_depth; /* Lock depth.
We can context switch in and out
of holding a syscall kernel lock... */
#endif
};

What is the "current" in Linux kernel source?

I'm studying about Linux kernel and I have a problem.
I see many Linux kernel source files have current->files. So what is the current?
struct file *fget(unsigned int fd)
{
struct file *file;
struct files_struct *files = current->files;
rcu_read_lock();
file = fcheck_files(files, fd);
if (file) {
/* File object ref couldn't be taken */
if (file->f_mode & FMODE_PATH ||
!atomic_long_inc_not_zero(&file->f_count))
file = NULL;
}
rcu_read_unlock();
return file;
}
It's a pointer to the current process (i.e. the process that issued the system call).
On x86, it's defined in arch/x86/include/asm/current.h (similar files for other archs).
#ifndef _ASM_X86_CURRENT_H
#define _ASM_X86_CURRENT_H
#include <linux/compiler.h>
#include <asm/percpu.h>
#ifndef __ASSEMBLY__
struct task_struct;
DECLARE_PER_CPU(struct task_struct *, current_task);
static __always_inline struct task_struct *get_current(void)
{
return percpu_read_stable(current_task);
}
#define current get_current()
#endif /* __ASSEMBLY__ */
#endif /* _ASM_X86_CURRENT_H */
More information in Linux Device Drivers chapter 2:
The current pointer refers to the user process currently executing. During the execution of a system call, such as open or read, the current process is the one that invoked the call. Kernel code can use process-specific information by using current, if it needs to do so. [...]
Current is a global variable of type struct task_struct. You can find it's definition at [1].
Files is a struct files_struct and it contains information of the files used by the current process.
[1] http://students.mimuw.edu.pl/SO/LabLinux/PROCESY/ZRODLA/sched.h.html
this is ARM64 definition. in arch/arm64/include/asm/current.h, https://elixir.bootlin.com/linux/latest/source/arch/arm64/include/asm/current.h
struct task_struct;
/*
* We don't use read_sysreg() as we want the compiler to cache the value where
* possible.
*/
static __always_inline struct task_struct *get_current(void)
{
unsigned long sp_el0;
asm ("mrs %0, sp_el0" : "=r" (sp_el0));
return (struct task_struct *)sp_el0;
}
#define current get_current()
which just use the sp_el0 register. As the pointer to current process's task_struct

Resources