How 'task_struct' is accessed via 'thread_info' in linux latest kernel? - linux

Background :
I am a beginner in the area of linux kernel. I just started to understand Linux kernel by reading a book 'Linux kernel Development - Third Edition' by Robert Love. Most of the explanations in this book are based on Linux kernel 2.6.34.
Hence, I am sorry, if this is repetitive question, but I could not find any info related to this in stack overflow.
Question:
What I understood from the book is that, each thread in linux has a structure called 'thread_info', which has pointer to its process/task.
This 'thread_info' is stored and the end of the kernel stack for each alive thread.
and the 'thread_info' has a pointer to its belonging task as below.
struct thread_info {
struct task_struct *task;
...
};
But when I checked the same structure in the latest linux code, I see a very different thread_info structure as below. (https://elixir.bootlin.com/linux/v5.16-rc1/source/arch/x86/include/asm/thread_info.h). It does not have 'task_struct' in it.
struct thread_info {
unsigned long flags; /* low level flags */
unsigned long syscall_work; /* SYSCALL_WORK_ flags */
u32 status; /* thread synchronous flags */
#ifdef CONFIG_SMP
u32 cpu; /* current CPU */
#endif
};
My Question is, that if 'thread_info' structure does not have its related task structure here, then how does it find the information about its address space?
Also, If you know any good book on the latest linux kernel, please provide links to me.

Pointer to the current task_struct object is stored in architecture-dependent way. On x86 it is stored in per-CPU variable:
DECLARE_PER_CPU(struct task_struct *, current_task);
(In arch/x86/include/asm/current.h).
For find out how current task_struct is stored on particular architecture and/or in particular kernel version just search for implementation of current macro: exactly that macro is responsible for returning a pointer to the task_struct of the current process.

Related

Linux kernel: finding task struct for given file struct

I'm doing some research of Linux kernel file descriptors maintenance. I'm trying to find out how the file descriptors are working from Linux kernel space perspective. I know that task_struct contains files_struct which contains in turn table of descriptors. So for given task_struct I can list the descriptors for this task. Each of descriptor is of type "struct file".
Now I'd like to know how to do reverse mapping - I mean having the pointer to "struct file" - how can I find out to which "task_struct" it belongs?

Where does getuid refer?

I have a question about getuid() and geteuid() in linux.
I know that getuid will return the real user id of the current process. Also geteuid() will return the effective user id of the current process.
My question is, where the informations about id are stored. Apart from the existence of /etc/passwd, I think every process should store their own id information somewhere.
If I'm right, please tell me where is the information stored (say the area like the stack). If I'm wrong, how does the process get its id?
This is something maintained by the kernel in its internal in-memory structures.
Linux kernel uses something called struct task_struct:
Every process under Linux is dynamically allocated a struct task_struct structure.
In Linux kernel 4.12.10 this is defined as follows:
task_struct.h:
struct task_struct {
...
/* Objective and real subjective task credentials (COW): */
const struct cred __rcu *real_cred;
/* Effective (overridable) subjective task credentials (COW): */
const struct cred __rcu *cred;
cred.h:
struct cred {
...
kuid_t uid; /* real UID of the task */
kgid_t gid; /* real GID of the task */
kuid_t suid; /* saved UID of the task */
kgid_t sgid; /* saved GID of the task */
kuid_t euid; /* effective UID of the task */
kgid_t egid; /* effective GID of the task */
kuid_t fsuid; /* UID for VFS ops */
kgid_t fsgid; /* GID for VFS ops */
These structures cannot be accessed directly by a user space process. To get this information, such processes have to use either system calls (such as getuid() and geteuid()) or the /proc file system.
Read Advanced Linux Programming and perhaps Operating System: Three Easy Pieces (both are freely downloadable).
(several books are needed to answer your question)
getuid(2) is (like getpid(2) and many others) a system call provided and implemented by the Linux kernel. syscalls(2) is a list of them.
(please take time to read more about system calls in general)
where the informations about id are stored.
The kernel manages data describing every process (in kernel memory, see NPE's answer for details). Each system call is a primitive atomic operation (from user-space perspective) and returns a result (usually in some register, not in memory). Read about CPU modes.
So that information is not in the user-level virtual address space of the process, it is returned at every invocation of getuid.

Where can I find the address of the Linux global mem_map array?

My distro (Ubuntu 14.04.3 LTS) doesn't seem to export this reference so I can't resolve the address at module load time. I'm looking for another way to determine the address without a kernel re-compile.
I'm running as a guest under VMware Fusion on a macbook pro. Kernel is 3.13.0-74-generic.
Thanks in advance
Use kallsyms_lookup_name. It is defined in linux/kallsyms.h as
unsigned long kallsyms_lookup_name(const char *name);
Usage is trivial:
struct page *my_mem_map = (struct page*)kallsyms_lookup_name("mem_map");
kallsyms_lookup_name is exported for modules since kernel 2.6.33.
For earlier kernels, or for find several symbols at once, generic function kallsyms_on_each_symbol can be used. It iterates over all symbols and calls user-specified function for them.
Maybe this answer is too late for you #owenh . I am answering it for people who are looking for the same question about mem_map array. Because so far I can not find any clear answer about why mem_map can not be found. After tracing into how pte_page works, I record whatever I learn here.
If you can not find mem_map array, it is possible because the kernel is using virtually contiguous mem_map or the sparse memory model is used, both of which do not have a
physically contiguous mem_map array. Instead, it may have a vmemmap or mem_section as a starting point to find all the page struct. This is decided by these macros (CONFIG_FLATMEM/DISCONTIGMEM/CONFIG_SPARSEMEM_VMEMMAP/CONFIG_SPARSEMEM) in include/asm-generic/memory_model.h. You can check your kernel compilation config flags using something like
sudo cat /boot/config-`uname -r` |grep CONFIG_SPARSEMEM_VMEMMAP
In CONFIG_SPARSEMEM_VMEMMAP mode, the page struct array starts from vmemmap, which is at a fixed location 0xffffea0000000000.
#define VMEMMAP_START _AC(0xffffea0000000000, UL)
#define vmemmap ((struct page *)VMEMMAP_START)
While, in CONFIG_SPARSEMEM mode, the page struct array is managed through an 2-dimensional array called mem_section.
In order to know how kernel uses it, we can learn from how kernel get a page struct of a page given a pte of a page. This macro is called pte_page(in arch/x86/include/asm/pgtable.h). It transform the pte into it pfn. Using that pfn, it can locate the page struct either using vmemmap or mem_section. This is in include/asm-generic/memory_model.h.

How I can make synchronize between my LKM and the Linux kernel for shared data structure?

I'm developing a LKM(Loadable Kernel Module) in Linux.
What the LKM want to is traverse all process information through TCB(i.e. task_strct).
I'm wondering if TCB data structure is updating while the LKM traverse the data structure.
That is, during LKM traverse TCB data structure, the data structure is can be updating because of a process is termination or creationg.
How I can make synchronize between my LKM and the Linux kernel that keeps updating the TCB data structure in SMP or Non SMP Linux system?
//Daum
I think you can traverse the process list via below sample code
struct task_struct *task;
rcu_read_lock();
for_each_process(task) {
task_lock(task);
/* do something with your task :) */
task_unlock(task);
}
rcu_read_unlock();
Reference :: how to iterate over PCB's to show information in a Linux Kernel Module?

What is the Linux process kernel stack state at process creation?

I can't find this information anywhere. Everywhere I look, I find things referring to how the stack looks once you hit "main" (whatever your entry point is), which would be the program arguments, and environment, but what I'm looking for is how the system sets up the stack to cooperate with the switch_to macro. The first time the task gets switched to, it would need to have EFLAGS, EBP, the registers that GCC saves, and the return address from the schedule() function on the stack pointed to by "tsk->thread->esp", but what I can't figure out is how the kernel sets up this stack, since it lets GCC save the general purpose registers (using the output parameters for inline assembly).
I am referring to x86 PCs only. I am researching the Linux scheduler/process system for my own small kernel I am (attempting) to write, and I can't get my head around what I'm missing. I know I'm missing something since the fact that Slackware is running on my computer is a testament to the fact that the scheduler works :P
EDIT: I seem to have worded this badly. I am looking for information on how the tasks kernel stack is setup not how the tasks user task is setup. More specifically, the stack which tsk->thread->esp points to, and that "switch_to" switches to.
The initial kernel stack for a new process is set in copy_thread(), which is an arch-specific function. The x86 version, for example, starts out like this:
int copy_thread(unsigned long clone_flags, unsigned long sp,
unsigned long unused,
struct task_struct *p, struct pt_regs *regs)
{
struct pt_regs *childregs;
struct task_struct *tsk;
int err;
childregs = task_pt_regs(p);
*childregs = *regs;
childregs->ax = 0;
childregs->sp = sp;
p->thread.sp = (unsigned long) childregs;
p->thread.sp0 = (unsigned long) (childregs+1);
p->thread.ip = (unsigned long) ret_from_fork;
p->thread.sp and p->thread.ip are the new thread's kernel stack pointer and instruction pointer respectively.
Note that it does not place a saved %eflags, %ebp etc there, because when a newly-created thread of execution is first switched to, it starts out executing at ret_from_fork (this is where __switch_to() returns to for a new thread), which means that it doesn't execute the second-half of the switch_to() routine.
The state of the stack at process creation is described in the X86-64 SVR4 ABI supplement (for AMD64, ie x86-64 64 bits machines). The equivalent for 32 bits Intel processor is probably ABI i386. I strongly recommend reading also Assembly HOWTO. And of course, you should perhaps read the relevant Linux kernel file.
Google for "linux stack layout process startup" gives this link: "Startup state of a Linux/i386 ELF binary", which describes the set up that the kernel performs just before transferring control to the libc startup code.

Resources