Linux kernel: finding task struct for given file struct - linux

I'm doing some research of Linux kernel file descriptors maintenance. I'm trying to find out how the file descriptors are working from Linux kernel space perspective. I know that task_struct contains files_struct which contains in turn table of descriptors. So for given task_struct I can list the descriptors for this task. Each of descriptor is of type "struct file".
Now I'd like to know how to do reverse mapping - I mean having the pointer to "struct file" - how can I find out to which "task_struct" it belongs?

Related

What's the relationship between `struct file` and file descriptor?

I know that a struct file is a kernel representation for an opened file and a file descriptor is for user space to use. But I still have some questions:
Is struct file and fd many-to-one? Can multiple file descriptors in the same process share one struct file?
If the answer is yes, can multiple file descriptors in different processes share the same struct file?
If the answer is yes, how does the kernel keep track of the fd-specific information such as the current file offset?

How 'task_struct' is accessed via 'thread_info' in linux latest kernel?

Background :
I am a beginner in the area of linux kernel. I just started to understand Linux kernel by reading a book 'Linux kernel Development - Third Edition' by Robert Love. Most of the explanations in this book are based on Linux kernel 2.6.34.
Hence, I am sorry, if this is repetitive question, but I could not find any info related to this in stack overflow.
Question:
What I understood from the book is that, each thread in linux has a structure called 'thread_info', which has pointer to its process/task.
This 'thread_info' is stored and the end of the kernel stack for each alive thread.
and the 'thread_info' has a pointer to its belonging task as below.
struct thread_info {
struct task_struct *task;
...
};
But when I checked the same structure in the latest linux code, I see a very different thread_info structure as below. (https://elixir.bootlin.com/linux/v5.16-rc1/source/arch/x86/include/asm/thread_info.h). It does not have 'task_struct' in it.
struct thread_info {
unsigned long flags; /* low level flags */
unsigned long syscall_work; /* SYSCALL_WORK_ flags */
u32 status; /* thread synchronous flags */
#ifdef CONFIG_SMP
u32 cpu; /* current CPU */
#endif
};
My Question is, that if 'thread_info' structure does not have its related task structure here, then how does it find the information about its address space?
Also, If you know any good book on the latest linux kernel, please provide links to me.
Pointer to the current task_struct object is stored in architecture-dependent way. On x86 it is stored in per-CPU variable:
DECLARE_PER_CPU(struct task_struct *, current_task);
(In arch/x86/include/asm/current.h).
For find out how current task_struct is stored on particular architecture and/or in particular kernel version just search for implementation of current macro: exactly that macro is responsible for returning a pointer to the task_struct of the current process.

In linux , how to create a file descriptor for a memory region

I have some program handling some data either in a file or in some memory buffer. I want to provide uniform way to handle these cases.
I can either 1) mmap the file so we can handle them uniformly as a memory buffer; 2) create FILE* using fopen and fmemopen so access them uniformly as FILE*.
However, I can't use either ways above. I need to handle them both as file descriptor, because one of the libraries I use only takes file descriptor, and it does mmap on the file descriptor.
So my question is, given a memory buffer (we can assume it is aligned to 4K), can we get a file descriptor that backed by this memory buffer? I saw in some other question popen is an answer but I don't think fd in popen can be mmap-ed.
You cannot easily create a file descriptor (other than a C standard library one, which is not helpful) from "some memory region". However, you can create a shared memory region, getting a file descriptor in return.
From shm_overview (7):
shm_open(3)
Create and open a new object, or open an existing object. This is analogous to open(2). The call returns a file descriptor for use by the other interfaces listed below.
Among the listed interfaces is mmap, which means that you can "memory map" the shared memory the same as you would memory map a regular file.
Thus, using mmap for both situations (file or memory buffer) should work seamlessly, if only you control creation of that "memory buffer".
You could write (perhaps using mmap) your data segment to a tmpfs based file (perhaps under /run/ directory), then pass the opened file descriptor to your library.

Do files in /proc/PID directory have their own proc_dir_entry instance?

Do files in /proc/PID directory (including /proc/PID ) have their own proc_dir_entry instance?
As I known, each normal file in /proc including /proc has their proc_dir_entry instance.
(The instance address is stored in proc_inode.pde.)
After surfing the procfs source code in Linux 2.6.11, seems that the kernel doesn't create a corresponding proc_dir_entry instance for each pid directory in /proc and each file in pid directory.
Is this true?
If it's not true, which file in the kernel source code shows that the kernel create proc_dir_entry instance for pid directory in /proc.
I think you're right, it looks like the pid entries are handled differently. See fs/proc/base.c.
Yes, every process has its proc_dir_entry that is /proc/PID/task directory in common.

Getting inode from path in Linux Kernel

I'm currently trying to get an inode for a given pathname in a kernel function. All I have available is the full pathname. I've tried attempts like:
user_path_at(AT_FDCWD, buffer, LOOKUP_FOLLOW, &path);
But the dentry in that given path isn't valid, it seems to turn out. Then I thought perhaps trying stat() and getting the inode number from that. However, that only gives me a number, not a struct inode. I don't know of a way to convert an inode number to an inode without grabbing an existing inode and traversing the entire list of inodes. And I don't even know if that would work. But I certainly don't want to do that.
Is there any simple way to get a struct inode from a char *pathname inside the kernel?
stat() will give you the inode of a file in the "st_ino" field.
Sorry, initial misunderstanding of the question.
If you want the actual inode structure within the kernel, I'm pretty certain the kernel itself wouldn't walk an array or list looking for the inode number (unless the list is very small). Since the code to the kernel is publicly available, you should be able to find out how it does it, then do the same.
There is no easy way since struct inode is part of the kernel and you are in user space. It all depends on the particular filesystem implementation. Are you sure that info in stat struct is not enough for your needs?
Anyway, this link might help.

Resources