How to check the state of Linux threads? - linux

How could I check the state of a Linux threads using codes, not tools? I want to know if a thread is running, blocked on a lock, or asleep for some other reason. I know the Linux tool "top" could do this work. But how to implement it in my own codes. Thanks.

I think you should study in details the /proc file system, also documented here, inside kernel source tree.
It is the way the Linux kernel tells things to outside!
There is a libproc also (used by ps and top, which reads /proc/ pseudo-files).
See this question, related to yours.
Reading files under /proc/ don't do any disk I/O (because /proc/ is a pseudo file system), so goes fast.

Lets say your process id is 100.
Go to /proc/100/task directory and there you could see multiple directories representing each threads.
then inside each subdirectory e.g. /proc/100/task/10100 there is a file named status.
the 2nd line inside this file is the state information of the thread.

You could also find it with by looking at the cgroup hierarchy of the service that your process belongs. Cgroups have a file called "tasks" and this file lists all the tasks of a service.
For example:
cat /sys/fs/cgroup/systemd/system.slice/hello.service/tasks
Note: cgroup should be enabled in your linux kernel.

Related

How can a program change a directory without using chdir()?

I can find a lot of documentation on using chdir() to change a directory in a program (a command shell, for instance). I was wondering if it is possible to somehow do the same thing without the use of chdir(). Yet, I can't find any documentation or examples of code where a person is changing directories without using chdir() to some capacity. Is this possible?
In Linux, chdir() is a syscall. That means it's not something a program does in its own memory, but it's a request for the OS kernel to do something on the program's behalf.
Granted, it's one of two syscalls that can change directories -- the other one is fchdir(). Theoretically you could use the other one, though whether that's what your professor actually wants is very much open to interpretation.
In terms of why chdir() and fchdir() can't be reimplemented by an application but need to be leveraged: The current working directory is among the process state maintained by the kernel on a program's behalf; the program itself can't access kernel memory without asking the kernel to operate on its behalf.
Things are syscalls because they need to be syscalls -- if something could be done in-process, it would be done that way (crossing the boundary between userspace and kernelspace involves a context-switch penalty; it's not without performance impact). In this case, letting the kernel do accurate bookkeeping as to what a process's working directory is ensures that the working directory is maintained when a new executable is loaded (with execve()), and helps to ensure the integrity of the kernel's records (making sure a program can't pretend to have its current working directory be a directory it doesn't actually have access to).

Where is the Linux kernel code that handles creation and deletion of /proc/pid/fd/# links?

Just looking for where this is handled. I have a similar need to track open fd's without scanning the /proc system.
The code which fill information under /proc/<PID>/fd is in file fs/proc/fd.c.
I suggest to look into the function proc_readfd_common, which iterates over the file descriptors available for the process. This function is eventually called when directory /proc/<PID>/fd is read.

Get notification on cgroup process change?

Basically, inotify which normally serves to notify on filesystem changes doesn't work within the cgroup virtual filesystem.
Essentially I want a way to get a notification similar to inotify when a process in a cgroup either is dies or forks. I tried attaching inotify to the tasks virtual file inside the cgroup filesystem but that does nothing when a process forks on its own, only when a usespace tool actually manually writes to it to influence the cgroup.
inotify does not work on such virtual file system, be it cgroup, proc or sys.
Note: I tried this too, it would have been very handy in some situations, but nope. :-)
This is because the files and directories do not actually exist per see (for example they take 0 disk space), they are produced for you on the fly by the kernel as you visit them.
So the alternative would be to actively visit the files and dir in a busy loop periodically, which is so ugly that it is not a real alternative in most cases.
And this is why programs such as top, htop and such consume so much CPU. They do actually and actively browse the proc virtual file system rather than inotify or select or stuff like that in an eventing manner.
EDIT:
But there are some things that could help you though:
1/ For recent kernels (cgroups have been re-designed):
Look at:
https://www.kernel.org/doc/Documentation/cgroup-v2.txt
I quote:
2-3. [Un]populated Notification
Each non-root cgroup has a "cgroup.events" file which contains
"populated" field indicating whether the cgroup's sub-hierarchy has
live processes in it. Its value is 0 if there is no live process in
the cgroup and its descendants; otherwise, 1. poll and [id]notify
events are triggered when the value changes. [...]
1/ For older kernels:
You may want to have a look at notify_on_release and release_agent. Have a look at:
https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
notify_on_release flag: run the release agent on exit?
release_agent: the path to use for release notifications (this file exists in the top cgroup only)
And the sections "1.4 What does notify_on_release do ?" and "1.5 What does clone_children do ?"

How to tell if a given process opened files with O_DIRECT?

I would like to tell if a process has opened any files using O_DIRECT, but I can only examine it after the process was launched (i.e. strace is not an option). I tried looking in /proc/$pid/fd/ to see if there was anything useful, but there wasn't. My goal is to track down if any of several hundred users on a system have opened files with O_DIRECT. Is this possible?
Since kernel 2.6.22, /proc/$pid/fdinfo/$fd contains a flags field, in octal. See http://www.kernel.org/doc/man-pages/online/pages/man5/proc.5.html
I don't think it's visible in /proc or elsewhere in user space.
With kernel code, it's possible:
1. Get the process's task_struct (use find_task_by_pid).
2. Go over files - use task->files->count and task->files->fd_array.
3. Look for file->f_flags & O_DIRECT.

reading and writing from a file in linux kernel

I'm writing a patch for VFS FAT implmentation on kernel 3.0
I want to add posix attributes to FAT files that are created in linux.
to achive that, I must save a file that contains all the relevant information on the mounted drive.
I know that reading and writing files from kernel space is something normally shouldn't be done, and I'm looking for another way to read/write the data.
I saw articles on the net that suggested to use /proc or to create a userspace daemon that will do the IO for me. I wanted to know if anyone saw or know where can I look at an implmentation of a thing like that,because I didn't find any examples for that over the net.
I'm not looking for a read/write to proc example, I want to see an entire solution for this issue.
Have a look at the quota implementation; this is a mechanism (ok, presumably not available on vfat) which reads/writes files from the kernel.
Additionally, the "loop" block device is another example of a kernel facility which does file IO.

Resources