attach preempt_notifier to user process in linux - linux

I am needing to identify whether a user process was ever preempted somehow, I understand we have hooks in preempt.h and sched.c which allow us to define preempt_notifiers which can in turn call sched_in and sched_out functions whenever a process is rescheduled or preempted.
But I still can't find out how can I attach a notifier to a particular process or pid in user space and then somehow log if this particular process was ever pre-empted. I'm assuming I have to write a module to do so, but how would I go about attaching a pid to a particular notifier?

The notifier is inherently per-process. When you register it, you are registering it for the current process. See the code in preempt_notifier_register(), it attaches the notifer to current->preempt_notifiers.

The pseudo-file /proc/<pid>/status contains a line nonvoluntary_ctxt_switches: which seems to be the information that you're after.

Related

Listening for new Processes in Linux Kernel Module

Is it possible to get notified (via callback or similar) when a new process is executed, when one is closed, and when state changes (ie. stopped, paged, etc)? In user-land, it would be easy to set up a directory listener on /proc.
Have you considered kprobes? You can use kprobes to execute a callback function when some kernel code is executed. E.g., you could add a do_fork kprobe to alert when new processes are created as in this example.
Similarly, you can add a probe for do_exit() to catch when processes exit.
For changing state, you could have have a return probe on sched_switch() and catch when the state changes. Depending on your application, this may add too much overhead.
If you only wish to collect data, perform some light processing, and aren't looking to do much more with the kernel module, systemtap may be a good alternative to writing a kernel module: https://sourceware.org/systemtap/documentation.html
More details on kprobes:
https://www.kernel.org/doc/Documentation/kprobes.txt
sched_switch() systemtap example:
https://sourceware.org/systemtap/examples/profiling/sched_switch.stp

Track user thread life in kernel space

I'm trying to track the life span of a user thread in a kernel module. I want to detect when a user thread is no longer executing (exit() has been called). How would I go about doing that? I'm digging into the kernel source code as I write this, but there's a lot to take in!
I did find task_struct.vfork_done, and it looks like something I can hook into. Am I on the right track?
Before anything, let me confirm that by 'no longer executing' you mean the process has been signaled to die and will soon expire. If I were you, I would register a notification chain within a simple misc driver module.
I would then trigger the notification from within the signal handling code of the kernel when the process under question has been signalled with a fatal signal. I would specifically tinker with the function get_signal_to_deliver (kernel/signal.c). I've recently answered a similar query here

cross-process locking in linux

I am looking to make an application in Linux, where only one instance of the application can run at a time. I want to make it robust, such that if an instance of the app crashes, that it won't block all the other instances indefinitely. I would really appreciate some example code on how to do this (as there's lots of discussion on this topic on the web, but I couldn't find anything which worked when I tried it).
You can use file locking facilities that Linux provides. You haven't specified the language, however you might find this capability pretty much everywhere in some form or another.
Here is a simple idea how to do that in a C program. When the program starts you can take an exclusive non-blocking lock on the whole file using fcntl system call. When another instance of the applications is attempted to be started, it will get an error trying to lock the file, which will mean the application is already running.
Here is a small example how to take the full file lock using fcntl (this function provides facilities for putting byte range locks, but when length is 0, the full file is locked).
struct flock lock_struct;
memset(&lock_struct, 0, sizeof(lock_struct));
lock_struct.l_type = F_WRLCK;
lock_struct.l_whence = SEEK_SET;
lock_struct.l_pid = getpid();
ret = fcntl(fd, F_SETLK, &lock_struct);
Please note that you need to open a file first to put a lock. This means you need to have a file around to use for locking. It might be useful to put the it somewhere where it won't cause any distraction/confusion for other applications.
When the process terminates, all locks that it has taken will be released, so nothing will be blocked.
This is just one of the ideas. I'm pretty sure there are other ways around.
The conventional UNIX way of doing this is with PID files.
Before a process starts, it checks to see if a pre-determined file - usually /var/run/<process_name>.pid exists. If found, its an indication that a process is already running and this process quits.
If the file does not exist, this is the first process to run. It creates the file /var/run/<process_name>.pid and writes its PID into it. The process unlinks the file on exit.
Update:
To handle cases where a daemon has crashed & left behind the pid file, additional checks can be made during startup if a pid file was found:
Do a ps and ensure that a process with that PID doesn't exist
If it exists ensure that its a different process
from the said ps output
from /proc/$PID/stat

Transient gen_server processes and updating pids

I'm currently learning Erlang at a reasonable clip but have a question about gen_server with supervisors. If a gen_server process crashes and is consequentially restarted by a supervisor, it receives a new pid. Now, what if I want other processes to refer to that process by Pid? What are some good idiomatic ways to 'update' the Pid in those processes?
As an exercise with some practical application, I'm writing a lock server where a client can request a lock with an arbitrary key. I ideally would like to have a separate processes handle the locking and releasing of a particular lock, the idea being that I can use the timeout argument in gen_server to terminate the process if no one has requested it after N amount time, so that only currently relevant locks will stay in memory. Now, I have a directory process which maps the lock name to the lock process. When the lock process terminates, it deletes the lock from the directory.
My concern is how to handle the case where a client requests a lock while the lock process is in the middle of terminating. It hasn't shutdown yet, so sniffing that the pid is alive won't work. The lock process hasn't reached the clause that deletes it from the directory yet.
Is there a better way to handle this?
EDIT
There are two gen_servers currently: the 'directory' which maintains an ETS table from LockName -> Lock Process, and the 'lock servers' which are added dynamically to the supervision tree using start_child. Ideally I would like each lock server to handle talking with the clients directly, but am worried about the scenario of a request to acquire/release getting issued with call or cast when the process is in the middle of crashing (and thus won't respond to the message).
Starting with {local} or {global} won't work since there can be N amount of them.
The trick is to name the process and don't refer to it by its pid. You generally have 3 viable options,
Use registered names. This is what andreypopp suggests. You refer to the server by its registered name. locally registered names have to be atoms, which may somewhat limit you. globally registered names do not have this limitation, you can register any term.
The Supervisor knows the Pid. Ask it. You will have to pass the Supervisor Pid to the process.
Alternatively, use the gproc application (exists on http://github.com). It allows you to create a generic process registry - you could have done that by ETS, but steal good code rather than implement yourself.
The pid is usable if all processes are part of the same supervision tree. So the death of one of them means the death of the others. Thus, the Pids recycling doesn't matter.
Don't refer to gen_server process by pid.
You should provide API for your gen_server via gen_server:call/2 or gen_server:call/3 functions. They are accept ServerRef as first argument, which can be Name | {Name,Node} | {global,GlobalName} | pid(). So, you API would look like:
lock(Key) ->
gen_server:call(?MODULE, {lock, Key}).
release(Key) ->
gen_server:call(?MODULE, {release, Key}).
Note that this API is defined in the same module as your gen_server and I assume you starting you server with something like:
gen_server:start_link({local, ?MODULE}, ?MODULE, [], [])
So your API methods can lookup server not by pid, but by server name, which is equal to ?MODULE.
For more information, please see gen_server documentation.
You can completely avoid the use of your "lock_server" process by using the "erlang:monitor/demonitor" API.
When a client requests a lock, you issue the lock.. and do a erlang:monitor on the client.. This will return you a Monitor Reference.. You can then store this Reference along with your lock.. The beauty of this is that your directory server WILL be notified when the client dies.. you could implement the TIMEOUT thing in the client.
Here is a snippet from code I had written recently..
https://github.com/xslogic/phoebus/blob/master/src/table_manager.erl
Basically, the table_manager is a process that issues lock on a particular table resource to client.. if the client dies, the table is returned to the pool..

How to find out when process exits in Linux?

I can't find a good way to find out when a process exits in Linux. Does anyone have a solution for that?
One that I can think of is check process list periodically, but that is not instant and pretty expensive (have to loop over all processes each time).
Is there an interface for doing that on Linux? Something like waitpid, except something that can be used from unrelated processes?
Thanks,
Boda Cydo
You cannot wait for an unrelated process, just children.
A simpler polling method than checking the process list, if you have permission, you can use the kill(2) system call and "send" signal 0.
From the kill(2) man page:
If sig is 0, then no signal is sent,
but error checking is still performed;
this can be used to check for the
existence of a process ID or process
group ID
Perhaps you can start the program together with another program, the second one doing whatever it is you want to do when the first program stops, like sending a notification etc.
Consider this very simple example:
sleep 10; echo "finished"
sleep 10 is the first process, echo "finished" the second one (Though echo is usually a shell plugin, but I hope you get the point)
Another option is to have the process open an IPC object such as a unix domain socket; your watchdog process can detect when the process quits as it will immediately be closed.
If you know the PID of the process in question, you can check if /proc/$PID exists. That's a relatively cheap stat() call.

Resources