How to find out when process exits in Linux? - linux

I can't find a good way to find out when a process exits in Linux. Does anyone have a solution for that?
One that I can think of is check process list periodically, but that is not instant and pretty expensive (have to loop over all processes each time).
Is there an interface for doing that on Linux? Something like waitpid, except something that can be used from unrelated processes?
Thanks,
Boda Cydo

You cannot wait for an unrelated process, just children.
A simpler polling method than checking the process list, if you have permission, you can use the kill(2) system call and "send" signal 0.
From the kill(2) man page:
If sig is 0, then no signal is sent,
but error checking is still performed;
this can be used to check for the
existence of a process ID or process
group ID

Perhaps you can start the program together with another program, the second one doing whatever it is you want to do when the first program stops, like sending a notification etc.
Consider this very simple example:
sleep 10; echo "finished"
sleep 10 is the first process, echo "finished" the second one (Though echo is usually a shell plugin, but I hope you get the point)

Another option is to have the process open an IPC object such as a unix domain socket; your watchdog process can detect when the process quits as it will immediately be closed.

If you know the PID of the process in question, you can check if /proc/$PID exists. That's a relatively cheap stat() call.

Related

Linux, where are the return codes stored of system daemons and other processes?

How do i know, if a process has completed its execution without any errors?
How do i know, if a C++ program has returned success to OS?
If i run it via shell, then i could use $?, however if i am checking the status of a process, initiated by other user, how could i check the status?
Say i started a process in morning, and it got terminated at noon. i have been workign on some other activities till evening, and prior leaving, i would like to check what the processes has returned to OS. how could i acheve that, programatically. Running through syslog would help, but looking for alternatives.
i could run through the OS's process table and read this information, however it sounds a bit complicated for my requirement. Do we have something like syslog, where all errors of processes are recorded?
Any other ways to retrieve errors reported by terminated processes (of other users too)?
When a process terminates its parent process must acknowledge this using the wait or waitpid function. These functions also return the exit status. After the call to wait or waitpid the process table entry is removed, and the exit status is no longer stored anywhere in the operating system. You should check if the software you use to start the process saves the exit status somewhere.
If the parent process has not acknowledged that the child has terminated you can read its exit status from the /proc file system: it is the last field in /proc/[pid]/stat. It is stored in the same format that wait returns it, so you have to divide by 256 to get the exit code. Also you probably have to be root.

How to set process ID in Linux for a specific program

I was wondering if there is some way to force to use some specific process ID to Linux to some application before running it. I need to know in advance the process ID.
Actually, there is a way to do this. Since kernel 3.3 with CONFIG_CHECKPOINT_RESTORE set(which is set in most distros), there is /proc/sys/kernel/ns_last_pid which contains last pid generated by kernel. So, if you want to set PID for forked program, you need to perform these actions:
Open /proc/sys/kernel/ns_last_pid and get fd
flock it with LOCK_EX
write PID-1
fork
VoilĂ ! Child will have PID that you wanted.
Also, don't forget to unlock (flock with LOCK_UN) and close ns_last_pid.
You can checkout C code at my blog here.
As many already suggested you cannot set directly a PID but usually shells have facilities to know which is the last forked process ID.
For example in bash you can lunch an executable in background (appending &) and find its PID in the variable $!.
Example:
$ lsof >/dev/null &
[1] 15458
$ echo $!
15458
On CentOS7.2 you can simply do the following:
Let's say you want to execute the sleep command with a PID of 1894.
sudo echo 1893 > /proc/sys/kernel/ns_last_pid; sleep 1000
(However, keep in mind that if by chance another process executes in the extremely brief amount of time between the echo and sleep command you could end up with a PID of 1895+. I've tested it hundreds of times and it has never happened to me. If you want to guarantee the PID you will need to lock the file after you write to it, execute sleep, then unlock the file as suggested in Ruslan's answer above.)
There's no way to force to use specific PID for process. As Wikipedia says:
Process IDs are usually allocated on a sequential basis, beginning at
0 and rising to a maximum value which varies from system to system.
Once this limit is reached, allocation restarts at 300 and again
increases. In Mac OS X and HP-UX, allocation restarts at 100. However,
for this and subsequent passes any PIDs still assigned to processes
are skipped
You could just repeatedly call fork() to create new child processes until you get a child with the desired PID. Remember to call wait() often, or you will hit the per-user process limit quickly.
This method assumes that the OS assigns new PIDs sequentially, which appears to be the case eg. on Linux 3.3.
The advantage over the ns_last_pid method is that it doesn't require root permissions.
Every process on a linux system is generated by fork() so there should be no way to force a specific PID.
From Linux 5.5 you can pass an array of PIDs to the clone3 system call to be assigned to the new process, up to one for each nested PID namespace, from the inside out. This requires CAP_SYS_ADMIN or (since Linux 5.9) CAP_CHECKPOINT_RESTORE over the PID namespace.
If you a not concerned with PID namespaces use an array of size one.

"How many links do I have?", asks an Erlang process

A process in Erlang will either call link/1 or spawn_link to create a link with another process. In a recent application i am working on i got curious on whether its possible for a process to know at a given instance, the number of other processes its linked to. is this possible ? is their a BIF ?
Then, also, when a linked process dies, i guess that if it were possible to know the number of linked processes, this number would be decremented automatically by the run-time system. Such a mechanism would be ideal in dealing with Parent-Child relationships in Erlang concurrent programs, even in simple ones which do not involve supervisors.
Well, is it possible for an Erlang process to know out-of-the-box perhaps via a BIF, the number of processes linked to it, such that whenever a linked process dies, this value is decremented automatically under-the-hood :)?
To expand on this question a little bit, consider a gen_server, which will handle thousands of messages via handle_info. In this part, its job is to dispatch child processes to handle the task as soon as it comes in. The aim of this is to make sure the server loop returns immediately to take up the next request. Now, the child process handles the task asynchronously and sends the reply back to the requestor before it dies. Please refer to this question and its answer before you continue. Now, what if, for every child process spawned off by the gen_server, a link is created, and i would like to use this link as a counter. I know, i know, everyone is going to be like " why not use the gen_server State, to carry say, a counter, and then increment or decrement it accordingly ? " :) Somewhere in the gen_server, i have:
handle_info({Sender,Task},State)->
spawn_link(?MODULE,child,[Sender,Task]),
%% At this point, the number of links to the gen_server is incremented
%% by the run-time system
{noreply,State};
handle_info( _ ,State) -> {noreply,State}.
The child goes on to do this:
child(Sender,Task)->
Result = (catch execute_task(Task)),
Sender ! Result,
ok. %% At this point the child process exits,
%% and i expect the link value to be decremented
Then finally, the gen_server has an exposed call like this:
get_no_of_links()-> gen_server:call(?MODULE,links).
handle_call(links, _ ,State)->
%% BIF to get number of instantaneous links expected here
Links = erlang:get_links(), %% This is fake, do not do it at home :)
{reply,Links,State};
handle_call(_ , _ ,State)-> {reply,ok,State}.
Now, some one may ask them selves, really, Why would anyone want to do this ?
Usually, its possible to create an integer in the gen_server State and then we do it ourselves, or at least make the gen_server handle_info of type {'EXIT',ChildPid,_Reason} and then the server would act accordingly. My thinking is that if it were possible to know the number of links, i would use this to know ( at a given moment in time), how many child processes are still busy working, this in turn may actually assist in anticipating server load.
From manual for process_info:
{links, Pids}:
Pids is a list of pids, with processes to which the process
has a link
3> process_info(self(), links).
{links,[<0.26.0>]}
4> spawn_link(fun() -> timer:sleep(100000) end).
<0.38.0>
5> process_info(self(), links).
{links,[<0.26.0>,<0.38.0>]}
I guess it could be used to count number of linked processes
Your process should run process_flag(trap_exit, true) and listen for messages of the form {'EXIT', Pid, Reason} which will arrive whenever a linked process exits. If you don't trap exits, the default behaviour will be for your linked process to exit when the other side of the link exits.
As for listening to when processes add links, you can use case process_info(self(), links) of {links, L} -> length(L) end or length(element(2, process_info(self(), links)), but you have to re-run this regularly as there is no way for your process to be notified whenever a link is added.
A process following OTP guidelines never needs to know how many processes are linked to it.

In Linux, I'm looking for a way for one process to signal another, with blocking

I'm looking for a simple event notification system:
Process A blocks until it gets notified by...
Process B, which triggers Process A.
If I was doing this in Win32 I'd likely use event objects ('A' blocks, when 'B' does a SetEvent).
I need something pretty quick and dirty (prefer script rather than C code).
What sort of things would you suggest? I'm wondering about file advisory locks but it seems messy. One of the processes has to actively open the file in order to hold a lock.
Quick and dirty?
Then use fifo. It is a named pipe. The process A read from the fifo's FD with blocking mode. The process B writes to it when needed.
Simple, indeed.
And here is the bash scripting implementation:
Program A:
#!/bin/bash
mkfifo /tmp/event
while read -n 1 </tmp/event; do
echo "got message";
done
Program B:
#!/bin/bash
echo -n "G" >>/tmp/event
First start script A, then in another shell window repeatedly start script B.
Other than fifo you can use signal and kill to essentially do interrupts and have one process sleep until it receives a signal like SIGUSR1 which then unblocks it (you can use condition variables to achieve this easily without polling).
Slow and clean?
Then use (named) semaphores: either POSIX or SysV (not recommended, but possibly slightly more portable). Process A does a sem_wait (or sem_timedwait) and Process B calls sem_post.

Linux, timing out on subprocess

Ok, I need to write a code that calls a script, and if the operation in script hangs, terminates the process.
The preferred language is Python, but I'm also looking through C and bash script documentation too.
Seems like an easy problem, but I can't decide on the best solution.
From research so far:
Python: Has some weird threading model where the virtual machine uses
one thread at a time, won't work?
C: The preferred solution so far seems to use SIGALARM + fork +
execl. But SIGALARM is not heap safe, so it can trash everything?
Bash: timeout program? Not standard on all distros?
Since I'm a newbie to Linux, I'm probably unaware of 500 different gotchas with those functions, so can anyone tell me what's the safest and cleanest way?
Avoid SIGALRM because there is not much safe stuff to do inside the signal handler.
Considering the system calls that you should use, in C, after doing the fork-exec to start the subprocess, you can periodically call waitpid(2) with the WNOHANG option to inspect whether the subprocess is still running. If waitpid returns 0 (process is still running) and the desired timeout has passed, you can kill(2) the subprocess.
In bash you can do something similar to this:
start the script/program in background with &
get the process id of the background process
sleep for some time
and then kill the process (if it is finished you cannot kill it) or you can check if the process is still live and then to kill it.
Example:
sh long_time_script.sh &
pid=$!
sleep 30s
kill $pid
you can even try to use trap 'script_stopped $pid' SIGCHLD - see the bash man for more info.
UPDATE: I found other command timeout. It does exactly what you need - runs a command with a time limit. Example:
timeout 10s sleep 15s
will kill the sleep after 10 seconds.
There is a collection of Python code that has features to do exactly this, and without too much difficulty if you know the APIs.
The Pycopia collection has the scheduler module for timing out functions, and the proctools module for spawning subprocesses and sending signals to it. The kill method can be used in this case.

Resources