Can a Linux process block external signals but accept signals from its own process? - linux

I am trying to setup a Linux process, which blocks SIGTERM that is sent from kill command (or any other process), but allows SIGTERM to be sent from within itself (through kill(2) system call).
Is it possible?
Here is an example program that I wrote, but it SIG_BLOCKS both external and internal signals, so it doesn't do what I want:
#include <signal.h>
#include <unistd.h>
#include <stdio.h>
int main(int argc, char **argv)
{
sigset_t sigs;
sigemptyset(&sigs);
sigaddset(&sigs, SIGTERM);
sigprocmask(SIG_BLOCK, &sigs, 0);
printf("Sleeping 30 secs, try killing me! (pid: %d)\n", getpid());
sleep(30);
printf("About to call kill\n");
kill(getpid(), SIGTERM);
printf("This never happens!\n");
return 1;
}
The output is:
Sleeping 30 secs, try killing me! (pid: 29416)
About to call kill
This never happens!
But it should be:
Sleeping 30 secs, try killing me! (pid: 29416)
About to call kill
Because the process should get killed from within through kill(getpid(), SIGTERM).

Not sure if this is what you're after, but you can set up a signal handler using sigaction with the SA_SIGINFO flag, have your SIGTERM handler only call _exit if siginfo.si_pid is your PID

According to what I test if you don't block the signal. you can see the expected behaviour.
But if you blocked the signal, the kill will return with value 0, and since the program continue to execute,it'll print the line and exit. I'm using ubuntu 12.04LTS for testing.

Related

Erlang: How to make connected external OS process automatically die when controlling Erlang process crashes?

I am using Erlang port to read output of Linux process. I'd like the Linux process to be automatically killed whenever my connected Erlang process dies. From the docs, it seems to me that this should automatically happen, but it does not.
Minimal example. Put this in the file test.erl:
-module(test).
-export([start/0, spawn/0]).
start() ->
Pid = spawn_link(?MODULE, spawn, []),
register(test, Pid).
spawn() ->
Port = open_port({spawn, "watch date"},[stream, exit_status]),
loop([{port, Port}]).
loop(State) ->
receive
die ->
error("died");
Any ->
io:fwrite("Received: ~p~n", [Any]),
loop(State)
end.
Then, in erl shell:
1> c(test).
{ok,test}
2> test:start().
true
The process starts and prints some data received from the Linux "watch" command every 2 seconds.
Then, I make the Erlang process crash:
3> test ! die.
=ERROR REPORT==== 26-May-2021::13:24:01.057065 ===
Error in process <0.95.0> with exit value:
{"died",[{test,loop,1,[{file,"test.erl"},{line,15}]}]}
** exception exit: "died"
in function test:loop/1 (test.erl, line 15)
The Erlang process dies as expected, the data from "watch" stops appearing but the watch process still keeps running in the background as can be seen in Linux (not erl) terminal:
fuxoft#frantisek:~$ pidof watch
1880127
In my real-life scenario, I am not using "watch" command but other process that outputs data and accepts no input. How can I make it automaticall die when my connected Erlang process crashes? I can do this using Erlang supervisor and manually issuing the "kill" command when Erlang process crashes but I thought this could be done easier and cleaner.
The open_port function creates a port() and links it to the calling process. If the owning process dies, the port() closes.
In order to communicate with the externally spawned command, Erlang creates several pipes, which are by default tied to the stdin and stdout (file descriptors) of the external process. Anything that the external process writes through the stdout will arrive as a message to the owning process.
When the Port is closed, the pipes attaching it to the external process are broken, and so trying to read or write to them will give you a SIGPIPE/EPIPE.
You can detect that from your external process when writing or reading from the FDs and exiting the process then.
E.g.: With your current code, you can retrieve the external process OS pid with proplists:get_value(os_pid, erlang:port_info(Port)). If you strace it, you will see:
write(1, ..., 38) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=31297, si_uid=1001} ---
SIGPIPE in ports and Erlang
It seems that although the default action for SIGPIPE is to terminate the process, Erlang sets it to ignore the signal (and thus the children processes inherit this configuration).
If you're unable to modify the external process code to detect the EPIPE, you can use this c wrapper to reset the action:
#include <unistd.h>
#include <signal.h>
int main(int argc, char* argv[]) {
if (signal(SIGPIPE, SIG_DFL) == SIG_ERR)
return 1;
if (argc < 2)
return 2;
execv(argv[1], &(argv[1]));
}
just compile it and run it as wrapper path-to-executable [arg1 [arg2 [...]]] with open_port

Why does POSIX demand that system(3) ignores SIGINT and SIGQUIT?

The POSIX spec says
The system() function shall ignore the SIGINT and SIGQUIT signals, and shall block the SIGCHLD signal, while waiting for the command to terminate. If this might cause the application to miss a signal that would have killed it, then the application should examine the return value from system() and take whatever action is appropriate to the application if the command terminated due to receipt of a signal.
This means that a program that starts a long-running sub-process will have SIGINT and SIGQUIT blocked for a long time. Here is a test program compiled on my Ubuntu 18.10 laptop:
$ cat > test_system.c <<< EOF
#include <stdlib.h>
int main() {
system("sleep 86400"); // Sleep for 24 hours
}
EOF
$ gcc test_system.c -o test_system
If I start this test program running in the background...
$ ./test_system &
[1] 7489
..Then I can see that SIGINT(2) and SIGQUIT(3) are marked as ignored in the bitmask.
$ ps -H -o pid,pgrp,cmd,ignored
PID PGRP CMD IGNORED
6956 6956 -bash 0000000000380004
7489 7489 ./test_system 0000000000000006
7491 7489 sh -c sleep 86400 0000000000000000
7492 7489 sleep 86400 0000000000000000
Trying to kill test_system with SIGINT has no effect..
$ kill -SIGINT 7489
.. But sending SIGINT to the process group does kill it (this is expected, it means that every process in the process group receives the signal - sleep will exit and system will return).
$ kill -SIGINT -7489
[1]+ Done ./test_system
Questions
What is the purpose of having SIGINT and SIGQUIT ignored since the process can still be killed via the process group (that's what happens when you do a ^C in the terminal).
Bonus question: Why does POSIX demand that SIGCHLD should be blocked?
Update If SIGINT and SIGQUIT are ignored to ensure we don't leave children behind, then why is there no handling for SIGTERM - it's the default signal sent by kill!
SIGINT and SIGQUIT are terminal generated signals. By default, they're sent to the foreground process group when you press Ctrl+C or Ctrl+\ respectively.
I believe the idea for ignoring them while running a child via system is that the terminal should be as if it was temporarily owned by the child and Ctrl+C or Ctrl+\ should temporarily only affect the child and its descendants, not the parent.
SIGCHLD is blocked so that system's the SIGCHLD caused by the child terminating won't trigger a SIGCHLD handler if you have one, because such a SIGCHLD handler might reap the child started by system before system reaps it.

Why can GDB mask tracee's SIGKILL when attaching to the tracee

The signal(7) man page says that SIGKILL cannot be caught, blocked, or ignored. But I just observed that after attaching to a process with GDB, I can no longer send SIGKILL to that process (similarly, other signal cannot be delivered either). But after I detach and quit GDB, SIGKILL is delivered as usual.
It seems to me that GDB has blocked that signal (on behalf of the tracee) when attaching, and unblocked it when detaching. However, the ptrace(2) man page says:
While being traced, the tracee will stop each time a signal is delivered, even if the signal is being ignored. (An exception is SIGKILL, which has its usual effect.)
So why does it behave this way? What tricks is GDB using?
Here is an trivial example for demonstration:
1. test program
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <errno.h>
#include <string.h>
/* Simple error handling functions */
#define handle_error_en(en, msg) \
do { errno = en; perror(msg); exit(EXIT_FAILURE); } while (0)
struct sigaction act;
void sighandler(int signum, siginfo_t *info, void *ptr) {
printf("Received signal: %d\n", signum);
printf("signal originate from pid[%d]\n", info->si_pid);
}
int
main(int argc, char *argv[])
{
printf("Pid of the current process: %d\n", getpid());
memset(&act, 0, sizeof(act));
act.sa_sigaction = sighandler;
act.sa_flags = SA_SIGINFO;
sigaction(SIGQUIT, &act, NULL);
while(1) {
;
}
return 0;
}
If you try to kill this program using SIGKILL (i.e., using kill -KILL ${pid}), it will die as expected. If you try to send it SIGQUIT (i.e., using kill -QUIT ${pid}), those printf statements get executed, as expected. However, if you have attached it with GDB before sending it signal, nothing will happen:
$ ##### in shell 1 #####
$ gdb
(gdb) attach ${pid}
(gdb)
/* now that gdb has attached successfully, in another shell: */
$ #### in shell 2 ####
$ kill -QUIT ${pid} # nothing happen
$ kill -KILL ${pid} # again, nothing happen!
/* now gdb detached */
##### in shell 1 ####
(gdb) quit
/* the process will receive SIGKILL */
##### in shell 2 ####
$ Killed # the tracee receive **SIGKILL** eventually...
FYI, I am using a CentOS-6u3 and uname -r result in 2.6.32_1-16-0-0. My GDB version is: GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6) and my GCC version is: gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-19.el6). An old machine...
Any idea will be appreciated ;-)
$ ##### in shell 1 #####
$ gdb
(gdb) attach ${pid}
(gdb)
The issue is that once GDB has attached to ${pid}, the inferior (being debugged) process is no longer running -- it is stopped.
The kernel will not do anything to it until it is either continued (with the (gdb) continue command), or it is no longer being traced ((gdb) detach or quit).
If you issue continue (either before or after kill -QUIT), you'll see this:
(gdb) c
Continuing.
kill -QUIT $pid executed in another shell:
Program received signal SIGQUIT, Quit.
main (argc=1, argv=0x7ffdcc9c1518) at t.c:35
35 }
(gdb) c
Continuing.
Received signal: 3
signal originate from pid[123419]
kill -KILL executed in another window:
Program terminated with signal SIGKILL, Killed.
The program no longer exists.
(gdb)

Signal handling with qemu-user

On my machine I have an aarch64 binary, that is statically compiled. I run it using qemu-aarch64-static with the -g 6566 flag. In another terminal I start up gdb-multiarch and connect as target remote localhost:6566.
I expect the binary to raise a signal for which I have a handler defined in the binary. I set a breakpoint at the handler from inside gdb-multiarch after connecting to remote. However, when the signal arises, the breakpoint is not hit on gdb-multiarch. Instead, on the terminal that runs the binary, I get a message along the lines of :-
[1] + 8388 suspended (signal) qemu-aarch64-static -g 6566 ./testbinary
Why does this happen? How can I set a breakpoint on the handler and debug it? I've tried SIGCHLD and SIGFPE.
This works for me with a recent QEMU:
$ cat sig.c
#include <stdlib.h>
#include <signal.h>
#include <stdio.h>
void handler(int sig) {
printf("In signal handler, signal %d\n", sig);
return;
}
int main(void) {
printf("hello world\n");
signal(SIGUSR1, handler);
raise(SIGUSR1);
printf("done\n");
return 0;
}
$ aarch64-linux-gnu-gcc -g -Wall -o sig sig.c -static
$ qemu-aarch64 -g 6566 ./sig
and then in another window:
$ gdb-multiarch
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
[etc]
(gdb) set arch aarch64
The target architecture is assumed to be aarch64
(gdb) file /tmp/sigs/sig
Reading symbols from /tmp/sigs/sig...done.
(gdb) target remote :6566
Remote debugging using :6566
0x0000000000400c98 in _start ()
(gdb) break handler
Breakpoint 1 at 0x400e44: file sig.c, line 6.
(gdb) c
Continuing.
Program received signal SIGUSR1, User defined signal 1.
0x0000000000405c68 in raise ()
(gdb) c
Continuing.
Breakpoint 1, handler (sig=10) at sig.c:6
6 printf("In signal handler, signal %d\n", sig);
(gdb)
As you can see, gdb gets control both immediately the process receives the signal and then again when we hit the breakpoint for the handler function.
Incidentally, (integer) dividing by zero is not a reliable way to provoke a signal. This is undefined behaviour in C, and the implementation is free to do the most convenient thing. On x86 this typically results in a SIGFPE. On ARM you will typically find that the result is zero and execution will continue without a signal. (This is a manifestation of the different behaviour of the underlying hardware instructions for division between the two architectures.)
i was doing some R&D for your answer and find following answer
"Internally, bad memory accesses result in the Mach exception EXC_BAD_ACCESS being sent to the program. Normally, this is translated into a SIGBUS UNIX signal. However, gdb intercepts Mach exceptions directly, before the signal translation. The solution is to give gdb the command set dont-handle-bad-access 1 before running your program. Then the normal mechanism is used, and breakpoints inside your signal handler are honored."
The link is gdb: set a breakpoint for a SIGBUS handler
It perhaps help you by considering that qemu does not change the functionality of base operations

Linux effect of ptrace TRACEME call

I have the following code. It simply calls ptrace(PTRACE_TRACEME) before going into an infinite loop.
I have two issues:
After executing the binary, I can't attach with gdb even if I am root.
With ptrace(PTRACE_TRACEME), I can't terminate the process with Ctrl-C (SIGINT). it simply stops.
Can someone explain what's going on? Thank you in advance.
PS: I know that most debuggers fork a child which then calls ptrace(PTRACE_TRACEME) before execve. No need to remind me of this.
#include <sys/ptrace.h>
#include <sys/reg.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>
int main(int argc, char **argv) {
printf("my pid : %d\n", getpid());
ptrace(PTRACE_TRACEME);
while(1){
printf("euid : %d\n", geteuid());
sleep(2);
}
return 0;
}
after executing this binary, I can't attach gdb even if I am root.
From man ptrace:
ERRORS
EPERM The specified process cannot be traced. This could be
because the parent has insufficient privileges (the required
capability is CAP_SYS_PTRACE); non-root processes cannot trace
processes that they cannot send signals to or those running
set-user-ID/set- group-ID programs, for obvious reasons.
Alternatively, the process may already be being traced, or be init(8) (PID 1).
with ptrace(PTRACE_TRACEME), I can't terminate the process with Ctrl-C (SIGINT). it simply stops.
From man ptrace:
DESCRIPTION
While being traced, the child will stop each time a signal is
delivered, even if the signal is being ignored. (The exception is
SIGKILL, which has its usual effect.) The parent will be notified at
its next wait(2) and may inspect and modify the child process
while it is stopped. The parent then causes the child to continue,
optionally ignoring the delivered signal (or even delivering a
different signal instead).

Resources