Linux: signal source - linux

How does one find reliably whether a process received a signal due to its own misbehavior or was sent the same by another process? Basically, how does one determine whether si_pid field is valid or not.

If si_pid in the siginfo_t structure matches getpid() then the process signaled itself. Otherwise, another process did. Since process IDs are unique at any point in time, a PID you have now could not possibly have sent you the signal at a time when it had your PID (because then it would have signaled itself and not you).
Edit:
As you have discovered, the si_pid field is not always set; sometimes it contains garbage values. The first thing to check is that you have passed SA_SIGINFO in the sa_flags field of your struct sigaction when registering your handler. Without this, your handler may not receive a siginfo_t at all.
Once that's done, there are rules for when si_pid is set, described here: https://www.mkssoftware.com/docs/man5/siginfo_t.5.asp#Signal_Codes
In brief: si_pid should be set if si_code is one of:
SI_USER - includes calls to kill()
SI_QUEUE
SI_TIMER
SI_ASYNCIO
SI_MESGQ
It is also set whenever si_signo is SIGCHLD.

Related

How to use SIGRTMAX and SIGRTMIN?

I do know how signals are real time signals defined in POSIX. However I was curious to know how to use these or rather how are these signals generated? like a SIGSEGV is generated on invalid memory reference, a SIGINT is generated on a keyboard interrupt like ctrl+c.
How are signals like SIGRTMAX and SIGRTMIN generated and used ?
Actually I did figure it out how to generate signals for SIGRTMAX and SIGRTMIN.
If someone comes across this in the future then-
First you use the struct sigaction , set the member as per your requirements. Set sa_handler to the function you want that handles the signal generation.
To put this into action you use the function sigaction() pass the arguments as specified in linux manual.
So now you use the struct sigevent set the members there in to specify the signal number to be handled and how you send the notification for it.
With this you have done the setup, now you need to create a phenomena or event which would generate the signal maybe like expiration of time.
This you do by getting a timer handler by timer_create() this associates your sigevent with the handler.
Then you setup the expiration interval using struct itimerspec and then using timer_settime() you associate the expiration with the handler.
So now the expiration is associated with the handler, the time handler is associated with sigevent and sigevent redirects that signal to the handler set by sigaction

ERESTARTSYS from get_user_pages and pending fatal signal?

I'm testing some software+driver for it on linux, and the driver uses the get_user_pages() in its internal functions.
At some point my driver receives ERESTARTSYS error (-512) from the get_user_pages(), and according to the kernel code it happens because "If we have a pending SIGKILL, don't keep faulting pages and potentially allocating memory." - it's a comment from memory.c kernel file.
How can I see who sent this SIGKILL and why? I was trying to look in /var/log/kern.log file but couldn't see anything there about any signal.
I don't believe you can for SIGKILL (others, yes), unless you are willing to patch the kernel to give you the signal information. In which case, you can inspect the si_code and si_pid values, per the docs: http://pubs.opengroup.org/onlinepubs/009696699/basedefs/signal.h.html
Eg, if your signal info is in siptr:
if ((siptr)->si_code <= 0) {
printk(KERN_DEBUG "kill sent by process %u", (siptr)->si_pid);
}
The if check isn't strictly necessary: it restricts the printk() to those signals raised by a kill(). If the kernel raised the signal, si_code would be greater than 0.
I had exactly same problem.
But instead of in get_user_pages() i would get -ERESTARTSYS from sock_sendmsg().
To debug this issue what i did was add a logging message in
linux-3.2/kernel/signal.c: __send_signal().
And to avoid message filling up in kernel logs.
I would strncmp(t->comm, "myprogramname") And then log t->comm, t->pid, current->comm and current->pid.
Also i realized it is not only SIGKILL but any other signal pending too,
Then the call will return -ERESTARTSYS.
So the my next step was to find out who is giving my program a signal.
And add handlers for all signals (except SIGKILL, which i cant really handle).
Lucky for me it was not SIGKILL.
Adding handler may not help your case, but logging will identify who is sending and the reason.
Handling may help others who have similar problem.

Is there an async-signal-safe way of reading a directory listing on Linux?

SUSv4 does not list opendir, readdir, closedir, etc. in its list of async-signal-safe functions.
Is there a safe way to read a directory listing from a signal handler?
e.g. is it possible to 'open' the directory and somehow slurp out the raw directory listing? If so what kind of data structure is returned by 'read'?
Or maybe on Linux there are certain system calls that are async-signal-safe even though SUSv4 / POSIX does not require it that could be used?
If you know in advance which directory you need to read, you could call opendir() outside the signal handler (opendir() calls malloc(), so you can't run it from within the handler) and keep the DIR* in a static variable somewhere. When your signal handler runs, you should be able to get away with calling readdir_r() on that handle as long as you can guarantee that only that one signal handler would use the DIR* handle at any moment. There is a lock field in the DIR that is taken by readdir() and readdir_r(), so if, say, you used the DIR* from two signal handlers, or you registered the same handler to handle multiple signals, you may end up with a deadlock due to the lock never being released by the interrupted handler.
A similar approach appears to also work to read a directory from a child process after calling fork() but before calling execve().

why is POSIX::SigSet is needed here?

!/usr/bin/env perl
use POSIX;
my $sig_set = POSIX::SigSet->new(POSIX::SIGINT);
my $sig_act = POSIX::SigAction->new(sub { print "called\n"; exit 0 },$sig_set);
POSIX::sigaction(SIGINT,$sig_act);
sleep(15);
Why do I need to use POSIX::SigSet if I already tell POSIX::sigaction that I want SIGINT?
Basically I'm trying to respond with my coderef to each of the signal I add to SigSet, looking at POSIX::sigaction signature, it must accept a singal as the first parametner, which doesnt seems reasonable to be if I already tell POSIX::SigAction about my POSIX::SigSet.
I'm sure I am missing something here.
thanks,
The answer to your question
The POSIX::SigSet specifies additional signals to mask off (to ignore) during the execution of your signal handler sub. It corresponds to the sa_mask member of the underlying struct passed to the C version of sigaction.
Now, SIGINT (well, the first argument to sigaction) will be masked off by default, unless you explicitly request otherwise via the SA_NODEFER.
A better approach?
However, if all you want to do it to register a signal handler whose execution won't be interrupted by the signal for which it was registered (e.g., don't allow SIGINT during your SIGINT handler), you can skip the POSIX module entirely:
$SIG{INT} = sub { print "called\n"; exit 0; }; # Won't be interrupted by SIGINT
Where it can, Perl's signal dispatching emulates the traditional UNIX semantics of blocking a signal during its handler execution. (And on Linux, it certainly can. sigprocmask() is called before executing the handler, and then a scope-guard function is registered to re-allow that signal at the end of the user-supplied sub.)

Does `epoll_wait` signify which event was triggered when both EPOLLIN and EPOLLOUT were added?

Suppose I am specifying both EPOLLIN and EPOLLOUT flags when adding descriptors to monitor with epoll_wait. From the 'epoll' manpages it is unclear what exactly each of the epoll_event structures returned as part of the array carries in its events field. Quoting:
the events member will contain the returned event bit field.
Does it mean that it is impossible to distinguish whether an event was triggered signifying 'can-write' as opposed to 'can-read'? Basically there is an event mask, and I would logically expect returned array to signify exactly what event(s) have 'happened' on a file descriptor?
Your expectation is right. The events member will contain the event(s) that have occured for that file descriptor.

Resources