Erlang Linux signal handling - linux

Is it possible to trap Linux signals (e.g. SIGUSR1) through an handler in Erlang? (without having to resort to a driver crafted in C)

(NOT A REAL ANSWER)
In 2001 someone asked:
Does anyone have any examples of unix
signal handling in erlang. I would
like to make a loadbalancer that I
have written respond to sighup.
At that time the answer was:
There is no provision for handling
signals in Erlang "itself", i.e. you
will need to use a driver - or a port
program of course, might actually be a
better idea. Also for the driver case,
the emulator has its own handler for a
number of signals, and interfering
with that will probably have
"interesting" results - but SIGHUP
should be OK I believe.
SOURCE: http://www.erlang.org/pipermail/erlang-questions/2001-October/003752.html
As far as I know, nothing changed since then. But this is extremely interesting. If anyone has any news about this, please let us know :)

This is possible since Erlang/OTP 20.0, released in June 2017. It was done through this pull request that adds an event manager for signals called erl_signal_server. See the "OS Signal Event Handler" section in the kernel manual page.
If you're interested in SIGUSR1, note that the default handler will make the Erlang VM halt and produce a crash dump. To avoid that, it's not enough to add your own handler to erl_signal_server; you have to swap the default handler for it:
gen_event:swap_handler(erl_signal_server, {erl_signal_handler, []}, {foo, []}).

Related

Last command during poweroff

I am writing some software to shutdown some external hardware wired into my control board. The catch is that I need to wait for the VERY end of the poweroff operation to send the signal (through a gpio output). I am weighing some options right now, but I am curious as to where I can see what the kernel actually does right before poweroff.
Is there a file somewhere that I can look into?
Start at the function kernel_power_off in kernel/reboot.c and follow the code. The final power-off operations are very platform specific, so if you want to follow down to the bitter end, you'd need to figure out exactly which bits of arch-specific code you're using.
One simpler possibility for sending your signals is to register a kmsg_dump handler. The last thing kernel_power_off does before invoking the platform-specific power-off code is to execute kmsg_dump(KMSG_DUMP_POWEROFF);. (Just ignore any kmsg_dump calls other than that one.)

Open MPI Virtual Timer Expired

I'm using Open MPI 1.8 on Gentoo 3.13 to manage the data transfer from one program to another via a server/client concept. Both the server and the clients are launched via mpiexec as separate processes. After some days (this is quite a heavy computation...), I sometimes receive the error
mpiexec noticed that process rank 0 with PID 17213 on node XXX exited on signal 26 (Virtual timer expired).
Unfortunately, the error is not reproducible in a reliable way, i.e., the error does not appear always and not always at the same point in the program flow. I also experienced this error on other machines. I already tracked the issue down to the ITIMER_VIRTUAL which, upon expiration, delivers SIGVTALRM (see, e.g., http://man7.org/linux/man-pages/man2/setitimer.2.html). In the BUGS section of the man page, it says that
Under very heavy loading, an ITIMER_REAL timer may expire before the signal from a previous expiration has been delivered. The second signal in such an event will be lost.
I wonder if something similar might also hold for ITIMER_VIRTUAL? Did anyone experience similar problems and can confirm the error?
The only workaround I can think of is to invoke setitimer(...) and try to manipulate the timer myself. However, I hope there is another way since I can't always modify the clients' source code. Any suggestions?
Since this question has not been answered officially, I will do it on behalf of Hristo (#HristoIliev: I hope this is ok for you). As was pointed out in the first comment to my question, there is not a single hint in the Open MPI source code which can have caused the virtual timer expiration. Indeed, the timer problem was related to a third-party library which made the code crash after an unpredictable time (depending on the current loading of the machine).

Producer/Consumer in the kernel space - Linux

I would like to have one thread to queue some requests in a request queue and another to serve these requests. The producer should wake up the consumer when there is a new request queued.
Is there anyone who has done this already or knows how to do it?
I have tried several tutorials on the internet and none of them really worked cleanly. They either miss a request, cause a system lockup/instability, or they just do not terminate.
Note: My question in essence is similar to this one. However, I wont be specific like the one who asked that question. Anyone who can/willing to help can just throw his two cents and may be we can work something out.
Thanks!
You can use Work Queues. Work Queues are simple, once you set up up your work queue, you use something like the following:
DECLARE_WORK(name, void (*function)(void *), void *data);
Your function call will be scheduled and called later, take a look at this article.
I also highly recommend you this book: Linux Device Drivers
edit: I just saw you already linked an SO post where they use work queues. Have you tried it out? You run into some issues? I suggest you start with an really simple example, just to try out if it's working. Implement your core functionality later.
Update:
From the official Documentation:
Some users depend on the strict execution ordering of ST wq. The
combination of #max_active of 1 and WQ_UNBOUND is used to achieve this
behavior. Work items on such wq are always queued to the unbound
worker-pools and only one work item can be active at any given time
thus achieving the same ordering property as ST wq.
That way you will have a guaranteed FIFO execution of your workers. But be aware that the work may be executed on different CPUs. You have to use memory barriers to ensure visibility (eg. wmb()).
Update:
As #user2009594 mentioned, a single threaded wq can be created using the following macro defined in linux/workqueue.h:
#define create_singlethread_workqueue(name) \
alloc_workqueue("%s", WQ_UNBOUND | WQ_MEM_RECLAIM, 1, (name)))
Multicast Netlink sockets can work here greatly. Recently I did the same; only difference was that my consumer was in kernel while producers in user space: same can be used in kernel only space.

Getting a backtrace of other thread

In Linux, to get a backtrace you can use backtrace() library call, but it only returns backtrace of current thread. Is there any way to get a backtrace of some other thread, assuming I know it's TID (or pthread_t) and I can guarantee it sleeps?
It seems that libunwind (http://www.nongnu.org/libunwind/) project can help. The problem is that it is not supported by CentOS, so I prefer not to use it.
Any other ideas?
Thanks.
I implemented that myself here.
Initially, I wanted to implement something similar as suggested here, i.e. getting somehow the top frame pointer of the thread and unwinding it manually (the linked source is derived from Apples backtrace implementation, thus might be Apple-specific, but the idea is generic).
However, to have that safe (and the source above is not and may even be broken anyway), you must suspend the thread while you access its stack. I searched around for different ways to suspend a thread and found this, this and this. Basically, there is no really good way. The common hack, also used by the Hotspot JAVA VM, is to use signals and sending a custom signal to your thread via pthread_kill.
So, as I would need such signal-hack anyway, I can have it a bit simpler and just use backtrace inside the called signal handler which is executed in the target thread (as also suggested here by sandeep). This is basically what my implementation is doing.
If you are also interested in printing the backtrace, i.e. get some useful debugging information (function name, source code filename, source code line number, ...), read here about an extended backtrace_symbols based on libbfd. Or just see the source here.
Signal Handling with the help of backtrace can solve your purpose.
I mean if you have a PID of the Thread, you can raise a signal for that thread. and in the handler you can use the backtrace. since the handler would be executing in that partucular thread, the backtrace there would be the output what you are needed.
gdb provides these facilities for debugging multi-thread programs:
automatic notification of new threads
‘thread thread-id’, a command to switch among threads
‘info threads’, a command to inquire about existing threads
‘thread apply [thread-id-list] [all] args’, a command to apply a command to a list of threads
thread-specific breakpoints
‘set print thread-events’, which controls printing of messages on thread start and exit.
‘set libthread-db-search-path path’, which lets the user specify which libthread_db to use if the default choice isn't compatible with the program.
So just goto required thread in GDB by cmd: 'thread thread-id'.
Then do 'bt' in that thread context to print the thread backtrace.

How can I register a call-back on suspend in a linux driver?

I'm writing a linux driver and I would like to register a callback function to be invoked when the system goes to sleep. What is the api to do this?
Thanks.
It depends on what kind of driver you have. For example, if you have a driver that is registered with platform_device_register(), then the struct platform_driver includes a .suspend member for the device's suspend callback. With PCI devices, the struct pci_driver that you pass to pci_register_driver() similarly includes a .suspend member.
Most device classes should provide a similar mechanism.
I'm pretty sure you want acpi_install_fixed_event_handler(), found in acpi/acpi.h , the generic events found in acpi/actypes.h (which is included from acpi.h).
The second argument to acpi_install_fixed_event() wants a handler of type u32, with the last argument being a void *context. What I could not find is a list of possibilities that *context might be. However, it looks like you just something to be entered on events, which means you probably don't care about the context. Not quite a callback, but the same result.
If you register a fixed handler for (say, ACPI_EVENT_POWER_BUTTON or ACPI_EVENT_SLEEP_BUTTON), your handler should be entered on the respective event. I'm not 100% sure ACPI_EVENT_SLEEP_BUTTON is what you want, i.e. I can't quite tell if its the same event as the system going to sleep on its own. Testing and further investigation is, of course, and exercise for the reader.
An example of using it can be found in drivers/rtc/rtc-cmos.c.
Please take care to wrap any code from acpi.h in
#ifdef CONFIG_ACPI
....
#endif /* CONFIG_ACPI */
I could be completely wrong here, I have not actually needed to do this for any of the drivers that I've written. The above is the result of about 30 minutes of digging through the source of 2.6.32.8 , which may be completely different than the kernel you are working with.
Please leave a comment if I am way off base :) I think this is what you are looking for.
Additional
As for licensing, its exported:
drivers/acpi/acpica/evxface.c:ACPI_EXPORT_SYMBOL(acpi_install_fixed_event_handler)
Not
*_EXPORT_SYMBOL_GPL()
... So you should have no issues using it, whatever you happen to be doing.
Finally, this is a perfectly good question that would probably be met with a good reception on the Linux Kernel mailing list. If in doubt, ask there. Even if this 'just works', its a good idea to confirm it.
The solution I settled on was using notifier chains. On later versions of the kernel you can register with register_pm_notifier. If your kernel doesn't support that api you can use the notifier for cpu hot-plug events (this appears to be what KVM uses). On the way in and out of suspend the cpu hotplug notifier chain fires.
The ACPI Howto probably will give you a good headstart...

Resources