I would like to write a kernel module in Linux that can monitor all the memory accesses made by a particular process(that I specify by name in the kernel module). I would also like to keep track of all the signals generated by the process and log all memory accesses that result in page faults, and memory accesses that cause a TRAP or a SEGV. How could I go about doing this? Could you point me towards any resources that could get me started off?
Well if you have never written a kernel module before this might be a great start:
https://web.archive.org/web/20180901094541/http://www.freesoftwaremagazine.com/articles/drivers_linux?page=0%2C2
From there you basically wan't to grab process information and output it, perhaps create some kind of /proc device..
But you should know this isn't really something you need kernel mode for. You could probably do this easily right from user space.
I am running embedded linux 3.2.6 on an ARM processor. I am using a modified version of atmel's serial driver to control the 4 USART ports on my device. When I use the driver compiled with the kernel, all works fine. But I want to run the driver as a kernel module instead. I make all of the necessary changes and disable the internal driver and everything seems fine. The 4 tty devices are registered successfully and I can see that the all of my probe and initialization functions work correctly.
So here's the problem:
When I try to write to any of the devices, my "start transmit" function gets called but then waits for an interrupt from the usart which never occurs. So the write just hangs, and using a logic analyzer I can see that RTS gets asserted but no bytes show up on the tx line. I know that my call to request_irq succeeds and yet i never see any of the irq entries in /proc/interrupts. In the driver, I have also tried using request_irq to register a separate interrupt handler for a gpio line, and this works fine.
I know that this is a problem that is probably hard to diagnose, but I am looking for ANY possible suggestions that could lead me in the right direction to finding a solution. Let me know if you need any clarifications. Thank you
The symptoms reads like a peripheral clock that has not been enabled (or turned off): the device can be initialized w/o errors and an I/O operation can be setup, but the device doesn't do anything; it plays dead. Since no I/O ever starts, you're never going to get an interrupt indicating completion!
The other thing to check are the conditional compilation directives for HW configuration structures in your arch/arm/mach-xxx/zzz_devices.c file.
Make sure that the serial port structures have something like:
#if defined(CONFIG_SERIAL_ATMEL) || defined(CONFIG_SERIAL_ATMEL_MODULE)
and not just
#if defined(CONFIG_SERIAL_ATMEL)
Addendum
I could be wrong but the clock shouldn't have any effect on the CTS pin causing an interrupt, right?
Not right.
These digital circuits are synchronous state machines: without a clock, a change-of-state by an input cannot be processed.
Also, SoCs and modern uControllers use the peripheral clocks as on/off switches for those integrated peripherals. There is often way more functionality, i.e. peripherals, on the silicon chip than can actually be used, mostly due to insufficient quantity of pins to the board. So disabling the clocks to unused devices is employed to reduce power consumption.
You are far too focused on interrupts.
You do not have a solvable interrupt problem; those are secondary failures.
The lack of output when attempting to transmit is far more significant and revealing.
The root cause is probably a flawed configuration of the USART devices, since transmitting bits is an automatic operation for a configured & operational USART.
If the difference between not-working versus working is loadable module versus static linking, then the root cause is going to be something fundamental (and trivial) like my two suggestions.
Also your lack of acknowledgement regarding the #if defined(), e.g. you didn't respond with "Oh yeah, we already knew that", raises a gigantic red flag that says "Fix me first!"
Addendum 2
I'm tempted to delete this answer after discovering that the Atmel serial driver cannot be configured/built as a loadable module using make menuconfig (which is the premise for half of the answer). (Of course the Kconfig file could be hacked to make the config variable tristate instead of boolean to overcome the module restriction.) I've left a comment for the OP. But I also wanted to preserve the comment to Mr. Stratton pointing out how symbols in the .config file are (not) used.
So I did finally fix my problem. Thank you for the responses, none of them directly solved my problem but they did prompt further examination of my code. After some trial and error I finally got it working. I had originally moved the platform_device structures for each usart from /mach-at91/xxx_devices.c to my loadable module. Well for some reason the structures weren't getting the correct data to map to the hardware, I suppose because it wasn't correctly linking the symbols from the kernel (never got an error message though) and so some of the registration functions weren't even getting called. I ended up moving the structures and platform_device_register calls back into the devices file. I also decided to keep the driver for the console built-in using the original atmel_serial.c driver. I had to change the platform_device name for the console in both the devices file and in the built-in atmel_serial.c file in order for it to not conflict with my usart ports driver. I found that changing the platform_device and platform_driver name for the usarts from anything but "atmel_usart" resulted in usart transmission failing. I really don't understand why, but i'm just leaving it as atmel_usart so it works.
Thanks again to everybody who responded to my problem.
I have an embedded board with a kernel module of thousands of lines which freeze on random and complexe use case with random time. What are the solution for me to try to debug it ?
I have already try magic System Request but it does not work. I guess that the explanation is that I am in a loop or a deadlock in a code where hardware interrupt is disable ?
Thanks,
Eva.
Typically, embedded boards have a watch dog. You should enable this timer and use the watchdog user process to kick the watch dog hard ware. Use nice on the watchdog process so that higher priority tasks must relinquish the CPU. This gives clues as to the issue. If the device does not reset with a watch dog active, then it maybe that only the network or serial port has stopped communicating. Ie, the kernel has not locked up. The issue is that there is no user visible activity. The watch dog is also useful if/when this type of issue occurs in the field.
For a kernel lockup case, the lockup watchdogs kernel features maybe useful. This will work if you have an infinite loop/deadlock as speculated. However, if this is custom hardware, it is also possible that SDRAM or a peripheral device latches up and causes abnormal bus activity. This will stop the CPU from fetching proper code; obviously, it is tough for Linux to recover from this.
You can combine the watchdog with some fallow memory that is used as a trace buffer. memmap= and mem= can limit the memory used by the kernel. A driver/device using this memory can be written that saves trace points that survive a reboot. The fallow memory's ring buffer is dumped when a watchdog reset is detected on kernel boot.
It is also useful to register thread notifiers that can do a printk on context switches, if the issue is repeatable or to discover how to make the event repeatable. Once you determine a sequence of events that leads to the lockup, you can use the scope or logic analyzer to do some final diagnosis. Or, it maybe evident which peripheral is the issue at this point.
You may also set panic=-1 and reboot=... on the kernel command line. The kdump facilities are useful, if you only have a code problem.
Related: kernel trap (at web archive). This link may no longer be available, but aren't important to this answer.
I would like to dynamically allocate memory from the machine_init function in my arm linux kernel. However, my tests indicate that calling kalloc sometimes results in a complete failure of the system to boot.
My debugging tools are very limited so I can't give much more information regarding the failure.
Simply put, is it legal to call kalloc from a machine_init function in ARM linux, and, if not, is there an alternative?
I understand that in most cases it is wrong-headed to be allocating memory this early in the boot process (this kind of work should be done by the device drivers); however, I am convinced that my particular project requires it.
I can't see where machine_init is called from, but I can't help thinking you're trying to do the wrong thing.
Device drivers and other subsystems have their own init time, trying to do things very early on is usually a mistake (because something required isn't started yet). You can definitely call kmalloc during the initialisation of a device driver (at least, most. Maybe the console driver is different).
In any case, the fact that your on ARM suggests that it's an embedded system, so you're unlikely to have to deal with a lot of different hardware. Can't you just statically allocate an array with as many elements as could possibly be required (give an error if it is exceeded) ?
Kmalloc is a kernel API on top slab/slob/slub memory frame work. Once any of these framework(one which used by kernel) is initialized kmalloc works fine. Make sure your call after the slab/slob/slub initialization
cheers
Is it possible to 'hibernate' a process in linux?
Just like 'hibernate' in laptop, I would to write all the memory used by a process to disk, free up the RAM. And then later on, I can 'resume the process', i.e, reading all the data from memory and put it back to RAM and I can continue with my process?
I used to maintain CryoPID, which is a program that does exactly what you are talking about. It writes the contents of a program's address space, VDSO, file descriptor references and states to a file that can later be reconstructed. CryoPID started when there were no usable hooks in Linux itself and worked entirely from userspace (actually, it still does work, depending on your distro / kernel / security settings).
Problems were (indeed) sockets, pending RT signals, numerous X11 issues, the glibc caching getpid() implementation amongst many others. Randomization (especially VDSO) turned out to be insurmountable for the few of us working on it after Bernard walked away from it. However, it was fun and became the topic of several masters thesis.
If you are just contemplating a program that can save its running state and re-start directly into that state, its far .. far .. easier to just save that information from within the program itself, perhaps when servicing a signal.
I'd like to put a status update here, as of 2014.
The accepted answer suggests CryoPID as a tool to perform Checkpoint/Restore, but I found the project to be unmantained and impossible to compile with recent kernels.
Now, I found two actively mantained projects providing the application checkpointing feature.
The first, the one I suggest 'cause I have better luck running it, is CRIU
that performs checkpoint/restore mainly in userspace, and requires the kernel option CONFIG_CHECKPOINT_RESTORE enabled to work.
Checkpoint/Restore In Userspace, or CRIU (pronounced kree-oo, IPA: /krɪʊ/, Russian: криу), is a software tool for Linux operating system. Using this tool, you can freeze a running application (or part of it) and checkpoint it to a hard drive as a collection of files. You can then use the files to restore and run the application from the point it was frozen at. The distinctive feature of the CRIU project is that it is mainly implemented in user space.
The latter is DMTCP; quoting from their main page:
DMTCP (Distributed MultiThreaded Checkpointing) is a tool to transparently checkpoint the state of multiple simultaneous applications, including multi-threaded and distributed applications. It operates directly on the user binary executable, without any Linux kernel modules or other kernel modifications.
There is also a nice Wikipedia page on the argument: Application_checkpointing
The answers mentioning ctrl-z are really talking about stopping the process with a signal, in this case SIGTSTP. You can issue a stop signal with kill:
kill -STOP <pid>
That will suspend execution of the process. It won't immediately free the memory used by it, but as memory is required for other processes the memory used by the stopped process will be gradually swapped out.
When you want to wake it up again, use
kill -CONT <pid>
The more complicated solutions, like CryoPID, are really only needed if you want the stopped process to be able to survive a system shutdown/restart - it doesn't sound like you need that.
Linux Kernel has now partially implemented the checkpoint/restart futures:https://ckpt.wiki.kernel.org/, the status is here.
Some useful information are in the lwn(linux weekly net):
http://lwn.net/Articles/375855/ http://lwn.net/Articles/412749/ ......
So the answer is "YES"
The issue is restoring the streams - files and sockets - that the program has open.
When your whole OS hibernates, the local files and such can obviously be restored. Network connections don't, but then the code that accesses the internet is typically more error checking and such and survives the error conditions (or ought to).
If you did per-program hibernation (without application support), how would you handle open files? What if another process accesses those files in the interim? etc?
Maintaining state when the program is not loaded is going to be difficult.
Simply suspending the threads and letting it get swapped to disk would have much the same effect?
Or run the program in a virtual machine and let the VM handle suspension.
Short answer is "yes, but not always reliably". Check out CryoPID:
http://cryopid.berlios.de/
Open files will indeed be the most common problem. CryoPID states explicitly:
Open files and offsets are restored.
Temporary files that have been
unlinked and are not accessible on the
filesystem are always saved in the
image. Other files that do not exist
on resume are not yet restored.
Support for saving file contents for
such situations is planned.
The same issues will also affect TCP connections, though CryoPID supports tcpcp for connection resuming.
I extended Cryopid producing a package called Cryopid2 available from SourceForge. This can
migrate a process as well as hibernating it (along with any open files and sockets - data
in sockets/pipes is sucked into the process on hibernation and spat back into these when
process is restarted).
The reason I have not been active with this project is I am not a kernel developer - both
this (and/or the original cryopid) need to get someone on board who can get them running
with the lastest kernels (e.g. Linux 3.x).
The Cryopid method does work - and is probably the best solution to general purpose process
hibernation/migration in Linux I have come across.
The short answer is "yes." You might start by looking at this for some ideas: ELF executable reconstruction from a core image (http://vx.netlux.org/lib/vsc03.html)
As others have noted, it's difficult for the OS to provide this functionality, because the application needs to have some error checking builtin to handle broken streams.
However, on a side note, some programming languages and tools that use virtual machines explicitly support this functionality, such as the Self programming language.
This is sort of the ultimate goal of clustered operating system. Mathew Dillon puts a lot of effort to implement something like this in his Dragonfly BSD project.
adding another workaround: you can use virtualbox. run your applications in a regular virtual machine and simply "save the machine state" whenever you want.
I know this is not an answer, but I thought it could be useful when there are no real options.
if for any reason you don't like virtualbox, vmware and Qemu are as good.
Ctrl-Z increases the chances the process's pages will be swapped, but it doesn't free the process's resources completely. The problem with freeing a process's resources completely is that things like file handles, sockets are kernel resources the process gets to use, but doesn't know how to persist on its own. So Ctrl-Z is as good as it gets.
There was some research on checkpoint/restore for Linux back in 2.2 and 2.4 days, but it never made it past prototype. It is possible (with the caveats described in the other answers) for certain values of possible - I you can write a kernel module to do it, it is possible. But for the common value of possible (can I do it from the shell on a commercial Linux distribution), it is not yet possible.
There's ctrl+z in linux, but i'm not sure it offers the features you specified. I suspect you asked this question since it doesn't