what methods do you take when your linux kernel programs are wrong? - linux

I did not find a good method when I write and test a linux kernel programs such as multiple kernel-level threads or other general kernel modules, what methods do you take? thanks in advance!

printk, printk and more printk.
use dmesg to view. crash the kernel sometimes deliberately to get the crashinfo, then you can decode the crashinfo
dumptrace(), dumpstack() will print the stacktrace on the dmesg.
As a last option, kgdb. but this requires a connection to another system and is a pain always to get it work.

Related

How to send a simple message from kernel to user space?

I have a very simple (I think) problem.
I have a very simple kernel module, which handling an interrupt coming from my hardware (its all described in my device tree). I get the interrupt in kernel. Now I want to send a message (just 64 Bit, two uint32_t) to a program in user space. It will also be ok if I can "wake" up my program (there are serveral threads in there, so one thread could sleep until it will woke up by kernel module).
My problem is: What is the easiest and clearest solution? I read about netlink, using the proc filesystem, but
either I cannot find some clear examples out there
the messageing is only from user to kernel space
examples are outdated for the kernel I use (4.4).
Does anybody have a very clear example or a how to do such things?
P.S. I don't want to handle all the things following on the interrupt in kernel space. It's ok if some messages getted lost.

Linux how to debug OS freeze issue

I am working on a kernel module and a user-space application to test that module.
The problem is that during testing my system hangs/freeze.
I have placed lots of debug prints in the code.
The last message that is printed is just before linux select call in my user-space application. Does select somehow freeze the system?
So, How can i debug that where is problem? whether the problem is on user-space application or kernel module?
As n.m mentioned, your userspace program can't freeze Linux, so its an error in your kernel module. The best way to debug this is to use a kernel debugger, and figure out what your module is doing wrong.
Common errors are uninitialized pointers that your module passes to the kernel or locking issues, so take a close look at those.
A userspace program cannot, by definition, freeze Linux. There's a bug in the kernel.

pinning a pthread to a single core

I am trying to measure the performance of some library calls. My primary measurement tool is the rdtsc call. After doing some reading I realize that I need to disable preemption and interrupts in order to get the most accurate readings. Can someone help me figure out how to do these? I know that pthreads have a 'set affinity' mechanism. Is that enough to get the job done?
I also read somewhere that I can make calls into the kernel of the sort
preempt_disable()
raw_local_irq_save(...)
Is there any benefit to using one approach over the other? I tried the latter approach and got this error.
error: 'preempt_disable' was not declared in this scope
which can be fixed by including linux/preempt.h but the compiler still complains.
linux/preempt.h: No such file or directory
Obviously I have not done any kernel hacking and I could not find this file on my system anywhere. I am really hoping I wont have to install a new linux kernel. :)
Thanks for your input.
Pinning a pthread to a single CPU can be done using pthread_setaffinity_np
But what you want to achieve at the end is not so simple. I'll explain you why.
preempt.h is part of the Linux Kernel source. Its located here. You need to have kernel sources with you. Anyways, you need to write a kernel module to access it, you cannot use it from user space. Learn how to write a kernel module here. Same is the case with functions preempt_disable and other interrupt disabling kernel functions
Now the point is, pthreads are in user space and your preemption disabling function is in kernel space. How to interact?
Either you need to write a new system call of your own where you do your preemption and interrupt disabling and call it from user space. Or you need to resort to other Kernel-User Space Interfaces like procfs, sysfs, ioctl etc
But I am really skeptical as to how all these will help you to benchmark library functions. You may want to have a look at how performance is typically measured using rdtsc

easy way to detect infinite loop in kernel of the linux

I've just spent my 2 extra hours trying to find bug in my modification of the kernel of the linux, every time when I was connecting module to the kernel it was good but when I unconnected it my mouse stopped to work, so using printk I found infinite loop, my question is does somebody know nice techniques to detect such bugs, sometimes it is difficult to find such loops, and linux becomes unpredictable, so how can I avoid infinite loops in kernel thanks in advance
There is some infrastructure in the kernel that allows you to detect some lockup conditions :
CONFIG_DETECT_SOFTLOCKUP
CONFIG_DETECT_HUNG_TASK
And the various lock checking function you can find in the "Kernel Hacking" section of the kernel config
I've always found printk useful for that, as you did.
Other options would be running your kernel in Bochs in debugging mode. And as I recall, there's a way of running the kernel in gdb. Google can help with those options.
Oh, you said "avoid" not "debug"... hmm, the best way to avoid is do not hack the kernel :^)
Seriously, when doing kernel-level programming you have to be extra careful. Add a main() to the code that stress-tests your routines in usermode before adding to the running kernel. And read over your code, especially after you've isolated the bug to a particular section. I once found an infinite loop in LynxOS's terminal driver when some ANSI art hung the operating system. Some junior programmer, apparently, had written that part, parsing the escape sequence options as text rather than numbers. The code was so bad, I got disgusted trying to locate the exact error that forced the loop, and just rewrote most of the driver. And tested it in usermode before adding to the kernel.
You could try to enable the NMI watchdog.

how to monitor the syslog(printk) in a LKM

deal all,
i am a newbie for writing Linux Kernel Module.
i used printk function in linux kernel source code (2.4.29) for debugging and display messages.
now, i have to read all the messages i added via httpd.
i tried to write the messages into a file instead of printk function, so i can read the file directly.
but it's not work very well.
so, i have a stupid question...
is it possible to write a LKM to monitor the syslog and rewrite into another file??
i mean is that possible to let a LKM the aware the messages when each time the linux kernel execute "printk"??
thanks a lot
That is the wrong way to do it, because printk already does this : it writes in the file /proc/kmsg.
What you want is klogd, a user space utility dealing with /proc/kmsg.
Another options is to use dmesg, which will output the whole content of the kernel buffers holding the printk messages, but I suggest you first read the linked article
You never, ever, ever want to try to open a file on a user space mounted block file system from within the kernel. Imagine if the FS aborted and the kernel was still trying to write to it .. kaboom (amongst MANY other reasons why its a bad idea) :) As shodanex said, for your purposes, its much better to use klogd.
Now, generally speaking, you have several ways to communicate meaningful data to userspace programs, such as:
Create a character device driver that causes userspace readers to block while waiting for data. Provide an ioctl() interface to it which lets other programs find out how many messages have been sent, etc.
Create a node in /proc/yourdriver to accomplish the same thing
Really, the most practical means is to just use printk()

Resources