How is the Linux kernel tested? - linux

How do the Linux kernel developers test their code locally and after they have it committed? Do they use some kind of unit testing and build automation? Test plans?

The Linux kernel has a heavy emphasis on community testing.
Typically, any developer will test their own code before submitting, and quite often they will be using a development version of the kernel from Linus, or one of the other unstable/development trees for a project relevant to their work. This means they are often testing both their changes and other people's changes.
There tends not to be much in the way of formal test plans, but extra testing may be asked for before features are merged into upstream trees.
As Dean pointed out, there's also some automated testing: The Linux Test Project and the kernel Autotest (good overview).
Developers will often also write automated tests targeted to test their change, but I'm not sure there's a (often used) mechanism to centrally collect these ad hoc tests.
It depends a lot on which area of the kernel is being changed of course - the testing you'd do for a new network driver is quite different to the testing you'd do when replacing the core scheduling algorithm.

Naturally, the kernel itself and its parts are tested prior to the release, but these tests cover only the basic functionality. There are some testing systems which perform testing of Linux Kernel:
The Linux Test Project (LTP) delivers test suites to the open source community that validate the reliability and stability of Linux. The LTP test suite contains a collection of tools for testing the Linux kernel and related features.
Autotest—a framework for fully automated testing. It is designed primarily to test the Linux kernel, though it is useful for many other purposes such as qualifying new hardware, virtualization testing, and other general user space program testing under Linux platforms. It's an open-source project under the GPL and is used and developed by a number of organizations, including Google, IBM, Red Hat, and many others.
Also there are certification systems developed by some major GNU/Linux distribution companies. These systems usually check complete GNU/Linux distributions for compatibility with hardware. There are certification systems developed by Novell, Red Hat, Oracle, Canonical, and Google.
There are also systems for dynamic analysis of the Linux kernel:
Kmemleak is a memory leak detector included in the Linux kernel. It provides a way of detecting possible kernel memory leaks in a way similar to a tracing garbage collector with the difference that the orphan objects are not freed, but only reported via /sys/kernel/debug/kmemleak.
Kmemcheck traps every read and write to memory that was allocated dynamically (i.e., with kmalloc()). If a memory address is read that has not previously been written to, a message is printed to the kernel log. It is also is a part of the Linux kernel.
Fault Injection Framework (included in the Linux kernel) allows for infusing errors and exceptions into an application's logic to achieve a higher coverage and fault tolerance of the system.

How do the Linux kernel developers test their code locally and after they have it committed?
Do they use some kind of unit testing and build automation?
In the classic sense of words, no.
For example, Ingo Molnar is running the following workload:
build a new kernel with a random set of configuration options
boot into it
go to 1
Every build fail, boot fail, bug or runtime warning is dealt with. 24/7. Multiply by several boxes, and one can uncover quite a lot of problems.
Test plans?
No.
There may be a misunderstanding that there is a central testing facility, but there is none. Everyone does what he/she wants.

In-tree tools
A good way to find test tools in the kernel is to:
make help and read all targets
look under tools/testing
In v4.0, this leads me to:
kselftest under tools/testing/selftests. Run with make kselftest. Must be running built kernel already. See also: Documentation/kselftest.txt , https://kselftest.wiki.kernel.org/
ktest under tools/testing/ktest. See also: http://elinux.org/Ktest , http://www.slideshare.net/satorutakeuchi18/kernel-auto-testbyktest
Static analysers section of make help, which contains targets like:
checkstack: Perl: what does checkstack.pl in linux source do?
coccicheck for Coccinelle (mentioned by askb)
Kernel CI
https://kernelci.org/ is a project that aims to make kernel testing more automated and visible.
It appears to do only build and boot tests (TODO how to test automatically that boot worked Source should be at https://github.com/kernelci/).
Linaro seems to be the main maintainer of the project, with contributions from many big companies: https://kernelci.org/sponsors/
Linaro Lava
http://www.linaro.org/initiatives/lava/ looks like a CI system with focus on development board bringup and the Linux kernel.
ARM LISA
https://github.com/ARM-software/lisa
Not sure what it does in detail, but it is by ARM and Apache Licensed, so likely worth a look.
Demo: https://www.youtube.com/watch?v=yXZzzUEngiU
Step debuggers
Not really unit testing, but may help once your tests start failing:
QEMU + GDB: https://stackoverflow.com/a/42316607/895245
KGDB: https://stackoverflow.com/a/44226360/895245
My own QEMU + Buildroot + Python setup
I also started a setup focused on ease of development, but I ended up adding some simple testing capabilities to it as well: https://github.com/cirosantilli/linux-kernel-module-cheat/tree/8217e5508782827320209644dcbaf9a6b3141724#test-this-repo
I haven't analyzed all the other setups in great detail, and they likely do much more than mine, however I believe that my setup is very easy to get started with quickly because it has a lot of documentation and automation.

It’s not very easy to automate kernel testing. Most Linux developers do the testing on their own, much like adobriyan mentioned.
However, there are a few things that help with debugging the Linux Kernel:
kexec: A system call that allows you to put another kernel into memory and reboot without going back to the BIOS, and if it fails, reboot back.
dmesg: Definitely the place to look for information about what happened during the kernel boot and whether it works/doesn't work.
Kernel Instrumentation: In addition to printk's (and an option called 'CONFIG_PRINTK_TIME' which allows you to see (to microsecond accuracy) when the kernel output what), the kernel configuration allows you to turn on a lot of tracers that enable them to debug what is happening.
Then, developers usually have others review their patches. Once the patches are reviewed locally and seen not to interfere with anything else, and the patches are tested to work with the latest kernel from Linus without breaking anything, the patches are pushed upstream.
Here's a nice video detailing the process a patch goes through before it is integrated into the kernel.

In addition to the other answers, this emphasise more on the functionality testing, hardware certification testing and performance testing the Linux kernel.
A lot of testing actually happen through scripts, static code analysis tools, code reviews, etc. which is very efficient in catching bugs, which would otherwise break something in the application.
Sparse – An open-source tool designed to find faults in the Linux kernel.
Coccinelle is another program does matching and transformation engine which provides the language SmPL (Semantic Patch Language) for specifying desired matches and transformations in C code.
checkpatch.pl and other scripts - coding style issues can be found in the file Documentation/CodingStyle in the kernel source tree. The important thing to remember when reading it is not that this style is somehow better than any other style, just that it is consistent. This helps developers easily find and fix coding style issues. The script scripts/checkpatch.pl in the kernel source tree has been developed for it. This script can point out problems easily, and should always be run by a developer on their changes, instead of having a reviewer waste their time by pointing out problems later on.

There are also:
MMTests which is collection of benchmarks and scripts to analyze the results.
Trinity which is Linux system call fuzz tester.
Also the LTP pages at SourceForge are quite outdated and the project has moved to GitHub.

I would imagine they use virtualization to do quick tests. It could be something like QEMU, VirtualBox or Xen, and some scripts to perform configurations and automated tests.
Automated testing is probably done by trying either many random configurations or a few specific ones (if they are working with a specific issue). Linux has a lot of low-level tools (such as dmesg) to monitor and log debug data from the kernel, so I imagine that is used as well.

As far as I know, there is an automatically performance regression check tool (named lkp/0 day) running/funding by the Intel. It will test each valid patch sent to the mailing list and check the scores changed from different microbenchmarks such as hackbench, fio, unixbench, netperf, etc.
Once there is a performance regression/improvement, a corresponding report will be sent directly to the patch author and a Cc related maintainers.

LTP and Memtests are generally preferred tools.

adobriyan mentioned Ingo's loop of random configuration build testing. That is pretty much now covered by the 0-day test bot (aka kbuild test bot). A nice article about the infrastructure is presented here: Kernel Build/boot testing
The idea behind this set-up is to notify the developers ASAP so that they can rectify the errors soon enough (before the patches make it into Linus' tree in some cases as the kbuild infrastructure also tests against maintainer's subsystem trees).

Once after contributors submit their patch files and after making a merge request, Linux gatekeepers are checking the patch by integrating and reviewing it. Once it succeeds, they will merge the patch into the relevant branch and a make new version release.
The Linux Test Project is the main source which provides test scenarios (test cases) to run against the kernel after applying patches. This may take around 2 ~ 4 hours, and it depends.
Please note regarding the file system of the selected kernel is going to test against.
Example: ext4 generates different results against ext3 and so on.
Kernel Testing procedure.
Get latest kernel source from the repository (The Linux Kernel Archives or GitHub)
Apply the patch file (using a diff tool)
Build the new kernel.
Test against test procedures in LTP (Linux Test Project)

I had done Linux kernel compilation and done some modifications for Android (Android 6.0 (Marshmallow) and Android 7.0 (Nougat)) in which I use Linux version 3. I cross-compiled it on a Linux system, debugged the errors manually and then ran its boot image file in Android and checked if it was going in a loop-hole or not. If it runs perfect then it means it is compiled perfectly according to system requirements.
For MotoG kernel Compilation
Note: The Linux kernel will change according to requirements which depend on system hardware

Related

Unable to inject errors with Einj (mce-test, ras-tools)

I want to inject memory errors on my system to check whether RAS/EDAC system really works and logs errors on my memory (during boot or any runtime). I came across with many tools but I don't know which one to actually trust. The machine I want to test is a Sandy Bridge machine running Linux kernel 5.15.0-58-generic version. Specificially, I want to test my system with Einj tool (https://docs.kernel.org/firmware-guide/acpi/apei/einj.html). Although I followed the earlier steps in the link (BIOS supports Einj, CONFIG_DEBUG_FS, CONFIG_ACPI_APEI, CONFIG_ACPI_APEI_EINJ config parameters are set on my kernel), the files mentioned in the document: /sys/kernel/debug/apei/einj etc. are not present. How can I proceed with this tool? Or is there a better way/tool to inject memory errors to check the EDAC subsystem?

Buildroot custom kernel under 1MB

I am trying to build minimal kernel under 1 Mb with Buildroot. It is intended for small board with qspi memory and basic functionality, ethernet, usb, spi, and some GPIO's. Basic terminal access via ssh and UART.
My first thoughts are if it is even possible to modify kernel .config via linux-menuconfig to reach this size.
Also if it is possible to identify the redundant parts without deep knowledge about kernel architecture and exclude them from compilation.
If someone can direct me to good direction how to solve this problem or even specify some tools and ways how to do it would be very helpful.
Thank you!
If you have working build root for your board, than, it's better to continue to work with it. Technic for disabling kernel options should be the same. In the article he reached ~0,7MB uImage with lost a lot of functionality (p40). He started with minimal (bare) config (p27) and add blocks of configs. So instead of identify the redundant parts you can build smallest possible uImage you can boot. Than add to it more options: ext2, serial and so on. Actually this work require a lot of testing. And you probably brake dependencies.
Kernel configs (working and new one) could be compared using diff -Naur, so you can see what changed.
Offtopic:
Looks like yocto officially supported by altera. here are instructions how to build altera-image-minimal. If you fine with it size, than use it and don't spend time on minimizing uImage. If you need extra packages installed into it, than you can ease extend it.
And here are instructions about building Angstrom (yocto based distribution). You can create you custom image based on console-image-minimal.
I use Angstrom in production. Must say, it was really hard to use it first time.
Whether or not you build the kernel with buildroot is not really relevant. The important thing is to configure it so it fits in 1MB. When you build the kernel from buildroot, you can do that with make linux-menuconfig, as you mention.
That said, getting a kernel under 1MB will be quite hard. Biff once did this for an x86-based platform, bifferboard. But that was without networking or USB.
You can refer to the kernel size tuning guide, which has links to some patches to reduce the size. But it's not been updated in a couple of years.

Tweaking linux kernel

I am new to linux programming & interested to tweak linux kernel(though I am not sure, what to tweak, I am planning to write drivers for particular device). To learn internal of kernel, I have started from historic kernel release (first release).
My problem is, how to test whatever changes I am doing for development, without disturbing my current os environment.(ubuntu 12, 64 bit). Is there any way like virtual box, sandbox?
Along with these, if anybody send some good approaches to learn these things, I would be really greatful.
Thank You.
If you're new to linux programming then you really don't want to be tweaking the kernel. You really want to be an advanced programmer capable of programming drivers and complex software first.
But yes there is, you can can create a virtual machine using openbox or vmware. If you're really keen on tweaking the kernel you probably want to first just try compiling and configuring the kernel and seeing if that works.
Also make sure you're well acquainted with how the kernel works and advanced OS designs in general.
Search in google fr "Kernel configuration" you u will get many links how to configure your own kernel.
And one more thing do not use a outdated version of kernel ,always use latest stable release , because a lot of code and API is changed in new versions and no book in market is updated so ,, u have to read from kernel documentation. Thats the best way to learn the most updated information about linux kernel
Yes, you can test your changes on any of the commonly available virtual machines (VMs); that way, whatever changes you make to the VM kernel won't affect native OS.
Personally, I prefer using CentOS 64 bit on VMWare Player. With this setup, I got away with minimal system maintenance while was able to focus on the actual job at hand. Once the VM is up & running, you can download and compile one of the latest stable releases from kernel.org. Instructions on compiling your downloaded version of kernel could be found here and here; however, this may require little tweaking based on your actual setup. Once the VM is running on your desired version of kernel, using a combination of cscope and ctags will help you immensely in kernel code browsing.
Finally, if you want to become a serious kernel programmer and write your own device drivers, you need to get familiar with it in the first place. Below are a few excellent references -
Linux Device Driver by Corbet, Rubini, Kroah-Hartman, 3rd edition
Linux Kernel Development by Robert Love, 3rd edition
Understanding the Linux Kernel by Bovet, Cesati
Linux kernel source (ideally placed into your /usr/src/$(DESIRED_KERNEL) path, symlinked to /usr/src/linux)
Going through these books is a tedious job and chances are that you may hit the roadblock from time to time. kernelnewbies mailing list and StackOverflow are some of the few reliable places where people would be happy to answer to your queries.
Good luck!

How to validate/test/benchmark for the set of features on EXT4 filesystem

I wanted to validate/test/benchmark set of features I have added to the ext4 kernel_tree/fs.
I came across Spruce Linux file system driver verification. Especially for filesystem.
The project is hosted #https://code.google.com/p/spruce/wiki/GettingStarted.
and this is for x86.
I work on arm target, and I have few questions before starting off.
Has anybody worked on Spruce earlier.
how to use Spruce project for ARM, Do we need to port for ARM?
Is cross compilation straight forward or any changes need to be done.
I have gone through this paper: http://syrcose.ispras.ru/2012/files/submissions/25_syrcose2012_submission_21.pdf
there is no information on ARM and its support.
Please someone explain/help who has any work experience/knowledge on Spruce project.
Spruce was intended to work as follows. It provides a set of tests that make the kernel module for a given file system execute as many paths in the code as possible. It allows to use some external analyzers (such as the tools from KEDR framework) to detect different kinds of errors: memory leaks, etc.
All that was primarily intended for x86.
While it might be possible to port the tests themselves to ARM, one will need to choose the analyzers that work on that platform too. KEDR tools are currently for x86 only but one may try Kmemleak, Fault injection facilities and other tools on ARM instead.
Spruce seems to be a work in progress still. I see, you opened a ticket concerning ARM support in their issue tracker, I think, it is the right thing to do.
I would also suggest to take a look at Phoronix Test Suite. It is currently widely used for testing and benchmarking, including the analysis of file system kernel modules. See this article for example. It seems to work on ARM although I haven't tried it there myself.
The best tool for testing/validating a file system is xfstests. I have written tools to make it easy to validate xfstests for ext4. See: http://thunk.org/gce-xfstests for more details.
There is also an alpha-test level support for using this on ARM directly: http://thread.gmane.org/gmane.comp.file-systems.ext4/53649/focus=53659
This has been used successfully to test ext4 on an Android device, although to be honest, most of the time what I do is to bludgeon an Android kernel until it will build on x86, and then use kvm-xfstests gce-xfstests, since it's much more convenient. In particular with gce-xfstests, I can just do a "fire and forget", and then when the test completes I get a test report in my e-mail. Where as with the Android arm xfstests tarball, the automation isn't done yet, so you have to manually set up an external USB-attached USB device, hook it up via some kind of USB C hub, or if you are going to use an OTG usb adapter, you need to make sure the Android device can receive power while it is also driving the OTG usb port --- and you have to manually set up the chroot. Unless the BSP kernel has been badly abused so you can't figure out how to make it build on x86 (getting the MSM kernel to work on x86 wasn't easy) testing on gce-xfstests may be much simpler at the end of the day.

Why I need to re-compile vmware kernel module after a linux kernel upgrade?

After a linux kernel upgrade, my VMWare server cannot start until using vmware-config.pl to do some re-config work (including build some kernel modules).
If I update my windows VMWare host with latest Windows Service Pack, I usually not need to do anything to run VMWare.
Why VMWare works differently between Linux and Windows? Does this re-compile action brings any benifits on Linux platform over Windows?
Go read The Linux Kernel Driver Interface.
This is being written to try to explain why Linux does not have a binary kernel interface, nor does it have a stable kernel interface. Please realize that this article describes the _in kernel_ interfaces, not the kernel to userspace interfaces. The kernel to userspace interface is the one that application programs use, the syscall interface. That interface is _very_ stable over time, and will not break. I have old programs that were built on a pre 0.9something kernel that still works just fine on the latest 2.6 kernel release. This interface is the one that users and application programmers can count on being stable.
It reflects the view of a large portion of Linux kernel developers:
the freedom to change in-kernel implementation details and APIs at any time allows them to develop much faster and better.
Without the promise of keeping in-kernel interfaces identical from release to release, there is no way for a binary kernel module like VMWare's to work reliably on multiple kernels.
As an example, if some structures change on a new kernel release (for better performance or more features or whatever other reason), a binary VMWare module may cause catastrophic damage using the old structure layout. Compiling the module again from source will capture the new structure layout, and thus stand a better chance of working -- though still not 100%, in case fields have been removed or renamed or given different purposes.
If a function changes its argument list, or is renamed or otherwise made no longer available, not even recompiling from the same source code will work. The module will have to adapt to the new kernel. Since everybody (should) have source and (can find somebody who) is able to modify it to fit. "Push work to the end-nodes" is a common idea in both networking and free software: since the resources [at the fringes]/[of the developers outside the Linux kernel] are larger than the limited resources [of the backbone]/[of the Linux developers], the trade-off to make the former do more of the work is accepted.
On the other hand, Microsoft has made the decision that they must preserve binary driver compatibility as much as possible -- they have no choice, as they are playing in a proprietary world. In a way, this makes it much easier for outside developers who no longer face a moving target, and for end-users who never have to change anything. On the downside, this forces Microsoft to maintain backwards-compatibility, which is (at best) time-consuming for Microsoft's developers and (at worst) is inefficient, causes bugs, and prevents forward progress.
Linux does not have a stable kernel ABI - things like the internal layout of datastructures, etc changes from version to version. VMWare needs to be rebuilt to use the ABI in the new kernel.
On the other hand, Windows has a very stable kernel ABI that does not change from service pack to service pack.
To add to bdonlan's answer, ABI compatibility is a mixed bag. On one hand, it allows you to distribute binary modules and drivers which will work with newer versions of the kernel. On the other hand, it forces kernel programmers to add a lot of glue code to retain backwards compatibility. Because Linux is open-source, and because kernel developers even whether they're even allowed, the ability to distribute binary modules isn't considered that important. On the upside, Linux kernel developers don't have to worry about ABI compatibility when altering datastructures to improve the kernel. In the long run, this results in cleaner kernel code.
It's a consequence of Linux and Windows being developed in different cultural environments and expectations: http://www.joelonsoftware.com/articles/Biculturalism.html. In short: Windows is designed to be suitable for users, whereas Linux evolves to be suitable for open source developers.

Resources