Shutdown (embedded) linux from kernel-space - linux

I'm working on a modified version of the 2.6.35 kernel for Olinuxino, an ARM9 based platform. I'm trying to modify the power management driver (the architecture specific part).
The processor is a Freescale i.MX23. This processor has a "special" pin, called PSWITCH, that triggers an interrupt that is handled by the power management driver.
If the switch is pressed,the system goes to standby. This is done in the driver by calling pm_suspend(PM_SUSPEND_STANDBY).
Given my hardware setup, I'd like to, instead, shutdown the system.
So my question is:
What is the preferred way for a kernel-space process to trigger a clean system halt/poweroff?
I suppose there's a nice little function call out there, but I couldn't find it so far.
My kernel code (the file I'm working on is arch/arm/mach-mx23/pm.c) can be found here: github.com/spairal/linux-for-lobster, though my question calls for a general Linux kernel approach.

The most general way would be for your driver to invoke shutdown as a userspace helper:
static const char * const shutdown_argv[] =
{ "/sbin/shutdown", "-h", "-P", "now", NULL };
call_usermodehelper(shutdown_argv[0], shutdown_argv, NULL, UMH_NO_WAIT);
(Presuming you have a /sbin/shutdown binary installed). This will shut userspace down cleanly, unmount filesystems and then request the kernel shutdown and power off.
However, you may be able to do better than this - for example if you can guarantee that there's no disk filesystems mounted read/write, you could tell a kernel thread to invoke the kernel_power_off() function (it shouldn't be done from interrupt context).

Related

Linux kernel: Find all drivers reachable via syscalls

I am comparing a mainline Linux kernel source with a modified copy of the same source that has many drivers added. A little background: That modified source is an Android kernel source, it contains many drivers added by the vendor, SoC manufacturer, Google etc.
I am trying to identify all drivers added in the modified source that are reachable from userspace via any syscalls. I'm looking for some systematic or ideally automatic way to find all these to avoid the manual work.
For example, char device drivers are of interest, since I could perform some openat, read, write, ioctl and close syscalls on them if there is a corresponding device file. To find new character device drivers, I could first find all new files in the source tree and then grep them for struct file_operations. But besides char drivers, what else is there that I need to look for?
I know that the syscalls mentioned above do some kind of "forwarding" to the respective device driver associated with the file. But are there other syscalls that do this kind of forwarding? I think I would have to focus on all these syscalls, right?
Is there something I can grep for in source files that indicates that syscalls can lead there? How should I go about this to find all these drivers?
Update (narrowing down):
I am targeting specific devices (e.g. Huawei P20 Lite), so I know relevant architecture and hardware. But for the sake of this question, we can just assume that hardware for whatever driver is present. It doesn't really matter in my case if I invoked a driver and it reported back that no corresponding hardware is present, as long as I can invoke the driver.
I only look for the drivers directly reachable via syscalls. By directly reachable I mean drivers designed to have some syscall interface with userspace. Yes, syscalls not aimed at a certain driver may still indirectly trigger code in that driver, but these indirect effects can be neglected.
Maybe some background on my objective clarifies: I want to fuzz-test the found drivers using Syzkaller. For this, I would create descriptions of the syscalls usable to fuzz each driver that Syzkaller parses.
I'm pretty sure there is no way to do this programmatically. Any attempt to do so would hit up against a couple of problems:
The drivers that are called in a given case depend on the hardware. For example, on my laptop, the iwlwifi driver will be reachable via network syscalls, but on a server that driver won't be used.
Virtually any code loaded into the kernel is reachable from some syscall if the hardware is present. Drivers interact with hardware, which in turn either interacts with users, external devices, or networks, and all of these operations are reachable by syscalls. People don't write drivers that don't do anything.
Even drivers that aren't directly reachable by a system call can affect execution. For example, a driver for a true RNG would be able to affect execution by changing the behavior of the system PRNG, even if it weren't accessible by /dev/hwrng.
So for a generic kernel that can run on any hardware of a given architecture, it's going to be pretty hard to exclude any driver from consideration. If your hope is to trace the execution of the code by some programmatic means without actually executing it, then you're going to need to solve the halting problem.
Sorry for the bad news.

Why is `kernal_thread()` not listed as a system call of Linux?

I was wondering why kernal_thread() isn't listed as a system call in http://man7.org/linux/man-pages/man2/syscalls.2.html?
Does a Linux application programmer never have any need to create a kernel thread?
Is the function accessible to a Linux application programmer?
Thanks.
Application programmers often need to create "kernel scheduled threads", aka "OS threads" or "native threads" using the clone syscall from that list.
"Kernel threads", however, are just threads that the kernel uses to run kernel code for its own internal purposes. They are created and used by kernel context code only. Each piece of software is responsible for creating and managing its own threads to do its own job, including userspace applications and the kernel itself.
kernel_thread is a kernel function defined in kernel/fork.c, which is not exposed to userspace. It's part of the internal kernel API and not a syscall.
As you are familiar that their are two address spaces one user and kernel, normal function will run in user space but when you will make use of some function calls that are implemented in kernel space you cannot use them directly so to access them we need system calls.
So now your question is why kernal_thread() is not listed in system calls.
(As answered by "that other guy" )
kernal_thread() function are used by the kernel programmer or usual in device driver for creating thread in kernel space. So their implementation is in kernel space and only used by kernel developer or programer. (Note:- if a interface have been provided for some function for user space that will be concluded as system call, as no interface for that function for user so their is no documentation for that in man pages)
If you want to read about documents about Kernel space function download the kernel source and check the "Documentation" folder or check the source for respective function they have few comments.

Linux Suspend To RAM from idle loop

I have a question regarding STR (Suspend To RAM) in the Linux kernel.
I am working on a small embedded Linux (Kernel 3.4.22) and I want to implement a mechanism that will put the system into sleep (suspend to ram) while it has nothing to do.
This is done in order to save power.
The HW support RAM self-refresh meaning its content will stay persistence.
And I'll take care of all the rest things which should be done (e.g keeping CPU context etc…)
I want to trigger the Kernel PM (power management) subsystem from within the idle loop.
When the system has nothing to do, it should go into sleep.
The HW also supports a way to wake up the system.
Doing some research, I have found out that Linux gives an option for the user space to switch to STR by writing "echo "mem" > /sys/power/state".
This will trigger the PM subsystem and will perform the relevant callbacks.
My questions are:
Is there any other standard alternative to go into STR besides writing to the above proc?
Did anyone tried to put the system into STR from the idle loop code ?
Thanks,
Why would you need another method? Linux treats everything as a file. Is it any surprise that the contents of a psudo-file dictate the state of the system? Check for yourself. pm-utils is a popular tool set for managing the state of the system. All the commands are just calls to /sys files.
This policy is actually platform dependent. You would have to look at the cpuidle driver for your platform to understand what it is doing. For example, on atmel platforms, it is using both RAM self refresh and WFI.

Best way to read/write to another block device from kernel mode

I'm writing a simple block dev driver to overcome some limitations with porting a previously hardware based RAID array to linux's software raid (mdadm).
This driver will create it's own block device, but proxy r/w requests to 1 or more other block devices (much like mdadm does already).
What is the best way for one kernel mode driver to read and write to another kernel mode (block device) driver?
[EDIT1]:
Ok, looking through the mdadm kernel module code - it looks like we need to do as the kernel does - use generic_make_request to the other disk drivers that handle the disks in the 'volume'. This avoids any user-mode filesystem block devices (/dev/xyz) to kernel mode device driver translations and keeps the I/O completely in kernel mode.
Now... How to obtain the bio handle from the several /dev/xyz strings passed to my module....
[EDIT2]:
Looking at this the wrong way, need to give my driver Major/Minors (translate the /dev/xyz in usermode and hand the dev_t value via ioctl to the driver, from there it can reference the driver.
Well on my way here, but still open to suggestions/recommendations.
The answer was to modify the BIO and re-send it off as I've done in this post:
https://unix.stackexchange.com/questions/171800/hp-smartarray-raid5-recovery-on-linux/171801#171801

Debugging kernel hang because of IOCTL calls

I am trying to make a kernel module which is working on 2.6.32 kernel to work on 3.6 kernel. We use IOCTL calls to update structures in Linux Kernel Module. These calls are working fine in 2.6.32 kernel.
When I try the same in 3.6 kernel I am facing kernel hang whenever ioctl calls are made from user-space application. Its a socket based interface not a file based interface hence we use the ioctl under struct proto_ops.
How can I debug this scenario as there is no core dump generated. To copy data from userspace I am using copy_from_user command.
Any pointers for debugging this scenario would be very helpful
ioctl() is one of the remaining parts of the kernel which runs under the Big Kernel Lock (BKL). In the past, the usage of the BKL has made it possible for long-running ioctl() methods to create long latencies for unrelated processes.
Follows an explanation of the patch that introduced unlocked_ioctl and compat_ioctl into 2.6.11. The removal of the ioctl field happened a lot later, in 2.6.36.
Explanation: When ioctl was executed, it took the Big Kernel Lock (BKL), so nothing else could execute at the same time. This is very bad on a multiprocessor machine, so there was a big effort to get rid of the BKL. First, unlocked_ioctl was introduced. It lets each driver writer choose what lock to use instead. This can be difficult, so there was a period of transition during which old drivers still worked (using ioctl) but new drivers could use the improved interface (unlocked_ioctl). Eventually all drivers were converted and ioctl could be removed.
compat_ioctl is actually unrelated, even though it was added at the same time. Its purpose is to allow 32-bit userland programs to make ioctl calls on a 64-bit kernel. The meaning of the last argument to ioctl depends on the driver, so there is no way to do a driver-independent conversion.
Reference: The new way of ioctl() by Jonathan Corbet

Resources