Why is `kernal_thread()` not listed as a system call of Linux? - linux

I was wondering why kernal_thread() isn't listed as a system call in http://man7.org/linux/man-pages/man2/syscalls.2.html?
Does a Linux application programmer never have any need to create a kernel thread?
Is the function accessible to a Linux application programmer?
Thanks.

Application programmers often need to create "kernel scheduled threads", aka "OS threads" or "native threads" using the clone syscall from that list.
"Kernel threads", however, are just threads that the kernel uses to run kernel code for its own internal purposes. They are created and used by kernel context code only. Each piece of software is responsible for creating and managing its own threads to do its own job, including userspace applications and the kernel itself.
kernel_thread is a kernel function defined in kernel/fork.c, which is not exposed to userspace. It's part of the internal kernel API and not a syscall.

As you are familiar that their are two address spaces one user and kernel, normal function will run in user space but when you will make use of some function calls that are implemented in kernel space you cannot use them directly so to access them we need system calls.
So now your question is why kernal_thread() is not listed in system calls.
(As answered by "that other guy" )
kernal_thread() function are used by the kernel programmer or usual in device driver for creating thread in kernel space. So their implementation is in kernel space and only used by kernel developer or programer. (Note:- if a interface have been provided for some function for user space that will be concluded as system call, as no interface for that function for user so their is no documentation for that in man pages)
If you want to read about documents about Kernel space function download the kernel source and check the "Documentation" folder or check the source for respective function they have few comments.

Related

ioctl vs kernel modules in Linux

I know that kernel modules are used to write device drivers. You can add new system calls to the Linux kernel and use it to communicate with other devices.
I also read that ioctl is a system call used in linux to implement system calls which are not available in the kernel by default.
My question is, why wouldn't you just write a new kernel module for your device instead of using ioctl? why would ioctl b useful where kernel modules exist?
You will need to write a kernel driver in either case, but you can choose between adding a new syscall and adding a ioctl.
Let's say you want to add a feature to get the tuner settings for a video capturing device.
If you implement it as a syscall:
You can't just load a module, you need to change the kernel itself
Hundreds of drivers could each add dozens of syscalls each, kludging up the table with thousands of global functions that must be kept forever.
For the driver to have any reach, you will need to convince kernel maintainers that this burden is worthwhile.
You will need to upstream the definition into glibc, and people must upgrade before they can write programs for it
If you implement it as an ioctl:
You can build your module for an existing kernel and let users load it, without having to get kernel maintainers involved
All functions are simple per-driver constants in the applicable header file, where they can easily be added or removed
Everyone can start programming with it just by including the header
Since an ioctl is much easier, more flexible, and exactly meant for all these driver specific function calls, this is generally the preferred method.
I also read that ioctl is a system call used in linux to implement system calls which are not available in the kernel by default.
This is incorrect.
System calls are (for Linux) listed in syscalls(2) (there are hundreds of them between user space and kernel land) and ioctl(2) is one of them. Read also wikipage on ioctl and on Unix philosophy and Linux Assembler HowTo
In practice, ioctl is mostly used on device files, and used for things which are not a read(2) or a write(2) of bytes.
For example, a sound is made by writing bytes to /dev/audio, but to change the volume you'll use some ioctl. See also fcntl(2) playing a similar role.
Input/output could also happen (somehow indirectly ...) thru mmap(2) and related virtual address space system calls.
For much more, read Advanced Linux Programming and Operating Systems: Three Easy Pieces. Look into Osdev for more hints about coding your own OS.
A kernel module could implement new devices, or new ioctl, etc... See kernelnewbies for more. I tend to believe it might sometimes add a few new syscalls (but this was false in older linux kernels like 3.x ones)
Linux is mostly open source. Please download then look inside source code. See also Linux From Scratch.
IIRC, Linux kernel 1.0 did not have any kernel modules. But that was around 1995.

Why are not all library functions not system calls?

So, from my basic OS class, I understood that kernel is the one who interacts with the hardware. So, if we want to interact with hardware, we need to call system calls. open() is a system call, while strlen() is not a system call. But any instruction or command has to interact with hardware, at least to increase program counter or modify the contents of memory. So, shouldn't all functions make a system call at some point ?
I would strongly suggest reading early papers on UNIX, the how and the why. Ken Thompson was a strong advocate for the kernel consisting only of the things that could not be implemented outside of the kernel.
At that time; outside of the kernel corresponded to outside of the privileged mode of the computer. This is a less interesting concept in modern systems; yet continues to drive architecture and design.
In short; open() is exported by the kernel because it has to be - it access data structures that are private to the kernel, thus is an interface; strlen is not exported by the kernel because it doesn't have to be, it neither requires privilege nor access to other data structures.
Doesn't have to be is a trump card; because nobody wants needless functionality in the kernel.
kernel is the one who interacts with the hardware.
That is a very inaccurate statement.
Even the following program, which may run on some microcontroller with no OS (and no optimizing compiler...), interacts with the hardware:
int array[8192];
void entry_point(void) {
array[100] = 5+3;
}
That hardware in question being "CPU", "memory bus", and "memory".
While system calls are primarily used to access certain hardware (disk, network etc.), system calls are not defined as "calls to access hardware", but rather as just "calls to kernel APIs".
A kernel can export whatever API it wants, including strlen(), but for an OS designer, such as the aforementioned Ken Thompson, the APIs that the kernel should export are ones that facilitate the existence of multiple programs, processes, and/or users.
The main concern here is access to resources such as disk, network, timers, memory, etc., but also include e.g.:
Scheduling / process management APIs (e.g. the fork(), exit(), nice(), and sched_yield() calls)
Multiprocess management (e.g. sigaction(), kill(), wait(), futex() and semop())
Performance & debugging (e.g. ptrace(), prlimit() and getrusage())
Security (e.g. chmod(), chown(), setuid(), seccomp(), and chroot())
Administration (e.g. init_module(), sethostname(), shutdown())

"Switching from user mode to kernel mode" is an incorrect concept

Im studying for the first time "Operating System". In my book i found this sentence about "User Mode" and "Kernel Mode":
"Switch from user to kernel mode" instruction is executed only in kernel
mode
I think that is a incorrect sentence as in practice there is no "switch of kernel". In fact, when a user process need to do a privileged instruction it simply ask the kernel to do something for itself. Is it correct ?
In fact, when a user process need to do a privileged instruction it simply ask the kernel to do something for itself.
But how does that happen? Details are processor (i.e. instruction set architecture) and OS specific (explained in ABI specifications relevant to your system, e.g. here), but that usually involves some machine code instruction like SYSENTER or SYSCALL (or SVC on mainframes) capable of atomically changing the CPU mode (that is switching it in a controlled manner to kernel mode). The actual parameters of the system call (including even the syscall number) are often passed in registers (but details are ABI specific).
So I feel the concept of switching from user-mode to kernel-mode is relevant, and meaningful (so "correct").
BTW, user-mode code is forbidden (by the hardware) to execute privileged machine instructions, such as those interacting with IO hardware devices (read about protection rings). If you try, you get some hardware exception (a bit similar to interrupts). Hence your code (even if it is malicious) has to make system calls, which the kernel controls (it has lots of code related to permission checking), for e.g. all IO.
Read also Operating Systems: Three Easy Pieces - freely downloadable. See also http://osdev.org/. Read system call wikipage & syscalls(2), and the Assembler HowTo.
In real life, things are much more complex. Read about System Management Mode and about the (scary) Intel Management Engine.

Linux System Calls & Kernel Mode

I understand that system calls exist to provide access to capabilities that are disallowed in user space, such as accessing a HDD using the read() system call. I also understand that these are abstracted by a user-mode layer in the form of library calls such as fread(), to provide compatibility across hardware.
So from the application developers point of view, we have something like;
//library //syscall //k_driver //device_driver
fread() -> read() -> k_read() -> d_read()
My question is; what is stopping me inlining all the instructions in the fread() and read() functions directly into my program? The instructions are the same, so the CPU should behave in the same way? I have not tried it, but I assume that this does not work for some reason I am missing. Otherwise any application could get arbitrary kernel mode operation.
TL;DR: What allows system calls to 'enter' kernel mode that is not copy-able by an application?
System calls do not enter the kernel themselves. More precisely, for example the read function you call is still, as far as your application is concerned, a library call. What read(2) does internally is calling the actual system call using some interruption or the syscall(2) assembly instruction, depending on the CPU architecture and OS.
This is the only way for userland code to have privileged code to be executed, but it is an indirect way. The userland and kernel code execute in different contexts.
That means you cannot add the kernel source code to your userland code and expect it to do anything useful but crash. In particular, the kernel code has access to physical memory addresses required to interact with the hardware. Userland code is limited to access a virtual memory space that has not this capability. Also, the instructions userland code is allowed to execute is a subset of the ones the CPU support. Several I/O, interruption and virtualization related instructions are examples of prohibited code. They are known as privileged instructions and require to be in an lower ring or supervisor mode depending on the CPU architecture.
You could inline them. You can issue system calls directly through syscall(2), but that soon gets messy. Note that the system call overhead (context switches back and forth, in-kernel checks, ...), not to mention the time the system call itself takes, makes your gain by inlining dissapear in the noise (if there is any gain, more code means cache isn't so useful, and performance suffers). Trust the libc/kernel folks to have studied the matter and done the inlining for you behind your back (in the relevant *.h file) if it really is a measurable gain.

call a kernel module function from program at user space

I developed a kernel module and some functions on it. Now i need to develop a program in the user space and call some functions which are in the kernel module.
I also need to access some global variable that are in the kernel module on my program at the user space.
There is complete overview of linux-kernel module and user-space program interacting http://wiki.tldp.org/kernel_user_space_howto "Kernel Space, User Space Interfaces" by Ariane Keller (it is from 2008-09-28, but about 2.6 kernels; only major new way is relayfs)
No ordinary function call from user space to kernel space is listed, only syscall (adding new syscall is not easy) and upcall (call in inverse direction).
One of easiest interface is ioctl; but you can't start to use ioctl before creating procfs, sysfs or similiar file.
Other is sysctl; but sysctl is more eligible to reading/writing to global variable. (It is hard to pass several parameters via sysctl interface).
You seem to be missing the point of kernel and userland separation. If your user program could modify data inside the kernel directly, that would quickly lead to disaster.
There's only one conventional way for a user program to explicitly request services from the kernel - make a system call.
There are also traps and some Linux-specific userland-kernel communication mechanisms, but those are not relevant here.
As other posters have mentioned, there is a clear distinction between kernel and user space. So no you can't call a kernel function directly from user space. I think the easiest way to send messages between userspace and kernel space is via netlink sockets. A netlink socket allows you to easily pass arbitrary data structures between user level and kernel level.
Yes ioctl, system calls are viable alternatives, they are not as flexible as the netlink socket for passing arbitrary information.
You'll need to install a new kernel to make use of the new call unless you already have some mechanism to update the kernel ... http://www.cyberciti.biz/tips/how-to-patch-running-linux-kernel.html

Resources