Linux kernel - add system call dynamically through module - linux

Is there any way to add a system call dynamic, such as through a module? I have found places where I can override an existing system call with a module by just changing the sys_call_table[] array to get my overridden function instead of the native when my module is installed, but can you do this with a new system call and a module?

No, sys_call_table is of fixed size:
const sys_call_ptr_t sys_call_table[__NR_syscall_max+1] = { ...
The best you can do, as you probably already discovered, is to intercept existing system calls.

Intercepting existing system call (to have something done in the kernel) is not the right way in some cases. For eg, if your userspace drivers need to execute something in kernel, send something there, or read something from kernel?
Usually for drivers, the right way is to use ioctl() call, which is just one system call, but it can call different kernel functions or driver modules - by passing different parameters through ioctl().
The above is for user-controlled kernel code execution.
For data passing, you can use procfs, or sysfs drivers to talk to the kernel.
PS: when you intercept system call, which generally affect the entire OS, you have to worry about how to solve the problem of doing it safely: what if someone else is halfway calling the system call, and then you modify/intercept the codes?

Related

kprobe vs uprobe system call interposition

I want to write a system call interposition by using Utrace. I understood that Utrace project has been abandoned, but part of its code is used on kprobe and uprobe.
I haven't understood really well how these work. Especially uprobe Can you explain what difference exists between them? And can I use uprobe without writing a module to check which are the actual parameters for a system call?
thanks
Kprobe creates and manages probepoints in kernel code, that is, you want to probe some kernel function, say, do_sys_open(). You need to take a look at Documentation/trace/kprobetrace.txt to get some usage of kprobe.
Uprobe creates and manages probepoints in user applications, that is, you want to probe some user-space function, but the probe is run in the kernel space on behalf of the probed process. You need to take a look at Documentation/trace/uprobetracer.txt to get the basic usage of uprobe, to see what it aims for.

Kernel module to monitor syscalls?

I would like to create a kernel module from scratch that latches to a user session and monitors each system call made by processes belonging to that user.
I know what everyone is thinking - "use strace" - but I'd like to have some of my own logging and analysis with the data I collect, and strace has some issues - an application could use "mmap" to write to a file without the file contents ever appearing as the arguments of an "open" system call, or an application without any write permission may create coredumps to copy sensitive data.
I want to be able to handle these special cases and do some of my own logging. I wonder though - how can I route all syscalls through my module? Is there any way to do that without touching the kernel code?
Thanks
I don't have the exact answer to your question, but I red a paper a couple of days ago and it may be useful for you:
http://www.cse.iitk.ac.in/users/moona/students/Y2157230.pdf/
I have done something similar in the past by using a kernel module to patch the system call table. Each patched function did something like the following:
patchFunction(/*params*/)
{
// pre checks
ret = origFunction(/*params*/);
// post checks
return ret;
}
Note that when you start mucking around in the kernel data structures, your module becomes version dependent. The kernel module will probably have to be compiled for the specific kernel version you are installing on.
Also note, this is a technique employed by many rootkits so if you have security software installed it may try to prevent you from doing something like this.

call a kernel module function from program at user space

I developed a kernel module and some functions on it. Now i need to develop a program in the user space and call some functions which are in the kernel module.
I also need to access some global variable that are in the kernel module on my program at the user space.
There is complete overview of linux-kernel module and user-space program interacting http://wiki.tldp.org/kernel_user_space_howto "Kernel Space, User Space Interfaces" by Ariane Keller (it is from 2008-09-28, but about 2.6 kernels; only major new way is relayfs)
No ordinary function call from user space to kernel space is listed, only syscall (adding new syscall is not easy) and upcall (call in inverse direction).
One of easiest interface is ioctl; but you can't start to use ioctl before creating procfs, sysfs or similiar file.
Other is sysctl; but sysctl is more eligible to reading/writing to global variable. (It is hard to pass several parameters via sysctl interface).
You seem to be missing the point of kernel and userland separation. If your user program could modify data inside the kernel directly, that would quickly lead to disaster.
There's only one conventional way for a user program to explicitly request services from the kernel - make a system call.
There are also traps and some Linux-specific userland-kernel communication mechanisms, but those are not relevant here.
As other posters have mentioned, there is a clear distinction between kernel and user space. So no you can't call a kernel function directly from user space. I think the easiest way to send messages between userspace and kernel space is via netlink sockets. A netlink socket allows you to easily pass arbitrary data structures between user level and kernel level.
Yes ioctl, system calls are viable alternatives, they are not as flexible as the netlink socket for passing arbitrary information.
You'll need to install a new kernel to make use of the new call unless you already have some mechanism to update the kernel ... http://www.cyberciti.biz/tips/how-to-patch-running-linux-kernel.html

pinning a pthread to a single core

I am trying to measure the performance of some library calls. My primary measurement tool is the rdtsc call. After doing some reading I realize that I need to disable preemption and interrupts in order to get the most accurate readings. Can someone help me figure out how to do these? I know that pthreads have a 'set affinity' mechanism. Is that enough to get the job done?
I also read somewhere that I can make calls into the kernel of the sort
preempt_disable()
raw_local_irq_save(...)
Is there any benefit to using one approach over the other? I tried the latter approach and got this error.
error: 'preempt_disable' was not declared in this scope
which can be fixed by including linux/preempt.h but the compiler still complains.
linux/preempt.h: No such file or directory
Obviously I have not done any kernel hacking and I could not find this file on my system anywhere. I am really hoping I wont have to install a new linux kernel. :)
Thanks for your input.
Pinning a pthread to a single CPU can be done using pthread_setaffinity_np
But what you want to achieve at the end is not so simple. I'll explain you why.
preempt.h is part of the Linux Kernel source. Its located here. You need to have kernel sources with you. Anyways, you need to write a kernel module to access it, you cannot use it from user space. Learn how to write a kernel module here. Same is the case with functions preempt_disable and other interrupt disabling kernel functions
Now the point is, pthreads are in user space and your preemption disabling function is in kernel space. How to interact?
Either you need to write a new system call of your own where you do your preemption and interrupt disabling and call it from user space. Or you need to resort to other Kernel-User Space Interfaces like procfs, sysfs, ioctl etc
But I am really skeptical as to how all these will help you to benchmark library functions. You may want to have a look at how performance is typically measured using rdtsc

Are there any other ways to record iotcl calls besides strace?

I'm trying to see if certain ioctl calls get called when I call a function(this is on linux).There no way to cause the kernel to write a log with this sort of data, is there?
On recent kernel, and if kernel is configured with support for tracing and dynamic tracing, ftrace can probably do what you need.
Another option is to write an ioctl wrapper, load it using LD_PRELOAD, and intercept the interesting ioctl in your wrapper.

Resources