Kernel module to monitor syscalls?

Kernel module to monitor syscalls? - linux

I would like to create a kernel module from scratch that latches to a user session and monitors each system call made by processes belonging to that user.
I know what everyone is thinking - "use strace" - but I'd like to have some of my own logging and analysis with the data I collect, and strace has some issues - an application could use "mmap" to write to a file without the file contents ever appearing as the arguments of an "open" system call, or an application without any write permission may create coredumps to copy sensitive data.
I want to be able to handle these special cases and do some of my own logging. I wonder though - how can I route all syscalls through my module? Is there any way to do that without touching the kernel code?
Thanks

I don't have the exact answer to your question, but I red a paper a couple of days ago and it may be useful for you:
http://www.cse.iitk.ac.in/users/moona/students/Y2157230.pdf/

I have done something similar in the past by using a kernel module to patch the system call table. Each patched function did something like the following:
patchFunction(/*params*/)
{
// pre checks
ret = origFunction(/*params*/);
// post checks
return ret;
}
Note that when you start mucking around in the kernel data structures, your module becomes version dependent. The kernel module will probably have to be compiled for the specific kernel version you are installing on.
Also note, this is a technique employed by many rootkits so if you have security software installed it may try to prevent you from doing something like this.

Related

How do I open a file in a kernel module if calling process is in user space?

I am trying to create a character device driver that dumps /etc/shadow when read from as a non-privileged user. This is for purely academic purposes of course.
I was reading about how reading/writing files in kernel space opens a system to possible exploits. I am trying to implement this in practice.
Please spare me the "don't touch the filesystem in kernel mode" talk. I am precisely trying to exploit the nuances of doing so.
Problem is that the only way I have found so far that works to open a file in kernel mode is filp_open, which is currently producing EACCESS when I read from the device file as a non-privileged user. This was confounding at first as I assumed that I can do anything in kernel space.
For example, when I cat the device file I have created as a non-root user, filp_open produces EACCESS in kernel space???
Further investigation has led me to believe that filp_open checks the capabilities of the calling process. This would make sense as it is used internally by open(), but I am in kernel mode here! There must be a way!
I am very new to programming in kernel space. I have extensive application C experience, but I am finding it difficult to navigate the kernel documentation for precisely what I am looking for. Additionally, it seems that more and more symbols within the kernel are not exported for use in modules. As I am developing an exploit proof of concept, I would like it to work without recompiling the kernel. I am finding a lot of code (vfs and syscalls) that is deprecated as the symbols are no longer exported to kernel modules.
Is what I am trying to do a thing that is specifically engineered against? Loading a kernel module requires root to begin with, so I would see this more in the lens of a persistence focused attack rather than an access one.
Also, I got the proof of concept working by just reading from the file when the module is loaded, but this is no fun! Any pointers here are much appreciated.

After some rethinking and digging I have found two solutions to my problem. Thank you to Tsyvarev and stark for the pointers.
Solution 1
The first solution is to elevate the privileges of the calling process before making a call of filp_open. This is also basically making a rootkit, so not as interesting.
Here is a link to the guide that I found on the subject.
https://0x00sec.org/t/kernel-rootkits-getting-your-hands-dirty/1485
Solution 2
The module will have an init function that by nature must be run with elevated privs when the module is loaded. So you can open the file pointer there and just close it when the module is unloaded. Caveats are that you have the file pointer open the whole time, so all of the gotchas there are still present. Better to only read, writing is where things can get a bit tricky. This is the solution I chose in the interim, as I didn't want this thing to be a full rootkit.
Another direction is workqueue or to spawn a thread. Probably the most tricky but also the most inline with what my original vision of this demo was. I did not test this direction but it probably is the best solution.

Persistant storage values handling in Linux

I have a QSPI flash on my embedded board.
I have a driver + process "Q" to handle reading and writing into.
I want to store variables like SW revisions, IP, operation time, etc.
I would like to ask for suggestions how to handle the different access rights to write and read values from user space and other processes.
I was thinking to have file for each variable. Than I can assign access rights for those files and process Q can change the value in file if value has been changed. So process Q will only write into and other processes or users can only read.
But I don't know about writing. I was thinking about using message queue or zeroMQ and build the software around it but i am not sure if it is not overkill. But I am not sure how to manage access rights anyway.
What would be the best approach? I would really appreciate if you could propose even totally different approach.
Thanks!

This question will probably be downvoted / flagged due to the "Please suggest an X" nature.
That said, if a file per variable is what you're after, you might want to look at implementing a FUSE file system that wraps your SPI driver/utility "Q" (or build it into "Q" if you get to compile/control source to "Q"). I'm doing this to store settings in an EEPROM on a current work project and its turned out nicely. So I have, for example, a file, that when read, retrieves 6 bytes from EEPROM (or a cached copy) provides a MAC address in std hex/colon-separated notation.
The biggest advantage here, is that it becomes trivial to access all your configuration / settings data from shell scripts (e.g. your init process) or other scripting languages.
Another neat feature of doing it this way is that you can use inotify (which comes "free", no extra code in the fusefs) to create applications that efficiently detect when settings are changed.
A disadvantage of this approach is that it's non-trivial to do atomic transactions on multiple settings and still maintain normal file semantics.

Linux kernel : logging to a specific file

I am trying to edit the linux kernel. I want some information to be written out to a file as a part of the debugging process. I have read about the printk function. But i would like to add text to a particular file (file other from the default files that keep debug logs).
To cut it short: I would kind of like to specify the "destination" in the printk function (or at least some work-around it)
How can I achieve this? Will using fwrite/fopen work (if yes, will it work without causing much overhead compared to printk, since they are implemented differently)?
What other options do i have?

Using fopen and fwrite will certainly not work. Working with files in kernel space is generally a bad idea.
It all really depends on what you are doing in the kernel though. In some configurations, there may not even be a hard disk for you to write to. If however, you are working at a stage where you can have certain assumptions about the running kernel, you probably actually want to write a kernel module rather than edit the kernel itself. For all you care, a kernel module is just as good as any other part of the kernel, but they are inserted when the kernel is already up and running.
You may also be thinking of doing so for debugging, or have output of a kernel-level application (e.g. an application that you are forced to run at kernel level for real-time constraints etc). In that case, kio may be of interest to you, but if you want to use it, do make sure you understand why.
kio is a library I wrote just for those "kernel-level applications", which makes a kernel module see a /proc file as if it's a user of it (rather than a provider). To make it work, you should have a user-space application also opening that virtual file and redirect it to wherever you want to write your log. Something along the lines of opening the file with kopen in write mode and in user space tell cat /proc/your_file > ~/log_file.
Note: I still recommend printk unless you really know what you are doing. Since you are thinking of fopen in kernel space, I don't think you really know what you are doing.

pinning a pthread to a single core

I am trying to measure the performance of some library calls. My primary measurement tool is the rdtsc call. After doing some reading I realize that I need to disable preemption and interrupts in order to get the most accurate readings. Can someone help me figure out how to do these? I know that pthreads have a 'set affinity' mechanism. Is that enough to get the job done?
I also read somewhere that I can make calls into the kernel of the sort
preempt_disable()
raw_local_irq_save(...)
Is there any benefit to using one approach over the other? I tried the latter approach and got this error.
error: 'preempt_disable' was not declared in this scope
which can be fixed by including linux/preempt.h but the compiler still complains.
linux/preempt.h: No such file or directory
Obviously I have not done any kernel hacking and I could not find this file on my system anywhere. I am really hoping I wont have to install a new linux kernel. :)
Thanks for your input.

Pinning a pthread to a single CPU can be done using pthread_setaffinity_np
But what you want to achieve at the end is not so simple. I'll explain you why.
preempt.h is part of the Linux Kernel source. Its located here. You need to have kernel sources with you. Anyways, you need to write a kernel module to access it, you cannot use it from user space. Learn how to write a kernel module here. Same is the case with functions preempt_disable and other interrupt disabling kernel functions
Now the point is, pthreads are in user space and your preemption disabling function is in kernel space. How to interact?
Either you need to write a new system call of your own where you do your preemption and interrupt disabling and call it from user space. Or you need to resort to other Kernel-User Space Interfaces like procfs, sysfs, ioctl etc
But I am really skeptical as to how all these will help you to benchmark library functions. You may want to have a look at how performance is typically measured using rdtsc

intercepting file system system calls

I am writing an application for which I need to intercept some filesystem system calls eg. unlink. I would like to save some file say abc. If user deletes the file then I need to copy it to some other place. So I need unlink to call my code before deleting abc so that I could save it. I have gone through threads related to intercepting system calls but methods like LD_PRELOAD it wont work in my case because I want this to be secure and implemented in kernel so this method wont be useful. inotify notifies after the event so I could not be able to save it. Could you suggest any such method. I would like to implement this in a kernel module instead of modifying kernel code itself.
Another method as suggested by Graham Lee, I had thought of this method but it has some problems ,I need hardlink mirror of all the files it consumes no space but still could be problematic as I have to repeatedly mirror drive to keep my mirror up to date, also it won't work cross partition and on partition not supporting link so I want a solution through which I could attach hooks to the files/directories and then watch for changes instead of repeated scanning.
I would also like to add support for write of modified file for which I cannot use hard links.
I would like to intercept system calls by replacing system calls but I have not been able to find any method of doing that in linux > 3.0. Please suggest some method of doing that.

As far as hooking into the kernel and intercepting system calls go, this is something I do in a security module I wrote:
https://github.com/cormander/tpe-lkm
Look at hijacks.c and symbols.c for the code; how they're used is in the hijack_syscalls function inside security.c. I haven't tried this on linux > 3.0 yet, but the same basic concept should still work.
It's a bit tricky, and you may have to write a good deal of kernel code to do the file copy before the unlink, but it's possible here.

One suggestion could be Filesystems in Userspace (FUSE.) That is, write a FUSE module (which is, granted, in userspace) which intercepts filesystem-related syscalls, performs whatever tasks you want, and possibly calls the "default" syscall afterwards.
You could then mount certain directories with your FUSE filesystem and, for most of your cases, it seems like the default syscall behavior would not need to be overridden.

You can watch unlink events with inotify, though this might happen too late for your purposes (I don't know because I don't know your purposes, and you should experiment to find out). The in-kernel alternatives based on LSM (by which I mean SMACK, TOMOYO and friends) are really for Mandatory Access Control so may not be suitable for your purposes.

If you want to handle deletions only, you could keep a "shadow" directory of hardlinks (created via link) to the files being watched (via inotify, as suggested by Graham Lee).
If the original is now unlinked, you still have the shadow file to handle as you want to, without using a kernel module.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string