ioctl() calls from within Linux kernel? - linux

Is it possible from within a kernel module to perform an ioctl() call?
I ask because for some time now, I have been trying to figure out how to properly take down a network interface such as eth0 with a kernel module I wrote. I have had no luck, I have been able to turn off an interface but the kernel does crazy after I do which leads me to believe I am doing it wrong.

System calls can be made from the Linux kernel. The kernel provides wrapper routines to invoke system calls. link: https://www.safaribooksonline.com/library/view/understanding-the-linux/0596002130/ch09s03.html
Make sure you do it from process context. Though, such practice is often discouraged, I assume, you know your design best. Good Luck.

Related

How to trap system calls in a userspace library in Linux?

I have a requirement to write a linux device driver in userspace.
How can I write a library which, when linked to an application, can handle system calls to a particular device.
The application should be able to use open(), read(), write(), ioctl() on a device such as /dev/mydev0, but these calls should terminate in a userspace library instead of a kernel module.
Please advise on if this is possible and how can I achieve this.
Linux is a monolithic kernel which means in general, what you're asking is not possible; you can't write arbitrary drivers in user-mode.
You could (as your title alludes to), use ptrace(2) to trap on system calls, and basically redirect them to functions in your library. That is not a simple, straightforward solution however.
See also:
How to use ptrace(2) to change behaviour of syscalls?
FUSE (Filesystem in USErspace) may be what you're looking for. It's a mechanism that allows filesystem drivers specifically to be implemented via a user-space process. This is how sshfs, for example, is implemented.
Resources:
http://fuse.sourceforge.net/

Can containers be implemented purely in userspace?

There are a bunch of container mechanisms for Linux now: LXC, Docker, lmctfy, OpenVZ, Linux-VServer, etc. All of these either involve kernel patches or recently added Linux features like cgroups and seccomp.
I'm wondering if it would be possible to implement similar (OS-level) virtualization purely in userspace.
There's already a precedent for this - User Mode Linux. However, it also requires special kernel features to be reasonably fast and secure. Also, it is literally a Linux kernel running in userspace, which makes networking setup rather difficult.
I'm thinking more along the lines of a process that would act as an intermediary between spawned programs and the Linux kernel. You would start the process with the programs to spawn as arguments; it would track system calls they made, and block or redirect attempts to access the real root filesystem, real network devices, etc. without itself relying on special kernel features.
Is such a thing possible to implement securely, and in a way that could be invoked effectively by a limited user (i.e. not privileged like chroot)?
In summary: would a pure userspace implementation of something like LXC be possible? If yes, what would the penalties be for doing it in userspace? If no, why not?
Surprisingly it turns out the answer is "yes": this is what systrace and sysjail do.
http://sysjail.bsd.lv/
And they are also inherently insecure on modern operating systems.
http://www.watson.org/~robert/2007woot/
So if you want proper sandboxing, it has to be done in kernel space.

Difference between system API and system call API

I have read "system call APIs are for user-space access and
system APIs are for system space access". I am new to Linux OS concepts, I don't have any knowledge about the System API. Can anyone explain the difference between these two?
A system call is an explicit request to the kernel made via a software interrupt. It is the lowest level thing which talks to the Operating system. System call is when you call the kernel. System-calls are actually intended to be very low-level interfaces, you can say to a very specific functionality which your program cannot accomplish on its own.
Whereas a System API are used to invoke system call
Read system call and linux kernel wikipages first.
As Rahul Triparhi answered, system calls are the elementary operations, as seen from a user-mode application software. Use strace(1) to find out which syscalls are done by some program.
The system calls are well documented in the section 2 of the man pages (type first man man in a terminal on your Linux system). So read intro(2) and then syscalls(2).
Stricto sensu, syscalls have an interface, notably specified in ABI specifications like x86-64 ABI, defined at the lowest possible machine level - in terms of machine instructions and registers, etc... The functions in section 2 are tiny C wrappers above them. See also Linux Assembly HowTo
Please read also Advanced Linux Programming which explains quite well many of them.
BTW, I am not sure that "System API" has a well defined meaning, even if I guess what it could be. See also the several answers to this question.
Probably "System API" refers to the many functions standardized by POSIX, implemented in the POSIX C library such as GNU libc (but you could use some other libc on Linux, like MUSL libc, if you really wanted to). I am thinking of functions like dlopen (to dynamically load a plugin) or getaddrinfo(3) (to get information about some network things) etc... The Linux implementation (e.g. dlopen(3)) is providing a super-set of it.
More generally, the section 3 of man pages, see intro(3), is providing a lot of library functions (most of them built above system calls, so dlopen actually calls mmap(2) syscall, and getaddrinfo may use syscalls to connect to some server - see nsswitch.conf(5), etc...). But some library functions are probably not doing any syscall, like snprintf(3) or sqrt(3) or longjmp(3) .... (they are just doing internal computations without needing any additional kernel service).

Switching into (Linux) Kernel Mode

Linux n00b here. How does one switch from User Mode to Kernel Mode? I'm running Linux Ubuntu 12.10. Is there an interrupt that I can call using inline assembly code that will do this? If not, how can it be done?
I'm asking this question because I am wishing to write a SCTP (network)protocol stack which has access to the kernel and runs in the background constantly though the UI cannot directly access the kernel. Never done anything like this before so tips from pros would definitely be appreciated.
All switches to kernel mode are made via system calls. In the case of network protocols these system calls are socket, listen, accept, ioctl, read, write, recvmsg, etc.
You write a Linux kernel module. There is already a SCTP protocol stack for Linux though. You would likely be better off modifying it to do what you want.
Once you have written and compiled your module you can load it into the kernel using insmod and rmmod. In my experience you rarely get a chance to use rmmod because if you made a mistake the system crashes or freezes. So use a virtual machine for your testing. It is faster to reboot, you lose less data, and it is easier to hook up a virtual serial console for debugging.
I am sure this question is a duplicate by the way. You can find a lot of questions on this topic.

New linux kernels, no lsm using lkms, no kernel hooks now what?

For security reasons, the kernel ceased to export characters necessary for writing security modules in the form of loadable kernel modules (Linux Kernel Module, LKM) starting with version 2.6.24.
And you can't export sys_call_table, again for security reasons.
But then, how can I filter filesystem requests?
I'll state it simply: I want to hook the "open" function!
I don't want to have to compile my own version of the kernel, what's the point of drivers? It should work for all kernels.
Please help, thought I would have more freedom than Windows with Linux, but now I see the most precious parts of my life are blocked in Linux.
I've written a kernel module that can do this called tpe-lkm. I've also mentioned it on some other questions similar to this here on StackOverflow:
access to the sys_call_table in kernel 2.6+
Reading kernel memory using a module
intercepting file system system calls
Hope one of these helps you out.

Resources