I understand that an Operating System forces security policies on users when they use the system and filesystem via the System Calls supplied by stated OS.
Is it possible to circumvent this security by implementing your own hardware instructions instead of making use of the supplied System Call Interface of the OS? Even writing a single bit to a file where you normally have no access to would be enough.
First, for simplicity, I'm considering the OS and Kernel are the same thing.
A CPU can be in different modes when executing code.
Lets say a hypothetical CPU has just two modes of execution (Supervisor and User)
When in Supervisor mode, you are allowed to execute any instructions, and you have full access to the hardware resources.
When in User mode, there is subset of instructions you don't have access to, such has instructions to deal with hardware or change the CPU mode. Trying to execute one of those instructions will cause the OS to be notified your application is misbehaving, and it will be terminated. This notification is done through interrupts. Also, when in User mode, you will only have access to a portion of the memory, so your application can't even touch memory it is not supposed to.
Now, the trick for this to work is that while in Supervisor Mode, you can switch to User Mode, since it's a less privileged mode, but while in User Mode, you can't go back to Supervisor Mode, since the instructions for that are not permitted anymore.
The only way to go back to Supervisor mode is through system calls, or interrupts. That enables the OS to have full control of the hardware.
A possible example how everything fits together for this hypothetical CPU:
The CPU boots in Supervisor mode
Since the CPU starts in Supervisor Mode, the first thing to run has access to the full system. This is the OS.
The OS setups the hardware anyway it wants, memory protections, etc.
The OS launches any application you want after configuring permissions for that application. Launching the application switches to User Mode.
The application is running, and only has access to the resources the OS allowed when launching it. Any access to hardware resources need to go through System Calls.
I've only explained the flow for a single application.
As a bonus to help you understand how this fits together with several applications running, a simplified view of how preemptive multitasking works:
In a real-world situation. The OS will setup an hardware timer before launching any applications.
When this timer expires, it causes the CPU to interrupt whatever it was doing (e.g: Running an application), switch to Supervisor Mode and execute code at a predetermined location, which belongs to the OS and applications don't have access to.
Since we're back into Supervisor Mode and running OS code, the OS now picks the next application to run, setups any required permissions, switches to User Mode and resumes that application.
This timer interrupts are how you get the illusion of multitasking. The OS keeps changing between applications quickly.
The bottom line here is that unless there are bugs in the OS (or the hardware design), the only way an application can go from User Mode to Supervisor Mode is through the OS itself with a System Call.
This is the mechanism I use in my hobby project (a virtual computer) https://github.com/ruifig/G4DevKit.
HW devices are connected to CPU trough bus, and CPU does use to communicate with them in/out instructions to read/write values at I/O ports (not used with current HW too much, in early age of home computers this was the common way), or a part of device memory is "mapped" into CPU address space, and CPU controls the device by writing values at defined locations in that shared memory.
All of this should be not accessible at "user level" context, where common applications are executed by OS (so application trying to write to that shared device memory would crash on illegal memory access, actually that piece of memory is usually not even mapped into user space, ie. not existing from user application point of view). Direct in/out instructions are blocked too at CPU level.
The device is controlled by the driver code, which is either run is specially configured user-level context, which has the particular ports and memory mapped (micro-kernel model, where drivers are not part of kernel, like OS MINIX). This architecture is more robust (crash in driver can't take down kernel, kernel can isolate problematic driver and restart it, or just kill it completely), but the context switches between kernel and user level are a very costly operation, so the throughput of data is hurt a bit.
Or the device drivers code runs on kernel-level (monolithic kernel model like Linux), so any vulnerability in driver code can attack the kernel directly (still not trivial, but lot more easier than trying to get tunnel out of user context trough some kernel bug). But the overall performance of I/O is better (especially with devices like graphics cards or RAID disc clusters, where the data bandwidth goes into GiBs per second). For example this is the reason why early USB drivers are such huge security risk, as they tend to be bugged a lot, so a specially crafted USB device can execute some rogue code from device in kernel-level context.
So, as Hyd already answered, under ordinary circumstances, when everything works as it should, user-level application should be not able to emit single bit outside of it's user sandbox, and suspicious behaviour outside of system calls will be either ignored, or crash the app.
If you find a way to break this rule, it's security vulnerability and those get usually patched ASAP, when the OS vendor gets notified about it.
Although some of the current problems are difficult to patch. For example "row hammering" of current DRAM chips can't be fixed at SW (OS) or CPU (configuration/firmware flash) level at all! Most of the current PC HW is vulnerable to this kind of attack.
Or in mobile world the devices are using the radiochips which are based on legacy designs, with closed source firmware developed years ago, so if you have enough resources to pay for a research on these, it's very likely you would be able to seize any particular device by fake BTS station sending malicious radio signal to the target device.
Etc... it's constant war between vendors with security researchers to patch all vulnerabilities, and hackers to find ideally zero day exploit, or at least picking up users who don't patch their devices/SW fast enough with known bugs.
Not normally. If it is possible it is because of an operating system software error. If the software error is discovered it is fixed fast as it is considered to be a software vulnerability, which equals bad news.
"System" calls execute at a higher processor level than the application: generally kernel mode (but system systems have multiple system level modes).
What you see as a "system" call is actually just a wrapper that sets up registers then triggers a Change Mode Exception of some kind (the method is system specific). The system exception hander dispatches to the appropriate system server.
You cannot just write your own function and do bad things. True, sometimes people find bugs that allow circumventing the system protections. As a general principle, you cannot access devices unless you do it through the system services.
I'm working on an embedded linux platform.
When I do "echo "mem" > /sys/power/state", system will suspend.
I know that kernel and driver can know that suspend operation's coming. But would it be possible that a user space process or application can get the notification that the system will suspend? How?
For example, I have an application who writes 'A' continually into a buffer whose start address is given by a device driver. Would it be possible that this application be notified that the system will suspend so that it could replace all this buffer with 'B' so that when driver is resumed, all what driver sees are 'B'?
Thanks a lot.
Been searching for the same thing. But unfortunately, I didn't find any user space notification during suspend/resume. Applications are just refrigerated/frozen and they will never know they are suspended.
However, one possible approach would be to add a generic netlink message sending or uevent from any driver's suspend/resume function that you can modify. Still the application may never get enough time to process it before it is frozen and might lead to race conditions. Say it received the suspend message and got frozen before it could process it. And once resumed, it will be processing the suspend message.
IMO, it is better to handle the scenario in the driver. Leave the user space alone.
I'm not sure whether it's useful for you in particular, given the mention of "embedded", however systemd can notify you over DBus: https://www.freedesktop.org/wiki/Software/systemd/inhibit/
I would like to ask your advice on the following: I need to write drivers for omap3, for accessing external dsp through fpga (through gpmc interface). The dsp is required to load file to dsp, and to read/write buffer from dsp. There is already FPGA driver in kernel. The kernel is 2.6.32. So I thought of the following options:
writing dsp driver in kernel, which uses the existing fpga driver.
writing a user space driver which interfaces with the fpga kernel driver.
writing user space driver using UIO , which will not use the kernel fpga driver, but shall do the access to fpga, as part of the user space single and complete dsp driver.
What do you think is prefered option ?
What is the advantage of kernel driver over user sace and vise versa ?
Thanks, Ran
* User-space driver:
Easier to debug.
Loads of libraries to support you.
Allows you to hide the details of your IP if you want to ( people will really hate you if you did! )
A crash won't affect the whole system.
Higher latency in handling interrupts since the kernel will have to relay the interrupt to the user space somehow.
You can't control access to your device from user space.
* Kernel-space driver:
Harder to debug.
Only linux kernel frameworks are supported.
You can always provide a binary blob to hide the details of your IP but this is annoying since it has to be generated against a specific kernel.
A crash will bring down the whole system.
Less latency in handling interrupts.
You can control access to your device from kernel space because it's a global context that all processes see.
As a kernel engineer I'm more comfortable/happy hacking code in a kernel context, that's probably why I would write the whole driver in the kernel.
However, I would say that the best thing to do is to divide the functionality of your driver into units and only put the unit in the kernel when there's a reason to do so.
For example:
If your device has a shared resource ( like an MMU, hardware FIFO ) and you want multiple processes to be able to use it safely, then probably you need some buffer manager to be in the kernel and all the processes would be communicating with it through ioctl.
If your driver needs to respond to an interrupt as fast as possible ( very low-latency ), then you will need to put the part of the code that handles interrupts in the kernel interrupt handler rather than putting it in user space and notifying user space when an interrupt occurs.
I'm working on one piece of a very high performance piece of hardware that works under Linux. We'd like to cache some data but we're worried about memory consumption - so the idea is to create a user process to manage the cache. That way, the cache can be in virtual memory, not in kernel space, et cetera.
The question is: what's the best way to do this? My first instinct is to have the kernel module create a character device file, and have a user program that opens that file, then sits on a select statement waiting for commands to arrive on it. But I'm concerned that this might not be optimal. A friend mentioned he knew of a socket-based interface, but when pressed he couldn't provide any details....
Any suggestions?
I think you're looking for the netlink interface. See Why and How to Use Netlink Socket [sic] for more information. Be careful of security issues when talking between the kernel and user space; there was a recent vulnerability when udev neglected to check that messages were coming from the kernel rather than user space.
When writing embedded ARM code, it's easy to access to the built-in zero wait state memory to accelerate your application. Windows CE doesn't expose this to user-mode applications, but there is probably a way to do it. The internal SRAM is usually used for the video buffer, but there's usually some left over. Anyone know how to do it?
Thanks,
Larry B.
Unfortunately you can't access the high speed ram from usermode-processes.
The only way to get access to it on a WindowsCE-OS is to write a driver, map the fixed address of the TCM into the user-mode process address space and pass it to the user-mode process.