Equivalent of VirtualProtectEx/CreateRemoteThread in Linux? - linux

I was wondering if there was an equivalent version either in a library or as a syscall, of the windows APIs which allow a process to interact with other process' space, which would mean modifying the flow of that second process.
This is to inject a .so in a running process without killing it.
Thanks!

maybe take a look here: CreateRemoteThread in Linux
I don't know of a simpler way than described there. On Windows you have this
fancy API like VirtualProtectEx. On Linux you'd be writing a .so which e.g. executes pthread_create
in a __attribute__((constructor)) function. Then you'd load that .so via the LD_PRELOAD mechanism.
The next best thing to CreateRemoteThread would be manipulating the main thread
of the process with the ptrace API. But this would involve
Holding a thread
Saving its context
Setting arguments for pthread_create
Set IP to pthread_create and execute
Restore the old context.
I think manipulating the memory access rights would also involve calling mprotect from a process context. As already mentioned above, the simplest way
to do that would not be using ptrace but using a precompiled shared object.

On Linux, there is a standard mechanism of injecting your code to a program. You basically define an encironment variable LD_PRELOAD that specifies a .so library that is loaded before all other .so files. Functions in that .so will replace standard versions of the functions. There is no need to modify the assembly code of fuctions manually to insert hooks to your own code like on windows.
Here is a nice tutorial: https://rafalcieslak.wordpress.com/2013/04/02/dynamic-linker-tricks-using-ld_preload-to-cheat-inject-features-and-investigate-programs/

Related

Making a clone'd thread pthread compatible

I am programming in C on Linux x86-64. I'm using a library which creates a number of threads via a raw clone system call rather than using pthread_create. These threads run low-level code internal to the library.
I would like to hook one of these threads to introspect its behavior. Hooking the code is easy enough, but I've discovered that I can't call almost anything in libc because the thread state is not configured. pthread_create normally inserts a bunch of data into the thread-local storage area indexed by fs:. Some of that data, for example, is essential to libc's function, such as the function pointer encryption key (pointer_guard) and locale pointer.
So my question is: can I upgrade a clone'd thread to a full pthread via any mechanism? If not, is there any way that I can call C functions from a clone'd thread (such as printf, toupper, etc. which require libc's thread-local data)?
Some of that data, for example, is essential to libc's function, such as the function pointer encryption key (pointer_guard) and locale pointer.
Correct. Don't forget about errno, which is also in there.
can I upgrade a clone'd thread to a full pthread via any mechanism?
No.
is there any way that I can call C functions from a clone'd thread
No.
If you have sources to the library, it should be relatively easy to replace direct clone calls with pthread_create.
If you do not, but the library is available in archive form, you may be able to use obcopy --rename-symbol to redirect its clone calls to a replacement (e.g. my_clone), which can then create a new thread via pthread_create and invoke the target function in that thread. Whether this will succeed greatly depends on how much the library cares about details of the clone.
It's also probably not worth the trouble.
A better alternative may be to implement the introspection without calling into libc. Since your printf and toupper probably only need to deal with ASCII and C locale, it's not hard to implement limited versions of these functions and use direct system calls to write the output.

Loading multiple copies of a shared library

I would like to be able to load multiple copies of a shared library into the same address space. I want them to not share any global variables, and I want the two copies to be unaware that the other has been loaded.
The use-case is parallel execution of a thread-unsafe library.
How can I do this
on Linux?
on OS X?
on Windows?
on *BSD?
on other Unix-like systems?
The use-case is parallel execution of a thread-unsafe library.
Even if you manage to achieve the "not share any global variables" goal (which is hard), the library may still not work, because it could be calling into thread-unsafe routines in other libraries.
The obvious case is for the library to call strtok.
On Linux and Solaris, you could use dlmopen(LM_ID_NEWLM, ...). Man page.

overriding functions of running process in linux

I am curious to know, how to override functions of a running process in Linux so that process should call my functions first.
we can use LD_PRELOAD for overriding of function in a binary but it does not work in case of already running process. Any suggestion please.
What LD_PRELOAD allows you to do is force loading a shared object before any other. So if a function is already provided by this custom shared object, it will not be loaded again from the "standard" shared object.
In your case you want to modify an already loaded function. I believe that is not possible.
This would clearly be a security risk.
Most operating systems implement DEP and ASLR which would prevent modifing and predicting the shared object position in memory.

Multithreading Linux vs Windows

I am porting one Linux Application to Windows. I observed many changes need to be done in multithreading part.
what will be the equivalent structure for "pthread_t" (which is in Linux), in windows?
what will be the equivalent for structure for "pthread_attr_t" (which is in Linux), in windows?
Can you please guide me some tips while porting.
Thanks...
The equivalent to pthread_t would be (as is so often the case) a HANDLE on Windows - which is what CreateThread returns.
There is no direct equivalent of pthread_attr_t. Instead, the attributes of a flag such as the stack size, whether the thread is initially suspended and other things are passed to CreateThread via arguments.
In the cases I saw so far, writing a small wrapper around pthreads so that you can have an alternative implementation for Windows was surprisingly simple. The most irritating thing for me was that on Windows, a Mutex is not the same thing as on Linux: on Windows, it's a handle which can be accessed from multiple processes. The thing which the pthread library calls mutex is called "critical section" on Windows.
That being said, if you find yourself finding more than just a few dozen lines of wrapper code you might want have a look at the c++11 thread library or at the thread support in Boost to avoid reinventing the wheel (and possibly wrongly so).
Here is your tip - "pthread is POSIX".
Mingw has pthreads,
Cygwin have pthreads and so on.
My advice is to stick with mingw and try not to do any changes.

Is it possible using Linux's clone() system call to run multiple applications in the same address space?

If you don't pass the CLONE_VM flag to clone(), then the new process shares memory with the original. Can this be used to make two distinct applications (two main()'s) run in the same process? Ideally this would be as simple as calling clone() with CLONE_VM and then calling exec(), but I realize it's probably more involved. At the very least, I assume that the spawned application would need to be compiled to be relocatable (-fPIC). I realize I could always recode applications to be libraries, and create a master app spawning the other 'apps' as threads, but I'm curious of this approach is possible.
Well, yes, that's what threads are, minus the "two distinct main()/application" part.
In fact, the reason clone(2) is there is to implement threads.
Clone(2) more-or-less requires you to declare a separate stack (if you don't it makes one), because without it the child won't be able to return from its current call level without destroying the parent's stack.
Once you start setting up stacks for each process then you might as well just use the posix thread library.
As for the part where two different applications are loaded, calling execve(2) would most likely not be the way to do it. These days the kernel doesn't precisely run programs anyway. It's more typical that the image is set up to run the Elf dynamic loader, and that's all that the kernel really runs. The loader then mmaps(2)s the process and its libraries into the address space. Certainly that could be done to get the "two distinct applications", and the thread scheduler would be happy to run them as two processes via clone(2).
Why not compile the applications into the same executable and just start them as threads in main?
What is the problem running them as separate tasks anyway? You can still share memory if you really want to.
Short answer: it's impossible.
Well, it's possible if you're willing to write your own custom ELF loader and simulate a lot of things that the kernel normally does for a process.
It's better to compile each of the apps into a library that exposes exactly one function, main (renamed to something else). Then the main stub program should link with the two libraries and call each one's exported function.

Resources