overriding functions of running process in linux - linux

I am curious to know, how to override functions of a running process in Linux so that process should call my functions first.
we can use LD_PRELOAD for overriding of function in a binary but it does not work in case of already running process. Any suggestion please.

What LD_PRELOAD allows you to do is force loading a shared object before any other. So if a function is already provided by this custom shared object, it will not be loaded again from the "standard" shared object.
In your case you want to modify an already loaded function. I believe that is not possible.
This would clearly be a security risk.
Most operating systems implement DEP and ASLR which would prevent modifing and predicting the shared object position in memory.

Related

Loading multiple copies of a shared library

I would like to be able to load multiple copies of a shared library into the same address space. I want them to not share any global variables, and I want the two copies to be unaware that the other has been loaded.
The use-case is parallel execution of a thread-unsafe library.
How can I do this
on Linux?
on OS X?
on Windows?
on *BSD?
on other Unix-like systems?
The use-case is parallel execution of a thread-unsafe library.
Even if you manage to achieve the "not share any global variables" goal (which is hard), the library may still not work, because it could be calling into thread-unsafe routines in other libraries.
The obvious case is for the library to call strtok.
On Linux and Solaris, you could use dlmopen(LM_ID_NEWLM, ...). Man page.

Equivalent of VirtualProtectEx/CreateRemoteThread in Linux?

I was wondering if there was an equivalent version either in a library or as a syscall, of the windows APIs which allow a process to interact with other process' space, which would mean modifying the flow of that second process.
This is to inject a .so in a running process without killing it.
Thanks!
maybe take a look here: CreateRemoteThread in Linux
I don't know of a simpler way than described there. On Windows you have this
fancy API like VirtualProtectEx. On Linux you'd be writing a .so which e.g. executes pthread_create
in a __attribute__((constructor)) function. Then you'd load that .so via the LD_PRELOAD mechanism.
The next best thing to CreateRemoteThread would be manipulating the main thread
of the process with the ptrace API. But this would involve
Holding a thread
Saving its context
Setting arguments for pthread_create
Set IP to pthread_create and execute
Restore the old context.
I think manipulating the memory access rights would also involve calling mprotect from a process context. As already mentioned above, the simplest way
to do that would not be using ptrace but using a precompiled shared object.
On Linux, there is a standard mechanism of injecting your code to a program. You basically define an encironment variable LD_PRELOAD that specifies a .so library that is loaded before all other .so files. Functions in that .so will replace standard versions of the functions. There is no need to modify the assembly code of fuctions manually to insert hooks to your own code like on windows.
Here is a nice tutorial: https://rafalcieslak.wordpress.com/2013/04/02/dynamic-linker-tricks-using-ld_preload-to-cheat-inject-features-and-investigate-programs/

shared libraries (dlopen) and thread-safety of library static pointers

When I load a shared library dynamically, for example with dlopen on linux, do I have to worry about the visibility of the loaded library between processors, or will it be automatically fenced/ensured safe?
For example, say I have this function in the loaded library:
char const * get_string()
{ return "literal"; }
In the main program using such a string-literal pointer is safe between multiple threads as they are all guaranteed to see its initial value. However, I'm wondering how the rules of "initial values" really apply to a loaded library (as the standard doesn't deal much with it.
Say that I load the library, then immediately call the get_string function. I pass the pointer to another thread via a non-memory sequenced atomic (relaxed in C++11 parlance). Can the other thread use this pointer safely without having to issue any load fence or other syncronization instruction?
My assumption is that it is safe. Perhaps because the new library will be loaded into new pages the other core cannot have them loaded yet, and thus cannot have old visibility on them?
I would like some kind of authorative reference as part of the answer if possible. Or a technical description of how it is made thread-safe by default. Or of course a refutation if it isn't thread-safe on its own.
Your question is : will dlopen() load all my lib code properly before returning ? Yes it will. Otherwise you'd have the problem with only a single thread. It would be very difficult to handle if you had to sleep before dlopen completes asynchronously. It will also perform various checks and initialize what needs to be before you have a chance to get the function pointer you are looking for. That means that if you get that pointer, everything is here, you can use directly in any thread.
Now of course, you need to pass that pointer with the usual thread safety, but I assume you know how.
Please be aware that static initialization and modules don't play well together (see all the other questions on SO about that subject).
Your comment on cores is strange. Cores don't load memory. They prefetch it in their cache, but that's not a problem, just a bit slow.
I'll expand on what Basile said. I followed up with glibc and found out dlopen there does in deed use mmap. All guarantees of memory visibility are assumed from the mmap system call, dlopen itself doesn't make any additional guarantees.
Users of mmap generally assume that it will map memory correctly across all processors at the point of its return such that visibility is not a concern. This does not appear to be an explicit guarantee, but the OS would probably be unusable without such a guarantee. There is also no known system where this doesn't work as expected.

Is it safe to call a dll function from multiple threads in a single application?

I am writing a server application in Delphi 2009 that implements several types of authentication. Each authentication method is stored in a separate dll. The first time an authentication method is used the appropriate dll is loaded. The dll is only released when the application closes.
Is it safe to access the dlls without any form of synchronisation between the server threads (connections)?
Short answer:
Yes, it is generally possible to call a DLL function from multiple threads, since every thread has it's own stack and calling a DLL function is more or less the same as calling any other function of your own code.
Long answer:
If it is actually possible depends on the DLL functions using a shared mutable state or not.
For example if you do something like this:
DLL_SetUser(UserName, Password)
if DLL_IsAuthenticated then
begin
...
end;
Then it is most certainly not safe to be used from different threads. In this example you can't guarantee that between DLL_SetUser and DLL_IsAuthenticated no other thread makes a different call to DLL_SetUser.
But if the DLL functions do not depend on some kind of predefined state, i.e. all necessary parameters are available at once and all other configuration is the same for all threads, you can assume it'll work.
if DLL_IsAuthenticated(UserName, Password) then
begin
...
end;
But be careful: It might be possible that a DLL function looks atomic, but internally uses something, which isn't. If for example if the DLL creates a temporary file with always the same name or it accesses a database which can only handle one request at a time, it counts as a shared state. (Sorry, I couldn't think of better examples)
Summary:
If the DLL vendors say, that their DLLs are thread-safe I'd use them from multiple threads without locking. If they are not - or even if the vendors don't know - you should play it safe and use locking.
At least until you run into performance problems. In that case you could try creating multiple applications/processes which wrap your DLL calls and use them as proxies.
For your DLLs to be thread-safe you need to protect all shared data structures that multiple threads in one process could access concurrently - there is no difference here between writing code for a DLL vs. writing code for an executable.
For multiple processes there is no risk in concurrent access, as each process gets its own data segment for the DLL, so variables with the same name are actually different when seen from different processes. It is actually much more difficult to provide data in a DLL that is the same from different processes, you would basically need to implement the same things you would use for data exchange between processes.
Note that a DLL is special in that you get notifications when a process or a thread attaches to or detaches from a DLL. See the documentation for the DllMain Callback Function for an explanation, and this article for an example how to use this in a Delphi-written DLL. So if your threads are not completely independent from each other (and no shared data is write-accessed), then you will need some shared data structures with synchronized access. The various notifications may help you with properly setting up any data structures in your DLL.
If your DLLs allow for completely independent execution of the exported functions, also do check out the threadvar thread-specific variables. Note that for them initialization and finalization sections are not usable, but maybe the thread notifications can help you there as well.
Here's the thing - you cannot assume that the DLL's are thread safe if you don't have control over the source code (or documentation stating it is) so therefore you should assume it is not
If you talking about Win32 DLLs, they're meant to be safe if called by multiple threads, applications. I don't know what your DLL does, but if your DLL is using a lockable resource like a file or a port, then there could be trouble based on the implementation inside the DLL.
I am not familiar with the working of Delphi 2009 authentication DLLs. Maybe you should add that information to the headline (that you're talking specifically about the Delphi 2009 DLLs)

Is it possible using Linux's clone() system call to run multiple applications in the same address space?

If you don't pass the CLONE_VM flag to clone(), then the new process shares memory with the original. Can this be used to make two distinct applications (two main()'s) run in the same process? Ideally this would be as simple as calling clone() with CLONE_VM and then calling exec(), but I realize it's probably more involved. At the very least, I assume that the spawned application would need to be compiled to be relocatable (-fPIC). I realize I could always recode applications to be libraries, and create a master app spawning the other 'apps' as threads, but I'm curious of this approach is possible.
Well, yes, that's what threads are, minus the "two distinct main()/application" part.
In fact, the reason clone(2) is there is to implement threads.
Clone(2) more-or-less requires you to declare a separate stack (if you don't it makes one), because without it the child won't be able to return from its current call level without destroying the parent's stack.
Once you start setting up stacks for each process then you might as well just use the posix thread library.
As for the part where two different applications are loaded, calling execve(2) would most likely not be the way to do it. These days the kernel doesn't precisely run programs anyway. It's more typical that the image is set up to run the Elf dynamic loader, and that's all that the kernel really runs. The loader then mmaps(2)s the process and its libraries into the address space. Certainly that could be done to get the "two distinct applications", and the thread scheduler would be happy to run them as two processes via clone(2).
Why not compile the applications into the same executable and just start them as threads in main?
What is the problem running them as separate tasks anyway? You can still share memory if you really want to.
Short answer: it's impossible.
Well, it's possible if you're willing to write your own custom ELF loader and simulate a lot of things that the kernel normally does for a process.
It's better to compile each of the apps into a library that exposes exactly one function, main (renamed to something else). Then the main stub program should link with the two libraries and call each one's exported function.

Resources