Assume a ready-to-run linked (C++) program a.out created with g++ and ld. Is there a way to create a shared object library liba.so that contains all functions or at least main of the original program.
Consider writing a miniature wrapper library with a single function main that executes your a.out... That way you need not try to do any decompilation magic at all.
Related
Suppose that there exists three shared libraries A.so, B.so and C.so, each having a function f(). I want to switch between each function f() under certain circumstances, determined at runtime. Using the LD_PRELOAD trick, I will execute the program p (i.e., the user of these libraries) as follows:
LD_PRELOAD=A.so:B.so:C.so ./p
f() in A.so will be the default. In that f() instance, I can access f() of B.so using dlsym(), as follows:
void f() // in A.so
{
...
void *f_in_B = dlsym(RTLD_NEXT, "f");
...
}
How can I access the f() instance in C.so?
UPDATE:
Although yugr's answer works in the simple case, it has problems in more general conditions. I will shed some light on the problem by giving more details the current situation:
I need two different dynamic memory allocators and do not want to deal with internal implementation details of glibc malloc(),.... Put simply, I have two separate memory areas each with its own glibc. I use LD_PRELOAD to switch between the allocators based on some runtime condition. I used yugr's answer to load and access the secondary library.
Firstly, I called ptmalloc_init() in the secondary library to initialize malloc() data structures. I have also managed brk() calls at the OS system call level in such a way that each library has its own large brk() range and this avoids further conflicts.
The problem is that the solution works only at the application/glibc boundary. For example, when I use malloc() in the secondary library, it calls functions such as __default_morecore() in the primary library, internally, and these calls can not be captured by the LD_PRELOAD trick.
Why does this happen? I thought that the internal symbols of a library are set internally at the library compilation time, but, here, it seems that they use the symbols in the primary library of LD_PRELOAD. If the resolution is done by the linker, why these seemingly internal symbols are not captured by the LD_PRELOAD trick?
How can I fix the problem? Should I rename all the exported functions in the secondary library? Or the only feasible approach is to use a single library and delve into the implementation stuff?
You can dlopen each of the three libraries at startup to obtain their handles and use those to call dlsym(h, "f") to get particular implementations of f. If you don't know library names in advance you could use dl_iterate_phdr to obtain them (see this code for example).
I am newing a heap object in a regular DLL. I export it properly with __declspec(dllexport) and import it in the EXE with __declspec(dllimport) linkage. As long as I am in the DLL the object is defined properly, but when executing/debugging in the EXE, the object is undefined. What am I missing? Name mangling? Should extern "C" demangling help?
Further explanation:
#Colin Robertson My problem stems from the prototype using extension DLLs whose code is integrated with the EXE upon compile. I knew my app would need to access objects directly in the DLL from the EXE which is okay in windows extension DLLs because of the code integration. But the prototype turned out to be a memory hog as my app creates many DLLs during execution, each of which got integrated, dynamically I might add, into the running executable. Therefore, the production code had to use the regular DLL which has automatic reference counting (dllmain etc) as long as it isn’t statically linked. Which brings me to my current problem of how do I access the DLL object from within the EXE?
As such, the discussion in your links regarding the passing of allocator is not relevant. Point 60 (60. Avoid allocating and deallocating memory in different modules.) in Sutter and Alexandrescu's book does not apply since the EXE is not responsible for object lifecyle. Also, since I am using shared libraries, the following is true: “Specifically, if you use the Dll runtime option, then a single dll - msvcrtxx.dll - manages a single freestore that is shared between all dll's, and the exe, that are linked against that dll.” (see StackOverflow’s “Who allocates heap to my DLL?” whose thread was closed by poke, Linus Kleen, mauris, Cody Gray, miku for some reason). My code does not mix the allocation/deallocation responsibilities of the DLL with the usage requests of the EXE.
I think the problem lies in the fact that in a regular DLL, using a pointer to an object in another module’s allocated heap running in a different thread with its own message pump is disastrous and is censured by the compiler. This is as it should be.
But my problem is still legitimate.
I see two ways windows solves such a situation. One is the Send/PostMessage call which posts messages on other thread queues and the other is COM marshalling. For the former I would have a problem with the return value. Since what I am doing is basically a remote procedure call, my EXE wants results back from the DLL, and SendMessage only returns an HRESULT. As for the latter, this is exactly what COM does when it marshals a pointer in an Apartment threaded app (see “Single-Threaded Apartments” in MSDN). COM is designed to let you pass pointers between threads, or even processes. There might be a third C++ way which is to use the Pimpl idiom (see http://www.c2.com/cgi/wiki?PimplIdiom), but this method is a lot more work and has drawbacks. Thanks to MVP Scott McPhillips for this suggestion.
Does anyone have advice or experience on which way to proceed?
Don't do that. This is item 60 in Sutter and Alexandrescu's C++ Coding Standards book, which I highly recommend. Separate modules may use their own versions of the run time library, including the basic allocation routines. Things allocated on one module's heap may be inaccessible from another module, or have different conventions for allocating and freeing them. The name mangling conventions can be different, but that's the least of your worries. Here's another StackOverflow question that has more detailed answers for why this is a bad idea, and what to do instead: Is it bad practice to allocate memory in a DLL and give a pointer to it to a client app?
Problem Statement
I'm trying to get the address of a running thread's start_routine as passed in the pthread_create() call.
Research so far
It is apparently not in /proc/[tid]/stat or /proc/[tid]/status.
I found that start_routine is a member of struct pthread and gets set by pthread_create.[1]
If I knew the address of this struct, I could read the start_routine address.
I also found td_thr_get_info defined in the debugging library thread_db.h.[2]
It fills a struct with information about the thread, including the start function.[3] But it needs a struct td_thragent as an argument and I don't know how to create it properly.
Links
[1] http://fxr.watson.org/fxr/source/nptl/pthread_create.c?v=GLIBC27;im=excerpts#L455
[2] http://fxr.watson.org/fxr/source/nptl_db/td_thr_get_info.c?v=GLIBC27#L27
[3] See comment, because I'm not allowed to post more than 2 links.
You probably can't, and I could even imagine a very wild scenario where it could not exist at the moment you are querying it.
Let's suppose that the initial thread start routine void*foo_start(void*) is in some dlopen-ed dynamic shared library libfoo.so.
Let's imagine that foo_start is making a tail-recursive call to bar, and that bar function is dlclose-ing libfoo.so and later calling some of your routine querying that start. It is an wild address in some defunct segment (which has been munmap-ed by dlclose called by bar)!
So, even if you hack your libc to retrieve the start routine of a thread, that does not make much sense. BTW, you could look into MUSL libc, its src/thread/pthread_create.c file is quite readable.
NB: on some occasions, recent GCC (e.g. 4.8 or 4.9) when asked to optimize a lot (e.g. -O3) are able to generate tail recursive calls from C code.
I know that LD_PRELOAD can be used to intercept calls to functions in shared libraries (if the app is not statically linked). However, I do not know how it can be used to add additional features or background threads to applications.
For example, I think Berkeley labs checkpoint/restart uses this method to add a background thread to an application that may be checkpointed later on.
So, now the question is how can a thread be injected into a compiled app using LD_PRELOAD without knowing before hand what functions of shared libraries are being called from this app?
It's a simple enough matter - you can implement the _init function - that would be void _init(void) {}, and you can use pthread_create in it (assuming you linked your library with -lpthread). You should compile your library with the other -l dependencies you need. GCC will allow you to replace the hardcoded _init() with another entry point, specified with an __attribute (constructor), as well. At any rate, your entry point will get called by LD.
When your library is injected, it gets injected before all others, but its own dependencies do get resolved as well, so whatever calls you make are generally ok (one notable exception being if you intercept functions you later call, for which you'll need to use the dlfcn APIs to do so safely).
Is there (in glibc-2.5 and newer) a way to define a hook for pthread_create?
There is a lot of binary applications and I want to write a dynamic lib to be loaded via LD_PRELOAD
I can add hook on entry to main (''attributte constructor''), but how can I force my code to be executed in every thread just before the thread's function will run.
This answer shows how to interpose pthread_create. (Beware: it will work correctly in 64-bit, but not 32-bit programs.)
Once you interpose pthread_create, you can make it call your own function, which will do whatever you want, and then call the original function the user passed to pthread_create.