Loading multiple copies of a shared library - shared-libraries

I would like to be able to load multiple copies of a shared library into the same address space. I want them to not share any global variables, and I want the two copies to be unaware that the other has been loaded.
The use-case is parallel execution of a thread-unsafe library.
How can I do this
on Linux?
on OS X?
on Windows?
on *BSD?
on other Unix-like systems?

The use-case is parallel execution of a thread-unsafe library.
Even if you manage to achieve the "not share any global variables" goal (which is hard), the library may still not work, because it could be calling into thread-unsafe routines in other libraries.
The obvious case is for the library to call strtok.
On Linux and Solaris, you could use dlmopen(LM_ID_NEWLM, ...). Man page.

Related

Equivalent of VirtualProtectEx/CreateRemoteThread in Linux?

I was wondering if there was an equivalent version either in a library or as a syscall, of the windows APIs which allow a process to interact with other process' space, which would mean modifying the flow of that second process.
This is to inject a .so in a running process without killing it.
Thanks!
maybe take a look here: CreateRemoteThread in Linux
I don't know of a simpler way than described there. On Windows you have this
fancy API like VirtualProtectEx. On Linux you'd be writing a .so which e.g. executes pthread_create
in a __attribute__((constructor)) function. Then you'd load that .so via the LD_PRELOAD mechanism.
The next best thing to CreateRemoteThread would be manipulating the main thread
of the process with the ptrace API. But this would involve
Holding a thread
Saving its context
Setting arguments for pthread_create
Set IP to pthread_create and execute
Restore the old context.
I think manipulating the memory access rights would also involve calling mprotect from a process context. As already mentioned above, the simplest way
to do that would not be using ptrace but using a precompiled shared object.
On Linux, there is a standard mechanism of injecting your code to a program. You basically define an encironment variable LD_PRELOAD that specifies a .so library that is loaded before all other .so files. Functions in that .so will replace standard versions of the functions. There is no need to modify the assembly code of fuctions manually to insert hooks to your own code like on windows.
Here is a nice tutorial: https://rafalcieslak.wordpress.com/2013/04/02/dynamic-linker-tricks-using-ld_preload-to-cheat-inject-features-and-investigate-programs/

overriding functions of running process in linux

I am curious to know, how to override functions of a running process in Linux so that process should call my functions first.
we can use LD_PRELOAD for overriding of function in a binary but it does not work in case of already running process. Any suggestion please.
What LD_PRELOAD allows you to do is force loading a shared object before any other. So if a function is already provided by this custom shared object, it will not be loaded again from the "standard" shared object.
In your case you want to modify an already loaded function. I believe that is not possible.
This would clearly be a security risk.
Most operating systems implement DEP and ASLR which would prevent modifing and predicting the shared object position in memory.

Are Win32 InterlockedIncrement and InterlockedExchange atomic across processes?

MSDN says that the interlocked functions provide a simple mechanism for synchronizing access to a variable that is shared by multiple threads.
I am not sure if they work across threads of multiple processes if the variable is in the shared memory of the processes.
Similarly what about GNU GCC compiler intrinsic: __sync_add_and_fetch and __sync_lock_test_and_set?
This question is essentially two questions for two different answers.
For __sync_XXX builtins in GCC answer is yes.
Refer to any online doc like this, where described, that these builtins are normally issuing full barrier, preventing even internal speculating loads inside processor pipeline. Every and all multi-thread, multi-process, etc. shared memory is safe with them.
I know nothing about Windows InterLockedXXX functions. But MSDN knows, and says:
The threads of different processes can use this mechanism if the variable is in shared memory
So both answers are "yes".

Does Racket support multithreading?

I want to write a multithreading program in Racket that actually utilizes multiple processes with shared memory space like pthread in C. Racket provides "thread", but it only uses one process to execute multiple threads. It also provides "subprocess" for executing new programs via command line that runs on multiple processes, but those programs cannot share the same memory space.
Don't do that.
Racket does provide parallelism via futures and places, but they do not provide (unrestricted) shared memory spaces. If you want to send data from one thread to another, use a place channel.
As Greg Hendershott points out, you can send a shared vector via a place channel, which provides a shared space to use. (But that's not the same thing as sharing all the memory references, which is what someone familiar with, say, Java-style threading might expect. And the latter is what my "don't do that" refers to.)
If you really want to use pthread-like threading, Guile does provide them, but then you won't be using Racket any more. ;-)

Is it possible using Linux's clone() system call to run multiple applications in the same address space?

If you don't pass the CLONE_VM flag to clone(), then the new process shares memory with the original. Can this be used to make two distinct applications (two main()'s) run in the same process? Ideally this would be as simple as calling clone() with CLONE_VM and then calling exec(), but I realize it's probably more involved. At the very least, I assume that the spawned application would need to be compiled to be relocatable (-fPIC). I realize I could always recode applications to be libraries, and create a master app spawning the other 'apps' as threads, but I'm curious of this approach is possible.
Well, yes, that's what threads are, minus the "two distinct main()/application" part.
In fact, the reason clone(2) is there is to implement threads.
Clone(2) more-or-less requires you to declare a separate stack (if you don't it makes one), because without it the child won't be able to return from its current call level without destroying the parent's stack.
Once you start setting up stacks for each process then you might as well just use the posix thread library.
As for the part where two different applications are loaded, calling execve(2) would most likely not be the way to do it. These days the kernel doesn't precisely run programs anyway. It's more typical that the image is set up to run the Elf dynamic loader, and that's all that the kernel really runs. The loader then mmaps(2)s the process and its libraries into the address space. Certainly that could be done to get the "two distinct applications", and the thread scheduler would be happy to run them as two processes via clone(2).
Why not compile the applications into the same executable and just start them as threads in main?
What is the problem running them as separate tasks anyway? You can still share memory if you really want to.
Short answer: it's impossible.
Well, it's possible if you're willing to write your own custom ELF loader and simulate a lot of things that the kernel normally does for a process.
It's better to compile each of the apps into a library that exposes exactly one function, main (renamed to something else). Then the main stub program should link with the two libraries and call each one's exported function.

Resources