Loading and Unloading shared libraries in Mac OSX - shared-libraries

I am sorry if this question has been repeated before in this forum. I am having a problem where, Loading and Unloading of dylibs arent working as expected in Mac(esp the unloading part.).
The question is if I have an executable and if I load a shared library say A.dylib and then use the loaded shared library to load an library say B.dylib. When I try unloading the library B.dylib at a later stage, the there is no error code returned(the return int value is 0 - as I am using a regular dlopen and dlclose functions to load and unload libraries, 0 means unloaded successfully), but when I check to make sure using the activity monitor or lsof the b.dylib is still in the memory.
Now the we are porting this code for windows, linux & mac. Windows and Linux works as expected, but only mac is giving me problems.
I was reading in the mac developer library and found out that: " There are a couple of cases in which a dynamic library will never be unloaded:
1) the main executable links against it, 2) An API that does not supoort unloading (e.g. NSAddImage())
was used to load it or some other dynamic library that depends on it, 3) the dynamic library is in
dyld's shared cache."
In my case I dont fall either of the first 2 cases. I am suspecting on case3.
Here is my question:
1. What can I do to make sure I have case 3?
2. If yes, how to fix it?
3. If not, how to fix it?
4. Why is mac so different?
Any help in this regard is appreciated!
Thanks,
Jan

When you load a shared library into an executable, all of the symbols exported by that library are candidates to resolve symbols required by the executable, causing the library to remain loaded if the DYLD linker binds to an unintended symbol. You can list the symbols in a shared library by using nm, and you can set environment variables to enable debugging output for the dynamic linker (see this man page on dyld). You need to set the DYLD_PRINT_BINDINGS environment variable.
Most likely, you need to limit the exported symbols to a specific subset that is used by the executable, so that only those symbols you intend to use are bound. This can be done by placing the required symbols in a file and passing it to the linker via the -exported_symbols_list option. Without doing so, you can end up binding a symbol in the dyloaded library, and it will not be unloaded since they are required to resolve a symbol in the executable and won't unload when dlclose() is called.

Related

Dynamic linking in a shared object .so file

I want to use an SDK/library that uses glibc2.14. My machine has glibc2.12. I installed glibc2.14 in a separate location. Used the SDK in my executable by using compile option --rpath and it works good.
Now, I want to use the SDK (that uses glibc2.14) in a shared object binary (.so). I tried --rpath and --dynamic-linker options but the shared object is not loaded and it gives me an error during runtime -
/lib64/libc.so.6: version ``GLIBC_2.14'' not found (required by /usr/local/lib/libsdk.so.1).
How do I make the shared object binary look at the glibc2.14?
Now, I want to use the SDK (that uses glibc2.14) in a shared object binary (.so).
You can't.
As this answer explains, ld-linux and libc.so must come from the same build of GLIBC.
Since the ld-linux is determined by the main executable (is hard-coded into it at static link time), it will not match your custom libc.so.6 no matter what you do to compile your .so.
In addition, specifying --dynamic-linker while building .so is pointless: it is the job of ld-linux to load the .so, so by definition ld-linux must already be loaded before any .so is loaded into the process.
P.S. If it were possible to make your .so use a different libc.so.6, the result would be an (almost) immediate crash, as libc.so.6 is not designed to work when multiple copies are loaded into the same process.
Update:
upgrade my OS which I don't think is possible because I am developing the software for a client and getting them to upgrade is not going to be easy. Second would be to ask the SDK supplier to recompile with glibc2.12.
GLIBC-2.14 was released 9 years ago. It's not unreasonable for your supplier to "only" support 2.14 (and later), and it is somewhat unreasonable for your client to run such an old OS.
I think you have 3rd possible option: have the client install newer GLIBC in parallel, and build their main executable with --rpath and --dynamic-linker flags (as you have done). Their binary will then have no problem loading your SDK.

Dynamically loaded library remains loaded despite dlclose

Today I'm looking for some enlightenment of the deep magic inside the dynamic loader. I'm debugging/troubleshooting a plugin system for a C++ application running on Linux. It loads plugins via dlopen (RTLD_NOW | RTLS_LOCAL) and releases them using dlclose. Nothing extraordinary - one would think.
However, I noticed that some plugins remain loaded even after dlclose is successfully called*. I concluded this after looking at the running process' memory map using pmap. Some libs get immediately removed from process memory and others keep lingering around apparently indefinitely.
Continuing on, the dlopen man page states:
The function dlclose() decrements the reference count on the dynamic
library handle handle. If the reference count drops to zero and no
other loaded libraries use symbols in it, then the dynamic library is
unloaded.
That means the problem boils down to these two possibilities; Either the reference counts aren't zero, or other loaded libs are using symbols from some - but not all - of the plugins.
I'm pretty sure (although not 100%) that the reference counts are zero. The application's plugin manager handles all plugins exactly the same. It also makes sure that a plugin doesn't get loaded multiple times. IMO loading & unloading should therefore behave the same for all plugins.
That leaves the second possibility: other loaded libs are using symbols from the plugins. Another typical case of 'That should never happen'. Although it's certainly possible. We're using gcc and default visibility and as far as I've seen nothing is stripped, so tons of symbols are being exported. Actually that worries me much more as these plugins should be independent.
Here then are my open questions at this point:
Are my conclusions so far correct?
Do you know a way to verifydlopen's reference count?
If there are internal symbols of my plugins (accidentally) used by other libs, is there a way to track down who is using which symbols?
My machine is:
Linux 3.13.0-43-generic #72-Ubuntu SMP Mon Dec 8 19:35:44 UTC 2014 i686 i686 i686 GNU/Linux
* I should mention that all of the loading and unloading happens in the main thread, so there should be no multi threading issue here.
other loaded libs are using symbols from the plugins
If other libs are not linked to that shared library at link time, referring to symbols of a shared library does not prevent that shared library from being unloaded.
To debug the run-time linker set environment variable LD_DEBUG to all, e.g. LD_DEBUG=all ./my_app. See man ld.so for details.

Which libraries appear in /proc/$PID/pmaps?

On Linux you can inspect /proc/$PID/pmaps to see the libraries loaded by a particular program, and a program can open /proc/self/pmaps to examine the libraries it itself has loaded.
I know pmaps will only contain dynamic libraries, and obviously the kernel can't predict which libraries we might dlopen at a later point, so I expect those aren't included in /proc/self/maps. But I'm unsure of a few other other scenarios:
Are libraries that have been linked at build time but we haven't called any function in yet included? My understanding is the Linux delays linking symbols until the first time they are used, so I'm not sure if they'll show up.
Does pmaps contain all the libraries used recursively? E.g. if I look at each library in pmaps and run ldd on it, and then run ldd on those, ad nauseum, I shouldn't find any new libraries that weren't in the original pmaps? I tried this on a couple binaries and it appears to be so but maybe I'm getting lucky.
Are libraries that have been linked at build time but we haven't called any function in yet included?
Yes: the runtime loader will mmap every library that your executable directly depends on, before your program starts running.
You can find the list of such libraries by running
readelf -d a.out | grep NEEDED
Does pmaps contain all the libraries used recursively?
Yes: if a library that you directly depend on itself depends on some other library, the runtime loader will mmap the recursive dependencies as well.
My understanding is the Linux delays linking symbols until the first time they are used
That is mosty correct for function symbols, but false for data symbols, which can't be resolved lazily.
Also, whether the symbols are resolved lazily or not depends on LD_BIND_NOW environment variable, and on an equivalent setting in the executable dynamic section, controlled by -znow linker flag.
None of that changes the mmap pciture though; if you have a DT_NEEDED entry for foo.so in your dynamic section, then foo.so will be mmaped (and will show in /proc/self/*map*) independent of lazy or non-lazy resolution.
/proc/$pid/maps is not only going to list libraries that are loaded, but also ALL other mapped memory segments.
Read this thread and the article in there:
Understanding Linux /proc/id/maps

Unloading a shared library from memory

I am trying to modify this shared library (with .so) extension on Linux. I am inserting some printf statement and fprintf statement to debug, and it has no effect. I removed the .so file and realized that the the program still runs fine. Does it mean that the program is loaded into memory?? (But I'm sure only the program I'm testing for uses that .so file though)
How do I get it to unload so I can make sure my program is loading the modified one?
No, shared libraries are not cached in memory. If you have deleted the .so file and your program still runs, then either:
the program is loading an .so of the same name from a different location, or
the program can run without loading the .so
If the .so is supposed to be loaded at program startup, then you can use ldd to find out where your OS thinks the .so actually is.
If the .so is loaded dynamically at runtime, then perhaps strace will be able to help pinpoint what is happening.
You can read /proc/1234/maps to find out the memory map of process 1234. This also shows the dynamically loaded shared objects.
You may use the LD_LIBRARY_PATH environment variable to change the path of shared libraries and ldconfig to upgrade its cache. Look also in /etc/ld.so.conf etc.
Of course, you have to restart the program loading your shared library.

.lib files and decompiling

I have a .exe which is compiled from a combination of .for (fortran), and .c source files.
It does not run on anything later than Win98, due to an error with the graphics server:
“access violation error in User 32.dll at Ox7e4467a9”
Unless there is some other way around the above error (?), I assume I have to recompile the .exe from source using a more modern graphics server. I have all the files to do this bar one .lib file!
Is it possible to pull any info on the missing lib file out of the current .exe I have?
It is possible to dis-assemble the .exe, but I don't think I gain much from this?
You probably can't "cut" the lib file from an executable. Even if you could somehow get the code from it, standard compilers and linker wouldn't know how to link against it, since it won't have the linking information needed (they are not included in the result binary).
However, if your problem is that your program works on Win98, but doesn't run on NT-based systems (XP, Vista, Win7), I think it would be easier to find out, what incompatibility is there that crashes the program. You mentioned that the access violation occurs in user32.dll. Start your program inside a debugger, take a look at which function the crash occurs. Make sure you have your PDB symbols loaded (so you can see names of internal non-public functions). Trace down which Win32 API is called and what are its parameters. Try to figure out, what should be at the memory that cannot be accessed.
Also without any other information, it's impossible to help you with that.
Once integrated into an image file (your exe), a library (your .lib) which is statically bound to an application (which is done by your linker) cannot be separated, differentiated from your own code, and thus, one cannot retrieve the code from a lib by decompiling the exe.

Resources