How to make a fix in one of the shared libraries (.so) in the project on linux? - linux

I want to make a quick fix to one of the project's .so libraries. Is it safe to just recompile the .so and replace the original? Or I have to rebuild and reinstall the whole project? Or it depends?

It depends. Shared library needs to be binary-compatible with your executable.
For example,
if you changed the behaviour of one of library's internal functions, you probably don't need to recompile.
If you changed the size of a struct (e.g. by adding a member) that's known by the application, you will need to recompile, otherwise the library and the application will think the struct is smaller than it is, and will crash when the library tries to read an extra uninitialized member that the application didn't write to.
If you change the type or the position of arguments of any functions visible from the applications, you do need to recompile, because the library will try to read more arguments off the stack than the application has put on it (this is the case with C, in C++ argument types are the part of function signature, so the app will refuse run, rather than crashing).
The rule of thumb (for production releases) is that, if you are not consciously aware that you are maintaining binary compatibility, or not sure what binary compatibility is, you should recompile.

That's certainly the intent of using dynamic libraries: if something in the library needs updating, then you just update the library, and programs that use it don't need to be changed. If the signature of the function you're changing doesn't change, and it accomplishes the same thing, then this will in general be fine.
There are of course always edge cases where a program depends on some undocumented side-effect of a function, and then changing that function's implementation might change the side-effect and break the program; but c'est la vie.

If you have not changed the ABI of the shared library, you can just rebuild and replace the library.

It depends yes.
However, I assume you have the exact same source and compiler that built the other stuff and now if you only change in a .cpp file something, it is fine.
Other things e.g. changing an interface (between the shared lib and the rest of the system) in a header file is not fine.

If you don't change your library binary interface, it's ok to recompile and redeploy only the shared library.
Good references:
How To Write Shared Libraries
The Little Manual of API Design

Related

Loading Linux libraries at runtime

I think a major design flaw in Linux is the shared object hell when it comes to distributing programs in binary instead of source code form.
Here is my specific problem: I want to publish a Linux program in ELF binary form that should run on as many distributions as possible so my mandatory dependencies are as low as it gets: The only libraries required under any circumstances are libpthread, libX11, librt and libm (and glibc of course). I'm linking dynamically against these libraries when I build my program using gcc.
Optionally, however, my program should also support ALSA (sound interface), the Xcursor, Xfixes, and Xxf86vm extensions as well as GTK. But these should only be used if they are available on the user's system, otherwise my program should still run but with limited functionality. For example, if GTK isn't there, my program will fall back to terminal mode. Because my program should still be able to run without ALSA, Xcursor, Xfixes, etc. I cannot link dynamically against these libraries because then the program won't start at all if one of the libraries isn't there.
So I need to manually check if the libraries are present and then open them one by one using dlopen() and import the necessary function symbols using dlsym(). This, however, leads to all kinds of problems:
1) Library naming conventions:
Shared objects often aren't simply called "libXcursor.so" but have some kind of version extension like "libXcursor.so.1" or even really funny things like "libXcursor.so.0.2000". These extensions seem to differ from system to system. So which one should I choose when calling dlopen()? Using a hardcoded name here seems like a very bad idea because the names differ from system to system. So the only workaround that comes to my mind is to scan the whole library path and look for filenames starting with a "libXcursor.so" prefix and then do some custom version matching. But how do I know that they are really compatible?
2) Library search paths: Where should I look for the *.so files after all? This is also different from system to system. There are some default paths like /usr/lib and /lib but *.so files could also be in lots of other paths. So I'd have to open /etc/ld.so.conf and parse this to find out all library search paths. That's not a trivial thing to do because /etc/ld.so.conf files can also use some kind of include directive which means that I have to parse even more .conf files, do some checks against possible infinite loops caused by circular include directives etc. Is there really no easier way to find out the search paths for *.so?
So, my actual question is this: Isn't there a more convenient, less hackish way of achieving what I want to do? Is it really so complicated to create a Linux program that has some optional dependencies like ALSA, GTK, libXcursor... but should also work without it! Is there some kind of standard for doing what I want to do? Or am I doomed to do it the hackish way?
Thanks for your comments/solutions!
I think a major design flaw in Linux is the shared object hell when it comes to distributing programs in binary instead of source code form.
This isn't a design flaw as far as creators of the system are concerned; it's an advantage -- it encourages you to distribute programs in source form. Oh, you wanted to sell your software? Sorry, that's not the use case Linux is optimized for.
Library naming conventions: Shared objects often aren't simply called "libXcursor.so" but have some kind of version extension like "libXcursor.so.1" or even really funny things like "libXcursor.so.0.2000".
Yes, this is called external library versioning. Read about it here. As should be clear from that description, if you compiled your binaries using headers on a system that would normally give you libXcursor.so.1 as a runtime reference, then the only shared library you are compatible with is libXcursor.so.1, and trying to dlopen libXcursor.so.0.2000 will lead to unpredictable crashes.
Any system that provides libXcursor.so but not libXcursor.so.1 is either a broken installation, or is also incompatible with your binaries.
Library search paths: Where should I look for the *.so files after all?
You shouldn't be trying to dlopen any of these libraries using their full path. Just call dlopen("libXcursor.so.1", RTLD_GLOBAL);, and the runtime loader will search for the library in system-appropriate locations.

Finding the shared library name to use with dlload

In my open-source project Artha I use libnotify for showing passive desktop notifications to the user.
Instead of statically linking libnotify, a lookup at runtime is made for the shared object (.so) file via dlload, if available on the target machine, Artha exposes the notification feature in it's GUI. On app. start, a call to dlload with filename param as libnotify.so.1 is made and if it returns a non-null pointer, then the feature is exposed.
A recurring problem with this model is that every time the version number of the library is bumped, Artha's code needs to be updated, currently libnotify.so.4 is the latest to entail such an occurance.
Is there a linux system call (irrespective of the distro the app. is running on), which can tell me if a particular library's shared object is available at runtime? I know that there exists the bruteforce option of enumerating the library by going from 1 to say 10, I find the solution ugly and inelegant.
Also, if this can be addressed via autoconf, then that solution is welcome too I.e. at build time, based on the target machine, the configure.h generated should've the right .so name that can be passed to dlload.
P.S.: I think good distros follow the style of creating links to libnotify.so.x so that a programmer can just do dlload("libnotify.so", RTLD_LAZY) and the right version numbered .so is loaded; unfortunately not all distros follow this, including Ubuntu.
The answer is: you don't.
dlopen() is not designed to deal with things like that, and trying to load whichever soversion you find on the system just because it happens to have the symbols you need is not a good way to do it.
Different sonames have different ABIs, and different ABIs means that you may be calling the same exact symbol name that is expecting a different set (or different size) of parameters, which will cause crashes or misbehaviour that are extremely difficult do debug.
You should have a read on how shared object versions work and what an ABI is.
The libfoo.so link is there for the link editor (ld) and is usually installed with the -devel packages for that reason; it might also very well not be a link but rather a text file with a linker script, often times on purpose to avoid exactly what you're trying to do.

Suppressing system calls when using gcc/g++

I have a portal in my university LAN where people can upload code to programming puzzles in C/C++. I would like to make the portal secure so that people cannot make system calls via their submitted code. There might be several workarounds but I'd like to know if I could do it simply by setting some clever gcc flags. libc by default seems to include <unistd.h>, which appears to be the basic file where system calls are declared. Is there a way I could tell gcc/g++ to 'ignore' this file at compile time so that none of the functions declared in unistd.h can be accessed?
Some particular reason why chroot("/var/jail/empty"); setuid(65534); isn't good enough (assuming 65534 has sensible limits)?
Restricting access to the header file won't prevent you from accessing libc functions: they're still available if you link against libc - you just won't have the prototypes (and macros) to hand; but you can replicate them yourself.
And not linking against libc won't help either: system calls could be made directly via inline assembler (or even tricks involving jumping into data).
I don't think this is a good approach in general. Running the uploaded code in a completely self-contained virtual sandbox (via QEMU or something like that, perhaps) would probably be a better way to go.
-D can overwrite individual function names. For example:
gcc file.c -Dchown -Dchdir
Or you can set the include guard yourself:
gcc file.c -D_UNISTD_H
However their effects can be easily reverted with #undefs by intelligent submitters :)

Static libraries dependency

I need some basic clarification on C++ static linkage. I have a file called data_client.lib. There are three independant consumers for the library file a.exe, b.exe and c.exe. There is a service called data_server.exe for which data_client.lib is the interface. Actually, I added another function to data_server.exe and corresponding interface to data_client.lib. Since just a.exe needs the extra functionality, I build a.exe only. I shipped data_server.exe, data_client.exe and a.exe as patch. Now, b.exe and c.exe randomly/inconsistently crashes throwing
mfc42u!CException::`RTTI Complete
Object Locator'+0x10
Does it make sense? If I also build b.exe and c.exe, then the crash does not happen. Is this the way it works?
Actually, I added another function to data_server.exe and corresponding interface to data_client.lib.
It's a little unclear from this exactly what was added to your library. However, if it's a new method or methods added to a class (rather than just some new standalone functions), there's a very high chance that recompiling everything will fix your problem. The vtable may or may not have been thrown out of whack by your changes.
It's also possible that your crashes have absolutely nothing to do with this and there's some other problem going on... but from your description, my money's on a vtable issue. If it were me, I'd recompile b.exe and c.exe and test again before investigating other issues.
Maybe You don't have explicit dependencies, but some of Your project headers uses, or put information implicitly into your library.
I do not know about the error, but your applications b.exe and c.exe are using an older version of the binding lib to communicate with a newer version of the data_server.exe. Some v_table indexes might be off or something if you added a function. You definately have to rebuild all the libraries.

What are the porting issues going from VC8 (VS2005) to VC9 (VS2008)?

I have inherited a very large and complex project (actually, a 'solution' consisting of 119 'projects', most of which are DLLs) that was built and tested under VC8 (VS2005), and I have the task of porting it to VC9 (VS2008).
The porting process I used was:
Copy the VC8 .sln file and rename it
to a VC9 .sln file.
Copy all of
the VC8 project files, and rename
them to VC9 project files.
Edit
all of the VC9 project files,
s/vc8/vc9.
Edit the VC9 .sln,
s/vc8/vc9/
Load the VC9 .sln with
VS2008, and let the IDE 'convert'
all of the project files.
Fix
compiler and linker errors until I
got a good build.
So far, I have run into the following issues in that last step.
1) A change in the way decorated names are calculated, causing truncation of the names.
This is more than just a warning (http://msdn.microsoft.com/en-us/library/074af4b6.aspx). Libraries built with this warning will not link with other modules. Applying the solution given in MSDN was non-trivial, but doable. I addressed this problem separately in How do I increase the allowed decorated name length in VC9 (MSVC 2008)?
2) A change that does not allow the assignment of zero to an iterator. This is per the spec, and it was fairly easy to find and fix these previously-allowed coding errors. Instead of assignment of zero to an iterator, use the value end().
3) for-loop scope is now per the ANSI standard. Another easy-to-fix problem.
4) More space required for pre-compiled headers. In some cases a LOT more space was required. I ended up using /Zm999 to provide the maximum PCH space. If PCH memory usage gets bumped up again, I assume that I will have to forgo PCH altogether, and just endure the increase in what is already a very long build time.
5) A change in requirements for copy ctors and default dtors. It appears that in template classes, under certain conditions that I haven't quite figured out yet, the compiler no longer generates a default ctor or a default dtor. I suspect this is a bug in VC9, but there may be something else that I'm doing wrong. If so, I'd sure like to know what it is.
6) The GUIDs in the sln and vcproj files were not changed. This does not appear to impact the build in any way that I can detect, but it is worrisome nevertheless.
Note that despite all of these issues, the project built, ran, and passed extensive QA testing under VC8. I have also back-ported all of the changes to the VC8 projects, where they still build and run just as happily as they did before (using VS2005/VC8). So, all of my changes required for a VC9 build at least appear to be backward-compatible, although the regression testing is still underway.
Now for the really hard problem: I have run into a difference in the startup sequence between VC8 and VC9 projects. The program uses a small-object allocator modeled after Loki, in Andrei Alexandrescu's Book Modern C++ Design. This allocator is initialized using a global variable defined in the main program module.
Under VC8, this global variable is constructed at the very beginning of the program startup, from code in a module crtexe.c. Under VC9, the first module that executes is crtdll.c, which indicates that the startup sequence has been changed. The DLLs that are starting up appear to be confusing the small-object allocator by allocating and deallocating memory before the global object can initialize the statistics, which leads to some spurious diagnostics. The operation of the program does not appear to be materially affected, but the QA folks will not allow the spurious diagnostics to get past them.
Is there some way to force the construction of a global object prior to loading DLLs?
What other porting issues am I likely to encounter?
Is there some way to force the construction of a global object prior to loading DLLs?
How about the DELAYLOAD option? So that DLLs aren't loaded until their first call?
That is a tough problem, mostly because you've inherited a design that's inherently dangerous because you're not supposed to rely on the initialization order of global variables.
It sounds like something you could try to work around by replacing the global variable with a singleton that other functions retrieve by calling a global function or method that returns a pointer to the singleton object. If the object exists at the time of the call, the function returns a pointer to it. Otherwise, it allocates a new one and returns a pointer to the newly allocated object.
The problem, of course, is that I can't think of a singleton implementation that would avoid the problem you're describing. Maybe this discussion would be useful: http://www.oneunified.net/blog/Personal/SoftwareDevelopment/CPP/Singleton.article
That's certainly an interesting problem. I don't have a solution other than perhaps to change the design so that there is no dependence on undefined behavior of the order or link/dll startup. Have you considered linking with the older linker? (or whatever the VS.NET term is)
Because the behavior of your variable and allocator relied on some (unknown at the time) arbitrary order of startup I would probably fix that so that it is not an issue in the future. I guess you are really asking if anyone knows how to do some voodoo in VC9 to make the problem disappear. I am interested in hearing it as well.
How about this,
Make your main program a DLL too, call it main.dll, linked to all the other ones, and export the main function as say, mainEntry(). Remove the global variable.
Create a new main exe which has the global variable and its initialization, but doesn't link statically to any of the other application DLLs (except for the allocator stuff).
This new main.exe then dynamically loads the main.dll using LoadLibrary(), then uses GetProcAddress to call mainEntry().
The solution to the problem turned out to be more straightforward than I originally thought. The initialization order problem was caused by the existence of several global variables of types derived from std container types (a basic design flaw that predated my position with that company). The solution was to replace all such globals with singletons. There were about 100 of them.
Once this was done, the initialization (and destruction) order was under programmer control.

Resources