rpm upgrading shared object used by other program - shared-libraries

I am generating rpm-A that has program P-A.1.1, and two libs L-A.1.1 and L-B.1.1.
L-A.1.1 changes some APIs it used to expose compared to it's previous version - L-A.1.0
Say the machine had another program P-B.1.0 that uses L-A.1.0.
Will installing rpm-A break program P-B.1.0?
Will L-A.1.1 co-exist with L-A.1.0?
A

If you are upgrading the package that had previously provided P-A.1.0 and the new version of the package no longer provides that version of the library and only provides the P-A.1.1 version of the library then RPM will not allow that upgrade to occur without being forced because it would break P-B.1.0.
You have a number of options to handle this sort of thing.
You can provide both libraries in the same package.
You can change the package name (e.g. gnupg.gnupg2 or iptables/iptables-ipv6 though those are both for slightly different reasons than this).
You can use library symbol versioning to have your library expose both APIs at the same time (I believe).

Related

ICU files needed during run time

In order to understand ICU and its APIs, I wrote a sample program and the libraries this code would link against are -licuuc and -licui18n. The libraries were available because the libicu-devel.x86_64 package was installed on the test system.
In my quest to understand how to integrate ICU library with my application that is targeted for a centOS platform, I stumbled across this page, which says:
For simple use of ICU's predefined data, this section on data management can safely be skipped. The data is built into a library that is loaded along with the rest of ICU. No specific action or setup is required of either the application program or the execution environment.
This indicates that if the application has no intention of adding its own data, the data available in the libraries can be used. On my test system where ICU is installed, these are the files:
$ sudo find . -name "*icu*"
./opt/rbt_boost/include/boost/regex/icu.hpp
./lib64/libicui18n.so.42
./lib64/libicui18n.so.42.1
./lib64/libicuuc.so.42.1
./lib64/libicuuc.so.42
./usr/lib64/libicui18n.so.42
./usr/lib64/libicule.so
./usr/lib64/libicuio.so.42
./usr/lib64/libicutu.so
./usr/lib64/libiculx.so.42.1
./usr/lib64/pkgconfig/icu.pc
./usr/lib64/libicui18n.so
./usr/lib64/libicui18n.so.42.1
./usr/lib64/libicule.so.42.1
./usr/lib64/libicuuc.so.42.1
./usr/lib64/libiculx.so
./usr/lib64/libicuuc.so.42
./usr/lib64/libicuio.so.42.1
./usr/lib64/icu
./usr/lib64/libicudata.so.42
./usr/lib64/libicule.so.42
./usr/lib64/libicutu.so.42.1
./usr/lib64/libicuio.so
./usr/lib64/libicudata.so
./usr/lib64/libicudata.so.42.1
./usr/lib64/libiculx.so.42
./usr/lib64/libicutu.so.42
./usr/lib64/libicuuc.so
./usr/bin/icu-config
./usr/share/icu
./usr/share/man/man1/icu-config.1.gz
./var/lib/yum/yumdb/l/e59bf24facac0acba1622a5180d0e2a22dda69c8-libicu-devel-4.2.1-9.1.el6_2-x86_64
./var/lib/yum/yumdb/l/7062f72703a5afbf894d617b94db3d4769fe643d-libicu-4.2.1-9.1.el6_2-x86_64
Questions:
Which of these ICU libraries (and files) should be packaged with the application for ICU data to be available at run time? As mentioned earlier, I used libicui18n and libicuuc libraries for linking, so these need to be present.
Aside from the above two libraries, libicudata, going by the name, seems to be the obvious candidate. Correct?
Is there a static version of libicui18n and libicuuc libraries available for use or does one have to build it?
In general, what is the process followed for integrating ICU with a product?
Thanks!
ICU always needs to link against its data library.
Here's a very general discussion about which libraries you ened.
ICU has to be built with the --enable-static option to allow static linkage.
Ideally you will want to use pkg-config to manage your linkage against ICU.
If you are on centOS, rather than linking statically (with its headaches), you might consider just compiling against the libicu-devel package (using pkg-config as mentioned above) and then at run time your users can just include the appropriate libicu package.

Is it possible to compile a portable executible on Linux based on yum or rpm?

Usually one rpm depends on many other packages or libs. This is not easy for massive deployment without internet access.
Since yum can automatically resolve dependencies. Is it possible to build a portable executable? So that we can copy it to other machines with the same OS.
If you want a known collection of RPMs to install, yum offers a downloadonly plugin. With that, you should be able to collect all the associated RPMs in one shot to install what you wanted on a disconnected machine.
The general way to build a binary without runtime library dependencies is to build it to be static, ie. using the -static argument to gcc, which links in static versions of the libraries required such that they're included in the resulting executable. This doesn't bundle in any data file dependencies or external executables (ie. libexec-style helpers), but simpler applications often don't need them.
For more complex needs (where data files are involved, or elements of the dependency chain can't be linked in for one reason or another), consider using AppImageKit -- which bundles an application and its dependency chain into a runnable ISO. See docs/links at PortableLinuxApps.org.
In neither of these cases does rpm or yum have anything to do with it. It's certainly possible to build an RPM that packages static executables, but that's a matter of changing the %build section of the spec file such that it passes -static to gcc, not of doing anything RPM-specific.
To be clear, by the way -- there are compelling reasons why we don't use static libraries all the time!
Using shared libraries means that applying a security update to a library only means replacing the library itself, not recompiling all applications using it.
Using shared libraries is more memory-efficient, since the single shared copy of the library in memory can be used by multiple applications.
Using shared libraries means your executables don't need to include full copies of all the libraries they use, making them much smaller.

Why do NSS modules have to end in .so.2 on Linux?

I've built a Name Service Switch Module for Red Hat Linux.
Using strace, I've determined that the OS looks for the library in various directories, but only for files with the extension .so.2 (e.g. libnss_xxx.so.2, where xxx is the service name)
Why doesn't it look for .so or .so.1 libraries? Is there any guarantee that it won't stop looking for .so.2 libraries and start looking for .so.3 libraries in future?
EDIT: http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html, says that the 2 is 'a version number that is incremented whenever the interface changes'.
So I guess that:
The version of NSS requires version 2 of the libraries.
An OS update with an updated NSS might require a different version number.
Can someone confirm whether that is true?
You're assumption is generally true with a minor edit:
The version of NSS requires a version of the libraries with interface version 2.
An OS update with an updated NSS might require a different version number.
The version of an interface does not necessarily need to change with the version of the library, i.e. a newer version of the library might still provide the same interface.
There are two types of so files: shared libraries (loaded and scanned for symbols at compile time, loaded again and linked at program startup time) and modules (loaded and linked at run time). The idea of shared libraries is that your program requires a certain version of the library. This version is determined at compile time. Once the program is compiled, it should continue to work even if a new (incompatible) version of the library is installed. This means that the new version must be a different file, so old programs can still use the old library while newer (or more recently compiled) programs use the newer version.
To properly use this system, your program must somehow make sure that the library version it needs will continue to be installed. This is one of the tasks of a distribution's packaging system. The package containing your program must have a dependency on the required version of the library package.
However, you seem to be talking about modules. Things are different there. They do not carry such a version, because ld.so (which takes care of loading shared libraries) isn't the one loading them. Your program should be bundled with those modules, so the module versions are always compatible with the program using them. This works for most programs.
But it doesn't work if your program allows third party modules. So they can come up with their own versioning system. This seems to be what nss has done (I'm not familiar with it, though). This means that they have defined a protocol version (currently 2), which specifies what a module should look like: which symbols need to be defined, what arguments do functions expect, these sort of things. If you create a module following version 2 of the protocol, you should name your module .so.2 (because that's their way of checking your supported version). If they create a new incompatible protocol 3, they will start looking for .so.3. Your module will no longer be found, and that's a good thing, because it will also not support the new protocol.

Best practice for bundling third party libraries for distribution in Python 3

I'm developing an application using Python 3. What is the best practice to use third party libraries for development process and end-user distribution? Note that I'm working within these constraints:
Developers in the team should have the exact same version of the libraries.
An ideal solution would work on both Windows and Linux.
I would like to avoid making the user install software before using our own; that is, they shouldn't have to install product A and product B before using ours.
You could use setuptools to create egg files for your libraries, assuming they aren't available in egg form already. You could then bundle the eggs alongside your software, which would need to either install them, or ensure that they were on the import path.
This has some complexities, i.e. if your libraries have C-extensions, then your eggs become platform-specific, but in my experience this is the most widely-accepted means of 'bundling' stuff in Python.
I have to say that this remains one of Python's weaknesses, though; the third-party ecosystem is certainly aimed at developers rather than end-users.
There are no best practices, but there are a few different tracks people follow. With regard to commercial product distribution there are the following:
Manage Your Own Package Server
With regard to your development process, it is typical to either have your dev boxes update from a local package server. That allows you to "freeze" the dependency list (i.e. just stop getting upstream updates) so that everyone is on the same version. You can update at particular times and have the developers update as well, keeping everyone in lockstep.
For customer installs you usually write an install script. You can collect all the packages and install your libs, as well as the other at the same time. There can be issues with trying to install a new Python, or even any standard library because the customer may already depend on a different version. Usually you can install in a sandbox to separate your packages from the systems packages. This is more of a problem on Linux than Windows.
Toolchain
The other option is to create a toolchain for each supported OS. A toolchain is all the dependencies (up to, but not including base OS libs like glibc). This toolchain gets packaged up and distributed for both developers AND customers. Best practice for a toolchain is:
change the executable to prevent confusion. (ie. python -> pkg_python)
don't install in .../bin directories to prevent accidental usage. (ie. on Linux you can install under .../libexec. /opt is also used although personally I detest it.)
install your libs in the correct location under lib/python/site-packages so you don't have to use PYTHONPATH.
Distribute the source .py files for the executables so the install script can relocate them appropriately.
The package format should be an OS native package (RedHat -> RPM, Debian -> DEB, Win -> MSI)
For developers use PIP with requirements file.
For end users, specify requirements in setup.py.

library- vs. application-version

If you have a project, that releases a library and an application, how you handle version-numbers between the two.
Example: Your project delivers a library, that convert different file-formats into each other. The library is released for inclusion into other applications. But you also release a command-line-application, that uses this library and implements an interface to the functionality.
New releases of the library lead to new releases of the application (to make use of all new features), but new releases of the application may not trigger new releases of the library. Now how are the versions numbers handled: Completely independent or should library- and application-version be dependent in some way?
Completely independent version numbers, but the command line (or any other dependent) app should say which version of the library it was compiled against in the help section or a banner.
That way you will be able to tell which functionality will the apps have and reduce potential confusion, especially given that somebody could compile a newer app version against an old library for any reason. Also, you decouple them and can add features on the library without depending on release of a new app version and so on.
If you are sure you will always want all the apps and library to go in lockstep then you could use same numbers, but that's adding a constraint for not a strong reason.
I'd say use separate version numbers, and of course document what minimum library version is required for each release of the app. If they always have the same version number, and you only ever test the app against the equal-numbered library version, then they aren't really separate components, so don't say they are. Release the whole lot as one lump.
If you make them separate, you can still give them the same version number when it's appropriate - for example after a major compatibility break you might release Version 2.0 of both simultaneously.
The following example illustrates: xsltproc (a command-line app) is released as part of libxslt (a library), so doesn't have its own version number. But libxslt depends on two other libraries, and the version numbers of those are independent.
$ xsltproc --version
Using libxml 20628, libxslt 10120 and libexslt 813
xsltproc was compiled against libxml 20628, libxslt 10120 and libexslt 813
libxslt 10120 was compiled against libxml 20628
libexslt 813 was compiled against libxml 20628
We built an application that uses a framework. We keep separate version numbers for both.
This works well, especially that now the framework and application have grown large enough to be developed by different teams.
So my opinion... keep the version numbers separate.

Resources