Using GNU Standard Directory Variables inside executable

Using GNU Standard Directory Variables inside executable - gnu

Often one needs the location of one of the standard GNU directories inside the executable. Unfortunately GNU autoconf does not provide a standard way to do this but suggests several work around, each having different disadvantages, a common way to access the installed location is this to add preprocess define for the location in CPPFLAGS:
AM_CPPFLAGS = -DDATADIR='"$(datadir)"'
However, the GNU Autoconf manual's section for defining directories contains the following sentence:
Note that all the previous solutions hard wire the absolute name of these directories in the executables, which is not a good property. You may try to compute the names relative to prefix, and try to find prefix at runtime, this way your package is relocatable.
Is there a library or any standard way to compute the GNU directories inside an executable as suggested in the quoted paragraph? Would that have other disadvantages compared to the preprocessor define mentioned above?

I think the docs are rather clear about this: The standard way is to not make any assumptions about the absolute path, and use relative paths instead. Especially you should not make any assumptions about ${prefix}
So if your application needs to access shared data, access it via ../share/foo/foodata.txt rather than using /usr/local/share/foo/foodata.txt; this way you can easily re-locate your application.
Afaik, there is no external library that computes the standard paths for you, based on your calling binary.
This is probably for two reasons:
if the binary indeed uses the standard paths, then it's trivial to calculate those paths yourself (using relative paths). what would a library do better?
if the binary does not use the standard paths (e.g. because the builder used something like the following (admittedly hypothetical) example), then the task of resolving these paths is virtually impossible; so a library won't help you either
ex:
./configure --sbindir=/home/me/sbin --bindir=/opt/foo/bin
make pkglibdir=/usr/lib/goo/
make install libdir=/usr/local/foo/lib/
A helper-library (or your application) might record all those paths into some auxiliary file (for additional lookups if the standard-paths fail), but I think the biggest problem is that there is no defined place where to store that file
libdir or datadir are obviously not good (as the data should help resolve these paths, so cannot rely on them)
putting the data into the same directory as the application binary breaks the assumption of bindir only containing executables.
putting the data into the application binary might require that binary to be modified during make install, which sounds very dirty as well.

Related

Loading Linux libraries at runtime

I think a major design flaw in Linux is the shared object hell when it comes to distributing programs in binary instead of source code form.
Here is my specific problem: I want to publish a Linux program in ELF binary form that should run on as many distributions as possible so my mandatory dependencies are as low as it gets: The only libraries required under any circumstances are libpthread, libX11, librt and libm (and glibc of course). I'm linking dynamically against these libraries when I build my program using gcc.
Optionally, however, my program should also support ALSA (sound interface), the Xcursor, Xfixes, and Xxf86vm extensions as well as GTK. But these should only be used if they are available on the user's system, otherwise my program should still run but with limited functionality. For example, if GTK isn't there, my program will fall back to terminal mode. Because my program should still be able to run without ALSA, Xcursor, Xfixes, etc. I cannot link dynamically against these libraries because then the program won't start at all if one of the libraries isn't there.
So I need to manually check if the libraries are present and then open them one by one using dlopen() and import the necessary function symbols using dlsym(). This, however, leads to all kinds of problems:
1) Library naming conventions:
Shared objects often aren't simply called "libXcursor.so" but have some kind of version extension like "libXcursor.so.1" or even really funny things like "libXcursor.so.0.2000". These extensions seem to differ from system to system. So which one should I choose when calling dlopen()? Using a hardcoded name here seems like a very bad idea because the names differ from system to system. So the only workaround that comes to my mind is to scan the whole library path and look for filenames starting with a "libXcursor.so" prefix and then do some custom version matching. But how do I know that they are really compatible?
2) Library search paths: Where should I look for the *.so files after all? This is also different from system to system. There are some default paths like /usr/lib and /lib but *.so files could also be in lots of other paths. So I'd have to open /etc/ld.so.conf and parse this to find out all library search paths. That's not a trivial thing to do because /etc/ld.so.conf files can also use some kind of include directive which means that I have to parse even more .conf files, do some checks against possible infinite loops caused by circular include directives etc. Is there really no easier way to find out the search paths for *.so?
So, my actual question is this: Isn't there a more convenient, less hackish way of achieving what I want to do? Is it really so complicated to create a Linux program that has some optional dependencies like ALSA, GTK, libXcursor... but should also work without it! Is there some kind of standard for doing what I want to do? Or am I doomed to do it the hackish way?
Thanks for your comments/solutions!

I think a major design flaw in Linux is the shared object hell when it comes to distributing programs in binary instead of source code form.
This isn't a design flaw as far as creators of the system are concerned; it's an advantage -- it encourages you to distribute programs in source form. Oh, you wanted to sell your software? Sorry, that's not the use case Linux is optimized for.
Library naming conventions: Shared objects often aren't simply called "libXcursor.so" but have some kind of version extension like "libXcursor.so.1" or even really funny things like "libXcursor.so.0.2000".
Yes, this is called external library versioning. Read about it here. As should be clear from that description, if you compiled your binaries using headers on a system that would normally give you libXcursor.so.1 as a runtime reference, then the only shared library you are compatible with is libXcursor.so.1, and trying to dlopen libXcursor.so.0.2000 will lead to unpredictable crashes.
Any system that provides libXcursor.so but not libXcursor.so.1 is either a broken installation, or is also incompatible with your binaries.
Library search paths: Where should I look for the *.so files after all?
You shouldn't be trying to dlopen any of these libraries using their full path. Just call dlopen("libXcursor.so.1", RTLD_GLOBAL);, and the runtime loader will search for the library in system-appropriate locations.

How to get a list of paths in /etc/ld.so.conf on Linux

What is the most portable and robust way to get the list of paths, configured by /etc/ld.so.conf and files included from it? Parsing the file manually seems to be not a good idea — the format is likely to change in the future revisions.
To allow better understanding of the question, I will give you specific details below. Note that, despite these details, this is a general programming question, applicable to other situations.
There is a program, called LuaRocks. It is a package manager for Lua programming language (somewhat like Ruby gems or Python eggs). LuaRocks packages are called "rocks".
As a convenience feature, LuaRocks allows a rock author to specify a list of external dependencies for a rock, formulated as a list of C header files and / or dynamic library files. (.so on Linux.) If the specified file does not exist, the rock can't be installed.
Currently, on Linux, LuaRocks by default checks .so file existance by searching for the file in two hardcoded paths, /usr/lib and /usr/local/lib.
I believe that this is incorrect behaviour, and it is broken by the recent changes in the Ubuntu and other Debian distributions.
Update: the paths are not hardcoded per se, but are user-configurable in the config file. Still, IMO, not a best solution.
Instead (as I understand it), LuaRocks should look up file in the paths, specified by /etc/ld.so.conf and files included from it.
(Now please re-read the question above ;-) )

You shouldn't need to parse /etc/ld.so.conf or any of the config files - if you run 'ldconfig', it will scan the configured directories and generate a cache file.
Then, subsequently when you attempt a dlopen it'll automatically find the files by iterating through the cached library directories. Same thing with compiling and giving -lSomeLib, you shouldn't need to specify -L/my/other/path if you've got it configured in ld.so.conf(.d)
autoconf accomplishes this by attempting to compile a test program that links to the shared library, but that's just a functional wrapper around the dlopen() call.
So, while other methods may not necessarily be 'wrong', at the root of it attempting to link to the library or doing a dlopen() are the 'most right' ways of doing it.
Consider this, if you attempt to link to a library in a directory that ISN'T cached in /etc/ld.so.cache, when you try to run the program it will fail because it won't be able to dlopen() the library!
Hence, any 'good' shared library will be in /etc/ld.so.cache and be linkable/dlopen()able, this means that gcc can use it to link and that the user-generated library or executable will be able to open it when it executes.
You can circumvent this by expressly setting the environment variable LD_LIBRARY_PATH, or LD_PRELOAD_PATH - but each of these has it's own caveats and should be avoided if possible for 'standard' use.
A good write-up on writing shared libraries covers some of these issues, and is a good read for anyone working on programmatic consuming of other-shared libraries. Ulrich Drepper's How to write shared libraries.

According to the FHS, the following are valid locations for dynamic libraries:
/lib*/
/opt/*/lib*/
/usr/lib*/
/usr/local/lib*/
(And most likely ~/lib*/ as well.)
All entries in my /etc/ld.so.conf.d/* conform to this. Some entries reference subdirectories below the FHS dirs, which probably means that you can use the libraries in there without path information.
Now I don't know enough about LuaRocks. If you're limited to Lua-path-style globs (only ?), you cannot match these and have to parse the configs. Otherwise, you could just try to find them anywhere in these directories.
This would break on non-FHS-conforming systems (only option: parse config) and if a directory is not included in the config, the installer might see libraries that the linker cannot find.
These two seem acceptable to me, therefore I'd simply ignore the config and look at these dirs.
(Another possibility could be trying to link the library, this should automagically use the right path. However, this is platform-specific and maybe dangerous.)

Recommended FHS compliant application test/install workflow under Linux?

I'm in the process of switching to Linux for development, and I'm puzzled about how to maintain a good FHS compliancy in my programs.
For example, under Windows, I know that all the resources (Bitmaps, audio data, etc.) that my program will need can be found with relative paths from the executable, so its the same if I'm running the program from my development directory, or from an installation (Under "Program Files" for example), the program will be able to locate all its files.
Now, under Linux, I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory, while also being able to locate the files when they are under /usr/local/share?
I thought for example of setting a symlink under /usr/local/share pointing to my resources dir, and then just hardcode that path inside my program, but I feel its quite hackish and not very portable.
Also, I thought of running an install script that copies all the resources to /usr/local/share everytime I change, or add resources, but I also feel its not a good way to do it.
Could anyone tell me or point me to where it tells how this issue is usually resolved?
Thanks!

For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
You can organize your source tree as you wish — it need not bear any resemblance to the FHS layout desired of installed software.
I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
The standard prefix is /usr. /usr/local is for, well, "local installations" as the FHS spec reiterates.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory
Definitely. Run ./configure --datadir=$PWD/share for example is the way to point your build to the data files form the source tree (substitute by proper path) and use something like -DDATADIR="'${datadir}'" in AM_CFLAGS to make the value known to the (presumably C) code. (All of that, provided you are using autoconf/automake. Similar options may be available in other build systems.)
This sort of hardcoding is what is used in practice, and it suffices. For a development build within your own working copy, having a hardcoded path should not be a problem, and final builds (those done by a packager) will simply use the standard FHS paths.

You could just test a few locations. For example, first check if you have a data directory within the directory you're currently running the program from. If so, just go ahead and use it. If not, try /usr/local/share/yourproject/data, and so on.
For developing/testing, you can use the data directory within your project folder, and for deploying, use the stuff in /usr/local/share/. Of course, you can test for even more locations (e.g. /usr/share).
Basically the requirement for this method is that you have a function that builds the correct paths for all filesystem accesses. Instead of fopen("data/blabla.conf", "w") use something like fopen(path("blabla.conf"), "w"). path() will construct the correct path from the path determined using the directory tests when the program started. E.g. if the path was /usr/local/share/yourproject/data/, the string returned by path("blabla.conf") would be "/usr/local/share/yourproject/data/blabla.conf" - and there is your nice absolute path.
That's how I'd do it. HTH.

My preferred solution in cases like this is to use a configuration file, along with a command-line option that overrides its location.
For example, a configuration file for a fully deployed application named myapp could reside in /etc/myapp/settings.conf and a part of it could look like this:
...
confdir=/etc/myapp/
bindir=/usr/bin/
datadir=/usr/share/myapp/
docdir=/usr/share/doc/myapp/
...
Your application (or a launcher script) can parse this file to determine where to find the rest of the needed files.
I believe that you can reasonably assume in your code that the location of the configuration file is fixed under /etc/myapp - or any other location specified at compile time. Then you provide a command line option to allow that location to be overridden:
myapp --configfile=/opt/myapp/etc/settings.conf ...
It might also make sense to have options for some of the directory paths as well, so that the user can easily override any of the configuration file settings. This approach has a couple of advantages:
Your users can relocate the application very easily - just by moving the files, modifying the paths in the configuration file and then using e.g. a wrapper script to call the main application with the proper --configfile option.
You can easily support FHS, as well as any other scheme you need to.
While developing, you can have your testsuite use a specially crafted configuration file with the paths being wherever you need them to be.
Some people advocate probing the system at runtime to resolve issues like this. I usually suggest avoiding such solutions for at least the following reasons:
It makes your program non-deterministic. You can never tell at a first glance which configuration file it picks up - especially if you have multiple versions of the application on your system.
At any installation mix-up, the application will remain fat and happy - and so will the user. In my opinion, the application should look at one specific and well-documented location and abort with an informative message if it cannot find what it is looking for.
It's highly unlikely that you will always get everything right. There will always be unexpected rare environments or corner cases that the application will not handle.
Such behaviour is against the Unix philosophy. Even comamnd shells probe multiple locations because all locations can hold a file that should be parsed.
EDIT:
This method is not mandated by any formal standard that I know of, but it is the prevalent solution in the Unix world. Most major daemons (e.g. BIND, sendmail, postfix, INN, Apache) will look for a configuration file at a certain location, but will allow you to override that location and - through the file - any other path.
This is mostly to allow the system administrator to implement whetever scheme they want or to setup multiple concurrent installations, but it does help during testing as well. This flexibility is what makes it a Best Practice if not a proper standard.

Specifying different platform specific package at compile time in Ada (GNAT)

I'm still new to the Ada programming world so forgive me if this question is obvious.
I am looking at developing an application (in Ada, using the features in the 2005 revision) that reads from the serial port and basically performs manipulation of the strings and numbers it receives from an external device.
Now my intention was to likely use Florist and the POSIX terminal interfaces to do all the serial work on Linux first....I'll get to Windows/MacOS/etc... some other time but I want to leave that option open.
I would like to follow Ada best practices in whatever I do with this. So instead of a hack like conditional compilation under C (which I know Ada does not have anyway) I would like to find out how you are suppose to specify a change in package files from the command line (gnatmake for example)?
The only thing I can think of right now is I could name all platform packages exactly the same (i.e. package name Serial.Connector with the same filenames) and place them in different folders in the project archive and then upon compilation specify the directories/Libraries to look in for the files with -I argument and change directory names for different platforms.
This is way I was shown for GCC using C/C++...is this still the best way with Ada using GNAT?.
Thanks,
-Josh

That's a perfectly acceptable way of handling this kind of situation. If at all possible you should have a common package specification (or specifications if more than one is appropriate), with all the platform-specific stuff strictly confined to the corresponding package body variations.
(If you did want to go down the preprocessor path, there's a GNAT preprocessor called gnatprep that can be used, but I don't like conditional compilation either, so I'd recommend staying with the separate subdirectories approach.)

You could use the GNAT Project file package Naming: an extract from a real example, where I wanted to choose between two versions of a package in the same directory, one with debug additions, is
...
type Debug_Code is ("no", "yes");
Debug : Debug_Code := External ("DEBUG", "no");
...
package Naming is
case Debug is
when "yes" =>
for Spec ("BC.Support.Managed_Storage")
use "bc-support-managed_storage.ads-debug";
for Body ("BC.Support.Managed_Storage")
use "bc-support-managed_storage.adb-debug";
when "no" =>
null;
end case;
end Naming;
To select the special naming, either set the environment variable DEBUG to yes or build with gnatmake -XDEBUG=yes.

Yes, the generally accepted way to handle this in Ada is to do it with different files, selected by your build system. Gnu make is about as multiplatform as it gets, and can allow you to build different files (with different names and/or directories and everything) under different configurations.
As a matter of fact, I find this a superior way (over #ifdefs) to do it in C as well.

Gurus say that LD_LIBRARY_PATH is bad - what's the alternative?

I read some articles about problems in using the LD_LIBRARY_PATH, even as a part of a wrapper script:
http://linuxmafia.com/faq/Admin/ld-lib-path.html
http://blogs.oracle.com/ali/entry/avoiding_ld_library_path_the
In this case - what are the recommended alternatives?
Thanks.

You can try adding:
-Wl,-rpath,path/to/lib
to the linker options. This will save you the need to worry about the LD_LIBRARY_PATH environment variable, and you can decide at compile time to point to a specific library.
For a path relative to the binary, you can use $ORIGIN, eg
-Wl,-rpath,'$ORIGIN/../lib'
($ORIGIN may not work when statically linking to shared libraries with ld, use -Wl,--allow-shlib-undefined to fix this)

I've always set LD_LIBRARY_PATH, and I've never had a problem.
To quote you first link:
When should I set LD_LIBRARY_PATH? The short answer is never. Why? Some users seem to set this environment variable because of bad advice from other users or badly linked code that they do not know how to fix.
That is NOT what I call a definitive problem statement. In fact it brings to mind I don't like it. [YouTube, but SFW].
That second blog entry (http://blogs.oracle.com/ali/entry/avoiding_ld_library_path_the) is much more forthcoming on the nature of the problem... which appears to be, in a nutshell, library version clashes ThisProgram requires Foo1.2, but ThatProgram requires Foo1.3, hence you can't run both programs (easily). Note that most of these problems are negated by a simple wrapper script which sets the LD_LIBRARY_PATH for just the executing shell, which is (almost always) a separate child process of interactive shell.
Note also that the alternatives are pretty well explained in the post.
I'm just confused as to why you would post a question containing links to articles which apparently answer your question... Do you have a specific question which wasn't covered (clearly enough) in either of those articles?

the answer is in the first article you quoted.
In UNIX the location of a library can be specified with the -L dir option to the compiler.
....
As an alternative to using the -L and -R options, you can set the environment variable LD_RUN_PATH before compiling the code.

I find that the existing answers to not actually answer the question in a straightforward way:
LD_RUN_PATH is used by the linker (see ld) at the time you link your software. It is used only if you have no -rpath ... on the command line (-Wl,rpath ... on the gcc command line). The path(s) defined in that variable are added to the RPATH entry in your ELF binary file. (You can see that RPATH using objdump -x binary-filename—in most cases it is not there though! It appears in my development binaries, but once the final version gets installed RPATH gets removed.)
LD_LIBRARY_PATH is used at runtime, when you want to specify a directory that the dynamic linker (see ldd) needs to search for libraries. Specifying the wrong path could lead to loading the wrong libraries. This is used in addition to the RPATH value defined in your binary (as in 1.)
LD_RUN_PATH really causes no security threat unless you are a programmer and don't know how to use it. As I am using CMake to build my software, the -rpath is used all the time. That way I do not have to install everything to run my software. ldd can find all the .so files automatically. (the automake environment was supposed to do that too, but it was not very good at it, in comparison.)
LD_LIBRARY_PATH is a runtime variable and thus you have to be careful with it. That being said, many shared object would be really difficult to deal with if we did not have that special feature. Whether it is a security threat, probably not. If a hacker takes a hold of your computer, LD_LIBRARY_PATH is accessible to that hacker anyway. What could happen is that you use the wrong path(s) in that variable, your binary may not load, but if it loads you may end up with a crashing binary or at least a binary that does not work quite right. One concern is that over time you get new versions of the library and you are likely to forget to remove the LD_LIBRARY_PATH which means you may be using an unsecure version of the library.
The one other possibility for security is if the hacker installs a fake library of the same name as what the binary is searching, library that includes all the same functions, but that has some of those functions replaced with sneaky code. He can get that library loaded by changing the LD_LIBRARY_PATH variable. Then it will eventually get executed by the hacker. Again, if the hacker can add such a library to your system, he's already in and probably does not need to do anything like that in the first place (since he's in he has full control of your system anyway.) Because in reality, if the hacker can only place the library in his account he won't do anything much (unless your Unix box is not safe overall...) If the hacker can replace one of your /usr/lib/... libraries, he already has full access to your system. So LD_LIBRARY_PATH is not needed.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string