Recommended FHS compliant application test/install workflow under Linux?

Recommended FHS compliant application test/install workflow under Linux? - linux

I'm in the process of switching to Linux for development, and I'm puzzled about how to maintain a good FHS compliancy in my programs.
For example, under Windows, I know that all the resources (Bitmaps, audio data, etc.) that my program will need can be found with relative paths from the executable, so its the same if I'm running the program from my development directory, or from an installation (Under "Program Files" for example), the program will be able to locate all its files.
Now, under Linux, I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory, while also being able to locate the files when they are under /usr/local/share?
I thought for example of setting a symlink under /usr/local/share pointing to my resources dir, and then just hardcode that path inside my program, but I feel its quite hackish and not very portable.
Also, I thought of running an install script that copies all the resources to /usr/local/share everytime I change, or add resources, but I also feel its not a good way to do it.
Could anyone tell me or point me to where it tells how this issue is usually resolved?
Thanks!

For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
You can organize your source tree as you wish — it need not bear any resemblance to the FHS layout desired of installed software.
I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
The standard prefix is /usr. /usr/local is for, well, "local installations" as the FHS spec reiterates.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory
Definitely. Run ./configure --datadir=$PWD/share for example is the way to point your build to the data files form the source tree (substitute by proper path) and use something like -DDATADIR="'${datadir}'" in AM_CFLAGS to make the value known to the (presumably C) code. (All of that, provided you are using autoconf/automake. Similar options may be available in other build systems.)
This sort of hardcoding is what is used in practice, and it suffices. For a development build within your own working copy, having a hardcoded path should not be a problem, and final builds (those done by a packager) will simply use the standard FHS paths.

You could just test a few locations. For example, first check if you have a data directory within the directory you're currently running the program from. If so, just go ahead and use it. If not, try /usr/local/share/yourproject/data, and so on.
For developing/testing, you can use the data directory within your project folder, and for deploying, use the stuff in /usr/local/share/. Of course, you can test for even more locations (e.g. /usr/share).
Basically the requirement for this method is that you have a function that builds the correct paths for all filesystem accesses. Instead of fopen("data/blabla.conf", "w") use something like fopen(path("blabla.conf"), "w"). path() will construct the correct path from the path determined using the directory tests when the program started. E.g. if the path was /usr/local/share/yourproject/data/, the string returned by path("blabla.conf") would be "/usr/local/share/yourproject/data/blabla.conf" - and there is your nice absolute path.
That's how I'd do it. HTH.

My preferred solution in cases like this is to use a configuration file, along with a command-line option that overrides its location.
For example, a configuration file for a fully deployed application named myapp could reside in /etc/myapp/settings.conf and a part of it could look like this:
...
confdir=/etc/myapp/
bindir=/usr/bin/
datadir=/usr/share/myapp/
docdir=/usr/share/doc/myapp/
...
Your application (or a launcher script) can parse this file to determine where to find the rest of the needed files.
I believe that you can reasonably assume in your code that the location of the configuration file is fixed under /etc/myapp - or any other location specified at compile time. Then you provide a command line option to allow that location to be overridden:
myapp --configfile=/opt/myapp/etc/settings.conf ...
It might also make sense to have options for some of the directory paths as well, so that the user can easily override any of the configuration file settings. This approach has a couple of advantages:
Your users can relocate the application very easily - just by moving the files, modifying the paths in the configuration file and then using e.g. a wrapper script to call the main application with the proper --configfile option.
You can easily support FHS, as well as any other scheme you need to.
While developing, you can have your testsuite use a specially crafted configuration file with the paths being wherever you need them to be.
Some people advocate probing the system at runtime to resolve issues like this. I usually suggest avoiding such solutions for at least the following reasons:
It makes your program non-deterministic. You can never tell at a first glance which configuration file it picks up - especially if you have multiple versions of the application on your system.
At any installation mix-up, the application will remain fat and happy - and so will the user. In my opinion, the application should look at one specific and well-documented location and abort with an informative message if it cannot find what it is looking for.
It's highly unlikely that you will always get everything right. There will always be unexpected rare environments or corner cases that the application will not handle.
Such behaviour is against the Unix philosophy. Even comamnd shells probe multiple locations because all locations can hold a file that should be parsed.
EDIT:
This method is not mandated by any formal standard that I know of, but it is the prevalent solution in the Unix world. Most major daemons (e.g. BIND, sendmail, postfix, INN, Apache) will look for a configuration file at a certain location, but will allow you to override that location and - through the file - any other path.
This is mostly to allow the system administrator to implement whetever scheme they want or to setup multiple concurrent installations, but it does help during testing as well. This flexibility is what makes it a Best Practice if not a proper standard.

Related

Best practice for paths defined in cross platform config files

I'm building a node application that has config files that are to be edited by users of the application, and they have file paths in them.
These config files will be used in Windows, Linux and MacOSX. For example, a project using this application might be developed in both Windows and MacOSX, but the config files are the same.
What is the best practice for the format of the paths in this case?
I have a couple of possibilities:
Force POSIX for the config files, and then when handling them, convert them to the current platform. The problem here is that Windows users might not like to have paths in the different format that they are used to.
Allow both formats in the config files and when parsing, convert them to the current platform. I'm afraid this might lead to issues parsing. Also, it might lead to weird mixed config files.
I think there's a lot of software out there that had the same dilemma, so I'm wondering if there's some best practice out there.
Thank you for your help!
Edit: paths are only going to be relative, no absolute paths. If someone puts an absolute path, then that config can't be used in different OSs anyway, so it's ok.

Converting between / and \ on the fly should not be a big issue, dealing with the root of the path is where things get problematic.
I would suggest that you add support for some kind of string expansion so that users can specify paths relative to known locations. Something like this perhaps:
imagedir=%config%/myimages
fallbackimagedir=%appdir%/images
You should also try to deal with full paths.
I would suggest that you convert Windows drive letter paths like c:\foo to /c/foo when reading the config on POSIX systems. This is easy since the string length is the same but it does assume that /c has been set up to be something useful.
Reading a POSIX path on Windows is trickier, when given a path that begins with / you have to prefix it with something. What that something is, is up to you, perhaps the drive letter root of the drive where your application is installed.

Using GNU Standard Directory Variables inside executable

Often one needs the location of one of the standard GNU directories inside the executable. Unfortunately GNU autoconf does not provide a standard way to do this but suggests several work around, each having different disadvantages, a common way to access the installed location is this to add preprocess define for the location in CPPFLAGS:
AM_CPPFLAGS = -DDATADIR='"$(datadir)"'
However, the GNU Autoconf manual's section for defining directories contains the following sentence:
Note that all the previous solutions hard wire the absolute name of these directories in the executables, which is not a good property. You may try to compute the names relative to prefix, and try to find prefix at runtime, this way your package is relocatable.
Is there a library or any standard way to compute the GNU directories inside an executable as suggested in the quoted paragraph? Would that have other disadvantages compared to the preprocessor define mentioned above?

I think the docs are rather clear about this: The standard way is to not make any assumptions about the absolute path, and use relative paths instead. Especially you should not make any assumptions about ${prefix}
So if your application needs to access shared data, access it via ../share/foo/foodata.txt rather than using /usr/local/share/foo/foodata.txt; this way you can easily re-locate your application.
Afaik, there is no external library that computes the standard paths for you, based on your calling binary.
This is probably for two reasons:
if the binary indeed uses the standard paths, then it's trivial to calculate those paths yourself (using relative paths). what would a library do better?
if the binary does not use the standard paths (e.g. because the builder used something like the following (admittedly hypothetical) example), then the task of resolving these paths is virtually impossible; so a library won't help you either
ex:
./configure --sbindir=/home/me/sbin --bindir=/opt/foo/bin
make pkglibdir=/usr/lib/goo/
make install libdir=/usr/local/foo/lib/
A helper-library (or your application) might record all those paths into some auxiliary file (for additional lookups if the standard-paths fail), but I think the biggest problem is that there is no defined place where to store that file
libdir or datadir are obviously not good (as the data should help resolve these paths, so cannot rely on them)
putting the data into the same directory as the application binary breaks the assumption of bindir only containing executables.
putting the data into the application binary might require that binary to be modified during make install, which sounds very dirty as well.

Using UglifyJs on the whole Node project?

I need to obfuscate my source code as best as possible so I decided to use uglifyjs2.. Now I have the project structure that has nested directories, how can I run it through uglifyjs2 to do the whole project instead of giving it all the input files?
I wouldn't mind if it minified the whole project into a single file or something

I've done something very similar to this in a project I worked on. You have two options:
Leave the files in their directory structure.
This is by far the easier option, but provides a much lower level of obfuscation since someone interested enough in your code basically has a copy of the logical organization of files.
An attacker can simply pretty-print all the files and rename the obfuscated variable names in each file until they have an understanding of what is going on.
To do this, use fs.readdir and fs.stat to recursively go through folders, read in every .js file and output the mangled code.
Compile everything into a single JS file.
This is much more difficult for you to implement, but does make life harder on an attacker since they no longer have the benefit of your project's organization.
Your main problem is reconciling your require calls with files that no longer exist (since everything is now in the same file).
I did this by using Uglify to perform static analysis of my source code by analyzing the AST for calls to require. I then loaded the source code of the required file and repeated.
Once all code was loaded, I replaced the require calls with calls to a custom function, wrapped each file's source code in a function that emulates how node's module system works, and then mangled everything and compiled it into a single file.
My custom require function does most of what node's require does except that rather than searching the disk for a module, it searches the wrapper functions.
Unfortunately, I can't really share any code for #2 since it was part of a proprietary project, but the gist is:
Parse the source text into an AST using UglifyJS.parse.
Use the TreeWalker to visit every node of the AST and check if
node instanceof UglifyJS.AST_Call && node.start.value == 'require'

As I have just completed a huge pure Nodejs project in 80+ files I had the same problem as OP. I needed at least a minimal protection for my hard work, but it seems this very basic need had not been covered by the NPMjs OS community. Add salt to injury the JXCore package encryption system was cracked last week in a few hours so back to obfuscation...
So I created the complete solution, that handles file merging, uglifying. You have the option of leaving out specified files/folders as well from merging. These files are then copied to the new output location of the merged file and references to them are rewritten auto.
NPMjs link of node-uglifier
Github repo of of node-uglifier
PS: I would be glad if people would contribute to make it even better. This is a war between thieves and hard working coders like yourself. Lets join our forces, increase the pain of reverse engineering!

This isn't supported natively by uglifyjs2.
Consider using webpack to package up your entire app into a single minified .js file, excluding node_modules:
http://jlongster.com/Backend-Apps-with-Webpack--Part-I

I had the same need - for which I created node-optimize and grunt-node-optimize.
https://www.npmjs.com/package/grunt-node-optimize

Is there any good build tool that stays out of my way?

As the title says, I want to have a build tool that quite much stays out of my way.
I would rather want to specify rules, rather than steps in the build process. I wan to say that I want a binary file with a name placed in the root directory of my project, .o files should go in an obj/tmp dir and the source is in the Source-directory.
I do NOT want to tell it that it is this'n'that file as I keep adding new files rather quickly, it should just scan the source directory (and its subdirectories) looking for Ragel (.rl) and C++ code (.cxx) and doing what's necessary to make all into an executable.
I have looked into many tools, like auto{make,conf,header} (Did not really like that I placed the files it wanted in a subdir of project root, eclipse did not like that either), CMake (Seems like I have to add all source-files myself, and is quite much a variation of autotools in my eyes). I have also read about ant, maven (I am also allergic against XML, it's a good format to serialize data for applications, not so much for humans. I would prefer YAML) and others on WikiPedia. And I have seen tools which seems good but which require to be set up as a webserver which is kinda overkill.
Also, I really need the ability to be able to work offline without internet connection!.
Right now it seems like the best option is to make a little script that finds all .cxx files and write an Unity.cxx and builds that one with G++, which probably is quite fast but to much an ugly hack, I guess.
Bonus Points:
Fast builds
Ability to type build test-1 or something and it will build and directly run test-1
Multi-core builds (i.e. faster builds)
Does really not interrupt my train of thought

CMake is great. It's free, cross-platform, and reasonably well documented. It supports "out of source builds", meaning none of the build files are placed in the source directory. That makes source control a bit easier. It can be set up to find new files (globbing). Fast?...It generates make files...after that it's up to your compiler. Multicore...again, more a function of the compiler. I've used CMake on Windows, Linux, and Mac...it just works.
Another that I haven't tried but have read about and plan to test is premake... http://industriousone.com/sample-script

cake from CoffeeScript is quite good, and I'm writing a similar tool using Lua myself.
CMake and premake Ain't build/maketools, they are build/make-descriptor generators; which may fit a large number of projects that ain't changing too much. But not for project where rapid prototyping is a key.
Right now, I'm doing a project where the browser updates when you hit the save-button in your text editor; You do not need to go to the browser and hit F5 (Which would cause a small delay while the browser load in everything again, and you would most likely loose the state of the page, like say that you have an menu open, and wish to tweak the look of the menu. You would be forced to navigate there again in your RIA).

How to get a list of paths in /etc/ld.so.conf on Linux

What is the most portable and robust way to get the list of paths, configured by /etc/ld.so.conf and files included from it? Parsing the file manually seems to be not a good idea — the format is likely to change in the future revisions.
To allow better understanding of the question, I will give you specific details below. Note that, despite these details, this is a general programming question, applicable to other situations.
There is a program, called LuaRocks. It is a package manager for Lua programming language (somewhat like Ruby gems or Python eggs). LuaRocks packages are called "rocks".
As a convenience feature, LuaRocks allows a rock author to specify a list of external dependencies for a rock, formulated as a list of C header files and / or dynamic library files. (.so on Linux.) If the specified file does not exist, the rock can't be installed.
Currently, on Linux, LuaRocks by default checks .so file existance by searching for the file in two hardcoded paths, /usr/lib and /usr/local/lib.
I believe that this is incorrect behaviour, and it is broken by the recent changes in the Ubuntu and other Debian distributions.
Update: the paths are not hardcoded per se, but are user-configurable in the config file. Still, IMO, not a best solution.
Instead (as I understand it), LuaRocks should look up file in the paths, specified by /etc/ld.so.conf and files included from it.
(Now please re-read the question above ;-) )

You shouldn't need to parse /etc/ld.so.conf or any of the config files - if you run 'ldconfig', it will scan the configured directories and generate a cache file.
Then, subsequently when you attempt a dlopen it'll automatically find the files by iterating through the cached library directories. Same thing with compiling and giving -lSomeLib, you shouldn't need to specify -L/my/other/path if you've got it configured in ld.so.conf(.d)
autoconf accomplishes this by attempting to compile a test program that links to the shared library, but that's just a functional wrapper around the dlopen() call.
So, while other methods may not necessarily be 'wrong', at the root of it attempting to link to the library or doing a dlopen() are the 'most right' ways of doing it.
Consider this, if you attempt to link to a library in a directory that ISN'T cached in /etc/ld.so.cache, when you try to run the program it will fail because it won't be able to dlopen() the library!
Hence, any 'good' shared library will be in /etc/ld.so.cache and be linkable/dlopen()able, this means that gcc can use it to link and that the user-generated library or executable will be able to open it when it executes.
You can circumvent this by expressly setting the environment variable LD_LIBRARY_PATH, or LD_PRELOAD_PATH - but each of these has it's own caveats and should be avoided if possible for 'standard' use.
A good write-up on writing shared libraries covers some of these issues, and is a good read for anyone working on programmatic consuming of other-shared libraries. Ulrich Drepper's How to write shared libraries.

According to the FHS, the following are valid locations for dynamic libraries:
/lib*/
/opt/*/lib*/
/usr/lib*/
/usr/local/lib*/
(And most likely ~/lib*/ as well.)
All entries in my /etc/ld.so.conf.d/* conform to this. Some entries reference subdirectories below the FHS dirs, which probably means that you can use the libraries in there without path information.
Now I don't know enough about LuaRocks. If you're limited to Lua-path-style globs (only ?), you cannot match these and have to parse the configs. Otherwise, you could just try to find them anywhere in these directories.
This would break on non-FHS-conforming systems (only option: parse config) and if a directory is not included in the config, the installer might see libraries that the linker cannot find.
These two seem acceptable to me, therefore I'd simply ignore the config and look at these dirs.
(Another possibility could be trying to link the library, this should automagically use the right path. However, this is platform-specific and maybe dangerous.)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string