How to make Unix binary self-contained?

How to make Unix binary self-contained? - linux

I have a Linux binary, without sources, that works on one machine, and I'd like to make a self-contained package that would run on a different machine of the same architecture. What is a way of achieving this?
In my case, both machines have the same architecture, same Ubuntu kernel, but target machine doesn't have make and has wrong version of files under /lib and /usr
One idea I had was to use chroot and recreate a subset of the filesystem that the binary uses, possibly using strace to figure out what it needs. Is there a tool that does this already?
For posterity, here's how I figure out which files a process opens
#!/usr/bin/python
# source of trace_fileopen.py
# Runs command and prints all files that have been successfully opened with mode O_RDONLY
# example: trace_fileopen.py ls -l
import re, sys, subprocess, os
if __name__=='__main__':
strace_fn = '/tmp/strace.out'
strace_re = re.compile(r'([^(]+?)\((.*)\)\s*=\s*(\S+?)\s+(.*)$')
cmd = sys.argv[1]
nowhere = open('/dev/null','w')#
p = subprocess.Popen(['strace','-o', strace_fn]+sys.argv[1:], stdout=nowhere, stderr=nowhere)
sts = os.waitpid(p.pid, 0)[1]
output = []
for line in open(strace_fn):
# ignore lines like --- SIGCHLD (Child exited) # 0 (0) ---
if not strace_re.match(line):
continue
(function,args,returnval,msg) = strace_re.findall(line)[0]
if function=='open' and returnval!='-1':
(fname,mode)=args.split(',',1)
if mode.strip()=='O_RDONLY':
if fname.startswith('"') and fname.endswith('"') and len(fname)>=2:
fname = fname[1:-1]
output.append(fname)
prev_line = ""
for line in sorted(output):
if line==prev_line:
continue
print line
prev_line = line
Update
The problem with LD_LIBRARY_PATH solutions is that /lib is hardcoded into interpreter and takes precedence over LD_LIBRARY_PATH, so native versions will get loaded first. The interpreter is hardcoded into the binary. One approach might be to patch the interpreter and run the binary as patched_interpreter mycommandline Problem is that when mycommandline is starts with java, this doesn't work because Java sets-up LD_LIBRARY_PATH and restarts itself, which resorts to the old interpreter. A solution that worked for me was to open the binary in the text editor, find the interpreter (/lib/ld-linux-x86-64.so.2), and replace it with same-length path to the patched interpreter

As others have mentioned, static linking is one option. Except static linking with glibc gets a little more broken with every release (sorry, no reference; just my experience).
Your chroot idea is probably overkill.
The solution most commercial products use, as far as I can tell, is to make their "application" a shell script that sets LD_LIBRARY_PATH and then runs the actual executable. Something along these lines:
#!/bin/sh
here=`dirname "$0"`
export LD_LIBRARY_PATH="$here"/lib
exec "$here"/bin/my_app "$#"
Then you just dump a copy of all the relevant .so files under lib/, put your executable under bin/, put the script in ., and ship the whole tree.
(To be production-worthy, properly prepend "$here"/lib to LD_LIBRARY_PATH if it is non-empty, etc.)
[edit, to go with your update]
I think you may be confused about what is hard-coded and what is not. ld-linux-x86-64.so.2 is the dynamic linker itself; and you are correct that its path is hard-coded into the ELF header. But the other libraries are not hard-coded; they are searched for by the dynamic linker, which will honor LD_LIBRARY_PATH.
If you really need a different ld-linux.so, instead of patching the ELF header, simply run the dynamic linker itself:
/path/to/my-ld-linux.so my_program <args>
This will use your linker instead of the one listed in the ELF header.
Patching the executable itself is evil. Please consider the poor person who has to maintain your stuff after you move on... Nobody is going to expect you to have hacked the ELF header by hand. Anybody can read what a shell script is doing.
Just my $0.02.

There's CDE a bit of software designed to do exactly what you want. Here's a google tech talk about it
http://www.youtube.com/watch?v=6XdwHo1BWwY

There are almost certainly better answers, but you can find out what libraries the binary needs with the ldd command (example for the ls binary):
$ ldd /bin/ls
linux-vdso.so.1 => (0x00007ffffff18000)
librt.so.1 => /lib/librt.so.1 (0x00007f5ae565c000)
libselinux.so.1 => /lib/libselinux.so.1 (0x00007f5ae543e000)
libacl.so.1 => /lib/libacl.so.1 (0x00007f5ae5235000)
libc.so.6 => /lib/libc.so.6 (0x00007f5ae4eb2000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00007f5ae4c95000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5ae588b000)
libdl.so.2 => /lib/libdl.so.2 (0x00007f5ae4a90000)
libattr.so.1 => /lib/libattr.so.1 (0x00007f5ae488b000)
Once you have this, you could make copies and put them in the proper locations on the target machine.

Related

Pattern syntax %.3: man/libfoo.man in Automake with different base name

I wrote a library libfoo providing functions bar and baz.
I want the user to be able to find the same man-page (from mans/libfoo.man) when they call man libfoo, man bar and man baz (Similar to man fprintf, man sprintf all pointing to the same page.)
My current setup has the files mans/libfoo.man and Makefile.am
To 'tell' automake that I want to end up with the three man-pages I specified the dist_man3_MANS variable.
Makefile.am:
dist_man3_MANS = mans/libfoo.3 mans/bar.3 mans/baz.3
Coming from GNU make, I thought I could just write
%.3: mans/libfoo.man
ln -S libfoo.man $#
to create links temporarily and then let Automake install those accordingly, but Automake errors out with Makefile.am:115: warning: '%'-style pattern rules are a GNU make extension. I want to do it properly and take this warning seriously by not relying on GNU Make to be as portable as possible.
The Automake manual suggests to add a target
.man.3:
$(LN_S) $^ $#
but that just tells Automake that xx.man can be compiled to xx.3, requiring the base name to be the same. I don't want to carry around those xx.man files, so this approach does not work.
I could hack it in with putting a rule
dist_man3_MANS = mans/libfoo.3 mans/bar.3 mans/baz.3
$(dist_man3_MANS): mans/libfoo.man
$(LN_S) libfoo.man $#
but that seems like a dirty hack, because I am not giving it a recipe to compile .man to .3, but rather say: "Hey, you can create those files with this rule", which for this case may work coincidental.

I would follow the example from the Automake info page section Extending Automake Rules and do something along the lines of
LIBFOO_MAN_ALIASES = bar baz
install-data-hook:
set -e; \
cd $(DESTDIR)$(man3dir) && \
for manalias in $(LIBFOO_MAN_ALIASES); do \
$(LN_S) libfoo.3 $${manalias}.3; \
done
uninstall-hook:
cd $(DESTDIR)$(man3dir) && \
for manalias in $(LIBFOO_MAN_ALIASES); do \
rm -f $${manalias}.3; \
done
relying on AC_PROG_LN_S to make sure that $(LN_S) does something reasonable for the system (symlink, hardlink, copy) to create a file name which can be open(2)ed and read.
FTR, I have just taken a look at three different systems' man pages and found them using three different methods to make the fprintf(3) man page show the same man page as printf(3) does:
Debian 10 uses symlinks
Fedora 35 uses a /usr/share/man/man3/fprintf.3 file containing .so man3/printf.3 (while some other man pages use symlinks to achieve the same effect)
FreeBSD 13 uses hardlinks, and find /usr/share/man -type l does not find any symlinks on my relatively clean system. However, manually testing both symlinks and the .so man3/printf.3 method suggests that FreeBSD man(1) does not treat symlinks in any special way and therefore opens the symlinked man page, and it also interprets the .so command just like Fedora 35's man(1) does.
I do not know how portable each of those methods is. Each of these three methods could set up on make install by using an appropriate install-data-hook, but any man file which can be opened using open(2) appears to be work, and therefore $(LN_S) looks like a good bet.

Setting RPATH order in QMake

I have a Linux Qt program. I'd like it to preferentially use the (dynamic) Qt libraries in the executable's directory if they exist, otherwise use the system's Qt libs. RPATH to the rescue.
I add this line to the qmake's .pro file:
QMAKE_LFLAGS += '-Wl,-rpath,\'\$$ORIGIN\''
and looking at the resulting executable with readelf I see:
0x000000000000000f (RPATH) Library rpath: [$ORIGIN:/usr/local/Trolltech/Qt-5.2.0/lib]
0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN:/usr/local/Trolltech/Qt-5.2.0/lib]
Seems right, but ldd shows it's using the system version:
libQt5Core.so.5 => /usr/local/Trolltech/Qt-5.2.0/lib/libQt5Core.so.5 (0x00007f2d2fe09000)
If I manually edit qmake's resulting Makefile to swap the order of the two rpaths, so $ORIGIN comes after /usr/local/..., I get the right behavior:
0x000000000000000f (RPATH) Library rpath: [/usr/local/Trolltech/Qt-5.2.0/lib:$ORIGIN]
0x000000000000001d (RUNPATH) Library runpath: [/usr/local/Trolltech/Qt-5.2.0/lib:$ORIGIN]
libQt5Core.so.5 => ./libQt5Core.so.5 (0x00007fb92aba9000)
My problem is with how qmake constructs the final LFLAGS variable. I can't figure out how to make it put my addition ($ORIGIN) after the system library. Any ideas?

You can add the following to your .pro file to force the dynamic linker to look in the same directory as your Qt application at runtime in Linux :
unix:{
# suppress the default RPATH if you wish
QMAKE_LFLAGS_RPATH=
# add your own with quoting gyrations to make sure $ORIGIN gets to the command line unexpanded
QMAKE_LFLAGS += "-Wl,-rpath,\'\$$ORIGIN\'"
}
If you want it to look in a subdirectory of the executable path, you can use :
QMAKE_LFLAGS += "-Wl,-rpath,\'\$$ORIGIN/libs\'"
Note that you should have the .so files with the exact same name in your application directory. For example you should copy libQt5Core.so.5.2.0 to your application directory with the name libQt5Core.so.5. Now the ldd shows the directory of the application.
You can also have libQt5Core.so.5.2.0 and a link to it with the name libQt5Core.so.5 in the application directory.

As far as my research can say, you can only add RPATH at the beginning of the list with QMake.
But if you are on Linux and can install chrpath, you can hack your way around that.
Add this block at the end of your .pro file
# Add spacing since chrpath cannot expand RPATH length
QMAKE_RPATHDIR = \
/XYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXY1\
/XYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXY2\
/XYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXY3\
/XYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXYXY4
QMAKE_POST_LINK += 'chrpath -r \'/my/qt/installation:\$$ORIGIN\' $$OUT_PWD/mybinaryname;'

I'm taking a bit of a guess at what's happening, but it's based on knowing some of the odd behaviours of ld.
check for the presence of an LD_LIBRARY_PATH variable that will come into effect before the processing of a RUNPATH variable. Because of the presence of both RPATH and RUNPATH, the LD_LIBRARY_PATH rule comes into effect, so if it's set then unset it.
Secondly, I'd never expect to see:
libQt5Core.so.5 => ./libQt5Core.so.5 (0x00007fb92aba9000)
in the output of ldd, I would always see the expansion of $ORIGIN to the directory of the binary (maybe you shortened it?), so I would have expected:
libQt5Core.so.5 => /path/to/bin/./libQt5Core.so.5 (0x00007fb92aba9000)
Which means it sounds like the LD_LIBRARY_PATH expansion is .:/usr/local/Trolltech/Qt-5.2.0/lib, which to me sounds like you've got environmental overrides happening.

qmake would always append the QMAKE_RPATHDIR with the QT_INSTALL_LIBS internally defined in $(QT_DIR)/mkspecs/features/qt.prf file:
170: relative_qt_rpath:!isEmpty(QMAKE_REL_RPATH_BASE):contains(INSTALLS, target):\
173: QMAKE_RPATHDIR += $$relative_path($$[QT_INSTALL_LIBS], $$qtRelativeRPathBase())
175: QMAKE_RPATHDIR += $$[QT_INSTALL_LIBS/dev]
179:!isEmpty(QMAKE_LFLAGS_RPATHLINK):!contains(QT_CONFIG, static) {
189: QMAKE_RPATHLINKDIR *= $$unique(rpaths)
So to avoid your application using the QT library from system path, comment out the lines above which append the QMAKE_RPATHDIR and add QMAKE_RPATHDIR=$ORIGIN into your .pro file.

Compressing the core files during core generation

Is there way to compress the core files during core dump generation?
If the storage space is limited in the system, is there a way of conserving it in case of need for core dump generation with immediate compression?
Ideally the method would work on older versions of linux such as 2.6.x.

The Linux kernel /proc/sys/kernel/core_pattern file will do what you want: http://www.mjmwired.net/kernel/Documentation/sysctl/kernel.txt#191
Set the filename to something like |/bin/gzip -1 > /var/crash/core-%t-%p-%u.gz and your core files should be saved compressed for you.

For an embedded Linux systems, following script change perfectly works to generate compressed core files in 2 steps
step 1: create a script
touch /bin/gen_compress_core.sh
chmod +x /bin/gen_compress_core.sh
cat > /bin/gen_compress_core.sh #!/bin/sh exec /bin/gzip -f - >"/var/core/core-$1.$2.gz"
ctrl +d
step 2: update the core pattern file
cat > /proc/sys/kernel/core_pattern |/bin/gen_compress_core.sh %e %p ctrl+d

As suggested by other answer, the Linux kernel /proc/sys/kernel/core_pattern file is good place to start: http://www.mjmwired.net/kernel/Documentation/sysctl/kernel.txt#141
As documentation says you can specify the special character "|" which will tell kernel to output the file to script. As suggested you could use |/bin/gzip -1 > /var/crash/core-%t-%p-%u.gz as name, however it doesn't seem to work for me. I expect that the reason is that on my system kernel doesn't treat the > character as a output, rather it probably passes it as a parameter to gzip.
In order to avoid this problem, like other suggested you can create your file in some location I am using /home//crash/core.sh, create it using the following command, replacing with your user. Alternatively you can also obviously change the entire path.
echo -e '#!/bin/bash\nexec /bin/gzip -f - >"/home/<username>/crashes/core-$1-$2-$3-$4-$5.gz"' > ~/crashes/core.sh
Now this script will take 5 input parameters and concatenate them and add to core-path. The full paths must be specified in the ~/crashes/core.sh. Also the location of this script can be specified. Now lets tell kernel to use tour executable with parameters when generating file:
sudo sysctl -w kernel.core_pattern="|/home/<username>/crashes/core.sh %e %p %h %t"
Again should be replaced (or entire path to match location and name of core.sh script). Next step is to crash some program, lets create example crashing cpp file:
int main (){
int * a = nullptr;
int b = *a;
}
After compiling and running there are 2 options, either we will see:
Segmentation fault (core dumped)
Or
Segmentation fault
In case we see the latter, there are few possible reasons.
ulimit is not set, ulimit -c should specify what is limit for cores
apport or your distro core dump collector is not running, this should be investigated further
there is an error in script we wrote, I suggest than checking some basic dump path to check if the other things aren't reason the below should create /tmp/core.dump:
sudo sysctl -w kernel.core_pattern="/tmp/core.dump"
I know there is already an answer for this question however it wasn't obvious for me why it isn't working "out of the box" so I wanted to summarize my findings, hope it helps someone.

Where are include files stored - Ubuntu Linux, GCC

So, when we do the following:
#include <stdio.h>
versus
#include "myFile.h"
the compiler, GCC in my case, knows where that stdio.h (and even the object file) are located on my hard drive. It just utilizes the files with no interaction from me.
I think that on my Ubuntu Linux machine the files are stored at /usr/include/. How does the compiler know where to look for these files? Is this configurable or is this just the expected default? Where would I look for this configuration?
Since I'm asking a question on these include files, what are the source of the files? I know this might be fuzzy in the Linux community but who manages these? Who would provide and manage the same files for a Windows compiler.
I was always under the impression that they come with the compiler but that was an assumption...

See here: Search Path
Summary:
#include <stdio.h>
When the include file is in brackets the preprocessor first searches in paths specified via the -I flag. Then it searches the standard include paths (see the above link, and use the -v flag to test on your system).
#include "myFile.h"
When the include file is in quotes the preprocessor first searches in the current directory, then paths specified by -iquote, then -I paths, then the standard paths.
-nostdinc can be used to prevent the preprocessor from searching the standard paths at all.
Environment variables can also be used to add search paths.
When compiling if you use the -v flag you can see the search paths used.

gcc is a rich and complex "orchestrating" program that calls many other programs to perform its duties. For the specific purpose of seeing where #include "goo" and #include <zap> will search on your system, I recommend:
$ touch a.c
$ gcc -v -E a.c
...
#include "..." search starts here:
#include <...> search starts here:
/usr/local/include
/usr/lib/gcc/i686-apple-darwin9/4.0.1/include
/usr/include
/System/Library/Frameworks (framework directory)
/Library/Frameworks (framework directory)
End of search list.
# 1 "a.c"
This is one way to see the search lists for included files, including (if any) directories into which #include "..." will look but #include <...> won't. This specific list I'm showing is actually on Mac OS X (aka Darwin) but the commands I recommend will show you the search lists (as well as interesting configuration details that I've replaced with ... here;-) on any system on which gcc runs properly.

Karl answered your search-path question, but as far as the "source of the files" goes, one thing to be aware of is that if you install the libfoo package and want to do some development with it (i.e., use its headers), you will also need to install libfoo-dev. The standard library header files are already in /usr/include, as you saw.
Note that some libraries with a lot of headers will install them to a subdirectory, e.g., /usr/include/openssl. To include one of those, just provide the path without the /usr/include part, for example:
#include <openssl/aes.h>

The \#include files of gcc are stored in /usr/include .
The standard include files of g++ are stored in /usr/include/c++.

g++ searches /lib/../lib/, then /lib/

According to g++ -print-search-dirs my C++ compiler is searching for libraries in many directories, including ...
/lib/../lib/:
/usr/lib/../lib/:
/lib/:
/usr/lib/
Naively, /lib/../lib/ would appear to be the same directory as /lib/ — lib's parent will have a child named lib, "that man's father's son is my father's son's son" and all that. The same holds for /usr/lib/../lib/ and /usr/lib/
Is there some reason, perhaps having to do with symbolic links, that g++ ought to be configured to search both /lib/../lib/ and /lib/?
If this is unnecessary redundancy, how would one go about fixing it?
If it matters, this was observed on an unmodified install of Ubuntu 9.04.
Edit: More information.
The results are from executing g++ -print-search-dirs with no other switches, from a bash shell.
Neither LIBRARY_PATH nor LPATH are output from printenv, and both echo $LPATH and echo LIBRARY_PATH return blank lines.

An attempt at an answer (which I gathered from a few minutes of looking at the gcc.c driver source and the Makefile environment).
These paths are constructed in runtime from:
GCC exec prefix (see GCC documentation on GCC_EXEC_PREFIX)
The $LIBRARY_PATH environment variable
The $LPATH environment variable (which is treated like $LIBRARY_PATH)
Any values passed to -B command-line switch
Standard executable prefixes (as specified during compilation time)
Tooldir prefix
The last one (tooldir prefix) is usually defined to be a relative path:
From gcc's Makefile.in
# Directory in which the compiler finds libraries etc.
libsubdir = $(libdir)/gcc/$(target_noncanonical)/$(version)
# Directory in which the compiler finds executables
libexecsubdir = $(libexecdir)/gcc/$(target_noncanonical)/$(version)
# Used to produce a relative $(gcc_tooldir) in gcc.o
unlibsubdir = ../../..
....
# These go as compilation flags, so they define the tooldir base prefix
# as ../../../../, and the one of the library search prefixes as ../../../
# These get PREFIX appended, and then machine for which gcc is built
# i.e i484-linux-gnu, to get something like:
# /usr/lib/gcc/i486-linux-gnu/4.2.3/../../../../i486-linux-gnu/lib/../lib/
DRIVER_DEFINES = \
-DSTANDARD_STARTFILE_PREFIX=\"$(unlibsubdir)/\" \
-DTOOLDIR_BASE_PREFIX=\"$(unlibsubdir)/../\" \
However, these are for compiler-version specific paths. Your examples are likely affected by the environment variables that I've listed above (LIBRARY_PATH, LPATH)

Well, theoretically, if /lib was a symlink to /drive2/foo, then /lib/../lib would point to /drive2/lib if I'm not mistaken. Theoretically...
Edit: I just tested and it's not the case - it comes back to /lib. Hrm :(

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to make Unix binary self-contained? - linux

There's CDE a bit of software designed to do exactly what you want. Here's a google tech talk about it http://www.youtube.com/watch?v=6XdwHo1BWwY

Related

Pattern syntax %.3: man/libfoo.man in Automake with different base name

Setting RPATH order in QMake

Compressing the core files during core generation

Where are include files stored - Ubuntu Linux, GCC

g++ searches /lib/../lib/, then /lib/

Categories

Resources