How to build glibc with reduced size?

How to build glibc with reduced size? - linux

I'm trying to download glibc 2.23 sources and build them on my Ubuntu system.
I need to build that specific version from sources for getting modified version of glibc customized for my research, and it will be used only within my research apps using the loader environment variables (e.g., LD_PREDLOAD or LD_LIBRARY_PATH).
But, when building it as following, I got a huge file as an output (libc.so weights about 11MB):
download the sources to some local dir (let's say /tmp/glibc/)
create new directory for build results (/tmp/glibc/build)
run configure from build dir:
< build-dir >$ ../configure --prefix=< build-dir >
As a result, the build process will produce libc.so file under build-dir with a size of 11MB.
Is there anyway to reduce the size of the built libc.so?
p.s.
Here are my system details:
Linux version 4.4.0-93-generic (buildd#lgw01-03) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) ) #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC 2017
Thanks :)

Building glibc from source could be a bad idea. See this and some comments there. Its current version is GNU libc 2.26... Consider instead upgrading your entire Ubuntu distribution (Ubuntu 17.10 should be released in a few weeks, end of October 2017)
../configure --prefix= build-dir
is a misunderstanding of the role of --prefix in autoconf-ed software. It relates to where the software is installed, not to its build directory.
(and I don't know exactly what should be your --prefix since libc is so essential to your system, perhaps it should be --prefix=/ but you should check carefully)
Is there any way to reduce the size of the built libc.so?
You might use (very carefully) strip(1), but you risk breaking your system.
And you might not care about reducing the size of libc since it is used (and shared) by almost every software on your Linux system!
BTW, consider also musl-libc. It can cohabit nicely with GNU glibc, and in practice is used only by programs built with musl-gcc (provided by it).
If you are doing some research, it would be reasonable to work in a chroot(2)-ed environment. See also schroot. You could install with the help of make install DESTDIR=/tmp/instmylibc then copy that /tmp/instmylibc appropriately. Read more about autoconf
PS. Be sure to at least back up your important data before such dangerous experimentations. I don't think that the size of your libc.so should be a significant concern. But you need to use chroot, perhaps with the help of debootstrap during installation of the chrooted environment.

Related

Build and bind against older libc version

I have dependencies in my code that requires libc. When building (cargo build --release) on Ubuntu 20.04 (glibc 2.31) the resulting executable doesn't run on CentOS 7 (glibc 2.17). It throws an error saying it requires GLIBC 2.18.
When build the same code on CentOS 7 the resulting executable runs on CentOS 7 and Ubuntu 20.04.
Is there a way to control which GLIBC version is required to build this version on Ubuntu 20.04 too?

If your project does not depend on any native libraries, then probably the easiest way would be to use the x86_64-unknown-linux-musl target.
This target statically links against MUSL Libc rather than dynamically linking against the system's libc. As a result it produces completely static binaries which should run on a wide range of systems.
To install this target:
rustup target add x86_64-unknown-linux-musl
To build your project using this target:
cargo build --target x86_64-unknown-linux-musl
See the edition guide for more details.
If you are using any non-rust libraries it becomes more difficult, because they may be dynamically linked and may in turn depend on the system libc. In that case you would either need to statically link the external libraries (assuming that is even possible, and that the libraries you are using will work with MUSL libc), or make different builds for each platform you want to target.
If you end up having to make different builds for each platform, a docker container would be the easiest way to achieve that.

Try cross.
Install it globally:
cargo install cross
Then build your project with it:
cross build --target x86_64-unknown-linux-gnu --release
cross take the same arguments as cargo but you have to specify a target explicitly. Also, the build directory is always target/{TARGET}/(debug|release), not target/(debug|release)
cross uses docker images prebuilt for different target architectures but nothing stops you from "cross-compiling" against the host architecture. The glibc version in these docker images should be conservative enough. If it isn't, you can always configure cross to use a custom image.

In general, you need to build binaries for a given OS on that OS, or at the very least build on the oldest OS you intend to support.
glibc uses symbol versioning to preserve the behavior of older programs while adding support for new functionality. For example, a newer version of pthread_mutex_lock may support lock elision, while the old one would not. You're seeing this error because when you link against libc, you link against the default version of the symbol if a version isn't explicitly specified, and in at least one case, the version you linked against is from glibc 2.18. Changing this would require recompiling libstd (and the libc crate, if you're using it) with custom changes to pick the old versioned symbols, which is a lot of work for little gain.
If your only dependency is glibc, then it might be sufficient to just compile on CentOS 7. However, if you depend on other libraries, like OpenSSL, then those just aren't compatible across OS versions because their SONAMEs differ, and there's no way around that. So that's why generally you want to build different binaries per OS.

glibc version for aarch64

I'm cross-compiling an application for aarch64 on my x86 Ubuntu Bionic system, and I have problems with glibc version mismatch. My cross-compile toolchain was using v2.27, while the system that is to run the application has v2.24. I thought that it might be due to my toolchain having a too high version, so I decided to downgrade.
After removing all previous cross-compilation installs, I installed gcc-4.8-aarch64-linux-gnu (as I had successfully cross-compiled the application with this version on a different host system), thinking that it would install an older aarch64 version of glibc to /usr/aarch64-linux-gnu/lib/. However, again, v2.27 was installed (I verified that this directory didn't exist before installing the new cross-compilation toolchain).
So my question is twofold:
What determines which aarch64 version of glibc is installed on my system when installing gcc-4.8-aarch64-linux-gnu? Is it directly tied to my own system's x86 version of glibc?
Is there a correct way to install the aarch64 version of glibc v2.24 (or lower) on my system?

I concur with your hypothesis. After battling similar symptoms for 40 hours straight, I've discovered this confirmation:
https://packages.ubuntu.com/impish/gcc-10-aarch64-linux-gnu
https://packages.debian.org/bullseye/gcc-aarch64-linux-gnu
Note that Ubuntu 21.10 (Impish) and Debian 11 (Bullseye) have packages for a gcc 10 cross compiler. Be wary of the very confusing fact the Ubuntu's default package is actually gcc 11, but Debian 11's default is gcc 10. The similar version numbers of Debian and gcc are a coincidence. Also ignore for now the fact that Ubuntu's package is gcc 10.3.0 and Debian's is gcc 10.2.1.
Focus instead on the recommendations and dependencies of each package. Ultimately the Ubuntu package calls up libc >= 2.34, while the Debian package calls up libc >= 2.28.
Sure enough, when I cross-compile from Impish on x86 for Bullseye on aarch64 (despite having a complete SYSROOT for the target), I get this at runtime:
/lib/aarch64-linux-gnu/libc.so.6: version 'GLIBC_2.34' not found
But your question remains, is there any tie between the host libc and that used by the cross-compiler? The answer is a definite maybe.
See this excellent answer and links for an overview of a cross-compiler. The take-away:
You don't just cross-compile glibc, you need to cross-compile an entire toolchain. Toolchain components are ALWAYS: ld + gcc + libc + gdb.
So the C library is an integral part of the cross-compiler.
What shenanigans then, are going on when you install gcc-aarch64-linux-gnu? It's just a compiler - only one of the four parts of a toolchain.
Well apparently there's some flexibility. Technically, a cross-compiler can be naked. That's typically only useful when you're compiling an operating system, rather than an executable that runs on an operating system. So you can construct special toolchains for special purposes.
But for the standard purpose (cross compiling for Linux on another architecture) you want a typical toolchain. Which is where the package's dependencies and recommendations come in. A gcc is always in want of an ld which is always in want of a libc, and the ménage à trois is intimate. In fact, gcc is built with libc using ld in a complex do-si-do. See this example from a great guide by Preshing on Programming:
It's possible to force separation and link to other libraries, but it's not easy.
For example, the linker you use has a set of default search directories that are baked in. From the fine manual:
The default set of paths searched (without being specified with -L) depends on which emulation mode ld is using, and in some cases also on how it was configured.
And it gets more intwined. By default, gcc will call on a dynamic linker whose location is hard-coded. For a cross-compiler, it might be something like /lib/ld-linux-aarch64.so.1. Not only that, the executable may also end up with the hardcoded path, as its program interpreter.
Again, if you're careful you can tear apart the toolchain and override things. But not only is it tricky to enforce, particularly if you have a complex build, the multitude of combinations of options and paths means there are also often bugs. So your host environment can easily leak into your cross-compiling toolchain.
So in summary, cross-compiling requires a toolchain. While pulling a cross-compiler from a package manager seems like an easy and legitimate thing to do, it comes with a lot of implicit baggage. You can either carefully follow the package dependencies to check what version you're getting, or use one of the many dedicated toolchain environments, such as crosstool-NG.

to use latest GCC(4.8.1) on older Linux Svrs (Redhat EL 5.7)

We do coding & building in one Linux machine, & deploy to our cluster with hundreds of cores. For now both type of machines are Redhat EL 5.7, with the default GCC 4.1.2 installed.
Recently we realized the latest GCCs (e.g. 4.8.1) got extensive optimizations for arithmetic calculations, including the usage of MPFR/MPC etc. Because our programs are very floating point calculation intensive, we hope to rebuild our programs in latest GCCs to achieve the boost.
Here are the current linking details for a typical program built by us:
linux-gate.so.1 => (0x007e0000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x009bb000)
libc.so.6 => /lib/libc.so.6 (0x00581000)
/lib/ld-linux.so.2 (0x0034c000)
Because it's unlikely for us to upgrade OS, or install new stuffs in cluster. (tightly controlled). So questions are:
1, for developing, possible for us to install the latest GCC on existing machine? (We tried and found lot's of dependencies needed). And possible to link to the older libs?
2, for deployment, possible to deploy to our cluster without installing new softwares? For these MPFR etc, can we just deploy the so files instead of installing RPMs in the target cluster nodes?
Thanks a lot for any help.

You need to install the required dependencies (and their required versions) to build GCC 4.8; notice that MPFR, CLOOG etc are only needed to the compiler (so are needed at compilation time, not at run time of your compiled program), so you don't need to install them to deploy your compiled program. Don't link the compiler to older versions of required dependencies.
The gcc-4.8 source tarball has a contrib/download_prerequisites script which could be helpful.
If building GCC 4.8 from source tarball, don't forget to build outside of the source tree, and to follow the install instruction.
You may want to link your program with the -static-libgcc option, or even to ..../configure the compiler with --disable-shared and --program-suffix=-4.8 configure options
(with that program-suffix option, you'll run your new GCC as gcc-4.8 and it will get installed, unless you configure some --prefix, in /usr/local/bin/ by default; this won't interfere with the system gcc; if you don't have [root] write access to /usr/local you should configure your own --prefix).
BTW, you might consider perhaps customizing your GCC 4.8, e.g. thru plugins or better yet using MELT.

Compiling program for old kernel

I statically compiled and linked a program in an up-to-date Linux machine, and ran it in another Linux which is 9 years old. It gave me an error "FATAL: kernel too old" and quit. Specifically, the new one is Fedora 18 (gcc 4.7.2, glibc 2.16, kernel 3.7.2) and the old one is RHEL4.8 (glibc 2.3.4, kernel 2.6.9). Since it's static linking, glibc version shouldn't matter. I guess the problem here is that the program calls system calls that's not in the old kernel.
If development on the old system is not an option, how can I build the program in the new system and run in the older (or even better, both)? I was looking for a way to run gcc in a compatible mode, which only calls old system calls. No luck yet.

The easiest option is to always build on the older system.
Alternatively, copy the glibc headers and static libraries from the old system to the new and link against those.
If that doesn't work, you'll have to rebuild glibc with --enable-kernel=2.6.9 or something like that.

Compiled gcc4.4.6 on one machine, how to let another machine use it?

I built gcc 4.4.6 (to use CUDA) on a fast server, it takes about 10 min. However, on my own desktop, it takes kinda for ever to compile.
So both machines are 64 bit Linux, although 1 is Ubuntu while the other is Arch Linux. Arch Linux has new kernel version.
So on the server, I installed the built gcc-4.4.6 to /opt. And I just copy /opt/gcc-4.4.6 to my PC's /opt/gcc-4.4.6.
em, seems like it doesn't quite work, when I tried
./x86_64-unknown-linux-gnu-gcc ~/Development/c/hello/hello.c
it shows
x86_64-unknown-linux-gnu-gcc: error trying to exec 'cc1': execvp: No such file or directory
So what can I do now?
Thanks,
Alfred

If the systems are similar enough, you could compile GCC on the big machine (don't forget that GCC needs to be configured and built in a directory outside of its source tree), then run make -j3 all and then make install DESTDIR=/tmp/gccinst/ and copy that /tmp/gccinst directory to your small machine, and finally copy it into the root filesystem (on the small machine).
However, GCC 4.4.6 is quite old today, if you are compiling GCC try to compile GCC 4.6.2 (or 4.6.1 at least).
And (shameless plug for my work) if you compile a GCC 4.6, please enable plugins on it, then you might try the GCC MELT [meta-] plugin (MELT is a high level domain specific language to ease the development of GCC extensions).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string