When cross compiling a compiler makes sense?

When cross compiling a compiler makes sense? - linux

I just had one of the "wait what, mmm ... " moments.
Assuming that you want to produce the compiler A for the targeted architecture a usually the outcome of the configuration phase depends on the value that you use for --target=, meaning that your tools need to be able to produce and deal with compiled objects for a.
Now, usually under a common GNU/Linux distribution with gcc as the main compiler the first thing that you want is to get the binutils and build them for your target, but you don't have a compiler that is compatible with your given target because that's what you are trying to do in the first place, creating a toolchain for a, so here starts the conundrum: how to break this loop ?
Now assuming that my previous example was taking into account and running on a machine with an architecture b, clearly different from a because we are always talking about the cross compilation case, you get lucky and the hardware manufacturer releases gcc for a on machines with the a architecture, you still have to solve the riddle on how to build a on b and break the previous loop. In other words even if you get support for your compiler on the original architecture this doesn't play any role when you want to cross compile.
So what's the logic behind this and how to break the loop ?

The "gcc" compiler is also "self-hosting". So, you usually build a "stage1" compiler on the non-target platform and then move to the target system and re-build the compiler with the "stage1" (running through "stage3").
You first need to understand the "Target Triplets" (you listed one "--target"), but you also have "--host" and "--build". From the link,
--build=build-type
the type of system on which the package is being configured and compiled. It defaults to the result of running config.guess.
--host=host-type
the type of system on which the package runs. By default it is the same as the build machine. Specifying it enables the cross-compilation mode.
--target=target-type
the type of system for which any compiler tools in the package produce code (rarely needed). By default, it is the same as host.
See also, Cross Linux From Scratch and the astonishing work at NetBSD.

Related

Handling autoconf with Android after NDK16

I'm trying to update an existing configuration we have we are cross compiling for a number of targets - the question specifically here is about Android. More specifically we are building code using cmake and the hunter package manager. However we are building ICU using a link that uses autoconf/configure, called from cmake. Not sure that is specifically important except that we have less control on the use of configure than is generally the case.
OK: we have a version that builds against an old NDK but I am updating and have hit a problem identified by https://android.googlesource.com/platform/ndk/+/master/docs/UnifiedHeaders.md: with NDK16 and later, the value of the sysroot parameter needs to vary between compilation and linkage. As it stands the configure script tries to build a small program conftest.c - the program fails to link. Manually I can compile the code in two stages using -c and then linking the subsequent .o, but that is not what configure is trying to do.
Now the reality is that when I build this code, I don't actually need to link the code - I am generating a library which is used elsewhere. However that is not currently the way that configure sees it.
I may look to redo the configuration script to just check that the code can be compiled when cross compiling. However I am curious to know if anybody has managed to handle this sort of thing by keeping the existing config files and just changing the parameters by which the scripts are called.

When r19 releases to stable this problem will go away on its own (https://github.com/android-ndk/ndk/issues/780), but since that's still in beta it's not a good solution just yet.
Prior to r19 (this isn't really unique to r16+, this has always been the case and it was just asymptomatic previously), autoconf builds should be done using a standalone toolchain.
You however should not use a standalone toolchain for CMake, so odds are something about your configuration will need to change until r19 is released. Depending on the effort involved, it may make sense to keep to r15 until r19 is available.

Does recompiling a compiler has effects on the compiled code?

I have to install without root access some software (the gromacs simulation package) on a cluster server, on which jobs can be sent through slurm. I only have direct access to the front-end machine, and the home directory is shared among all the servers and front-end. I had to manually build and install locally:
gcc 4.8
automake, autoconf, cmake
openmpi
lapack libs
gromacs
Right now, I have installed all of this only on the front-end, which is an older Intel Xeon machine. The production servers have new AMD processor instead. This is my question: in order to achieve optimal performance, which parts of the aforementioned stack should be recompiled on the production servers? I guess it would make much sense to rebuild the final software (gromacs) and maybe the lapack libs, because of the different instruction sets and processor architecture, but I'm not exactly sure whether it would make any sense to rebuild the compiler or other parts of the system. Hence the question: does using a compiler (and the associated libraries) which have been built on a different machine result in higher execution times for the generated binaries?
In general, I'd expect a compiler to produce the same binaries if given the same output, so the answer would be no; but what about the libraries (as libstdc++) which have been compiled together with the compiler on the other machine?
thank you

In order to optimize gromacs (parallel molecular dynamics code), you can forget about recompiling the compileror the compilation tools: that's useless.
You should go after and check for optimizations. For Intel CPU using the Intel C compiler makes a difference. It's possible you observe some gains with AMDs as well.
Another alternative is to use the Portland Group compiler.
Regarding MPI, you need to be sure it's customized for your interconnect (for example, if you have infiniband, avoid to use the TCP standard version).
regarding lapack libraries, you need to install optimized lapack (ACML for AMDs, MKL for Intels. You can use with very good performance GOTO or ATLAS blas - they are included in many linux distros).
You have not mentioned FFT: they are indeed important for electromagnetics (Ewald summations) in the simulations: FFTW here is a good choice. You need to install the correct version for the processor or compile it on the target processor, because it performs a sort of "auto-tuning" in the compilation process.
Going below than this (tools, compilers) make no difference on the produced executables.

Building the GCC compiler already involves a four-stage bootstrap process, one of whose purposes is to QA the compiler by ensuring the last two stages produce the same output. So there is no reason to believe that a fifth stage will have any effect at all.

Where did the first make binary come from?

I'm having to build gnu make from source for reasons too complicated to explain here.
I noticed to build it I require the make command itself, in the traditional fashion:
./configure
make install
So what if I didn't have the make binary already? Where did the first ever make binary come from?

From the same place the first gcc binary came from.
The first make was created probably using a shell script to do the build. After that, make would "make" itself.
It's a notable achievement in systems development when the platform becomes "self-hosting". That is the platform can build itself.
Things like "make make" and "gcc gcc.c".
Many language writers will create their language in another language (say, C) and when they have moved it far enough along, they will use that original bootstrap compiler to write a new compiler in the original language. Finally, they discard the original.
Back in the day, a friend was working on a debugger for OS/2, notable for being a multi-tasking operating system at the time. And he would regale about the times when they would be debugging the debugger, and find a bug. So, they would debug the debugger debugging the debugger. It's a novel concept and goes to the heart of computing and abstraction.
Inevitably, it all boils back to when someone keyed in something through a hardwire key pad or some other switches to get an initial program loaded. Then they leveraged that program to do other work, and it all just grows from there.

Stuart Feldman, then at AT&T, wrote the source code for make around the time of 7th Edition UNIX™, and used manual compilation (or maybe a shell script) until make was working well enough to be used to build itself. You can find the UNIX Programmer's Manual for 7th Edition online, and in particular, the original paper describing the original version of make, dated August 1978.

make is just one convenience tool. It is still possible to invoke cc, ld, etc. manually or via other scripting tools.

If you're building GNU make, have a look at build.sh in the source tree after running configure:
# Shell script to build GNU Make in the absence of any `make' program.
# build.sh. Generated from build.sh.in by configure.

Compiling C programs is not the only way to produce an executable file. The first make executable (or more notably the C compiler itself) could for example be an assembly program, or it could be hand coded in machine code. It could also be cross compiled on a completely different system.

The essence of make is that it is a simplified way of running some commands.
To make the first make, the author had to manually act as make, and run gcc or whatever toolset was available, rather than having it run automatically.

how to change the host type for a 'Canadian cross' compilation of GCC with crosstool-NG

I've installed crosstool-NG and built GCC on a host+build x86 machine that targets arm-unknown-linux-gnueabi. I've then used arm-unknown-linux-gnueabi-gcc to compile a program that ran well on my ARM board.
I'm wanting to now build GCC, targeting ARM to be hosted on ARM. I believe the lingo is
build=i486-pc-linux-gnu
target=arm-unknown-linux-gnueabi-gcc
host=arm-unknown-linux-gnueabi-gcc
How do I do this? do I run ./configure for crosstool-NG passing --host=arm-unknown-linux-gnueabi-gcc?
or do I change the environment variables for CC/etc?

You do this with a .config file. I think samples with a comma in the name are good ones to look at. The main difference is that you must run ct-ng multiple times to create several cross compilers.
ct-ng has under went some changes in Canadian crosses lately. However, you will probably need to re-use your original cross compiler that runs on the PC. The reason is that a compiler will include libraries compiled for the ARM and you need to generate these libraries on your PC. Generally, ensure that the iX86-host+ARM-target compiler is on your path. Then you must set the host tuple or prefix for this tool chain in the toolchain menu. You need set the build tuple to the same compiler.
ct-ng help | grep variables
This gives a directory with a bunch of text files that you can grep for hints.
See 6 - Toolchain types.txt for example. Cross-native or Canadian-cross really doesn't matter, in terms of complexity of building. You need only one intermediate for Cross native, but you need two intermediate compilers for a Canadian cross.
Edit: Ct-ng's How a compiler is constructed has some information on all the happening.

Can autotools create multi-platform makefiles

I have a plugin project I've been developing for a few years where the plugin works with numerous combinations of [primary application version, 3rd party library version, 32-bit vs. 64-bit]. Is there a (clean) way to use autotools to create a single makefile that builds all versions of the plugin.
As far as I can tell from skimming through the autotools documentation, the closest approximation to what I'd like is to have N independent copies of the project, each with its own makefile. This seems a little suboptimal for testing and development as (a) I'd need to continually propagate code changes across all the different copies and (b) there is a lot of wasted space in duplicating the project so many times. Is there a better way?
EDIT:
I've been rolling my own solution for a while where I have a fancy makefile and some perl scripts to hunt down various 3rd party library versions, etc. As such, I'm open to other non-autotools solutions. For other build tools, I'd want them to be very easy for end users to install. The tools also need to be smart enough to hunt down various 3rd party libraries and headers without a huge amount of trouble. I'm mostly looking for a linux solution, but one that also works for Windows and/or the Mac would be a bonus.

If your question is:
Can I use the autotools on some machine A to create a single universal makefile that will work on all other machines?
then the answer is "No". The autotools do not even make a pretense at trying to do that. They are designed to contain portable code that will determine how to create a workable makefile on the target machine.
If your question is:
Can I use the autotools to configure software that needs to run on different machines, with different versions of the primary software which my plugin works with, plus various 3rd party libraries, not to mention 32-bit vs 64-bit issues?
then the answer is "Yes". The autotools are designed to be able to do that. Further, they work on Unix, Linux, MacOS X, BSD.
I have a program, SQLCMD (which pre-dates the Microsoft program of the same name by a decade and more), which works with the IBM Informix databases. It detects the version of the client software (called IBM Informix ESQL/C, part of the IBM Informix ClientSDK or CSDK) is installed, and whether it is 32-bit or 64-bit. It also detects which version of the software is installed, and adapts its functionality to what is available in the supporting product. It supports versions that have been released over a period of about 17 years. It is autoconfigured -- I had to write some autoconf macros for the Informix functionality, and for a couple of other gizmos (high resolution timing, presence of /dev/stdin etc). But it is doable.
On the other hand, I don't try and release a single makefile that fits all customer machines and environments; there are just too many possibilities for that to be sensible. But autotools takes care of the details for me (and my users). All they do is:
./configure
That's easier than working out how to edit the makefile. (Oh, for the first 10 years, the program was configured by hand. It was hard for people to do, even though I had pretty good defaults set up. That was why I moved to auto-configuration: it makes it much easier for people to install.)
Mr Fooz commented:
I want something in between. Customers will use multiple versions and bitnesses of the same base application on the same machine in my case. I'm not worried about cross-compilation such as building Windows binaries on Linux.
Do you need a separate build of your plugin for the 32-bit and 64-bit versions? (I'd assume yes - but you could surprise me.) So you need to provide a mechanism for the user to say
./configure --use-tppkg=/opt/tp/pkg32-1.0.3
(where tppkg is a code for your third-party package, and the location is specifiable by the user.) However, keep in mind usability: the fewer such options the user has to provide, the better; against that, do not hard code things that should be optional, such as install locations. By all means look in default locations - that's good. And default to the bittiness of the stuff you find. Maybe if you find both 32-bit and 64-bit versions, then you should build both -- that would require careful construction, though. You can always echo "Checking for TP-Package ..." and indicate what you found and where you found it. Then the installer can change the options. Make sure you document in './configure --help' what the options are; this is standard autotools practice.
Do not do anything interactive though; the configure script should run, reporting what it does. The Perl Configure script (note the capital letter - it is a wholly separate automatic configuration system) is one of the few intensively interactive configuration systems left (and that is probably mainly because of its heritage; if starting anew, it would most likely be non-interactive). Such systems are more of a nuisance to configure than the non-interactive ones.
Cross-compilation is tough. I've never needed to do it, thank goodness.
Mr Fooz also commented:
Thanks for the extra comments. I'm looking for something like:
./configure --use-tppkg=/opt/tp/pkg32-1.0.3 --use-tppkg=/opt/tp/pkg64-1.1.2
where it would create both the 32-bit and 64-bit targets in one makefile for the current platform.
Well, I'm sure it could be done; I'm not so sure that it is worth doing by comparison with two separate configuration runs with a complete rebuild in between. You'd probably want to use:
./configure --use-tppkg32=/opt/tp/pkg32-1.0.3 --use-tppkg64=/opt/tp/pkg64-1.1.2
This indicates the two separate directories. You'd have to decide how you're going to do the build, but presumably you'd have two sub-directories, such as 'obj-32' and 'obj-64' for storing the separate sets of object files. You'd also arrange your makefile along the lines of:
FLAGS_32 = ...32-bit compiler options...
FLAGS_64 = ...64-bit compiler options...
TPPKG32DIR = #TPPKG32DIR#
TPPKG64DIR = #TPPKG64DIR#
OBJ32DIR = obj-32
OBJ64DIR = obj-64
BUILD_32 = #BUILD_32#
BUILD_64 = #BUILD_64#
TPPKGDIR =
OBJDIR =
FLAGS =
all: ${BUILD_32} ${BUILD_64}
build_32:
${MAKE} TPPKGDIR=${TPPKG32DIR} OBJDIR=${OBJ32DIR} FLAGS=${FLAGS_32} build
build_64:
${MAKE} TPPKGDIR=${TPPKG64DIR} OBJDIR=${OBJ64DIR} FLAGS=${FLAGS_64} build
build: ${OBJDIR}/plugin.so
This assumes that the plugin would be a shared object. The idea here is that the autotool would detect the 32-bit or 64-bit installs for the Third Party Package, and then make substitutions. The BUILD_32 macro would be set to build_32 if the 32-bit package was required and left empty otherwise; the BUILD_64 macro would be handled similarly.
When the user runs 'make all', it will build the build_32 target first and the build_64 target next. To build the build_32 target, it will re-run make and configure the flags for a 32-bit build. Similarly, to build the build_64 target, it will re-run make and configure the flags for a 64-bit build. It is important that all the flags affected by 32-bit vs 64-bit builds are set on the recursive invocation of make, and that the rules for building objects and libraries are written carefully - for example, the rule for compiling source to object must be careful to place the object file in the correct object directory - using GCC, for example, you would specify (in a .c.o rule):
${CC} ${CFLAGS} -o ${OBJDIR}/$*.o -c $*.c
The macro CFLAGS would include the ${FLAGS} value which deals with the bits (for example, FLAGS_32 = -m32 and FLAGS_64 = -m64, and so when building the 32-bit version,FLAGS = -m32would be included in theCFLAGS` macro.
The residual issues in the autotools is working out how to determine the 32-bit and 64-bit flags. If the worst comes to the worst, you'll have to write macros for that yourself. However, I'd expect (without having researched it) that you can do it using standard facilities from the autotools suite.
Unless you create yourself a carefully (even ruthlessly) symmetric makefile, it won't work reliably.

As far as I know, you can't do that. However, are you stuck with autotools? Are neither CMake nor SCons an option?

We tried it and it doesn't work! So we use now SCons.
Some articles to this topic: 1 and 2
Edit:
Some small example why I love SCons:
env.ParseConfig('pkg-config --cflags --libs glib-2.0')
With this line of code you add GLib to the compile environment (env). And don't forget the User Guide which just great to learn SCons (you really don't have to know Python!). For the end user you could try SCons with PyInstaller or something like that.
And in comparison to make, you use Python, so a complete programming language! With this in mind you can do just everything (more or less).

Have you ever considered to use a single project with multiple build directories?
if your automake project is implemented in a proper way (i.e.: NOT like gcc)
the following is possible:
mkdir build1 build2 build3
cd build1
../configure $(YOUR_OPTIONS)
cd build2
../configure $(YOUR_OPTIONS2)
[...]
you are able to pass different configuration parameters like include directories and compilers (cross compilers i.e.).
you can then even run this in a single make call by running
make -C build1 -C build2 -C build3

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string