Estimating compiler (GCC or LLVM-clang) version from a stripped ELF binary - linux

ELF binaries of Linux distributions are stripped and don't include ".comment" section.
Therefore, I could not get the information which compiler build the ELF binary.
I guess most of them are created by GCC and want to know which GCC version is used.
I know that if the ".comment" section is included in the ELF binary, I can get the compiler information using "readelf" or "objdump".
The method was posted before at the stackoverflow.com
How to retrieve the GCC version used to compile a given ELF executable?
I guess I can use decompiling tools (e.g., Hex-Rays Decompiler https://www.hex-rays.com/products/decompiler/ ) to estimate the compiler version.
I also want to know the compiler name if the binary is not created by GCC. For example LLVM-clang.
Do you know a tool to estimate the compiler name and version which create the ELF binary?
I prefer to use a free tool.

ELF binaries of Linux distributions are stripped and don't include ".comment" section. Therefore, I could not get the information which compiler build the ELF binary.
Most binaries also come with a separate debuginfo package, which does have .comment sections, and full source package which allows you to configure and build (almost) identical binary.
Examining either the debuginfo, or the source package is a much easier and more reliable way to answer your question, then guessing from the binary will ever be.
Do you know a tool to estimate the compiler name and version which create the ELF binary?
I doubt such a tool exists: writing such a tool would be a mostly pointless exercise.

Related

Can a library (.so) dynamically load another library built with a different compiler

Summary:
I am having troubles with one library dynamically loading another another and I'm wondering if difference in the compilers is the root cause.
Problem Details:
My application links into libgbm.so which dynamically loads libpvrGBMWSEGL.so and then requests the gbm_backend function.
#libgbm.so
module = dlopen("/usr/lib/libpvrGBMWSEGL.so", RTLD_NOW | RTLD_GLOBAL)
dlsym(module, entrypoint)
When I try to use the symbol provided, it throws a segmentation fault.
Analysis:
libpvrGBMWSEGL.so is provided as a proprietary binary blob. A quick analysis shows that it was build with Linaro GCC 5.3-2016.02
> strings libpvrGBMWSEGL.so | grep GCC
GCC: (Linaro GCC 5.3-2016.02) 5.3.1 20160113
Meanwhile the library libgbm which dynamically calls it was build with Buildroot GCC 6.4.0
> strings libgbm.so | grep GCC
GCC: (Buildroot 2017.11-git-00884-g7af8140-dirty) 6.4.0
Question:
Should I expect these two library to be compatible in the manner in which I am using them?
For many platforms, there is a published ABI document to which compilers are expected to adhere. For C++ and on top of those platform ABIs, there is the Itanium C++ ABI (which has nothing to do with Itanium anymore and will be Itanium's lasting contribution to computing, I assume).
This does not extend to libraries, though. There are many libcs for Linux, and something compiled and linked against glibc will not run on Bionic libc (Android) and vice versa, even if the architectures match. Essentially the same thing is true for the C++ standard library (and even the implementation that comes with GCC comes with slightly different ABIs as option).
With ARM, there is also a considerable amount of sub-architecture variation.
The summary is: When everyone makes an effort, then what you are trying to do will work. If not, probably not. Getting this right for C++ is more difficult than for C.

How to compile ARM32 only binary (no thumb)

Is there a GCC configuration which will produce an executable only containing ARM32 code?
I know the -marm switch tells the compiler not to produce Thumb code, but it applies only to the user code of the program, while initialization routines (e.g. _start, frame_dummy, ...) still contain Thumb instructions.
I am using the Linaro cross compiler tool-chain (arm-linux-gnueabihf-) on a Linux x86-64 system.
EDIT :
While recompiling the tool-chain I found the (probable) solution myself. The initialization routines encoded as Thumb are part of glibc and can be found in the object files crt1.o, crti.o and crtbegin.o. I haven't tried recompiling it, but there may be a configuration value which forces the whole libc to be encoded as ARM32.
Is there a GCC configuration which will produce an executable only containing ARM32 code? I know the -marm switch ...
Your main problem is that that code (e.g. _start) is not produced by the compiler but it is already present pre-compiled (as thumb code).
If you want to have these functions to be non-thumb code you'll have to "replace" the existing files (thumb) by your own ones (non-thumb).
(You don't have to overwrite the existing files but you can instruct the linker to search for these files in a different directory.)
If you don't find pre-built non-thumb files you'll have to create them yourself (what may be a lot of work).

Execution of binaries created by specific compiler

I want to restrict the execution of the binaries, in Linux, to those only compiled by myself. Lets say my system has gcc with version 4.8.4, I want to allow execution of ELF binaries that are compiled by gcc only installed on my system. Any ELF that is even compiled by same version 4.8.4, should not execute on my system.
.comment section contains the version and name of the compiler used to compile ELF. Can we use this information if yes how?
Any idea and suggestion is much appreciated
I want to restrict the execution of the binaries, in Linux, to those only compiled by myself.
Suppose you succeed in this. You do realize that your shell, your gcc, your ls will all immediately stop working (unless you've built them all yourself prior to turning on the restriction).
But suppose you have built the entire system, including the compiler, assembler and linker. In this case, you'll want to modify your linker and your kernel such that the linker signs the binaries it links, and your kernel verifies that a valid signature is present, and refuses to run the binary if the signature is invalid. You can read more about code signing here.
.comment section contains the version and name of the compiler used to compile ELF.
Normally, it does. However it is trivial to add a .comment section with arbitrary contents to an already-linked executable, so you can't base your restriction on the contents of .comment (unless you want your restriction to be trivial to bypass).
can objcopy ease my life without changing linker?
Yes.
I believe you are thinking about your problem in a wrong way (thinking about mechanics before thinking about the substance of your solution).
Assuming you do want to do code signing, you'll need to generate a signature for every binary, and then:
Have some means for the kernel to verify that the signature is valid, and
Attach the signature to the binary. This could be as easy as having foo.signature in the same directory as foo for every binary you wish to run. Or you could use objcpy to make the signature part of the binary itself (more convenient if you move your binaries around).

arm-linux-gnueabi toolchain vs arm-linux-androideabi toolchain.

Can I compile files (e.g. C or C++ source code) using for my android device using the arm-linux-gnueabi-* toolchain?
My question might seem a bit silly, but will I get the same result as compiling with the arm-linux-androideabi-* toolchain?
A compilation might mean more than just converting source code to binary. A compiler like GCC also provides certain libraries, in this case libgcc for handling what hardware can't handle. When a compiler becomes a toolchain, it also provides runtime libraries standardised by the programming language similar to ones provided in target system. In arm-linux-gnueabi-'s case that might be libc and for arm-linux-androideabi- that's bionic.
You can produce compatible object files to be used by different compilers, that's what elf is for.
You can produce static executable which can be mighty in size and they should work on any matching hardware/kernel, because in that case toolchains aim for that.
But if you produce dynamic executables, those ones can only run on systems that's supporting their dependencies. Because of that a simple "hello world" application that's not static build by arm-linux-gnueabi- won't work on an Android system since it provides bionic, not libc.

Compiling a fortran program on linux and moving the executable to another linux machine

I have a code that I have written in Fortran during my PhD, and now I am collaborating with some researcher that uses Linux, and they need my model, that is basically a single executable file. In the future I will probably make it open source, but up to know they just want the executable, also because they are not programmers and they have never compiled a program in their life. So the question is: is it possible to compile it on my linux machine and then send it to them in order to use it in another linux machine?Or does the linux version and distribution matter?
thank you very much
A.
If you do not use many libraries you can do that. One option is statically linking the executable (-static or similar compiler option). You need to have the static versions of all needed libraries for that. The have .a suffix. They are often not installed by default in Linux distributions and often they are not supplied in the repositories at all.
In my distrbution (OpenSuSE) they are in packages like glibc-devel-static, lapack-devel-static and similar.
The other option would be to compile the executable on a compatible distribution the users will have (GLIBC version is important) and supply all .so dynamically linked libraries they will need with your executable.
All of this assumes you use the same platform, like i586 or amd64 or arm like wallyk comments. I mostly assumed you are on a PC. You can force most compilers to produce a 32-bit or 64-bit executable by -m32 or -m64 option. You need the right version of the development libraries for that.

Resources