I'd like to copy an executable ELF file via:
$ objcopy -O binary myfile.elf myfile.bin
Unfortunately:
$ chmod +x myfile.bin
$ ./myfile.bin
results in: cannot execute binary file
Is there any way to retain the files executability?
To be executable by the execve(2) syscall, a file usually has to be some elf(5) file (or some script, or some old a.out format). But see also binfmt_misc
Your objcopy(1) command is loosing the essential meta-data in the ELF file. Maybe you want strip(1)
Recall that ELF is quite a complex and versatile format, it specifies the starting address, the interpreter (ld-linux(8) dynamic linker), the several segments of the program etc. All this meta-data is needed by the kernel for execve(2) and is lost with objcopy -O binary ...
When you use objcopy -O binary, you are copying only the binary data:
objcopy can be used to generate a raw binary file by using an output target of `binary' (e.g., use -O binary). When objcopy generates a raw binary file, it will essentially produce a memory dump of the contents of the input object file. All symbols and relocation information will be discarded. The memory dump will start at the load address of the lowest section copied into the output file.
In particular you lose the entry point and the segments list given in the original ELF header. The kernel cannot guess them.
I don't understand why you expect the result of objcopy -O binary to be executable by Linux using execve(2). The main purpose of that objcopy -O binary command is to make firmware or kernel-like stand-alone (or freestanding) binaries, and then you need to exactly understand how they should look like (e.g. what is their starting point, how they are loaded and started) and you probably also use some very specific linker script, and of course that binary cannot contain any syscall to the linux kernel (in particular cannot do any kind of input or output the way a plain Linux executable does them).
Read also more about ABIs, the x86-64 ABI, the Linux Assembly HowTo, the Advanced Linux Programming book.
You probably should read a good OS textbook like Operating System: Three Easy Pieces.
Related
I want to find the libc.so file that's being used in a Rust build so that I can query it with --version. (Some libcs expose their version information via C macros, so an alternative for them would be to use the cc crate in a build script. But others like musl don't.)
I can figure out which libstd-*.so file a rust binary or library will be linked against. When this libstd.so is linked against the host's libc, then running ldd on it shows that libc.so. But when the host system is using glibc and the targeted environment is musl, this doesn't work ("Invalid ELF header"). Instead of ldd, I could instead use readelf -d or objdump -p on the libstd.so. But these only show the filename of the libc.so file it uses, not its full path. And that libc.so isn't at any of the directories in LD_LIBRARY_PATH. (I do know where it is on my own systems, but I'm trying to find it programmatically on arbitrary systems.)
Running ldconfig -p only gives me information about the libc for the host system.
It would be great if there were a rustc equivalent of gcc's and clang's -print-file-name=libc.so, so that I could do something like rustc --target=$TARGET --print-file-name=libc.so.
Other ideas about how I could get this information?
You can pass linker arguments to rustc like so:
rustc -C link-args=...
To find out which libc.so is used, I believe the following command should suffice:
rustc -C link-args=-Wl,-t ...
From man ld:
-t
--trace
Print the names of the input files as ld processes them. ...
Update:
This didn't work: rustc "eats up" the output from the linker.
I was able to get the desired output indirectly:
echo 'fn main() { println!("")}' | rustc -C link-args=-Wl,-Map=map.out -o foo -
grep 'libc\.so' map.out
libc.so.6 /usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-f25e49a311b0f577.rlib(std-f25e49a311b0f577.std.cy8lhng1-cgu.2.rcgu.o) (setuid##GLIBC_2.2.5)
LOAD /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libc.so
LOAD /lib/x86_64-linux-gnu/libc.so.6
I'm trying to get ELF data (Section Header Table, SHT) for Linux system utilities.
I've noticed that I can get this info for my own programs when those were compiled such that objects (.o) file is created. I also managed to get SHT for bash via readelf, by typing:
readelf -l /bin/bash
However, it doesn't work for some utilities, like gunzip. For instance, I want to type-in something like:
readelf -S tar -czvf large_file.tar.gz large_file.dat
and get a set of execution attributes, like the ones in the below picture:
Anyone knows how to achieve this?
Thanks in advance,
Alexander.
The manpage for readelf shows that it expects to receive the path to an ELF binary file as its argument, not an entire command like tar -czvf large_file.tar.gz large_file.dat.
To find ELF data for tar, do this:
which tar
readelf -S <path to tar binary>
On my system, the path to tar is simply /bin/tar.
On Arch-Linux when linking an object file with ld to a dynamically linked ELF executable, it uses /lib/ld64.so.1 as the default dynamic linker. However, my dynamic linker is /lib/ld-2.26.so from Glibc.
I know, that I can specify the dynamic linker to ld with the --dynamic-linker option, but how can I ensure, that when compiling for other Linux distributions, the correct dynamic linker is found. In other words: how can I find the correct name of the dynamic linker on Linux?
Link with gcc -nostartfiles or gcc -nostdlib instead of using ld directly.
(With -no-pie or -pie if you want to make that choice explicit).
The system gcc knows the right path for the ELF interpreter.
Or to find out what the right path is, file /bin/ls and parse the output. On my Arch Linux system, it includes ... dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2,
Other programs include readelf -p /bin/ls, and ldd /bin/ls also includes the right path.
(But note that ldd includes the right path even if you use it on an ELF executable that has the wrong path; /bin/ldd is a shell script that works by running the ELF interpreter on an executable with special args, so the shell script contains paths to try, and the runtime dynamic linker doesn't look for itself because it's already running. You can use file or readelf -a to inspect executables to check for the right path, but not ldd.)
Is there any efficient way (maybe by abusing the gcc preprocessor?) to get a set of stripped kernel sources where all code not needed according to .config is left out?
Well got some steps into a solution.
First, one can obtain the used compiler commands by
make KBUILD_VERBOSE=1 | tee build.log
grep '^ gcc' build.log
For now, I select only one gcc command line for further steps. For example the build of kernel/kmod.c, it looks like:
gcc <LIST OF MANY OPTIONS> -c -o kernel/kmod.o kernel/kmod.c
I now remove the option -c, -o ... and add -E, thus disabling compilation and writing preprocessor output to the screen. Further I add -fdirectives-only to prevent macro expansion and -undef to remove the GNU defined macro definitions. -nostdinc to remove the standard c headers is already added by the kernel makefile.
Now includes are still included and thus expanded on the preprocessor output. Thus I pipe the input file through grep removing them: grep -v '#include' kernel/kmod.c. Now only one include is left: autoconf.h is included by the Makefile's command line. This is great as it actually defines the macros used by #ifdef CONFIG_... to select the active kernel code.
The only thing left is to filter out the preprocessor comments and the remaining #defines from autoconf.h by means of grep -v '^#'.
The whole pipe looks like:
grep -v '#include' kernel/kmod.c | gcc -E -fdirectives-only -undef <ORIGINAL KERNEL BUILD GCC OPTIONS WITHOUT -c AND -o ...> - |grep -v '^#'
and the result is a filtered version of kernel/kmod.c containing the code that is actually build into kmod.o.
Questions remain: How to do that for the whole source tree? Are there files that are actually build but never used and stripped at linking?
Kernel Minimization Script :
A project inspired by this question and providing an easy answer...
It contains a Python script that generate a minimized sources code during build time. The new minimized source tree will only contain used sources. (project page)
Info :
The script is tested working with the kernel v4.14.x, however building the kernel one more time from those generated minimized sources require to copy make files and Kconfig files etc... at least we could easily isolate only used source for investigations and development
Usage :
cd /kernel/sources
make
wget https://github.com/Hitachi-India-Pvt-Ltd-RD/minimization/raw/master/minimize.py
export PATH=$PATH:`pwd`
make C=2 CHECK=minimize.py CF="-mindir ../path-to-minimized-source-tree/"
Note & Reminder :
If we are building within and against the targeted machine, we also have the make localmodconfig command that shrink the current config file with only the currently used modules, if used before "Minimization" it will generate further more stripped sources
Compile everything and use atime to find out which files were not used. It might not be very accurate but it's probably worth a try.
What is the usage of the -I and -L flags in a makefile?
These are typically part of the linker command line, and are either supplied directly in a target action, or more commonly assigned to a make variable that will be expanded to form link command. In that case:
-L is the path to the directories containing the libraries. A search path for libraries.
-l is the name of the library you want to link to.
For instance, if you want to link to the library ~/libs/libabc.a you'd add:
-L$(HOME)/libs -labc
To take advantage of the default implicit rule for linking, add these flags to the variable LDFLAGS, as in
LDFLAGS+=-L$(HOME)/libs -labc
It's a good habit to separate LDFLAGS and LIBS, for example
# LDFLAGS contains flags passed to the compiler for use during linking
LDFLAGS = -Wl,--hash-style=both
# LIBS contains libraries to link with
LIBS = -L$(HOME)/libs -labc
program: a.o b.o c.o
$(CC) $(LDFLAGS) $^ $(LIBS) -o $#
# or if you really want to call ld directly,
# $(LD) $(LDFLAGS:-Wl,%=%) $^ $(LIBS) -o $#
Even if it may work otherwise, the -l... directives are supposed to go after the objects that reference those symbols. Some optimizations (-Wl,--as-needed is the most obvious) will fail if linking is done in the wrong order.
To really grok a makefile, you need to also have a good understanding of the command lines for all of the components of your project's toolchain. Options like -I and -L are not understood by make itself. Rather, make is attempting to create a command line that will execute a tool to transform a prerequisite file into a target file.
Often, that is a C or C++ source file being compiled to an object file, and eventually linked to get an executable file.
In that case, you need to see the manual for your compiler, and especially the bits related to the command line options it understands.
All that said in generic terms, those specific options are pretty standard among compilers and linkers. -I adds a directory to the list of places searched by the compiler for a file named on a #include line, and -L adds a directory to the list of places searched by the linker for a library named with the -l option.
The bottom line is that the "language" of a makefile is a combination of the syntax of the makefile itself, your shell as known to make (usually /bin/sh or something similar), common shell commands (such as rm, cp, install, etc.), and the commands specific to your compiler and linker (e.g. typing gcc -v --help at your shell prompt will give you a nearly complete (and extremely long) list of the options understood by gcc as one starting point).
One thing to note is that these are the options passed to the compiler/linker.
So you should be looking at the compiler man pages/documentation to know their role.