I had implemented an elf packer for executable for aarch64.
Now, I would like to pack a shared library.
For executable binaries, the entry point is replaced with the address
of the loader. While the job of the loader is to decrypt the .text section
and finally jump to the decrypted .text section.
However, there is no entry point for shared libraries.
Where should my loader be placed ?
The elf loader can be triggered by patching the entry of .rela.dyn with the address of the loader.
https://github.com/0xN3utr0n/Noteme/blob/master/injection.c
This trick has been tested on x86_64 and x86_86.
But failed on aarch64.
Related
How can I easily find out the direct shared object dependencies of a Linux binary in ELF format?
I'm aware of the ldd tool, but that appears to output all dependencies of a binary, including the dependencies of any shared objects that binary is dependent on.
You can use readelf to explore the ELF headers. readelf -d will list the direct dependencies as NEEDED sections.
$ readelf -d elfbin
Dynamic section at offset 0xe30 contains 22 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libssl.so.1.0.0]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x000000000000000c (INIT) 0x400520
0x000000000000000d (FINI) 0x400758
...
If you want to find dependencies recursively (including dependencies of dependencies, dependencies of dependencies of dependencies and so on)…
You may use ldd command.
ldd - print shared library dependencies
The objdump tool can tell you this information. If you invoke objdump with the -x option, to get it to output all headers then you'll find the shared object dependencies right at the start in the "Dynamic Section".
For example running objdump -x /usr/lib/libXpm.so.4 on my system gives the following information in the "Dynamic Section":
Dynamic Section:
NEEDED libX11.so.6
NEEDED libc.so.6
SONAME libXpm.so.4
INIT 0x0000000000002450
FINI 0x000000000000e0e8
GNU_HASH 0x00000000000001f0
STRTAB 0x00000000000011a8
SYMTAB 0x0000000000000470
STRSZ 0x0000000000000813
SYMENT 0x0000000000000018
PLTGOT 0x000000000020ffe8
PLTRELSZ 0x00000000000005e8
PLTREL 0x0000000000000007
JMPREL 0x0000000000001e68
RELA 0x0000000000001b38
RELASZ 0x0000000000000330
RELAENT 0x0000000000000018
VERNEED 0x0000000000001ad8
VERNEEDNUM 0x0000000000000001
VERSYM 0x00000000000019bc
RELACOUNT 0x000000000000001b
The direct shared object dependencies are listing as 'NEEDED' values. So in the example above, libXpm.so.4 on my system just needs libX11.so.6 and libc.so.6.
It's important to note that this doesn't mean that all the symbols needed by the binary being passed to objdump will be present in the libraries, but it does at least show what libraries the loader will try to load when loading the binary.
ldd -v prints the dependency tree under "Version information:' section. The first block in that section are the direct dependencies of the binary.
See Hierarchical ldd(1)
ELF in Linux allows specifying a custom loader in the .interp header. Usually, this is ld. What are examples of non-ld loaders? What is used to load the loader itself? Assuming I specify a non-ld interp, what is the interface it has to the executable it is supposed to load?
ELF in Linux allows specifying a custom loader in the .interp header. Usually, this is ld.
The loader usually depends on the libc being used, and is never ld.
On Solaris, it's usually ld.so.
On Linux it's usually /lib/ld-linux.so.2 (on i*x86 GLIBC systems), /lib64/ld-linux-x86-64.so.2 (on x86_64 GLIBC systems), etc.
What are examples of non-ld loaders?
Musl-based systems use $(syslibdir)/ld-musl-$(ARCH).so.1
What is used to load the loader itself?
The kernel does it. The loader must relocate itself, since there is nothing else which can do that.
Assuming I specify a non-ld interp, what is the interface it has to the executable it is supposed to load?
That interface is defined by whatever libc you are using (which is why the loader and libc always come from the same package).
ELF in Linux allows specifying a custom loader in the .interp header. Usually, this is ld.
On Linux, that is normally ld.so, camouflaged as ld-linux.so:
$ readelf -p .interp /bin/gcc
String dump of section '.interp':
[ 0] /lib64/ld-linux-x86-64.so.2
$ file /lib64/ld-linux-x86-64.so.2
/lib64/ld-linux-x86-64.so.2: symbolic link to /lib/x86_64-linux-gnu/ld-2.31.so
I need to build a complete linux development framework for a Cortex-M MCU, specifically a STM32F7 Cortex-M7. First I've to explain some background info so please bear with me.
I've downloaded and built a gcc toolchain with croostool-ng 1.24 specifying an armv7e-m architecture with thumb-only instructions and linux 4.20 as the OS and that I want the output to be FLAT executables (I assumed it will mean bFLT).
Then I proceeded to compile the linux kernel (version 4.20) using configs/stm32_defconf, and then a statically compiled busybox rootfs, all using my new toolchain.
Kernel booted just fine but throw me an error and kernel painc with the following message:
Starting init: /sbin/init exists but couldn't execute it (error -8)
and
request_module: modprobe binfmt-464c cannot be processed, kmod busy with 50 threads
The interesting part is the last message. My busybox excutable turned out to be an .ELF! Cortex-M has no MMU, so it's imposible to build a linux kernel on a MMU-less architecture with .ELF support, that's why an (464c)"LF" binary loader can't be found, there is none.
So at last, my question is:
how could I build bFLT executables to run on MMU-less Linux architectures? My toolchain has elf2flt, but in crosstool-ng I've already specified a MMU-less architecture and FLAT binary and I was expecting direct bFLT output, not a "useless" executable. Is that even possible?
Or better: is there anywhere a documented standard procedure to build a complete, working Linux system based on Cortex-M?
Follow-up:
I gave up on building FLAT binaries and tried FDPIC executables. Another dead end. AFAIK:
Linux has long been supporting ELF FDPIC, but the ABI for ARM is pretty new.
It seems that still at this day and age, GCC has not a standard way to enable FDPIC. On some architectures you can use -mfdpic. Not on arm, don't know why. I even don't know if ARM FDPIC is supported at all by mainline GCC. Info is extremely scarce if inexistent.
It seems crosstool-ng 1.24 is BROKEN at building ARM ELF FDPIC support. Resulting gcc has not -mfdpic, and -fPIC generates ARM executables, not ARM FDPIC.
Any insight will be very appreciated.
you can generate FDPIC ELF files just with a prebuilt arm-linux-gnueabi-gcc compiler.
Specifications of an FDPIC ELF file:
Position independent executable/code (i.e. -fPIE and fPIC)
Should be compiled as a shared executable (ET_DYN ELF) to be position independent
use these flags to compile your programs:
arm-linux-gnueabi-gcc -shared -fPIE -fPIC <YOUR PROGRAM.C> -o <OUTPUT FILE>
I've compiled busybox successfully for STM32H7 with this method.
As I know, unfortunately FDPIC ELFs should be compiled with - shared flag so, they use shared libraries and cannot be compiled as -static ELF.
For more information take a look at this file:
https://github.com/torvalds/linux/blob/master/fs/binfmt_elf_fdpic.c
Track the crosstool-ng BFLAT issue from here:
https://github.com/crosstool-ng/crosstool-ng/issues/1399
I am wondering if it is legal to return with ret from a program's entry point.
Example with NASM:
section .text
global _start
_start:
ret
; Linux: nasm -f elf64 foo.asm -o foo.o && ld foo.o
; OS X: nasm -f macho64 foo.asm -o foo.o && ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo
ret pops a return address from the stack and jumps to it.
But are the top bytes of the stack a valid return address at the program entry point, or do I have to call exit?
Also, the program above does not segfault on OS X. Where does it return to?
MacOS Dynamic Executables
When you are using MacOS and link with:
ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo
you are getting a dynamically loaded version of your code. _start isn't the true entry point, the dynamic loader is. The dynamic loader as one of its last steps does C/C++/Objective-C runtime initialization, and then calls your specified entry point specified with the -e option. The Apple documentation about Forking and Executing the Process has these paragraphs:
A Mach-O executable file contains a header consisting of a set of load commands. For programs that use shared libraries or frameworks, one of these commands specifies the location of the linker to be used to load the program. If you use Xcode, this is always /usr/lib/dyld, the standard OS X dynamic linker.
When you call the execve routine, the kernel first loads the specified program file and examines the mach_header structure at the start of the file. The kernel verifies that the file appear to be a valid Mach-O file and interprets the load commands stored in the header. The kernel then loads the dynamic linker specified by the load commands into memory and executes the dynamic linker on the program file.
The dynamic linker loads all the shared libraries that the main program links against (the dependent libraries) and binds enough of the symbols to start the program. It then calls the entry point function. At build time, the static linker adds the standard entry point function to the main executable file from the object file /usr/lib/crt1.o. This function sets up the runtime environment state for the kernel and calls static initializers for C++ objects, initializes the Objective-C runtime, and then calls the program’s main function
In your case that is _start. In this environment where you are creating a dynamically linked executable you can do a ret and have it return back to the code that called _start which does an exit system call for you. This is why it doesn't crash. If you review the generated object file with gobjdump -Dx foo you should get:
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000001 0000000000001fff 0000000000001fff 00000fff 2**0
CONTENTS, ALLOC, LOAD, CODE
SYMBOL TABLE:
0000000000001000 g 03 ABS 01 0010 __mh_execute_header
0000000000001fff g 0f SECT 01 0000 [.text] _start
0000000000000000 g 01 UND 00 0100 dyld_stub_binder
Disassembly of section .text:
0000000000001fff <_start>:
1fff: c3 retq
Notice that start address is 0. And the code at 0 is dyld_stub_binder. This is the dynamic loader stub that eventually sets up a C runtime environment and then calls your entry point _start. If you don't override the entry point it defaults to main.
MacOS Static Executables
If however you build as a static executable, there is no code executed before your entry point and ret should crash since there is no valid return address on the stack. In the documentation quoted above is this:
For programs that use shared libraries or frameworks, one of these commands specifies the location of the linker to be used to load the program.
A statically built executable doesn't use the dynamic loader dyld with crt1.o embedded in it. CRT = C runtime library which covers C++/Objective-C as well on MacOS. The processes of dealing with dynamic loading are not done, C/C++/Objective-C initialization code is not executed, and control is transferred directly to your entry point.
To build statically drop the -lc (or -lSystem) from the linker command and add -static option:
ld foo.o -macosx_version_min 10.12.0 -e _start -o foo -static
If you run this version it should produce a segmentation fault. gobjdump -Dx foo produces
start address 0x0000000000001fff
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000001 0000000000001fff 0000000000001fff 00000fff 2**0
CONTENTS, ALLOC, LOAD, CODE
1 LC_THREAD.x86_THREAD_STATE64.0 000000a8 0000000000000000 0000000000000000 00000198 2**0
CONTENTS
SYMBOL TABLE:
0000000000001000 g 03 ABS 01 0010 __mh_execute_header
0000000000001fff g 0f SECT 01 0000 [.text] _start
Disassembly of section .text:
0000000000001fff <_start>:
1fff: c3 retq
You should notice start_address is now 0x1fff. 0x1fff is the entry point you specified (_start). There is no dynamic loader stub as an intermediary.
Linux
Under Linux when you specify your own entry point it will segmentation fault whether you are building as a static or shared executable. There is good information on how ELF executables are run on Linux in this article and the dynamic linker documentation. The key point that should be observed is that the Linux one makes no mention of doing C/C++/Objective-C runtime initialisation unlike the MacOS dynamic linker documentation.
The key difference between the Linux dynamic loader (ld.so) and the MacOS one (dynld) is that the MacOS dynamic loader performs C/C++/Objective-C startup initialization by including the entry point from crt1.o. The code in crt1.o then transfers control to the entry point you specified with -e (default is main). In Linux the dynamic loader makes no assumption about the type of code that will be run. After the shared objects are processed and initialized control is transferred directly to the entry point.
Stack Layout at Process Creation
FreeBSD (on which MacOS is based) and Linux share one thing in common. When loading 64-bit executables the layout of the user stack when a process is created is the same. The stack for 32-bit processes is similar but pointers and data are 4 bytes wide, not 8.
Although there isn't a return address on the stack, there is other data representing the number of arguments, the arguments, environment variables, and other information. This layout is not the same as what the main function in C/C++ expects. It is part of the C startup code to convert the stack at process creation to something compatible with the C calling convention and the expectations of the function main (argc, argv, envp).
I wrote more information on this subject in this Stackoverflow answer that shows how a statically linked MacOS executable can traverse through the program arguments passed by the kernel at process creation.
Suplementing what Michael Petch already answered:
from runnable Mach-o executable perspective program launch happens either due to load command LC_MAIN (most modern executables since 10.7) which utilises DYLD in the process or backward compatible load command LC_UNIXTHREAD. The former is the variant where your ret is allowed and in fact preferable because you return control to DYLD __mh_execute_header. This will be followed by a buffer flush.
Alternatively to ret you can use system exit call either through undocumented syscall kernel API (64bit, int 0x80 for 32bit) or DYLD wrapper C lib doing it(documented).
If your executable is not utilising LC_MAIN you're left with legacy LC_UNIXTHREAD where you have no alternative to system exit call , ret will cause a segmentation fault.
I have a weird Linux system where most of the software is compiled against Glibc and some others against uClibc.
Since the Linux is a standard distro when I launch and executable the standard dynamic linker is invoked (/lib/ld.so.1) from glibc.
I'm looking for a way to specify the dynamic loader before launching any executable so when I want to run software which was compiled against uClibc I can define the launching mechanism to use uClibc dynamic loader (/lib/ld-uClibc.so.0).
Any ideas?
I'm looking for a way to specify the dynamic loader before launching any executable so when I want to run software which was compiled against uClibc
You should be specifying the correct dynamic loader while building against uClibc, using the linker --dynamic-linker argument. E.g.
gcc -nostdlib -Wl,--dynamic-linker=/lib/ld-uClibc.so.0 \
/lib/uClibc-crt1.o main.o -L/path/to/uClibc -lc
Just put the full path to the dynamic linker before calling the executable, for example:
/home/x20/tools/codescape-2016.05-3-mips-mti-linux-gnu/2016.05-03/sysroot/mipsel-r2-hard/lib64/ld-2.20.so out.gn/mipsel/d8
d8 is the binary we want to execute and ld-2.20.so is the dynamic linker
Looks to me as if you need to set PT_INTERP to point to an alternative interpreter that in turn prefers your prefered ld.so device. See the man page for elf(5). See readelf to dump what you have and see; you are trying to change ld-linux-xxx.so.x to whatever you come up with.
Actually, it looks to me as if you just want to point to your alternative ld.so as the INTERP.