arm hello world not starting - command aligned not page-aligned - linux

I have create a hello world application for my armv6 ... but it is not starting
./hello.out
> Killed
ldd ./hello.out
> $ not a dynamic executable
/lib/ld-linux.so.3 --list ./hello.out
> ./hello.out: error while loading shared libraries: ./hello.out: ELF load command alignment not page-aligned
What does "command aligned not page-aligned" mean?

It means one of the segments in your ELF file is not page aligned. Pages are usually aligned at 4096 (0x1000) byte boundaries.
You can check your ELF file with the readelf command, see for example the program headers for bash:
$ readelf -l /bin/bash
Elf file type is EXEC (Executable file)
Entry point 0x419b80
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000bca0c 0x00000000000bca0c R E 200000
LOAD 0x00000000000bcde0 0x00000000006bcde0 0x00000000006bcde0
0x0000000000003d1c 0x000000000000d4e8 RW 200000
DYNAMIC 0x00000000000bcdf8 0x00000000006bcdf8 0x00000000006bcdf8
0x0000000000000200 0x0000000000000200 RW 8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x00000000000a8a00 0x00000000004a8a00 0x00000000004a8a00
0x0000000000002ea4 0x0000000000002ea4 R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x00000000000bcde0 0x00000000006bcde0 0x00000000006bcde0
0x0000000000000220 0x0000000000000220 R 1
The two LOAD sections have an alignment of 200000. This is a proper multiple of 0x1000 (4096).

Related

Loaded glibc base address different for each function

I'm trying to calculate the base address of the library of a binary file.
I have the address of printf, puts ecc and then I subtract it's offset to get the base address of the library.
I was doing this for printf, puts and signal, but every time I got a different base address.
I also tried to do the things in this post, but I couldn't get the right result either.
ASLR is disabled.
this is where I take the address of the library function:
gdb-peda$ x/20wx 0x804b018
0x804b018 <signal#got.plt>: 0xf7e05720 0xf7e97010 0x080484e6 0x080484f6
0x804b028 <puts#got.plt>: 0xf7e3fb40 0x08048516 0x08048526 0xf7df0d90
0x804b038 <memset#got.plt>: 0xf7f18730 0x08048556 0x08048566 0x00000000
then I have:
gdb-peda$ info proc mapping
process 114562
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x8048000 0x804a000 0x2000 0x0 /home/ofey/CTF/Pwnable.tw/applestore/applestore
0x804a000 0x804b000 0x1000 0x1000 /home/ofey/CTF/Pwnable.tw/applestore/applestore
0x804b000 0x804c000 0x1000 0x2000 /home/ofey/CTF/Pwnable.tw/applestore/applestore
0x804c000 0x806e000 0x22000 0x0 [heap]
0xf7dd8000 0xf7fad000 0x1d5000 0x0 /lib/i386-linux-gnu/libc-2.27.so
0xf7fad000 0xf7fae000 0x1000 0x1d5000 /lib/i386-linux-gnu/libc-2.27.so
0xf7fae000 0xf7fb0000 0x2000 0x1d5000 /lib/i386-linux-gnu/libc-2.27.so
0xf7fb0000 0xf7fb1000 0x1000 0x1d7000 /lib/i386-linux-gnu/libc-2.27.so
0xf7fb1000 0xf7fb4000 0x3000 0x0
0xf7fd0000 0xf7fd2000 0x2000 0x0
0xf7fd2000 0xf7fd5000 0x3000 0x0 [vvar]
0xf7fd5000 0xf7fd6000 0x1000 0x0 [vdso]
0xf7fd6000 0xf7ffc000 0x26000 0x0 /lib/i386-linux-gnu/ld-2.27.so
0xf7ffc000 0xf7ffd000 0x1000 0x25000 /lib/i386-linux-gnu/ld-2.27.so
0xf7ffd000 0xf7ffe000 0x1000 0x26000 /lib/i386-linux-gnu/ld-2.27.so
0xfffdd000 0xffffe000 0x21000 0x0 [stack]
and :
gdb-peda$ info sharedlibrary
From To Syms Read Shared Object Library
0xf7fd6ab0 0xf7ff17fb Yes /lib/ld-linux.so.2
0xf7df0610 0xf7f3d386 Yes /lib/i386-linux-gnu/libc.so.6
I then found the offset of signal and puts to calculate the base libc address.
base_with_signal_offset = 0xf7e05720 - 0x3eda0 = 0xf7dc6980
base_with_puts_offset = 0xf7e3fb40 - 0x809c0 = 0xf7dbf180
I was expecting base_with_signal_offset = base_with_puts_offset = 0xf7dd8000, but that's not the case.
What I'm doing wrong?
EDIT(To let you understand where I got those offset):
readelf -s /lib/x86_64-linux-gnu/libc-2.27.so | grep puts
I get :
191: 00000000000809c0 512 FUNC GLOBAL DEFAULT 13 _IO_puts##GLIBC_2.2.5
422: 00000000000809c0 512 FUNC WEAK DEFAULT 13 puts##GLIBC_2.2.5
496: 00000000001266c0 1240 FUNC GLOBAL DEFAULT 13 putspent##GLIBC_2.2.5
678: 00000000001285d0 750 FUNC GLOBAL DEFAULT 13 putsgent##GLIBC_2.10
1141: 000000000007f1f0 396 FUNC WEAK DEFAULT 13 fputs##GLIBC_2.2.5
1677: 000000000007f1f0 396 FUNC GLOBAL DEFAULT 13 _IO_fputs##GLIBC_2.2.5
2310: 000000000008a640 143 FUNC WEAK DEFAULT 13 fputs_unlocked##GLIBC_2.2.5
I was expecting base_with_signal_offset = base_with_puts_offset = 0xf7dd8000
There are 3 numbers in your calculation:
&puts_at_runtime - symbol_value_from_readelf == &first_executable_pt_load_segment_libc.
The readelf output shows that you got one of these almost correct: the value of puts in 64-bit /lib/x86_64-linux-gnu/libc-2.27.so is indeed 0x809c0, but that is not the library you are actually using. You need to repeat the same on the actually used 32-bit library: /lib/i386-linux-gnu/libc-2.27.so.
For the first number -- &puts_at_runtime, you are using value from the puts#got.plt import stub. That value is only guaranteed to have been resolved (point to actual puts in libc.so) IFF you have LD_BIND_NOW=1 set in the environment, or you linked your executable with -z now linker flag, or you actually called puts already.
It may be better to print &puts in GDB.
The last number -- &first_executable_pt_load_segment_libc is correct (because info shared shows that libc.so.6 .text section starts at 0xf7df0610, which is between 0xf7dd8000 and 0xf7fad000.
So putting it all together, the only error was that you used the wrong version of libc.so to extract the symbol_value_from_readelf.
On my system:
#include <signal.h>
#include <stdio.h>
int main() {
puts("Hello");
signal(SIGINT, SIG_IGN);
return 0;
}
gcc -m32 t.c -fno-pie -no-pie
gdb -q a.out
... set breakpoint on exit from main
Breakpoint 1, 0x080491ae in main ()
(gdb) p &puts
$1 = (<text variable, no debug info> *) 0xf7e31300 <puts>
(gdb) p &signal
$2 = (<text variable, no debug info> *) 0xf7df7d20 <ssignal>
(gdb) info proc map
process 114065
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x8048000 0x8049000 0x1000 0x0 /tmp/a.out
...
0x804d000 0x806f000 0x22000 0x0 [heap]
0xf7dc5000 0xf7de2000 0x1d000 0x0 /lib/i386-linux-gnu/libc-2.29.so
...
(gdb) info shared
From To Syms Read Shared Object Library
0xf7fd5090 0xf7ff0553 Yes (*) /lib/ld-linux.so.2
0xf7de20e0 0xf7f2b8d6 Yes (*) /lib/i386-linux-gnu/libc.so.6
Given above, we expect readelf -s to give us 0xf7e31300 - 0xf7dc5000 ==
0x6c300 for puts and 0xf7df7d20 - 0xf7dc5000 == 0x32d20 for signal respectively.
readelf -Ws /lib/i386-linux-gnu/libc-2.29.so | egrep ' (puts|signal)\W'
452: 00032d20 68 FUNC WEAK DEFAULT 14 signal##GLIBC_2.0
458: 0006c300 400 FUNC WEAK DEFAULT 14 puts##GLIBC_2.0
QED.

Why is the stack segment executable on Raspberry Pi?

I have a Raspberry Pi 3 with the Raspbian GNU/Linux 8 (Jessie) OS.
I wrote this simple program. I compiled it with gcc -o hello hello.c.
#include <stdio.h>
void main(){
printf("hello!\n");
}
From from the readelf output everything seems OK:
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
EXIDX 0x0004cc 0x000104cc 0x000104cc 0x00008 0x00008 R 0x4
PHDR 0x000034 0x00010034 0x00010034 0x00120 0x00120 R E 0x4
INTERP 0x000154 0x00010154 0x00010154 0x00019 0x00019 R 0x1
[Requesting program interpreter: /lib/ld-linux-armhf.so.3]
LOAD 0x000000 0x00010000 0x00010000 0x004d8 0x004d8 R E 0x10000
LOAD 0x000f0c 0x00020f0c 0x00020f0c 0x0011c 0x00120 RW 0x10000
DYNAMIC 0x000f18 0x00020f18 0x00020f18 0x000e8 0x000e8 RW 0x4
NOTE 0x000170 0x00010170 0x00010170 0x00044 0x00044 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10
GNU_RELRO 0x000f0c 0x00020f0c 0x00020f0c 0x000f4 0x000f4 R 0x1
But when I run the program the stack is executable:
0x7efdf000 0x7f000000 0x00000000 rwx [stack]
I try to compile also with the option -z noexecstack, but nothing change.
I try also to download the version of libarmmem.so that have this code:
#if defined(__linux__) && defined(__ELF__)
.section .note.GNU-stack,"",%progbits
#endif
But nothing change.
Why is the stack segment executable on Raspberry Pi?
Edit I add the output of the LD_DEBUG=files ./hello command
23110:
23110: file=/usr/lib/arm-linux-gnueabihf/libarmmem.so [0]; needed by ./hello [0]
23110: file=/usr/lib/arm-linux-gnueabihf/libarmmem.so [0]; generating link map
23110: dynamic: 0x76f273fc base: 0x76f13000 size: 0x00014524
23110: entry: 0x76f13568 phdr: 0x76f13034 phnum: 6
23110:
23110:
23110: file=libc.so.6 [0]; needed by ./hello [0]
23110: file=libc.so.6 [0]; generating link map
23110: dynamic: 0x76f0ef20 base: 0x76dd4000 size: 0x0013e550
23110: entry: 0x76dea840 phdr: 0x76dd4034 phnum: 10
23110:
23110:
23110: calling init: /lib/arm-linux-gnueabihf/libc.so.6
23110:
23110:
23110: calling init: /usr/lib/arm-linux-gnueabihf/libarmmem.so
23110:
23110:
23110: initialize program: ./hello
23110:
23110:
23110: transferring control: ./hello
23110:
hello!
23110:
23110: calling fini: ./hello [0]
23110:
23110:
23110: calling fini: /usr/lib/arm-linux-gnueabihf/libarmmem.so [0]
23110:
Add more info:
I edit the file architecture.S, and after the make I received:
gcc -std=gnu99 -O2 -c -o trampoline.o trampoline.c
gcc -shared -o libarmmem.so architecture.o memcmp.o memcpymove.o memcpymove-a7.o memset.o trampoline.o
`architecture' referenced in section `.text' of trampoline.o: defined in discarded section `.note.GNU-stack' of architecture.o
`architecture' referenced in section `.text' of trampoline.o: defined in discarded section `.note.GNU-stack' of architecture.o
`architecture' referenced in section `.text' of trampoline.o: defined in discarded section `.note.GNU-stack' of architecture.o
`architecture' referenced in section `.text' of trampoline.o: defined in discarded section `.note.GNU-stack' of architecture.o
collect2: error: ld returned 1 exit status
Makefile:13: recipe for target 'libarmmem.so' failed
make: *** [libarmmem.so] Error 1
It is likely that /usr/lib/arm-linux-gnueabihf/libarmmem.so is causing this. I found this source file:
https://github.com/bavison/arm-mem/blob/master/architecture.S
It lacks the non-executable stack annotation, so glibc conservatively makes the stack executable when the DSO is preloaded. The other source files have this:
/* Prevent the stack from becoming executable */
#if defined(__linux__) && defined(__ELF__)
.section .note.GNU-stack,"",%progbits
#endif
So you just need to copy this into architecture.S (at the end of the file) and rebuild.
You can verify with eu-readelf -l /usr/lib/arm-linux-gnueabihf/libarmmem.so if this DSO is indeed the culprit. It should show either no GNU_STACK program header at all, or a GNU_STACK program header which is marked RWE in the penultimate column.

Computing offset of a function in memory

I am reading documentation for a uprobe tracer and there is a instruction how to compute offset of a function in memory. I am quoting it here.
Following example shows how to dump the instruction pointer and %ax
register at the probed text address. Probe zfree function in /bin/zsh:
# cd /sys/kernel/debug/tracing/
# cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh
# objdump -T /bin/zsh | grep -w zfree
0000000000446420 g DF .text 0000000000000012 Base zfree
0x46420 is the offset of zfree in object /bin/zsh that is loaded at
0x00400000.
I do not know why, but they took output 0x446420 and subtracted 0x400000 to get 0x46420. It seamed as an error to me. Why 0x400000?
I have tried to do the same on my Fedora 23 with 4.5.6-200 kernel.
First I turned off memory address randomization
echo 0 > /proc/sys/kernel/randomize_va_space
Then I figured out where binary is in memory
$ cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
555555554000-55555560f000 r-xp 00000000 fd:00 2387155 /usr/bin/zsh
Took the offset
marko#fedora:~ $ objdump -T /bin/zsh | grep -w zfree
000000000005dc90 g DF .text 0000000000000012 Base zfree
And figured out where zfree is via gdb
$ gdb -p 21067 --batch -ex 'p zfree'
$1 = {<text variable, no debug info>} 0x5555555b1c90 <zfree>
marko#fedora:~ $ python
Python 2.7.11 (default, Mar 31 2016, 20:46:51)
[GCC 5.3.1 20151207 (Red Hat 5.3.1-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> hex(0x5555555b1c90-0x555555554000)
'0x5dc90'
You see, I've got the same result as in objdump without subtracting anything.
But then I tried the same on another machine with SLES and there it's the same as in uprobe documentation.
Why is there such a difference? How do I compute correct offset then?
As far as I see the difference may be caused only by the way how examined binary was built. Saying more precisely - if ELF has fixed load address or not. Lets do simple experiment. We have simple test code:
int main(void) { return 0; }
Then, build it in two ways:
$ gcc -o t1 t.c # create image with fixed load address
$ gcc -o t2 t.c -pie # create load-base independent image
Now, lets check load base addresses for these two images:
$ readelf -l --wide t1 | grep LOAD
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00067c 0x00067c R E 0x200000
LOAD 0x000680 0x0000000000600680 0x0000000000600680 0x000228 0x000230 RW 0x200000
$ readelf -l --wide t2 | grep LOAD
LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x0008cc 0x0008cc R E 0x200000
LOAD 0x0008d0 0x00000000002008d0 0x00000000002008d0 0x000250 0x000258 RW 0x2000
Here you can see that first image requires fixed load address - 0x400000, and the second one has no address requirements at all.
And now we can compare addresses that objdump tells about main:
$ objdump -t t1 | grep ' main'
00000000004004b6 g F .text 000000000000000b main
$ objdump -t t2 | grep ' main'
0000000000000710 g F .text 000000000000000b main
As we see, the address is a complete virtual address that first byte of main will occupy if image is loaded at address, stored in program header. And of course the second image never won't be loaded at 0x0 but instead at another, randomly chosen location, that will offset real function position.

Base Address of ELF

I am trying to find the base address of ELF files. I know that you can use readelf to find the Program Entry Point and different section details (base address, size, flags and so on).
For example, programs for x86 architecture are based at 0x8048000 by linker. using readelf I can see the program entry point but no specific field in the output tells the base address.
$ readelf -e test
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x8048390
Start of program headers: 52 (bytes into file)
Start of section headers: 4436 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 9
Size of section headers: 40 (bytes)
Number of section headers: 30
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 08048154 000154 000013 00 A 0 0 1
[ 2] .note.ABI-tag NOTE 08048168 000168 000020 00 A 0 0 4
[ 3] .note.gnu.build-i NOTE 08048188 000188 000024 00 A 0 0 4
[ 4] .gnu.hash GNU_HASH 080481ac 0001ac 000024 04 A 5 0 4
[ 5] .dynsym DYNSYM 080481d0 0001d0 000070 10 A 6 1 4
In the section details, I can see that the Offset is calculated with respect to the base address of the ELF.
So, .dynsym section starts at address, 0x080481d0 and offset is 0x1d0. This would mean the base address is, 0x08048000. Is this correct?
similarly, for programs compiled on different architectures like PPC, ARM, MIPS, I cannot see their base address but only the OEP, Section Headers.
You need to check the segment table aka program headers (readelf -l).
Elf file type is EXEC (Executable file)
Entry point 0x804a7a0
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x08048000 0x08048000 0x10fc8 0x10fc8 R E 0x1000
LOAD 0x011000 0x08059000 0x08059000 0x0038c 0x01700 RW 0x1000
DYNAMIC 0x01102c 0x0805902c 0x0805902c 0x000f8 0x000f8 RW 0x4
NOTE 0x000168 0x08048168 0x08048168 0x00020 0x00020 R 0x4
TLS 0x011000 0x08059000 0x08059000 0x00000 0x0005c R 0x4
GNU_EH_FRAME 0x00d3c0 0x080553c0 0x080553c0 0x00c5c 0x00c5c R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
The first (lowest) LOAD segment's virtual address is the default load base of the file. You can see it's 0x08048000 for this file.
The ELF mapping base Address of the .text section is defined by the ld(1) loader script in the binutils project under script template elf.sc on Linux.
The script define the following variables used by the loader ld(1):
# TEXT_START_ADDR - the first byte of the text segment, after any
# headers.
# TEXT_BASE_ADDRESS - the first byte of the text segment.
# TEXT_START_SYMBOLS - symbols that appear at the start of the
# .text section.
You can inspect the current values using the command:
~$ ld --verbose |grep SEGMENT_START
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
. = SEGMENT_START("ldata-segment", .);
The text-segment mapping values are:
0x08048000 on 32 Bits
0x400000 on 64 Bits
Also the interpreter base address of an ELF program is defined in the Auxiliary vector array at the index AT_BASE. The Auxiliary vector array is an array of the Elf_auxv_t structure and located after the envp in the process stack. It's configured while loading the ELF binary in the function create_elf_tables() of Linux kernel fs/binfmt_elf.c. The following code snippet show how to read the value:
$ cat at_base.c
#include <stdio.h>
#include <elf.h>
int
main(int argc, char* argv[], char* envp[])
{
Elf64_auxv_t *auxp;
while(*envp++ != NULL);
for (auxp = (Elf64_auxv_t *)envp; auxp->a_type != 0; auxp++) {
if (auxp->a_type == 7) {
printf("AT_BASE: 0x%lx\n", auxp->a_un.a_val);
}
}
}
$ clang -o at_base at_base.c
$ ./at_base
AT_BASE: 0x7fcfd4025000
Linux Auxiliary Vector definition
Auxiliary Vector Reference
It used to be a fixed address on x86 32 bits architecture, but with ASLR now, it's randomized. You can use setarch i386 -R to disable randomization if you want.
It's defined in the linker script. You can dump the default linker script with ld --verbose. Example output:
GNU ld (GNU Binutils) 2.23.1
Supported emulations:
elf_x86_64
elf32_x86_64
elf_i386
i386linux
elf_l1om
elf_k1om
using internal linker script:
==================================================
/* Script for -z combreloc: combine and sort reloc sections */
OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64",
"elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
ENTRY(_start)
SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/x86_64-unknown-linux-gnu/lib64"); SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/lib64"); SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/x86_64-unknown-linux-gnu/lib"); SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/lib");
SECTIONS
{
/* Read-only sections, merged into text segment: */
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
.interp : { *(.interp) }
.note.gnu.build-id : { *(.note.gnu.build-id) }
.hash : { *(.hash) }
.gnu.hash : { *(.gnu.hash) }
.dynsym : { *(.dynsym) }
.dynstr : { *(.dynstr) }
.gnu.version : { *(.gnu.version) }
.gnu.version_d : { *(.gnu.version_d) }
.gnu.version_r : { *(.gnu.version_r) }
(snip)
In case you missed it: __executable_start = SEGMENT_START("text-segment", 0x400000)).
And for me, sure enough, when I link a simple .o file into a binary, the entry point address is very close to 0x400000.
The entry point address in the ELF metadata is this value, plus the offset from the beginning of the .text section to the _start symbol. Note also that the _start symbol can be configured. Again from my default linker script example: ENTRY(_start).

checking if a binary compiled with "-static"

I have a binary file in linux. How can I check whether it has been compiled with "-static" or not?
ldd /path/to/binary should not list any shared libraries if the binary is statically compiled.
You can also use the file command (and objdump could also be useful).
Check if it has a program header of type INTERP
At the lower level, an executable is static if it does not have a program header with type:
Elf32_Phd.p_type == PT_INTERP
This is mentioned in the System V ABI spec.
Remember that program headers determine the ELF segments, including those of type PT_LOAD that will get loaded in to memory and be run.
If that header is present, its contents are exactly the path of the dynamic loader.
readelf check
We can observe this with readelf. First compile a C hello world dynamically:
gcc -o main.out main.c
and then:
readelf --program-headers --wide main.out
outputs:
Elf file type is DYN (Shared object file)
Entry point 0x1050
There are 11 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x000268 0x000268 R 0x8
INTERP 0x0002a8 0x00000000000002a8 0x00000000000002a8 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000560 0x000560 R 0x1000
LOAD 0x001000 0x0000000000001000 0x0000000000001000 0x0001bd 0x0001bd R E 0x1000
LOAD 0x002000 0x0000000000002000 0x0000000000002000 0x000150 0x000150 R 0x1000
LOAD 0x002db8 0x0000000000003db8 0x0000000000003db8 0x000258 0x000260 RW 0x1000
DYNAMIC 0x002dc8 0x0000000000003dc8 0x0000000000003dc8 0x0001f0 0x0001f0 RW 0x8
NOTE 0x0002c4 0x00000000000002c4 0x00000000000002c4 0x000044 0x000044 R 0x4
GNU_EH_FRAME 0x00200c 0x000000000000200c 0x000000000000200c 0x00003c 0x00003c R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x002db8 0x0000000000003db8 0x0000000000003db8 0x000248 0x000248 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt
03 .init .plt .plt.got .text .fini
04 .rodata .eh_frame_hdr .eh_frame
05 .init_array .fini_array .dynamic .got .data .bss
06 .dynamic
07 .note.ABI-tag .note.gnu.build-id
08 .eh_frame_hdr
09
10 .init_array .fini_array .dynamic .got
so note the INTERP header is there, and it is so important that readelf even gave a quick preview of its short 28 (0x1c) byte contents: /lib64/ld-linux-x86-64.so.2, which is the path to the dynamic loader (27 bytes long + 1 for \0).
Note how this resides side by side with the other segments, including e.g. those that actually get loaded into memory such as: .text.
We can then more directly extract those bytes without the preview with:
readelf -x .interp main.out
which gives:
Hex dump of section '.interp':
0x000002a8 2f6c6962 36342f6c 642d6c69 6e75782d /lib64/ld-linux-
0x000002b8 7838362d 36342e73 6f2e3200 x86-64.so.2.
as explained at: How can I examine contents of a data section of an ELF file on Linux?
file source code
file 5.36 source code comments at src/readelf.c claim that it also checks for PT_INTERP:
/*
* Look through the program headers of an executable image, searching
* for a PT_INTERP section; if one is found, it's dynamically linked,
* otherwise it's statically linked.
*/
private int
dophn_exec(struct magic_set *ms, int clazz, int swap, int fd, off_t off,
int num, size_t size, off_t fsize, int sh_num, int *flags,
uint16_t *notecount)
{
Elf32_Phdr ph32;
Elf64_Phdr ph64;
const char *linking_style = "statically";
found with git grep statically from the message main.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped.
However, this comment seems to be outdated compared to the code, which instead checks for PT_DYNAMIC:
case PT_DYNAMIC:
linking_style = "dynamically";
doread = 1;
break;
I'm not sure why this is done, and I'm lazy to dig into git log now. In particular, this confused me a bit when I tried to make a statically linked PIE executable with --no-dynamic-linker as shown at: How to create a statically linked position independent executable ELF in Linux? which does not have PT_INTERP but does have PT_DYNAMIC, and which I do not expect to use the dynamic loader.
I ended up doing a deeper source analysis for -fPIE at: Why does GCC create a shared object instead of an executable binary according to file? the answer is likely there as well.
Linux kernel source code
The Linux kernel 5.0 reads the ELF file during the exec system call at fs/binfmt_elf.c as explained at: How does kernel get an executable binary file running under linux?
The kernel loops over the program headers at load_elf_binary
for (i = 0; i < loc->elf_ex.e_phnum; i++) {
if (elf_ppnt->p_type == PT_INTERP) {
/* This is the program interpreter used for
* shared libraries - for now assume that this
* is an a.out format binary
*/
I haven't read the code fully, but I would expect then that it only uses the dynamic loader if INTERP is found, otherwise which path should it use?
PT_DYNAMIC is not used in that file.
Bonus: check if -pie was used
I've explained that in detail at: Why does GCC create a shared object instead of an executable binary according to file?

Resources