How is the address of the text section of a PIE executable determined in Linux? - linux

First I tried to reverse engineer it a bit:
printf '
#include <stdio.h>
int main() {
puts("hello world");
}
' > main.c
gcc -std=c99 -pie -fpie -ggdb3 -o pie main.c
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
readelf -s ./pie | grep -E 'main$'
gdb -batch -nh \
-ex 'set disable-randomization off' \
-ex 'start' -ex 'info line' \
-ex 'start' -ex 'info line' \
-ex 'set disable-randomization on' \
-ex 'start' -ex 'info line' \
-ex 'start' -ex 'info line' \
./pie \
;
Output:
64: 000000000000063a 23 FUNC GLOBAL DEFAULT 14 main
Temporary breakpoint 1, main () at main.c:4
4 puts("hello world");
Line 4 of "main.c" starts at address 0x5575f5fd263e <main+4> and ends at 0x5575f5fd264f <main+21>.
Temporary breakpoint 2 at 0x5575f5fd263e: file main.c, line 4.
Temporary breakpoint 2, main () at main.c:4
4 puts("hello world");
Line 4 of "main.c" starts at address 0x55e3fbc9363e <main+4> and ends at 0x55e3fbc9364f <main+21>.
Temporary breakpoint 3 at 0x55e3fbc9363e: file main.c, line 4.
Temporary breakpoint 3, main () at main.c:4
4 puts("hello world");
Line 4 of "main.c" starts at address 0x55555555463e <main+4> and ends at 0x55555555464f <main+21>.
Temporary breakpoint 4 at 0x55555555463e: file main.c, line 4.
Temporary breakpoint 4, main () at main.c:4
4 puts("hello world");
Line 4 of "main.c" starts at address 0x55555555463e <main+4> and ends at 0x55555555464f <main+21>.
which indicates that it is 0x555555554000 + random offset + 63e.
But then I tried to grep the Linux kernel and glibc source code for 555555554 and there were no hits.
Which part of which code calculates that address?
I came across this while answering: What is the -fPIE option for position-independent executables in gcc and ld?

Some Internet search of 0x555555554000 gives hints: there were problems with ThreadSanitizer https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual
Q: When I run the program, it says: FATAL: ThreadSanitizer can not mmap the shadow memory (something is mapped at 0x555555554000 < 0x7cf000000000). What to do? You need to enable ASLR:
$ echo 2 >/proc/sys/kernel/randomize_va_space
This may be fixed in future kernels, see https://bugzilla.kernel.org/show_bug.cgi?id=66721
...
$ gdb -ex 'set disable-randomization off' --args ./a.out
and https://lwn.net/Articles/730120/ "Stable kernel updates." Posted Aug 7, 2017 20:40 UTC (Mon) by hmh (subscriber) https://marc.info/?t=150213704600001&r=1&w=2
(https://patchwork.kernel.org/patch/9886105/, commit c715b72c1ba4)
Moving the x86_64 and arm64 PIE base from 0x555555554000 to 0x000100000000
broke AddressSanitizer. This is a partial revert of:
commit eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE") (https://patchwork.kernel.org/patch/9807325/ https://lkml.org/lkml/2017/6/21/560)
commit 02445990a96e ("arm64: move ELF_ET_DYN_BASE to 4GB / 4MB") (https://patchwork.kernel.org/patch/9807319/)
Reverted code was:
b/arch/arm64/include/asm/elf.h
/*
* This is the base location for PIE (ET_DYN with INTERP) loads. On
- * 64-bit, this is raised to 4GB to leave the entire 32-bit address
+ * 64-bit, this is above 4GB to leave the entire 32-bit address * space open for things that want to use the area for 32-bit pointers. */
-#define ELF_ET_DYN_BASE 0x100000000UL
+#define ELF_ET_DYN_BASE (2 * TASK_SIZE_64 / 3)
+++ b/arch/x86/include/asm/elf.h
/*
* This is the base location for PIE (ET_DYN with INTERP) loads. On
- * 64-bit, this is raised to 4GB to leave the entire 32-bit address
+ * 64-bit, this is above 4GB to leave the entire 32-bit address
* space open for things that want to use the area for 32-bit pointers.
*/
#define ELF_ET_DYN_BASE (mmap_is_ia32() ? 0x000400000UL : \
- 0x100000000UL)
+ (TASK_SIZE / 3 * 2))
So, 0x555555554000 is related with ELF_ET_DYN_BASE macro (referenced in fs/binfmt_elf.c for ET_DYN as not randomized load_bias) and for x86_64 and arm64 it is like 2/3 of TASK_SIZE. When there is no CONFIG_X86_32, x86_64 has TASK_SIZE of 2^47 - one page in arch/x86/include/asm/processor.h
/*
* User space process size. 47bits minus one guard page. The guard
* page is necessary on Intel CPUs: if a SYSCALL instruction is at
* the highest possible canonical userspace address, then that
* syscall will enter the kernel with a non-canonical return
* address, and SYSRET will explode dangerously. We avoid this
* particular problem by preventing anything from being mapped
* at the maximum canonical address.
*/
#define TASK_SIZE_MAX ((1UL << 47) - PAGE_SIZE)
Older versions:
/*
* User space process size. 47bits minus one guard page.
*/
#define TASK_SIZE_MAX ((1UL << 47) - PAGE_SIZE)
Newer versions also have support of 5level with __VIRTUAL_MASK_SHIFT of 56 bit - v4.17/source/arch/x86/include/asm/processor.h (but don't want to use it before enabled by user + commit b569bab78d8d ".. Not all user space is ready to handle wide addresses")).
So, 0x555555554000 is rounded down (by load_bias = ELF_PAGESTART(load_bias - vaddr);, vaddr is zero) from the formula (2^47-1page)*(2/3) (or 2^56 for larger systems):
$ echo 'obase=16; (2^47-4096)/3*2'| bc -q
555555554AAA
$ echo 'obase=16; (2^56-4096)/3*2'| bc -q
AAAAAAAAAAA000
Some history of 2/3 * TASK_SIZE:
commit 9b1bbf6ea9b2 "use ELF_ET_DYN_BASE only for PIE" has usefull comments: "The ELF_ET_DYN_BASE position was originally intended to keep loaders away from ET_EXEC binaries ..."
Don't overflow 32bits with 2*TASK_SIZE "[uml-user] [PATCH] x86, UML: fix integer overflow in ELF_ET_DYN_BASE", 2015 and "ARM: 8320/1: fix integer overflow in ELF_ET_DYN_BASE", 2015:
Almost all arches define ELF_ET_DYN_BASE as 2/3 of TASK_SIZE. Though
it seems that some architectures do this in a wrong way. The problem
is that 2*TASK_SIZE may overflow 32-bits so the real ELF_ET_DYN_BASE
becomes wrong. Fix this overflow by dividing TASK_SIZE prior to
multiplying: (TASK_SIZE / 3 * 2)
Same in 4.x, 3.y, 2.6.z, (where is davej-history git repo? archive and at or.cz) 2.4.z, ... added in 2.1.54 of 06-Sep-1997
diff --git a/include/asm-i386/elf.h b/include/asm-i386/elf.h
+/* This is the location that an ET_DYN program is loaded if exec'ed. Typical
+ use of this is to invoke "./ld.so someprog" to test out a new version of
+ the loader. We need to make sure that it is out of the way of the program
+ that it will "exec", and that there is sufficient room for the brk. */
+
+#define ELF_ET_DYN_BASE (2 * TASK_SIZE / 3)

Related

creating Linux i386 a.out executable shorter than 4097 bytes

I'm trying to create a Linux i386 a.out executable shorter than 4097 bytes, but all my efforts have failed so far.
I'm compiling it with:
$ nasm -O0 -f bin -o prog prog.nasm && chmod +x prog
I'm testing it in a Ubuntu 10.04 i386 VM running Linux 2.6.32 with:
$ sudo modprobe binfmt_aout
$ sudo sysctl vm.mmap_min_addr=4096
$ ./prog; echo $?
Hello, World!
0
This is the source code of the 4097-byte executable which works:
; prog.nasm
bits 32
cpu 386
org 0x1000 ; Linux i386 a.out QMAGIC file format has this.
SECTION_text:
a_out_header:
dw 0xcc ; magic=QMAGIC; Demand-paged executable with the header in the text. The first page (0x1000 bytes) is unmapped to help trap NULL pointer references.
db 0x64 ; type=M_386
db 0 ; flags=0
dd SECTION_data - SECTION_text ; a_text=0x1000 (byte size of .text; mapped as r-x)
dd SECTION_end - SECTION_data ; a_data=0x1000 (byte size of .data; mapped as rwx, not just rw-)
dd 0 ; a_bss=0 (byte size of .bss)
dd 0 ; a_syms=0 (byte size of symbol table data)
dd _start ; a_entry=0x1020 (in-memory address of _start == file offset of _start + 0x1000)
dd 0 ; a_trsize=0 (byte size of relocation info or .text)
dd 0 ; a_drsize=0 (byte size of relocation info or .data)
_start: mov eax, 4 ; __NR_write
mov ebx, 1 ; argument: STDOUT_FILENO
mov ecx, msg ; argument: address of string to output
mov edx, msg_end - msg ; argument: number of bytes
int 0x80 ; syscall
mov eax, 1 ; __NR_exit
xor ebx, ebx ; argument: EXIT_SUCCESS == 0.
int 0x80 ; syscall
msg: db 'Hello, World!', 10
msg_end:
times ($$ - $) & 0xfff db 0 ; padding to multiple of 0x1000 ; !! is this needed?
SECTION_data: db 0
; times ($$ - $) & 0xfff db 0 ; padding to multiple of 0x1000 ; !! is this needed?
SECTION_end:
How can I make the executable file smaller? (Clarification: I still want a Linux i386 a.out executable. I know that that it's possible to create a smaller Linux i386 ELF executable.) There is several thousands bytes of padding at the end of the file, which seems to be required.
So far I've discovered the following rules:
If a_text or a_data is 0, Linux doesn't run the program. (See relevant Linux source block 1 and 2.)
If a_text is not a multiple of 0x1000 (4096), Linux doesn't run the program. (See relevant Linux source block 1 and 2.)
If the file is shorter than a_text + a_data bytes, Linux doesn't run the program. (See relevant Linux source code location.)
Thus file_size >= a_text + a_data >= 0x1000 + 1 == 4097 bytes.
The combinations nasm -f aout + ld -s -m i386linux and nasm -f elf + ld -s -m i386linux and as -32 + ld -s -m i386linux produce an executable of 4100 bytes, which doesn't even work (because its a_data is 0), and by adding a single byte to section .data makes the executable file 8196 bytes long, and it will work. Thus this path doesn't lead to less than 4097 bytes.
Did I miss something?
TL;DR It doesn't work.
It is impossible to make a Linux i386 a.out QMAGIC executable shorter than 4097 bytes work on Linux 2.6.32, based on evidence in the Linux kernel source code of the binfmt_aout module.
Details:
If a_text is 0, Linux doesn't run the program. (Evidence for this check: a_text is passed as the length argument to mmap(2) here.)
If a_data is 0, Linux doesn't run the program. (Evidence for this check: a_data is passed as the length argument to mmap(2) here.)
If a_text is not a multiple of 0x1000 (4096), Linux doesn't run the program. (Evidence for this check: fd_offset + ex.a_text is passed as the offset argument to mmap(2) here. For QMAGIC, fd_offset is 0.)
If the file is shorter than a_text + a_data bytes, Linux doesn't run the program. (Evidence for this check: file sizes is compared to a_text + a_data + a_syms + ... here.)
Thus file_size >= a_text + a_data >= 0x1000 + 1 == 4097 bytes.
I've also tried OMAGIC, ZMAGIC and NMAGIC, but none of them worked. Details:
For OMAGIC, read(2) is used instead of mmap(2) within here, thus it can work. However, Linux tries to load the code to virtual memory address 0 (N_TXTADDR is 0), and this causes SIGKILL (if non-root and vm.mmap_min_addr is larger than 0) or SIGILL (otherwise), thus it doesn't work. Maybe the reason for SIGILL is that the page allocated by set_brk is not executable (but that should be indicated by SIGSEGV), this could be investigated further.
For ZMAGIC and NMAGIC, read(2) instead of mmap(2) within here if fd_offset is not a multiple of the page size (0x1000). fd_offset is 32 for NMAGIC, and 1024 for ZMAGIC, so good. However, it doesn't work for the same reason (load to virtual memory address 0).
I wonder if it's possible to run OMAGIC, ZMAGIC or NMAGIC executables at all on Linux 2.6.32 or later.

Why does my RIP value change after overwriting via an overflow?

I've been working on a buffer overflow on a 64 bit Linux machine for the past few days. The code I'm attacking takes in a file. This original homework ran on a 32-bit system, so a lot is differing. I thought I'd run with it and try to learn something new along the way. I set sudo sysctl -w kernel.randomize_va_space=0 and compiled the program below gcc -o stack -z execstack -fno-stack-protector stack.c && chmod 4755 stack
/* This program has a buffer overflow vulnerability. */
/* Our task is to exploit this vulnerability */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int bof(char *str)
{
char buffer[12];
/* The following statement has a buffer overflow problem */
strcpy(buffer, str);
return 1;
}
int main(int argc, char **argv)
{
char str[517];
FILE *badfile;
badfile = fopen("badfile", "r");
fread(str, sizeof(char), 517, badfile);
bof(str);
printf("Returned Properly\n");
return 1;
}
I could get it to crash by making a file with 20 "A"s. I made a small script to help.
#!/usr/bin/bash
python -c 'print("A"*20)' | tr -d "\n" > badfile
So now, if I add 6 "B"s to the mix after hitting the SIGSEGV in gbd I get.
RIP: 0x424242424242 ('BBBBBB')
0x0000424242424242 in ?? ()
Perfect! Now we can play with the RIP and put our address in for it to jump to! This is where I'm getting a little confused.
I added some C's to the script to try to find a good address to jump to
python -c 'print("A"*20 + "B"*6 + C*32)' | tr -d "\n" > badfile
after creating the file and getting the SEGFAULT my RIP address is no longer our B's
RIP: 0x55555555518a (<bof+37>: ret)
Stopped reason: SIGSEGV
0x000055555555518a in bof (
str=0x7fffffffe900 'C' <repeats 14 times>)
at stack.c:16
I thought this might be due to using B's, so I went ahead and tried to find an address. I then ran x/100 $rsp in gbd, but it looks completely different than before without the Cs
# Before Cs
0x7fffffffe8f0: 0x00007fffffffec08 0x0000000100000000
0x7fffffffe900: 0x4141414141414141 0x4141414141414141
0x7fffffffe910: 0x4242424241414141 0x0000000000004242
# After Cs
0x7fffffffe8e8: "BBBBBB", 'C' <repeats 32 times>
0x7fffffffe90f: "AAAAABBBBBB", 'C' <repeats 32 times>
0x7fffffffe93b: ""
I've been trying to understand why this is happening. I know after this I can supply an address plus code to get a shell like so
python -c 'print("noOPs"*20 + "address" + "shellcode")' | tr -d "\n" > badfile
The only thing that has come to mind is the buffer size? I'm not too sure, though. Any help would be great. Doing this alone without help has made me learn a lot. I'm just dying to create a working exploit!

why does memory address change in Ubuntu and not in Redhat

I have this program:
double t;
main() {
}
On Ubuntu, I run:
% gdb a.out
(gdb) p &t
$1 = (double *) 0x4010 <t>
(gdb) run
Starting program: /home/phan/a.out
[Inferior 1 (process 95930) exited normally]
(gdb) p &t
$2 = (double *) 0x555555558010 <t>
Why did the address change from 0x4010 to 0x555555558010. Is there someway to prevent this?
On Redhat, it doesn't do that:
% gdb a.out
(gdb) p &t
$1 = (double *) 0x601038 <t>
(gdb) r
Starting program: /home/phan/a.out
[Inferior 1 (process 23337) exited normally]
(gdb) p &t
$2 = (double *) 0x601038 <t>
BTW, this only occurs in Ubuntu 18.04. In Ubuntu 16.04, it works exactly as Redhat, for example the address is the same before and after.
You are presumably seeing pre and post-relocation addresses for the .bss segment.
You can avoid this by disabling position independent executables, thus making gcc choose the final address of the .bss register up front:
gcc -no-pie foo.c
-static would have the effect.
I don't know why there'd be a difference between Ubuntu and Redhat though.

How are function sizes calculated by readelf

I am trying to understand how readelf utility calculates function size. I wrote a simple program
#include <stdio.h>
int main() {
printf("Test!\n");
}
Now to check function size I used this (is this OK ? ):
readelf -sw a.out|sort -n -k 3,3|grep FUNC
which yielded:
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts#GLIBC_2.2.5 (2)
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main#GLIBC_2.2.5 (2)
29: 0000000000400470 0 FUNC LOCAL DEFAULT 13 deregister_tm_clones
30: 00000000004004a0 0 FUNC LOCAL DEFAULT 13 register_tm_clones
31: 00000000004004e0 0 FUNC LOCAL DEFAULT 13 __do_global_dtors_aux
34: 0000000000400500 0 FUNC LOCAL DEFAULT 13 frame_dummy
48: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts##GLIBC_2.2.5
50: 00000000004005b4 0 FUNC GLOBAL DEFAULT 14 _fini
51: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main##GLIBC_
58: 0000000000400440 0 FUNC GLOBAL DEFAULT 13 _start
64: 00000000004003e0 0 FUNC GLOBAL DEFAULT 11 _init
45: 00000000004005b0 2 FUNC GLOBAL DEFAULT 13 __libc_csu_fini
60: 000000000040052d 16 FUNC GLOBAL DEFAULT 13 main
56: 0000000000400540 101 FUNC GLOBAL DEFAULT 13 __libc_csu_init
Now if I check the main function's size, it shows 16. How did it arrive at that? Is that the stack size ?
Compiler used gcc version 4.8.5 (Ubuntu 4.8.5-2ubuntu1~14.04.1)
GNU readelf (GNU Binutils for Ubuntu) 2.24
ELF symbols have an attribute st_size which specifies their size (see <elf.h>):
typedef struct
{
...
Elf32_Word st_size; /* Symbol size */
...
} Elf32_Sym;
This attribute is generated by the toolchain generating the binary; e.g. when looking at the assembly code generated by the C compiler:
gcc -c -S test.c
cat test.s
you will see something like
.globl main
.type main, #function
main:
...
.LFE0:
.size main, .-main
where .size is a special as pseudo op.
update:
.size is the size of the code.
Here, .size gets assigned the result of . - main, where "." is the actual address and main the address where main() starts.
In the 'cat test.s' output provided by ensc, the "secret sauce" is the "..." between 'main:' and '.LFE0:'; those are the assembly instructions generated by the compiler that implement the call to printf(). The corresponding machine code for each assembler instruction occupies some number of bytes; "." is incremented by the number of bytes used by each instruction, so at the end of main, ". - main" is the total number of bytes occupied by the machine instructions for main().
It's the compiler that determines which sequences of assembly instructions are needed to execute the printf() call, and that varies from target to target, and from optimization level to optimization level. In your case, your compiler generated machine code occupying 16 bytes. The '.size main, . - main' caused the assembler to create the ELF symbol for main() with its st_size field set to 16.
readelf read the ELF symbol for main, saw that its st_size field was 16, and dutifully reported 16 as the size for main(). readelf doesn't 'calculate' a function's size -- it just reports the st_size field for the function's ELF symbol. The calculation is done by the assembler, when it interprets the '.size main, . - main' directive.

Interactively toggle output on and off in Linux?

For controlling output on Linux there is control-s and control-t, which provides a method for temporarily halting terminal output and then resuming it. On VMS in addition there was control-O, which would toggle all output on and off. This didn't pause output, it discarded it.
Is there an equivalent keyboard shortcut in Linux?
This comes up most often for me in gdb, when debugging programs which output millions of status lines. It would be very convenient to be able to temporarily send most of that to /dev/null rather than the screen, and then pick up with the output stream further on, having dispensed with a couple of million lines in between.
(Edited: The termios(3) man page mentions VDISCARD - and then says that it isn't going to work in POSIX or Linux. So it looks like this is out of the question for general command line use on linux. gdb might still be able to discard output though, through one of its own commands. Can it?)
Thanks.
On VMS in addition there was control-O ...
This functionality doesn't appear to exist on any UNIX system I've ever dealt with (or maybe I just never knew it existed; it's documented in e.g. FreeBSD man page, and is referenced by Solaris and HP-UX docs as well).
gdb might still be able to discard output though, through one of its own commands. Can it?
I don't believe so: GDB doesn't actually intercept the output from the inferior (being debugged) process, it simply makes it run (between breakpoints) with the inferior output going to wherever it's going.
That said, you could do it yourself:
#include <stdio.h>
int main()
{
int i;
for (i = 0; i < 1000; ++i) {
printf("%d\n", i);
}
}
gcc -g foo.c
gdb -q ./a.out
(gdb) break 6
Breakpoint 1 at 0x40053e: file foo.c, line 6.
(gdb) run 20>/dev/null # run the program, file descriptor 20 goes to /dev/null
Starting program: /tmp/a.out 20>/dev/null
Breakpoint 1, main () at foo.c:6
6 printf("%d\n", i);
(gdb) c
Continuing.
0
Breakpoint 1, main () at foo.c:6
6 printf("%d\n", i);
We've now run two iterations. Let's prevent further output for 100 iterations:
(gdb) call dup2(20, 1)
$1 = 1
(gdb) ign 1 100
Will ignore next 100 crossings of breakpoint 1.
(gdb) c
Continuing.
Breakpoint 1, main () at foo.c:6
6 printf("%d\n", i);
(gdb) p i
$2 = 102
No output, as desired. Now let's restore output:
(gdb) call dup2(2, 1)
$3 = 1
(gdb) ign 1 10
Will ignore next 10 crossings of breakpoint 1.
(gdb) c
Continuing.
102
103
104
105
106
107
108
109
110
111
112
Breakpoint 1, main () at foo.c:6
6 printf("%d\n", i);
Output restored!

Resources