Are function locations altered when running a program through GDB? - linux

I'm trying to run through a buffer overflow exercise, here is the code:
#include <stdio.h>
int badfunction() {
char buffer[8];
gets(buffer);
puts(buffer);
}
int cantrun() {
printf("This function cant run because it is never called");
}
int main() {
badfunction();
}
This is a simple piece of code. The objective is to overflow the buffer in badfunction()and override the return address having it point to the memory address of the function cantrun().
Step 1: Find the offset of the return address (in this case it's 12bytes, 8 for the buffer and 4 for the base pointer).
Step 2: Find the memory location of cantrun(), gdb say it's 0x0804849a.
When I run the program printf "%012x\x9a\x84\x04\x08" | ./vuln, I get the error "illegal instruction". This suggests to me that I have correctly overwritten the EIP, but that the memory location of cantrun() is incorrect.
I am using Kali Linux, Kernel 3.14, I have ASLR turned off and I am using execstack to allow an executable stack. Am I doing something wrong?
UPDATE:
As a shot in the dark I tried to find the correct instruction by moving the address around and 0x0804849b does the trick. Why is this different than what GDB shows. When running GDB, 0x0804849a is the location of the prelude instruction push ebp and 0x0804849b is the prelude instruction mov ebp,esp.

gdb doesn't do anything to change the locations of functions in the programs it executes. ASLR may matter, but by default gdb turns this off to enable simpler debugging.
It's hard to say why you are seeing the results you are. What does disassembling the function in gdb show?

Related

How to find the current location of the program break

I tried adding this inside the brk system call function :
void *addr = sbrk(0);
printk("current-add-is-%p-\n", addr);
But it returned error during kernel compilation that implicit declaration of sbrk function. And I could not find where sbrk is defined!!
All I need to measure that whenever some user process tries to extended its program break address, I would know its current program break address, so that I can measure how much memory processes are requesting.
Thank you.
Looks like you are trying to do something wrong.
There is no 'sbrk' syscall, there is 'brk'. Except then it would be named sys_brk, but you have no reasons to call it. So if you want to find out how to learn the current break address, read brk's sources.
However, where exactly did you put this in if you did not happen to find brk's sources?
Add this line of code:
printf("Address of program break is %p\n", (void *)sbrk(0));
It will return a message to terminal with hex address of the program break.(e.g., 0x#### #### ####.)
If you want the address in other than hex, then use %u or similar. The use of sbrk(0) is documented in man pages (linux programmers manual).
To see documentation, type in command line: man sbrk and documentation will pop up.

Why a segfault instead of privilege instruction error?

I am trying to execute the privileged instruction rdmsr in user mode, and I expect to get some kind of privilege error, but I get a segfault instead. I have checked the asm and I am loading 0x186 into ecx, which is supposed to be PERFEVTSEL0, based on the manual, page 1171.
What is the cause of the segfault, and how can I modify the code below to fix it?
I want to resolve this before hacking a kernel module, because I don't want this segfault to blow up my kernel.
Update: I am running on Intel(R) Xeon(R) CPU X3470.
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
#include <sched.h>
#include <assert.h>
uint64_t
read_msr(int ecx)
{
unsigned int a, d;
__asm __volatile("rdmsr" : "=a"(a), "=d"(d) : "c"(ecx));
return ((uint64_t)a) | (((uint64_t)d) << 32);
}
int main(int ac, char **av)
{
uint64_t start, end;
cpu_set_t cpuset;
unsigned int c = 0x186;
int i = 0;
CPU_ZERO(&cpuset);
CPU_SET(i, &cpuset);
assert(sched_setaffinity(0, sizeof(cpuset), &cpuset) == 0);
printf("%lu\n", read_msr(c));
return 0;
}
The question I will try to answer: Why does the above code cause SIGSEGV instead of SIGILL, though the code has no memory error, but an illegal instruction (a privileged instruction called from non-privileged user pace)?
I would expect to get a SIGILL with si_code ILL_PRVOPC instead of a segfault, too. Your question is currently 3 years old and today, I stumbled upon the same behavior. I am disappointed too :-(
What is the cause of the segfault
The cause seems to be that the Linux kernel code decides to send SIGSEGV. Here is the responsible function:
http://elixir.free-electrons.com/linux/v4.9/source/arch/x86/kernel/traps.c#L487
Have a look at the last line of the function.
In your follow up question, you got a list of other assembly instructions which get propagated as SIGSEGV to userspace though they are actually general protection faults. I found your question because I triggered the behavior with cli.
and how can I modify the code below to fix it?
As of Linux kernel 4.9, I'm not aware of any reliable way to distinguish between a memory error (what I would expect to be a SIGSEGV) and a privileged instruction error from userspace.
There may be very hacky and unportable way to distibguish these cases. When a privileged instruction causes a SIGSEGV, the siginfo_t si_code is set to a value which is not directly listed in the SIGSEGV section of man 2 sigaction. The documented values are SEGV_MAPERR, SEGV_ACCERR, SEGV_PKUERR, but I get SI_KERNEL (0x80) on my system. According to the man page, SI_KERNEL is a code "which can be placed in si_code for any signal". In strace, you see SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0}. The responsible kernel code is here.
It would also be possible to grep dmesg for this string.
Please, never ever use those two methods to distinguish between GPF and memory error on a production system.
Specific solution for your code: Just don't run rdmsr from user space. But this answer is really unsatisfying if you are looking for a generic way to figure out why a program received a SIGSEGV.

Gdb dump memory in specific region, save formatted output into a file

I have a buggy (memory leaked) software.
As an evidence, I have 1GB of core.dump file. Heap size is 900MB, so obviously, something allocates, but does not free the memory.
So, I have a memory region to examine like this.
(gdb) x/50000s 0x200000000
However, this is hard to guess only with naked eyes, which object or struct is not freed.
My idea to trace is, "Save gdb formatted output into a file, and run a pattern match to see which magic string comes up the most." So, here is my question:
How can I save output of following command into a textfile, so that I can write an analyzer?
(gdb) x/10000000s 0x20000000 <-- I need this output into a file
You could use the "dump" function of gdb, see: https://sourceware.org/gdb/onlinedocs/gdb/Dump_002fRestore-Files.html
For your example:
dump binary memory result.bin 0x200000000 0x20000c350
This will give you a plain binary dump int file result.bin. You can also use the following to dump it in hex format:
dump ihex memory result.bin 0x200000000 0x20000c350
Using the dump command is much clearer than using the gdb logging hack (which even did not work for me somehow).
How can I save output of following command into a textfile, so that I can write an analyzer?
(gdb) x/10000000s 0x20000000
That's actually quite easy:
(gdb) set height 0 # prevent GDB from stopping every screenfull
(gdb) set logging on # GDB output is now also copied into gdb.txt
(gdb) x/10000000s 0x20000000
(gdb) quit
Voila, enjoy your output in gdb.txt.
I have a buggy (memory leaked) software. ... "Save gdb formatted output into a file, and run a pattern match to see which magic string comes up the most."
That idea is quite unlikely to yield satisfactory results. Consider:
void some_function() {
std::vector<string> *v = new std::vector<string>();
// code to insert and use 1000s of strings into "v".
return; // Oops: forgot to delete "v".
}
Even if you could effectively "see magic string that comes up the most", you'll discover that you are leaking all the strings; but they are not the problem, leaking "v" is the problem.
So what you really want is to build a graph of which allocated regions point to other allocated regions, and find a "root" of that graph. This is nearly impossible to do by hand.
So what is more likely to help you find the memory leak(s)? Fortunately, there are lots of tools that can solve this problem for you:
Valgrind,
Google heap leak checker,
jemalloc,
... etc. etc.
you can write simple lkm will do that
lkm:
#include <linux/kernel.h>
#include <linux/module.h>
int *ptr=(int*)0Xc18251c0; //the address you want to read from kernel space
int module_i(void)
{
printk("%d\n",*ptr);
}
module_init(module_i);
and the data will show up in log
so write
enter code here
dmesg

Simple heap overflow exploit with toy example on old glibc

Consider this example of a heap buffer overflow vulnerable program in Linux, taken directly from the "Buffer Overflow Attacks" (p. 248) book:
#include <stdlib.h>
#include <string.h>
int main(int argc, char **argv)
{
char *A, *B;
A = malloc(128);
B = malloc(32);
strcpy(A, argv[1]);
free(A);
free(B);
return 0;
}
Since unlink() has been changed to prevent the most simple form of exploit using the FD and BK pointers with a sanity check, I'm using a very old system I have with an old version of glibc (version 2.3.2). I'm also setting MALLOC_CHECK_=0 for this testing.
My goal of this toy example is to simply see if I can write 4 bytes to some arbitrary address I specify. The most simple test I can think of is to try write something to 0x41414141, which is an illegal address and should let the program crash to just confirm to me that it is indeed trying to write to this address (something I should be able to observe in GDB).
So I try executing with the argument perl -e 'print "A"x128 . "\xf8\xff\xff\xff" . "\xf8\xff\xff\xff" . "\x41\x41\x41\x41" . "\x41\x41\x41\x41" '
So I have:
Buffer A: 128 bytes of 0x41.
prev_size: 0xfffffff8
size: 0xfffffff8
FD: 0x41414141
BK: 0x41414141
I'm using 0xfffffff8 instead of 0xfffffffc because there is a note that with glibc 2.3 the third lowest bit NON_MAIN_AREA is used for management purposes for the arenas and has to be 0.
This should attempt to write 0x41414141 to 0x41414141 (+ 12 to be more correct, but still an illegal address), correct? However, when I execute this, the program simply terminates normally.
What am I missing here? This seems simple enough that it shouldn't be that hard to get to work.
I've tried various things such as using 0xfffffffc instead for prev_size and size, using legal addresses for FD (some address on the heap). I've tried swapping the order A and B are free()'d, I've tried to step into free() to see what happens in GDB but I got lost. Note that there shouldn't be any other security features on this system as it is very old and wouldn't have NX-bit, ASLR, etc (not that it should matter for the purpose of just writing 4 bytes to an illegal address).
Any ideas for how to make this work?
I could add that if using MALLOC_CHECK_=3 I get this:
malloc: using debugging hooks
malloc: using debugging hooks
free(): invalid pointer 0x8049688!
Program received signal SIGABRT, Aborted.
0x4004a1b1 in kill () from /lib/libc.so.6

injecting shellcode

i have a .cmd file on a webserver with a variable user="...", vulnerable against buffer overflows. I can execute the .cmd file via ssh or via web.
Now i have this shellcode:
#include <stdio.h>
char sc[] =
"
...
";
void main(void)
{
void(*s)(void);
printf("size: %d\n", sizeof(sc));
s = sc;
s();
}
my problem is, i don't know how this all plays together. I know what the Assembler and the C code does, but how do i inject the code into the running cmd file?
cat "shellcode " | nc host cmd
You generally have to insert enough data so that your write ends up in part of the program's memory that gets executed. How to do that exactly depends entirely on the structure of the program with the overflow.
But, imagine if the program were specified by this ASM listing:
[SECTION .text]
global _start
_start:
;;...
jmpagain:
jmp next
uname db "username"
next:
mov eax,uname
;;...
jmp jmpagain
the string "username" is in memory immediately adjacent to an address that the instruction pointer visits. If the program writes some data to that area of memory without checking it's bounds, it will overwrite the code, and anytime the instruction pointer revisits the next function, the machine is going to execute whatever data overflowed in that part of memory. Supposing the write you are exploiting starts at the beginning of the string, and stops writing on some condition that provides enough room for you to inject your shellcode, you would prepend a byte string of the same length as "username" to your shellcode in the input. Then the beginning of your shellcode would be at the address of the next label.
But this is just a simple example demonstration of the basic principle. Actually getting your data to an area of memory that the instruction pointer visits is likely going to be a lot trickier. If you have access to the command file in question, you can debug it and dump the memory and trace how the area of memory is written to to see how you need to overflow the buffer to get to IP reachable memory.
It's important to reiterate that your shellcode not only needs to make it to memory that the instruction pointer passes over, it needs to be correctly aligned in that memory to execute in the way you expect it to. If the instruction pointer lands somewhere in the middle of your code rather than at the beginning of the first instruction, it obviously isn't going to do what you expect it to.

Resources