Stack Overflow (Shellcoder's Handbook) - security

I'm currently following the erratas for the Shellcoder's Handbook (2nd edition).
The book is a little outdated but pretty good still. My problem right now is that I can't guess how long my payload needs to be I tried to follow every step (and run gdb with the same arguments) and I tried to guess where the buffer starts, but I don't know exactly. I'm kind of new to this too so it makes sense.
I have a vulnerable program with strcpy() and a buffer[512]. I want to make the stack overflow, so I run some A's with the program (as the Erratas for the Shellcoders Handbook). I want to find how long the payload needs to be (no ASLR) so in theory I just need to find where the buffer is.
Since I'm new I can't post an image, but the preferred output from the book has a full 4 row of 'A's (0x41414141), and mine is like this:
(gdb) x/20xw $esp - 532
0xbffff968 : 0x0000000 0xbfffffa0e 0x41414141 0x41414141
0xbffff968 0x41414141 0x41414141 0x00004141 0x0804834
What address is that? How I know where this buffer starts? I want to do this so I can keep working with the book. I realize that the buffer is somewhere in there because of the A's that I ran. But if I want to find how long the payload needs to be I need the point where it starts.

I'm not sure that you copied the output of gdb correctly. You used the command x/20xw, this says you'd like to examine 20 32-bit words of memory, displayed as hex. As such, each item of data displayed should consist of 0x followed by 8 characters. You have some some with only 7, and some with 9. I'll assume that you copied out the text by hand and made a few mistakes.
The address is the first item displayed on the line, so, for the first line the address is 0xbffff968, this is the address of the first byte on the line. From there you can figure out the address of every other byte on the line by counting.
Your second line looks a little messed up, you have the same address, and also you're missing the : character, again, I'll assume this is just a result of the copy. I would expect the address of the second line to be 0xbffff978.
If the buffer starts with the first word of 0x41414141 then this is at address 0xbffff970, though an easier way to figure out the address of a variable is just to ask gdb for the address of the variable, so, in your case, once gdb is stopped at a place where buffer is in scope:
(gdb) p &buffer
$1 = (char (*)[512]) 0xbffff970

Metasploit has a nice tool to assist with calculating the offset. It will generate a string that contains unique patterns. Using this pattern (and the value of EIP or whatever other location after using the pattern), you can see how big the buffer should be to write exactly into EIP or whatever other location.
Open the tools folder in the metasploit framework3 folder (I’m using a linux version of metasploit 3). You should find a tool called pattern_create.rb. Create a pattern of 5000 characters and write it into a file:
root#bt:/pentest/exploits/framework3/tools# ./pattern_create.rb
Usage: pattern_create.rb length [set a] [set b] [set c]
root#bt:/pentest/exploits/framework3/tools# ./pattern_create.rb 5000
Then just replace the A's with the output of the tool.
Run the application and wait until the application dies again, and take note of the contents of EIP or whatever other location.
Then use a second metasploit tool to calculate the exact length of the buffer before writing into EIP or whatever other location, feed it with the value of EIP or whatever other location(based on the pattern file) and length of the buffer :
root#bt:/pentest/exploits/framework3/tools# ./pattern_offset.rb 0x356b4234 5000
1094
root#bt:/pentest/exploits/framework3/tools#
That’s the buffer length needed to overwrite EIP or whatever other location.

Related

How to deal with a bad char in a shellcode buffer overflow?

So I got recently interested in buffer overflow and many tutorials and recourses online have this CTF like attack where you need to read the content of a flag file (using cat for example).
So I started looking online for assembly examples of how to do this and I came accross sites like this or shell-storm where there are plenty of examples on how to do this.
So I generated my exploit and got this machine code (it basically executes a shell doing cat flag):
shellcode = b'\x31\xc0\x50\x68\x2f\x63\x61\x74\x68\x2f\x62\x69\x6e\x89\xe3\x50\x68\x66\x6c\x61\x67\x89\xe1\x50\x51\x53\x89\xe1\x31\xc0\x83\xc0\x0b\xcd\x80'
The problem is that, thanks to stepping in with GDB to debug the problem, I noticed that my buffer doesn't get copied starting with \x0b towards the end of the shell code. I know the problem is there because if I change it to say \x3b then it works (with the rest of my exploits not copied here) even if it obviously crashes when it reaches the wrong value there but at least the whole buffer gets copied. Now doing some research it seems like \x0b is a "bad char" which can cause issues and should be avoided. Having said this I don't understand how:
All those online and even university tutorials use that shell code
for this exact task.
How to potentially fix this. Is it even possible without completely
change the assembly code?
I will add that I am on Ubuntu and trying to make this work on 64 bits.
One thing that's special about byte 0x0b is it's ASCII Vertical Tab, which is considered a whitespace character.
So I'm going to make a wild guess that the code you're exploiting looks something like
// Dangerous code, DO NOT USE
char buf[TOO_SMALL];
scanf("%s", buf);
since scanf("%s") is a commonly (mis)used input mechanism that stops when it hits whitespace. If so, then if your shellcode contains 0x0b or any other whitespace character, it will get truncated.
To your first question, as to "why do other tutorials use shellcode like this", they may be thinking instead of exploiting code like
// Dangerous code, DO NOT USE
char buf[TOO_SMALL];
gets(buf);
where gets() will not stop reading at 0x0b but only at newline 0x0a. Or maybe they are thinking of a buffer filled by strcpy() which will only stop at 0x00, or maybe a buffer filled by read() with a user-controlled size which will read the full amount of data no matter what bytes it contains. So the question of which characters are "bad" depends on what the vulnerable code actually does.
As to how to handle it, well, you need to modify your shellcode to use only instructions that don't contain any whitespace bytes. This sort of thing is more an art than a science; you have to know your instruction set well, and be creative in thinking about alternative instruction sequences to achieve the desired result. Sometimes you may be able to do it with minor tweaks; other times a wholesale rewrite may be needed. It really varies.
In this case, luckily the 0x0b is the only whitespace character in the whole code, and it appears in the instruction
83C00B add eax, 0x0b
Since eax was previously zeroed, the goal is to load it with the value 0xb which is the system call number of execve. When the "bad byte" appears as part of immediate data, it is usually not too hard to find another way to get that data to where it needs to go. (Life is harder when the bad byte is part of the opcode itself.) In this case, a simple solution is to take advantage of two's complement, and write instead
83E8F5 sub eax, -0x0b
The single byte -0x0b = 0xf5 gets sign-extended to 32 bits and used as the value to subtract, which leaves 0x0b in eax as desired. Of course there are lots of other ways, some of which may have smaller code size; I'll leave this to your ingenuity.
To find out the "bad char" for the shellcode is an important step to exploit an overflow vulneribility.
first, you have to figure out how many bits the target can be overflow (this field is also for the shellcode). if this zone is big enough and you can use all the "char"(google bad char from \x01 to \xff. \x00 is bad char) to act as shellcode send to target.
Then you can get find the Register to see what the char left.(if the zone is not big enough for all the chars you can send just some chars one time and repeat)
you can follow this https://netsec.ws/?p=180.

Buffer overflow exploitation 101

I've heard in a lot of places that buffer overflows, illegal indexing in C like languages may compromise the security of a system. But in my experience all it does is crash the program I'm running. Can anyone explain how buffer overflows could cause security problems? An example would be nice.
I'm looking for a conceptual explanation of how something like this could work. I don't have any experience with ethical hacking.
First, buffer overflow (BOF) are only one of the method of gaining code execution. When they occur, the impact is that the attacker basically gain control of the process. This mean that the attacker will be able to trigger the process in executing any code with the current process privileges (depending if the process is running with a high or low privileged user on the system will respectively increase or reduce the impact of exploiting a BOF on that application). This is why it is always strongly recommended to run applications with the least needed privileges.
Basically, to understand how BOF works, you have to understand how the code you have build gets compiled into machine code (ASM) and how data managed by your software is stored in memory.
I will try to give you a basic example of a subcategory of BOF called Stack based buffer overflows :
Imagine you have an application asking the user to provide a username.
This data will be read from user input and then stored in a variable called USERNAME. This variable length has been allocated as a 20 byte array of chars.
For this scenario to work, we will consider the program's do not check for the user input length.
At some point, during the data processing, the user input is copied to the USERNAME variable (20bytes) but since the user input is longer (let's say 500 bytes) data around this variable will be overwritten in memory :
Imagine such memory layout :
size in bytes 20 4 4 4
data [USERNAME][variable2][variable3][RETURN ADDRESS]
If you define the 3 local variables USERNAME, variable2 and variable3 the may be store in memory the way it is shown above.
Notice the RETURN ADDRESS, this 4 byte memory region will store the address of the function that has called your current function (thanks to this, when you call a function in your program and readh the end of that function, the program flow naturally go back to the next instruction just after the initial call to that function.
If your attacker provide a username with 24 x 'A' char, the memory layout would become something like this :
size in bytes 20 4 4 4
data [USERNAME][variable2][variable3][RETURN ADDRESS]
new data [AAA...AA][ AAAA ][variable3][RETURN ADDRESS]
Now, if an attacker send 50 * the 'A' char as a USERNAME, the memory layout would looks like this :
size in bytes 20 4 4 4
data [USERNAME][variable2][variable3][RETURN ADDRESS]
new data [AAA...AA][ AAAA ][ AAAA ][[ AAAA ][OTHER AAA...]
In this situation, at the end of the execution of the function, the program would crash because it will try to reach the address an invalid address 0x41414141 (char 'A' = 0x41) because the overwritten RETURN ADDRESS doesn't match a correct code address.
If you replace the multiple 'A' with well thought bytes, you may be able to :
overwrite RETURN ADDRESS to an interesting location.
place "executable code" in the first 20 + 4 + 4 bytes
You could for instance set RETURN ADDRESS to the address of the first byte of the USERNAME variable (this method is mostly no usable anymore thanks to many protections that have been added both to OS and to compiled programs).
I know it is quite complex to understand at first, and this explanation is a very basic one. If you want more detail please just ask.
I suggest you to have a look at great tutorials like this one which are quite advanced but more realistic

How to tell compiler to pad a specific amount of bytes on every C function?

I'm trying to practice some live instrumentation and I saw there was a linker option -call-nop=prefix-nop, but it has some restriction as it only works with GOT function (I don't know how to force compiler to generate GOT function, and not sure if it's good idea for performance reason.) Also, -call-nop=* cannot pad more than 1 byte.
Ideally, I'd like to see a compiler option to pad any specific amount of bytes, and compiler will still perform all the normal function alignment.
Once I have this pad area, I can at run time to reuse these padding area to store some values or redirect the control flow.
P.S. I believe Linux kernel use similar trick to dynamically enable some software tracepoint.
-pg is intended for profile-guided optimization. The correct option for this is -fpatchable-function-entry
-fpatchable-function-entry=N[,M]
Generate N NOPs right at the beginning of each function, with the function entry point before the Mth NOP. If M is omitted, it defaults to 0 so the function entry points to the address just at the first NOP. The NOP instructions reserve extra space which can be used to patch in any desired instrumentation at run time, provided that the code segment is writable. The amount of space is controllable indirectly via the number of NOPs; the NOP instruction used corresponds to the instruction emitted by the internal GCC back-end interface gen_nop. This behavior is target-specific and may also depend on the architecture variant and/or other compilation options.
It'll insert N single-byte 0x90 NOPs and doesn't make use of multi-byte NOPs thus performance isn't as good as it should, but you probably don't care about that in this case so the option should work fine
I achieved this goal by implement my own mcount function in an assembly file and compile the code with -pg.

Windows console application with gets() ROP exploit

I'm trying (for learning purposes) to take advantage of gets() function vulnerability using return-oriented programming (ROP) technique. The target program is a Windows console application that in some point asks for some input, and then uses gets() to store the input in the local 80 characters long array.
I created a file that contains 80 'a' characters in the beginning + some extra characters + 0x5da06c48 address for overwriting the old EIP pointer.
I'm opening the file in text editor and copy-pasting the content into the console as input. I've used IDA Pro (or OllyDbg) to set a breakpoint right after the return from the gets() function and noticed that the address was corrupted - it was set to 0x3fa03f48 (two 3f substitutions).
I've tried other addresses as well - part of them works well, but most of the times the address is being corrupted (sometimes characters missing or substituted, sometimes truncated).
How to get over this problem? Any suggestion will be highly appreciated!
Copy-Pasting binary data is hit-and-miss. Have you tried feeding the input into your test program directly from the file using input redirection?
First of all keep track of the Endianness of your platform. If you think your bits are in the right order but you are still getting malformed input, it might be that your shell/text editor isn't binary safe. You are better off writing an exploit for this flaw in a scripting language such as Python, using the Subprocess library which allows you to write data directly to an arbitrary process's stdin pipe.

gdb break when program opens specific file

Back story: While running a program under strace I notice that '/dev/urandom' is being open'ed. I would like to know where this call is coming from (it is not part of the program itself, it is part of the system).
So, using gdb, I am trying to break (using catch syscall open) program execution when the open call is issued, so I can see a backtrace. The problem is that open is being called alot, like several hundred times so I can't narrow down the specific call that is opening /dev/urandom. How should I go about narrowing down the specific call? Is there a way to filter by arguments, and if so how do I do it for a syscall?
Any advice would be helpful -- maybe I am going about this all wrong.
GDB is a pretty powerful tool, but has a bit of a learning curve.
Basically, you want to set up a conditional breakpoint.
First use the -i flag to strace or objdump -d to find the address of the open function or more realistically something in the chain of getting there, such as in the plt.
set a breakpoint at that address (if you have debug symbols, you can use those instead, omitting the *, but I'm assuming you don't - though you may well have them for library functions if nothing else.
break * 0x080482c8
Next you need to make it conditional
(Ideally you could compare a string argument to a desired string. I wasn't getting this to work within the first few minutes of trying)
Let's hope we can assume the string is a constant somewhere in the program or one of the libraries it loads. You could look in /proc/pid/maps to get an idea of what is loaded and where, then use grep to verify the string is actually in a file, objdump -s to find it's address, and gdb to verify that you've actually found it in memory by combining the high part of the address from maps with the low part from the file. (EDIT: it's probably easier to use ldd on the executable than look in /proc/pid/maps)
Next you will need to know something about the abi of the platform you are working on, specifically how arguments are passed. I've been working on arm's lately, and that's very nice as the first few arguments just go in registers r0, r1, r2... etc. x86 is a bit less convenient - it seems they go on the stack, ie, *($esp+4), *($esp+8), *($esp+12).
So let's assume we are on an x86, and we want to check that the first argument in esp+4 equals the address we found for the constant we are trying to catch it passing. Only, esp+4 is a pointer to a char pointer. So we need to dereference it for comparison.
cond 1 *(char **)($esp+4)==0x8048514
Then you can type run and hope for the best
If you catch your breakpoint condition, and looking around with info registers and the x command to examine memory seems right, then you can use the return command to percolate back up the call stack until you find something you recognize.
(Adapted from a question edit)
Following Chris's answer, here is the process that eventually got me what I was looking for:
(I am trying to find what functions are calling the open syscall on "/dev/urandom")
use ldd on executable to find loaded libraries
grep through each lib (shell command) looking for 'urandom'
open library file in hex editor and find address of string
find out how parameters are passed in syscalls (for open, file is first parameter. on x86_64 it is passed in rdi -- your mileage may vary
now we can set the conditional breakpoint: break open if $rdi == _addr_
run program and wait for break to hit
run bt to see backtrace
After all this I find that glib's g_random_int() and g_rand_new() use urandom. Gtk+ and ORBit were calling these functions -- if anybody was curious.
Like Andre Puel said:
break open if strcmp($rdi,"/dev/urandom") == 0
Might do the job.

Resources