i have a .cmd file on a webserver with a variable user="...", vulnerable against buffer overflows. I can execute the .cmd file via ssh or via web.
Now i have this shellcode:
#include <stdio.h>
char sc[] =
"
...
";
void main(void)
{
void(*s)(void);
printf("size: %d\n", sizeof(sc));
s = sc;
s();
}
my problem is, i don't know how this all plays together. I know what the Assembler and the C code does, but how do i inject the code into the running cmd file?
cat "shellcode " | nc host cmd
You generally have to insert enough data so that your write ends up in part of the program's memory that gets executed. How to do that exactly depends entirely on the structure of the program with the overflow.
But, imagine if the program were specified by this ASM listing:
[SECTION .text]
global _start
_start:
;;...
jmpagain:
jmp next
uname db "username"
next:
mov eax,uname
;;...
jmp jmpagain
the string "username" is in memory immediately adjacent to an address that the instruction pointer visits. If the program writes some data to that area of memory without checking it's bounds, it will overwrite the code, and anytime the instruction pointer revisits the next function, the machine is going to execute whatever data overflowed in that part of memory. Supposing the write you are exploiting starts at the beginning of the string, and stops writing on some condition that provides enough room for you to inject your shellcode, you would prepend a byte string of the same length as "username" to your shellcode in the input. Then the beginning of your shellcode would be at the address of the next label.
But this is just a simple example demonstration of the basic principle. Actually getting your data to an area of memory that the instruction pointer visits is likely going to be a lot trickier. If you have access to the command file in question, you can debug it and dump the memory and trace how the area of memory is written to to see how you need to overflow the buffer to get to IP reachable memory.
It's important to reiterate that your shellcode not only needs to make it to memory that the instruction pointer passes over, it needs to be correctly aligned in that memory to execute in the way you expect it to. If the instruction pointer lands somewhere in the middle of your code rather than at the beginning of the first instruction, it obviously isn't going to do what you expect it to.
Related
I made my own string declarator with macro in GNU Assembler in x64 machine.
.macro declareString identifier, value
.pushsection .data
\identifier: .ascii "\value"
"lengthof.\identifier"= . - \identifier
.popsection
.endm
This is how to use it:
.data
ANOTHER_VARIABEL: .ascii "why"
.text
_start:
declareString myString, "good"
lea rax, myString
mov rbx, lengthof.myString
So basically my macro is just label maker.
So in debugger, rax will be value of myString address because myString basically just a label.
And rbx will be value of myString length (4).
So If I want change myString from good which has 4 chars to rocker which has 6 chars at runtime, I'm afraid it will overwrite another variabel due to size collision.
I heard about heap memory but I'm not sure how does it work. I only know about stack, but I'm used to use stack as temporary backup.
So how do I declare mutable string in assembly?
Same as in C; if you want to use static storage (in .data) for the characters themselves (like static char myString[4] = "good"; note not including a terminating 0 byte), you actually need to reserve enough space for the largest you ever want this string to be. Like static char myString[128] = "good".
In your case, like .space 124 after your .ascii would be a total of 128 bytes reserved after the label myString.
You're correct that you shouldn't write past the size you reserve, although it would come after ANOTHER_VARIABEL, not before, since that part of your .data section is earlier in your source than where you use the macro containing .pushsection
If you wanted to do dynamic allocation, you might want to make an mmap(..., MAP_PRIVATE|MAP_ANONYMOUS, PROT_READ|PROT_WRITE, ...) system call to allocate 1 or more pages of memory, then copy bytes into it. (You're writing a _start so I'm assuming you're not linking libc.)
You could make your global variable a pointer, but if you free the old pointer and allocate a new one (or mremap to the new size), you should start with runtime allocation and init of it. You don't want to munmap a page of .rodata.
Or instead of mmap/munmap, Linux mremap(MREMAP_MAYMOVE) can ensure that the allocation is big enough to hold your whole string without copying it. MAYMOVE lets it pick a new virtual address if there aren't enough free virtual pages following the currently pointed-to page, but without copying the actual data in the already-allocated pages.
There are tons of Q&As about making system calls in assembly so I'm not going into details about that; you can think about how you're managing memory in terms of system calls to make, separately from implementing it in asm. Thinking about it in C is pretty much equally valid; asm doesn't add much in terms of being able to declare static data or make system calls. Other than the fact that you could in asm make sure that a page of .data was exclusively dedicated to this string, with no other things using it, so you could potentially let mremap unmap that page from your .data and map it at a different virtual address.
#include<stdio.h>
int main()
{
char *name = "Vikram";
printf("%s",name);
name[1]='s';
printf("%s",name);
return 0;
}
There is no output printed on terminal and just get segmentation fault. But when I run it in GDB, I get following -
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400525 in main () at seg2.c:7
7 name[1]='s';
(gdb)
This means program receive SEG fault on 7th line (obviously I can't write on constant char array) . Then why printf() of line number 6 is not executed ?
This is due to stream buffering of stdout. Unless you do fflush(stdout) or you print a newline "\n" the output is may be buffered.
In this case, it's segfaulting before the buffer is flushed and printed.
You can try this instead:
printf("%s",name);
fflush(stdout); // Flush the stream.
name[1]='s'; // Segfault here (undefined behavior)
or:
printf("%s\n",name); // Flush the stream with '\n'
name[1]='s'; // Segfault here (undefined behavior)
First you should end your printfs with "\n" (or at least the last one). But that is not related to the segfault.
When the compiler compiles your code, it splits the binary into several section. Some are read only, while other are writeable.
Writing to an read only section may cause a segfault.
String literals are usually placed in a read only section (gcc should put it in ".rodata").
The pointer name points to that ro section. Therefore you must use
const char *name = "Vikram";
In my response I've used a few "may" "should". The behaviour depends on your OS, compiler and compilation settings (The linker script defines the sections).
Adding
-Wa,-ahlms=myfile.lst
to gcc's command line produces a file called myfile.lst with the generated assembler code.
At the top you can see
.section .rodata
.LC0:
.string "Vikram"
Which shows that the string is in Vikram.
The same code using (Must be in global scope, else gcc may store it on the stack, notice it is an array and not a pointer)
char name[] = "Vikram";
produces
.data
.type name, #object
.size name, 7
name:
.string "Vikram"
The syntax is a bit different but see how it is in .data section now, which is read-write.
By the way this example works.
The reason you are getting a segmentation fault is that C string literals are read only according to the C standard, and you are attempting to write 's' over the second element of the literal array "Vikram".
The reason you are getting no output is because your program is buffering its output and crashes before it has a chance to flush its buffer. The purpose of the stdio library, in addition to providing friendly formatting functions like printf(3), is to reduce the overhead of i/o operations by buffering data in in-memory buffers and only flushing output when necessary, and only performing input occasionally instead of constantly. Actual input and output will not, in the general case, occur at the moment when you call the stdio function, but only when the output buffer is full (or the input buffer is empty).
Things are slightly different if a FILE object has been set so it flushes constantly (like stderr), but in general, that's the gist.
If you're debugging, it is best to fprintf to stderr to assure that your debug printouts will get flushed before a crash.
By default when stdout is connected to a terminal, the stream is line-buffered. In practice, in your example the absence of '\n' (or of an explicit stream flush) is why you don't get the characters printed.
But in theory undefined behavior is not bounded (from the Standard "behavior [...] for which this International Standard imposes no requirements") and the segfault can happen even before the undefined behavior occurs, for example before the first printf call!
I started studying software security, and I'm having trouble getting what buffer overflow attack and ROP attack are.
From what I understand is,
Buffer overflow attack:
When a buffer has a certain size, fill the buffer and an add additional code so that the attacker can execute another function in the code or his/her own shellcode.
ROP attack:
Give a certain input which can override the return address, so that the attacker can control the flow.
But what's the exact difference between the two?
I feel like both just give an excessive input to override the area which is not supposed to be approached.
For example, if I have a code of
1 #include <stdio.h>
2
3 void check(){
4 printf("overflow occurs!\n");
5 }
6
7 int main(int argc, char* argv[]){
8 char buffer[256];
9 gets(buffer);
10 printf("%s\n", buffer);
11 return 0;
12 }
and try to execute the function check() by giving a certain input to gets() function.
Is this a ROP attack or a buffer overflow attack?
A ROP attack is one kind of payload you can deliver via a buffer-overflow vulnerability, for buffers on the stack. (Overflowing other buffers could let you overwrite other data, e.g. in a struct or nearby other globals, but not take control of the program-counter.)
A buffer overflow is when incorrect bounds checking or handling of implicit-length data (e.g. strcpy or strcat) lets malicious input write memory past the end of an array. This gets interesting when the array was allocated on the call-stack, so one of the things following it is the return address of this function.
(In theory overwriting a static variable past the end of a static array could be useful as an exploit, and that would also be a buffer overflow. But usually a buffer overflow implies a buffer on the stack, allowing the attacker to control the return address. And thus to gain control of the instruction pointer.)
As well as a new return address, your malicious data will include more data which will be in memory below and above that return address. Part of this is the payload. Controlling the return address alone is usually not sufficient: in most processes there isn't anywhere you can jump to that (without other inputs) will execve a shell listening on a TCP port, for example.
Traditionally your payload would be machine-code ("shellcode"), and the return address would be the stack address where you knew that payload would land. (+- a NOP slide so you didn't have to get it exactly right).
Stack ASLR and non-executable stacks have made the traditional shellcode injection method of exploiting a buffer overflow impossible in normal modern programs. "Buffer overflow attack" used to (I think) imply shellcode injection, because there was no need to look for more complicated attacks. But that's no longer true.
A ROP attack is when the payload is a sequence of return addresses and data to be popped by pop instructions, and/or some strings like "/bin/sh". The first return address in the payload sends execution to some already-existing bytes at a known address in an executable page.
and try to execute the function check() by giving a certain input to gets() function.
The code for check() already exists in the target program, so the simplest attack would be a ROP attack.
This is the absolute simplest form of ROP attack, where code to do exactly what you want exists at a single known address, without needing any "function args". So it makes a good example to introduce the topic.
Is this a ROP attack or a buffer overflow attack?
It's both. It's a buffer overflow to inject a ROP payload.
If the program was compiled with -z execstack -no-pie, you could also choose to inject e.g. x86 shellcode that did mov eax, imm32 / jmp eax to jump to the known absolute address of check. In that case it would be a buffer overflow but not a ROP attack; it would be a code-injection attack.
(You might not call it "shellcode" because the purpose isn't to run a shell replacing the program, but rather to do something using the existing code of the
program. But terminology is often used sloppily, so I think many people would call any injectable machine code "shellcode" regardless of what it does.)
Buffer overflow attack:
When a buffer has a certain size, fill the buffer and an add additional code so that the attacker can execute another function in the code or his/her own shellcode.
The "in the code" option would be a ROP attack. You point the return address at code which is already in memory.
The "or his/her own shellcode" option would be a code-injection attack. You point the return address at the buffer you just overflowed. (Either directly or via a ret2reg ROP attack to defeat stack ASLR, by looking for a jmp esp gadget on x86 for example.)
This "Buffer Overflow" definition is still slightly too narrow: it excludes overwriting some other critical variable (like bool user_authenticated) without overwriting a return address.
But yes, code injection and ROP attacks are the 2 main ways, with code injection normally made impossible by non-executable stack memory.
I'm trying to run through a buffer overflow exercise, here is the code:
#include <stdio.h>
int badfunction() {
char buffer[8];
gets(buffer);
puts(buffer);
}
int cantrun() {
printf("This function cant run because it is never called");
}
int main() {
badfunction();
}
This is a simple piece of code. The objective is to overflow the buffer in badfunction()and override the return address having it point to the memory address of the function cantrun().
Step 1: Find the offset of the return address (in this case it's 12bytes, 8 for the buffer and 4 for the base pointer).
Step 2: Find the memory location of cantrun(), gdb say it's 0x0804849a.
When I run the program printf "%012x\x9a\x84\x04\x08" | ./vuln, I get the error "illegal instruction". This suggests to me that I have correctly overwritten the EIP, but that the memory location of cantrun() is incorrect.
I am using Kali Linux, Kernel 3.14, I have ASLR turned off and I am using execstack to allow an executable stack. Am I doing something wrong?
UPDATE:
As a shot in the dark I tried to find the correct instruction by moving the address around and 0x0804849b does the trick. Why is this different than what GDB shows. When running GDB, 0x0804849a is the location of the prelude instruction push ebp and 0x0804849b is the prelude instruction mov ebp,esp.
gdb doesn't do anything to change the locations of functions in the programs it executes. ASLR may matter, but by default gdb turns this off to enable simpler debugging.
It's hard to say why you are seeing the results you are. What does disassembling the function in gdb show?
I'm currently playing with ARM assembly on Linux as a learning exercise. I'm using 'bare' assembly, i.e. no libcrt or libgcc. Can anybody point me to information about what state the stack-pointer and other registers will at the start of the program before the first instruction is called? Obviously pc/r15 points at _start, and the rest appear to be initialised to 0, with two exceptions; sp/r13 points to an address far outside my program, and r1 points to a slightly higher address.
So to some solid questions:
What is the value in r1?
Is the value in sp a legitimate stack allocated by the kernel?
If not, what is the preferred method of allocating a stack; using brk or allocate a static .bss section?
Any pointers would be appreciated.
Since this is Linux, you can look at how it is implemented by the kernel.
The registers seem to be set by the call to start_thread at the end of load_elf_binary (if you are using a modern Linux system, it will almost always be using the ELF format). For ARM, the registers seem to be set as follows:
r0 = first word in the stack
r1 = second word in the stack
r2 = third word in the stack
sp = address of the stack
pc = binary entry point
cpsr = endianess, thumb mode, and address limit set as needed
Clearly you have a valid stack. I think the values of r0-r2 are junk, and you should instead read everything from the stack (you will see why I think this later). Now, let's look at what is on the stack. What you will read from the stack is filled by create_elf_tables.
One interesting thing to notice here is that this function is architecture-independent, so the same things (mostly) will be put on the stack on every ELF-based Linux architecture. The following is on the stack, in the order you would read it:
The number of parameters (this is argc in main()).
One pointer to a C string for each parameter, followed by a zero (this is the contents of argv in main(); argv would point to the first of these pointers).
One pointer to a C string for each environment variable, followed by a zero (this is the contents of the rarely-seen envp third parameter of main(); envp would point to the first of these pointers).
The "auxiliary vector", which is a sequence of pairs (a type followed by a value), terminated by a pair with a zero (AT_NULL) in the first element. This auxiliary vector has some interesting and useful information, which you can see (if you are using glibc) by running any dynamically-linked program with the LD_SHOW_AUXV environment variable set to 1 (for instance LD_SHOW_AUXV=1 /bin/true). This is also where things can vary a bit depending on the architecture.
Since this structure is the same for every architecture, you can look for instance at the drawing on page 54 of the SYSV 386 ABI to get a better idea of how things fit together (note, however, that the auxiliary vector type constants on that document are different from what Linux uses, so you should look at the Linux headers for them).
Now you can see why the contents of r0-r2 are garbage. The first word in the stack is argc, the second is a pointer to the program name (argv[0]), and the third probably was zero for you because you called the program with no arguments (it would be argv[1]). I guess they are set up this way for the older a.out binary format, which as you can see at create_aout_tables puts argc, argv, and envp in the stack (so they would end up in r0-r2 in the order expected for a call to main()).
Finally, why was r0 zero for you instead of one (argc should be one if you called the program with no arguments)? I am guessing something deep in the syscall machinery overwrote it with the return value of the system call (which would be zero since the exec succeeded). You can see in kernel_execve (which does not use the syscall machinery, since it is what the kernel calls when it wants to exec from kernel mode) that it deliberately overwrites r0 with the return value of do_execve.
Here's what I use to get a Linux/ARM program started with my compiler:
/** The initial entry point.
*/
asm(
" .text\n"
" .globl _start\n"
" .align 2\n"
"_start:\n"
" sub lr, lr, lr\n" // Clear the link register.
" ldr r0, [sp]\n" // Get argc...
" add r1, sp, #4\n" // ... and argv ...
" add r2, r1, r0, LSL #2\n" // ... and compute environ.
" bl _estart\n" // Let's go!
" b .\n" // Never gets here.
" .size _start, .-_start\n"
);
As you can see, I just get the argc, argv, and environ stuff from the stack at [sp].
A little clarification: The stack pointer points to a valid area in the process' memory. r0, r1, r2, and r3 are the first three parameters to the function being called. I populate them with argc, argv, and environ, respectively.
Here's the uClibc crt. It seems to suggest that all registers are undefined except r0 (which contains a function pointer to be registered with atexit()) and sp which contains a valid stack address.
So, the value you see in r1 is probably not something you can rely on.
Some data are placed on the stack for you.
I've never used ARM Linux but I suggest you either look at the source for the libcrt and see what they do, or use gdb to step into an existing executable. You shouldn't need the source code just step through the assembly code.
Everything you need to find out should happen within the very first code executed by any binary executable.
Hope this helps.
Tony