Why doesn't printf function show string after ROP attack? [duplicate] - security

#include<stdio.h>
int main()
{
char *name = "Vikram";
printf("%s",name);
name[1]='s';
printf("%s",name);
return 0;
}
There is no output printed on terminal and just get segmentation fault. But when I run it in GDB, I get following -
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400525 in main () at seg2.c:7
7 name[1]='s';
(gdb)
This means program receive SEG fault on 7th line (obviously I can't write on constant char array) . Then why printf() of line number 6 is not executed ?

This is due to stream buffering of stdout. Unless you do fflush(stdout) or you print a newline "\n" the output is may be buffered.
In this case, it's segfaulting before the buffer is flushed and printed.
You can try this instead:
printf("%s",name);
fflush(stdout); // Flush the stream.
name[1]='s'; // Segfault here (undefined behavior)
or:
printf("%s\n",name); // Flush the stream with '\n'
name[1]='s'; // Segfault here (undefined behavior)

First you should end your printfs with "\n" (or at least the last one). But that is not related to the segfault.
When the compiler compiles your code, it splits the binary into several section. Some are read only, while other are writeable.
Writing to an read only section may cause a segfault.
String literals are usually placed in a read only section (gcc should put it in ".rodata").
The pointer name points to that ro section. Therefore you must use
const char *name = "Vikram";
In my response I've used a few "may" "should". The behaviour depends on your OS, compiler and compilation settings (The linker script defines the sections).
Adding
-Wa,-ahlms=myfile.lst
to gcc's command line produces a file called myfile.lst with the generated assembler code.
At the top you can see
.section .rodata
.LC0:
.string "Vikram"
Which shows that the string is in Vikram.
The same code using (Must be in global scope, else gcc may store it on the stack, notice it is an array and not a pointer)
char name[] = "Vikram";
produces
.data
.type name, #object
.size name, 7
name:
.string "Vikram"
The syntax is a bit different but see how it is in .data section now, which is read-write.
By the way this example works.

The reason you are getting a segmentation fault is that C string literals are read only according to the C standard, and you are attempting to write 's' over the second element of the literal array "Vikram".
The reason you are getting no output is because your program is buffering its output and crashes before it has a chance to flush its buffer. The purpose of the stdio library, in addition to providing friendly formatting functions like printf(3), is to reduce the overhead of i/o operations by buffering data in in-memory buffers and only flushing output when necessary, and only performing input occasionally instead of constantly. Actual input and output will not, in the general case, occur at the moment when you call the stdio function, but only when the output buffer is full (or the input buffer is empty).
Things are slightly different if a FILE object has been set so it flushes constantly (like stderr), but in general, that's the gist.
If you're debugging, it is best to fprintf to stderr to assure that your debug printouts will get flushed before a crash.

By default when stdout is connected to a terminal, the stream is line-buffered. In practice, in your example the absence of '\n' (or of an explicit stream flush) is why you don't get the characters printed.
But in theory undefined behavior is not bounded (from the Standard "behavior [...] for which this International Standard imposes no requirements") and the segfault can happen even before the undefined behavior occurs, for example before the first printf call!

Related

Why does the sys_read system call end when it detects a new line?

I'm a beginner in assembly (using nasm). I'm learning assembly through a college course.
I'm trying to understand the behavior of the sys_read linux system call when it's invoked. Specifically, sys_read stops when it reads a new line or line feed. According to what I've been taught, this is true. This online tutorial article also affirms the fact/claim.
When sys_read detects a linefeed, control returns to the program and the users input is located at the memory address you passed in ECX.
I checked the linux programmer's manual for the sys_read call (via "man 2 read"). It does not mention the behavior when it's supposed to, right?
read() attempts to read up to count bytes from file descriptor fd
into the buffer starting at buf.
On files that support seeking, the read operation commences at the
file offset, and the file offset is incremented by the number of bytes
read. If the file offset is at or past the end of file, no bytes are
read, and read() returns zero.
If count is zero, read() may detect the errors described below. In
the absence of any errors, or if read() does not check for errors, a
read() with a count of 0 returns zero and has no other effects.
If count is greater than SSIZE_MAX, the result is unspecified.
So my question really is, why does the behavior happen? Is it a specification in the linux kernel that this should happen or is it a consequence of something else?
It's because you're reading from a POSIX tty in canonical mode (where backspace works before you press return to "submit" the line; that's all handled by the kernel's tty driver). Look up POSIX tty semantics / stty / ioctl. If you ran ./a.out < input.txt, you wouldn't see this behaviour.
Note that read() on a TTY will return without a newline if you hit control-d (the EOF tty control-sequence).
Assuming that read() reads whole lines is ok for a toy program, but don't start assuming that in anything that needs to be robust, even if you've checked that you're reading from a TTY. I forget what happens if the user pastes multiple lines of text into a terminal emulator. Quite probably they all end up in a single read() buffer.
See also my answer on a question about small read()s leaving unread data on the terminal: if you type more characters on one line than the read() buffer size, you'll need at least one more read system call to clear out the input.
As you noted, the read(2) libc function is just a thin wrapper around sys_read. The answer to this question really has nothing to do with assembly language, and is the same for systems programming in C (or any other language).
Further reading:
stty(1) man page: where you can change which control character does what.
The TTY demystified: some history, and some diagrams showing how xterm, the kernel, and the process reading from the tty all interact. And stuff about session management, and signals.
https://en.wikipedia.org/wiki/POSIX_terminal_interface#Canonical_mode_processing and related parts of that article.
This is not an attribute of the read() system call, but rather a property of termios, the terminal driver. In the default configuration, termios buffers incoming characters (i.e. what you type) until you press Enter, after which the entire line is sent to the program reading from the terminal. This is for convenience so you can edit the line before sending it off.
As Peter Cordes already said, this behaviour is not present when reading from other kinds of files (like regular files) and can be turned off by configuring termios.
What the tutorial says is garbage, please disregard it.

(open + write) vs. (fopen + fwrite) to kernel /proc/

I have a very strange bug. If I do:
int fd = open("/proc/...", O_WRONLY);
write(fd, argv[1], strlen(argv[1]));
close(fd);
everything is working including for a very long string which length > 1024.
If I do:
FILE *fd = fopen("/proc/...", "wb");
fwrite(argv[1], 1, strlen(argv[1]), fd);
fclose(fd);
the string is cut around 1024 characters.
I'm running an ARM embedded device with a 3.4 kernel. I have debugged in the kernel and I see that the string is already cut when I reach the very early function vfs_write (I spotted this function with a WARN_ON instruction to get the stack).
The problem is the same with fputs vs. puts.
I can use fwrite for a very long string (>1024) if I write to a standard rootfs file. So the problem is really linked how the kernel handles /proc.
Any idea what's going on?
Probably the problem is with buffers.
The issue is that special files, such as those at /proc are, well..., special, they are not always simple stream of bytes, and have to be written to (or read from) with specific sizes and or offsets. You do not say what file you are writing to, so it is impossible to be sure.
Then, the call to fwrite() assumes that the output fd is a simple stream of bytes, so it does smart fancy things, such as buffering and splicing and copying the given data. In a regular file it will just work, but in a special file, funny things may happen.
Just to be sure, try to run strace with both versions of your program and compare the outputs. If you wish, post them for additional comments.

Gdb dump memory in specific region, save formatted output into a file

I have a buggy (memory leaked) software.
As an evidence, I have 1GB of core.dump file. Heap size is 900MB, so obviously, something allocates, but does not free the memory.
So, I have a memory region to examine like this.
(gdb) x/50000s 0x200000000
However, this is hard to guess only with naked eyes, which object or struct is not freed.
My idea to trace is, "Save gdb formatted output into a file, and run a pattern match to see which magic string comes up the most." So, here is my question:
How can I save output of following command into a textfile, so that I can write an analyzer?
(gdb) x/10000000s 0x20000000 <-- I need this output into a file
You could use the "dump" function of gdb, see: https://sourceware.org/gdb/onlinedocs/gdb/Dump_002fRestore-Files.html
For your example:
dump binary memory result.bin 0x200000000 0x20000c350
This will give you a plain binary dump int file result.bin. You can also use the following to dump it in hex format:
dump ihex memory result.bin 0x200000000 0x20000c350
Using the dump command is much clearer than using the gdb logging hack (which even did not work for me somehow).
How can I save output of following command into a textfile, so that I can write an analyzer?
(gdb) x/10000000s 0x20000000
That's actually quite easy:
(gdb) set height 0 # prevent GDB from stopping every screenfull
(gdb) set logging on # GDB output is now also copied into gdb.txt
(gdb) x/10000000s 0x20000000
(gdb) quit
Voila, enjoy your output in gdb.txt.
I have a buggy (memory leaked) software. ... "Save gdb formatted output into a file, and run a pattern match to see which magic string comes up the most."
That idea is quite unlikely to yield satisfactory results. Consider:
void some_function() {
std::vector<string> *v = new std::vector<string>();
// code to insert and use 1000s of strings into "v".
return; // Oops: forgot to delete "v".
}
Even if you could effectively "see magic string that comes up the most", you'll discover that you are leaking all the strings; but they are not the problem, leaking "v" is the problem.
So what you really want is to build a graph of which allocated regions point to other allocated regions, and find a "root" of that graph. This is nearly impossible to do by hand.
So what is more likely to help you find the memory leak(s)? Fortunately, there are lots of tools that can solve this problem for you:
Valgrind,
Google heap leak checker,
jemalloc,
... etc. etc.
you can write simple lkm will do that
lkm:
#include <linux/kernel.h>
#include <linux/module.h>
int *ptr=(int*)0Xc18251c0; //the address you want to read from kernel space
int module_i(void)
{
printk("%d\n",*ptr);
}
module_init(module_i);
and the data will show up in log
so write
enter code here
dmesg

Simple heap overflow exploit with toy example on old glibc

Consider this example of a heap buffer overflow vulnerable program in Linux, taken directly from the "Buffer Overflow Attacks" (p. 248) book:
#include <stdlib.h>
#include <string.h>
int main(int argc, char **argv)
{
char *A, *B;
A = malloc(128);
B = malloc(32);
strcpy(A, argv[1]);
free(A);
free(B);
return 0;
}
Since unlink() has been changed to prevent the most simple form of exploit using the FD and BK pointers with a sanity check, I'm using a very old system I have with an old version of glibc (version 2.3.2). I'm also setting MALLOC_CHECK_=0 for this testing.
My goal of this toy example is to simply see if I can write 4 bytes to some arbitrary address I specify. The most simple test I can think of is to try write something to 0x41414141, which is an illegal address and should let the program crash to just confirm to me that it is indeed trying to write to this address (something I should be able to observe in GDB).
So I try executing with the argument perl -e 'print "A"x128 . "\xf8\xff\xff\xff" . "\xf8\xff\xff\xff" . "\x41\x41\x41\x41" . "\x41\x41\x41\x41" '
So I have:
Buffer A: 128 bytes of 0x41.
prev_size: 0xfffffff8
size: 0xfffffff8
FD: 0x41414141
BK: 0x41414141
I'm using 0xfffffff8 instead of 0xfffffffc because there is a note that with glibc 2.3 the third lowest bit NON_MAIN_AREA is used for management purposes for the arenas and has to be 0.
This should attempt to write 0x41414141 to 0x41414141 (+ 12 to be more correct, but still an illegal address), correct? However, when I execute this, the program simply terminates normally.
What am I missing here? This seems simple enough that it shouldn't be that hard to get to work.
I've tried various things such as using 0xfffffffc instead for prev_size and size, using legal addresses for FD (some address on the heap). I've tried swapping the order A and B are free()'d, I've tried to step into free() to see what happens in GDB but I got lost. Note that there shouldn't be any other security features on this system as it is very old and wouldn't have NX-bit, ASLR, etc (not that it should matter for the purpose of just writing 4 bytes to an illegal address).
Any ideas for how to make this work?
I could add that if using MALLOC_CHECK_=3 I get this:
malloc: using debugging hooks
malloc: using debugging hooks
free(): invalid pointer 0x8049688!
Program received signal SIGABRT, Aborted.
0x4004a1b1 in kill () from /lib/libc.so.6

injecting shellcode

i have a .cmd file on a webserver with a variable user="...", vulnerable against buffer overflows. I can execute the .cmd file via ssh or via web.
Now i have this shellcode:
#include <stdio.h>
char sc[] =
"
...
";
void main(void)
{
void(*s)(void);
printf("size: %d\n", sizeof(sc));
s = sc;
s();
}
my problem is, i don't know how this all plays together. I know what the Assembler and the C code does, but how do i inject the code into the running cmd file?
cat "shellcode " | nc host cmd
You generally have to insert enough data so that your write ends up in part of the program's memory that gets executed. How to do that exactly depends entirely on the structure of the program with the overflow.
But, imagine if the program were specified by this ASM listing:
[SECTION .text]
global _start
_start:
;;...
jmpagain:
jmp next
uname db "username"
next:
mov eax,uname
;;...
jmp jmpagain
the string "username" is in memory immediately adjacent to an address that the instruction pointer visits. If the program writes some data to that area of memory without checking it's bounds, it will overwrite the code, and anytime the instruction pointer revisits the next function, the machine is going to execute whatever data overflowed in that part of memory. Supposing the write you are exploiting starts at the beginning of the string, and stops writing on some condition that provides enough room for you to inject your shellcode, you would prepend a byte string of the same length as "username" to your shellcode in the input. Then the beginning of your shellcode would be at the address of the next label.
But this is just a simple example demonstration of the basic principle. Actually getting your data to an area of memory that the instruction pointer visits is likely going to be a lot trickier. If you have access to the command file in question, you can debug it and dump the memory and trace how the area of memory is written to to see how you need to overflow the buffer to get to IP reachable memory.
It's important to reiterate that your shellcode not only needs to make it to memory that the instruction pointer passes over, it needs to be correctly aligned in that memory to execute in the way you expect it to. If the instruction pointer lands somewhere in the middle of your code rather than at the beginning of the first instruction, it obviously isn't going to do what you expect it to.

Resources