why is the value of LD_PRELOAD on the stack - linux

I'm studying buffer overflow and solving some wargames.
There was a problem that all of the stack memory above the buffer is set to 0 except return address of main, which will be:
buffer
[0000000...][RET][000000...]
and I can overwrite that RET.
So I found some hints for solving this problem.
It was to use LD_PRELOAD.
Some people said that LD_PRELOAD's value is in somewhere of stack not only in environment variable area of stack.
So I set LD_PRELOAD and search it and found it using gdb.
$ export LD_PRELOAD=/home/coffee/test.so
$ gdb -q abcde
(gdb) b main
Breakpoint 1 at 0x8048476
(gdb) r
Starting program: /home/coffee/abcde
Breakpoint 1, 0x8048476 in main ()
(gdb) x/s 0xbffff6df
0xbffff6df: "#èC\001#/home/coffee/test.so"
(gdb) x/s 0xbffffc59
0xbffffc59: "LD_PRELOAD=/home/coffee/test.so"
(gdb) q
The program is running. Exit anyway? (y or n) y
$
So there is!
Now I know that LD_PRELOAD's value is on stack below the buffer and now I can exploit!
But I wonder why LD_PRELOAD is loaded on that memory address.
The value is also on environment variable area of the stack!
What is the purpose of this?
Thanks.

Code to explore the stack layout:
#include <inttypes.h>
#include <stdio.h>
// POSIX 2008 declares environ in <unistd.h> (Mac OS X doesn't)
extern char **environ;
static void dump_list(const char *tag, char **list)
{
char **ptr = list;
while (*ptr)
{
printf("%s[%d] 0x%.16" PRIXPTR ": %s\n",
tag, (ptr - list), (uintptr_t)*ptr, *ptr);
ptr++;
}
printf("%s[%d] 0x%.16" PRIXPTR "\n",
tag, (ptr - list), (uintptr_t)*ptr);
}
int main(int argc, char **argv, char **envp)
{
printf("%d\n", argc);
printf("argv 0x%.16" PRIXPTR "\n", (uintptr_t)argv);
printf("argv[argc+1] 0x%.16" PRIXPTR "\n", (uintptr_t)(argv+argc+1));
printf("envp 0x%.16" PRIXPTR "\n", (uintptr_t)envp);
printf("environ 0x%.16" PRIXPTR "\n", (uintptr_t)environ);
dump_list("argv", argv);
dump_list("envp", envp);
return(0);
}
With the program compiled as x, I ran it with a sanitized environment:
$ env -i HOME=$HOME PATH=$HOME/bin:/bin:/usr/bin LANG=$LANG TERM=$TERM ./x a bb ccc
4
argv 0x00007FFF62074EC0
argv[argc+1] 0x00007FFF62074EE8
envp 0x00007FFF62074EE8
environ 0x00007FFF62074EE8
argv[0] 0x00007FFF62074F38: ./x
argv[1] 0x00007FFF62074F3C: a
argv[2] 0x00007FFF62074F3E: bb
argv[3] 0x00007FFF62074F41: ccc
argv[4] 0x0000000000000000
envp[0] 0x00007FFF62074F45: HOME=/Users/jleffler
envp[1] 0x00007FFF62074F5A: PATH=/Users/jleffler/bin:/bin:/usr/bin
envp[2] 0x00007FFF62074F81: LANG=en_US.UTF-8
envp[3] 0x00007FFF62074F92: TERM=xterm-color
envp[4] 0x0000000000000000
$
If you study that carefully, you'll see that the argv argument to main() is the start of a series of pointers to strings further up the stack; the envp (optional third argument to main() on POSIX machines) is the same as the global variable environ and argv[argc+1], and is also the start of a series of pointers to strings further up the stack; and the strings pointed at by the argv and envp pointers follow the two arrays.
This is the layout on Mac OS X (10.7.5 if it matters, which it probably doesn't), but I'm tolerably sure you'd find the same layout on other Unix-like systems.

Related

Reading and modifying data of syscall with ptrace

I am trying to do a simple thing (just for learning),
I wish to intercept clock_gettime on 64 bit linux, read the output and modify it so to return a flase date/time to the tracee (/bin/date).
What I do is:
ptrace(PTRACE_GETREGS, pid, NULL, &regs);
if(regs.orig_rax==228){ // this is the clock_gettime syscall number in 64 bit linux
unsigned long p1=ptrace(PTRACE_PEEKDATA, pid, regs.rcx, NULL); // rcx is ARG1
printf("ARG1: 0x%lx\n",p1);
}
Now if I understood correctly (clearly not) regs.rcx should point to a timespec structure, so
I should read the first long int of that structure which is the time in seconds (unixtime).
But I read 0.
Also, the printf is invoked twice, once entering the syscall and the second time exiting it.
So ok it's normal is 0 when entering but it should not be whene exiting.
Infact strace shows it correctly:
strace 2>&1 date|grep CLOCK
clock_gettime(CLOCK_REALTIME, {tv_sec=1583960872, tv_nsec=403163000}) = 0
How can I do the same?
found the error.. wrong register.. it was RSI
ptrace(PTRACE_GETREGS, pid, NULL, &regs);
if(regs.orig_rax==228){ // this is the clock_gettime syscall number in 64 bit linux
unsigned long p1=ptrace(PTRACE_PEEKDATA, pid, regs.rsi, NULL); // rsi is ARG2
printf("ARG2: 0x%lx\n",p1);
}
for x86_64:
#define SYSCALL_ENTRY ((long)RET == -ENOSYS)
#define REGS_STRUCT struct user_regs_struct
#define SYSCALL (regs.orig_rax)
#define ARG1 (regs.rdi)
#define ARG2 (regs.rsi)
#define ARG3 (regs.rdx)
#define ARG4 (regs.r10)
#define ARG5 (regs.r8)
#define ARG6 (regs.r9)
#define RET (regs.rax)

C program stores function parameters from $rbp+4 in memory? My check failed

I was trying to learn how to use rbp/ebp to visit function parameters and local variables on ubuntu1604, 64bit. I've got a simply c file:
#include<stdio.h>
int main(int argc,char*argv[])
{
printf("hello\n");
return argc;
}
I compiled it with:
gcc -g my.c
Then debug it with argument parameters:
gdb --args my 01 02
Here I know the "argc" should be 3, so I tried to check:
(gdb) b main
Breakpoint 1 at 0x400535: file ret.c, line 5.
(gdb) r
Starting program: /home/a/cpp/my 01 02
Breakpoint 1, main (argc=3, argv=0x7fffffffde98) at ret.c:5
5 printf("hello\n");
(gdb) x $rbp+4
0x7fffffffddb4: 0x00000000
(gdb) x $rbp+8
0x7fffffffddb8: 0xf7a2e830
(gdb) x/1xw $rbp+8
0x7fffffffddb8: 0xf7a2e830
(gdb) x/1xw $rbp+4
0x7fffffffddb4: 0x00000000
(gdb) x/1xw $rbp
0x7fffffffddb0: 0x00400550
I don't find any clue that a dword of "3" is saved in any of bytes in $rbp+xBytes. Did I get anything wrong in my understanding or commands?
Thanks!
I was trying to learn how to use rbp/ebp to visit function parameters and local variables
The x86_64 ABI does not use stack to pass parameters; they are passed in registers. Because of that, you wouldn't find them at any offset off $rbp (this is different from ix86 calling convention).
To find the parameters, you'll need to look at the $rdi and $rsi regusters:
Breakpoint 1, main (argc=3, argv=0x7fffffffe3a8) at my.c:4
4 printf("hello\n");
(gdb) p/x $rdi
$1 = 0x3 # matches argc
(gdb) p/x $rsi
$2 = 0x7fffffffe3a8 # matches argv
x $rbp+4
You almost certainly wouldn't find anything useful at $rbp+4, because it is usually incremented or decremented by 8, in order to store the entire 64-bit value.

Get the location of environment variable when trying ret2libc exploit

Recently I'm learning some experiment about ret2libc exploit, I found that we can using the environment variable to store payload, and the following code getenv.c can help us to get the location of the environment variable:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void main(int argc, char *argv[]) {
char *ptr;
if(argc < 3) {
printf("Usage: %s <environment var> <target program>\n", argv[0]);
exit(0);
}
ptr = getenv(argv[1]); /* Get env var location. */
ptr += (strlen(argv[0]) - strlen(argv[2])); /* Adjust for program name. */
printf("%s will be at %p\n", argv[1], ptr);
}
we can use the program this way:
$ ~/getenv FAV ./program
FAV will be at 0xbfffff22
It makes me so confused that the ptr value is not used directly, but do the adujstment (strlen(argv[0]) - strlen(argv[2])); Why?
The environment variable address on program foo is guessed when using your getenv binary.
The program name comes before the environment variables and so, if the original program's name is longer or shorter, it will change the environment variables addresses.
This is why you substract the getenv program name length to the env address, and add the foo binary name length instead.

Difficulty in using execve

I am trying to execute "word count" command on file given by absolute path - "/home/aaa/xxzz.txt" . I have closed the stdin so as to take input from file but the program doesn't give any output .
Also if I add some statement after "execve" command, it is also getting executed . Shouldn't the program exit after execve ?
int main()
{
char *envp[]={NULL };
int fd=open("/home/aaa/xxzz.txt",O_RDONLY);
close(0);
dup(fd);
char *param[]={ "/bin/wc",NULL } ;
execve("/bin/wc",param,envp);
}
Probably wc does not live in /bin (except for some systems which symlink that to /usr/bin, because wc normally lives in the latter). If I change the path in your example to /usr/bin/wc, it works for me:
#include <unistd.h>
#include <fcntl.h>
int
main()
{
char *envp[] = {NULL};
int fd = open("/home/aaa/xxzz.txt", O_RDONLY);
close(0);
dup(fd);
char *program = "/usr/bin/wc";
char *param[] = {program,NULL};
execve(program, param, envp);
}

Computing memory address of the environment within a process

I got the following code from the lecture-slides of a security course.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
extern char shellcode;
#define VULN "./vuln"
int main(int argc, char **argv) {
void *addr = (char *) 0xc0000000 - 4
- (strlen(VULN) + 1)
- (strlen(&shellcode) + 1);
fprintf(stderr, "Using address: 0x%p\n", addr);
// some other stuff
char *params[] = { VULN, buf, NULL };
char *env[] = { &shellcode, NULL };
execve(VULN, params, env);
perror("execve");
}
This code calls a vulnerable program with the shellcode in its environment. The shellcode is some assembly code in an external file that opens a shell and VULN defines the name of the vulnerable program.
My question: how is the shellcode address is computed
The addr variable holds the address of the shellcode (which is part of the environment). Can anyone explain to me how this address is determined? So:
Where does the 0xc0000000 - 4 come from?
Why is the length of the shellcode and the programname substracted from it?
Note that both this code and the vulnerable program are compiled like this:
$ CFLAGS="-m32 -fno-stack-protector -z execstack -mpreferred-stack-boundary=2"
$ cc $CFLAGS -o vuln vuln.c
$ cc $CFLAGS -o exploit exploit.c shellcode.s
$ echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
0
So address space randomization is turned off.
I understood that the stack is the first thing inside the process (highest memory address). And the stack contains, in this order:
The environment data.
argv
argc
the return address of main
the framepointer
local variables in main
...etc...
Constants and global data is not stored on the stack, that's why I also don' t understand why the length of the VULN constant influence the address at which the shellcode is placed.
Hope you can clear this up for me :-)
Note that we're working with a unix system on a intel x86 architecture

Resources