Stack overflow - odd return address - linux

I am working my way through an example in "The Shellcoder's Handbook". However, it is not going all that well. I am running a Debian 2.6.32-5-686 kernel, i386.
The following walkthrough is to guide the reader through the guts of what is happening when a buffer overflow occurs.
The program:
include <stdio.h>
include <string.h>
void return_input(void)
{
char array[30];
gets (array);
printf("%s\n", array);
}
int main ()
{
return_input();
return 0;
}
The aim of the game to to pass "AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDDDD" to the array which, in turn, will overwrite the return address with the excess 'D's.
I compiled like so:
gcc -ggdb -m32 -o test -fno-stack-protector -mpreferred-stack-boundary=2 test.c
I run gdb test and start investigating:
(gdb) disas return_input
Dump of assembler code for function return_input:
0x080483f4 <return_input+0>: push %ebp
0x080483f5 <return_input+1>: mov %esp,%ebp
0x080483f7 <return_input+3>: sub $0x24,%esp
0x080483fa <return_input+6>: lea -0x1e(%ebp),%eax
0x080483fd <return_input+9>: mov %eax,(%esp)
0x08048400 <return_input+12>: call 0x804830c <gets#plt>
0x08048405 <return_input+17>: lea -0x1e(%ebp),%eax
0x08048408 <return_input+20>: mov %eax,(%esp)
0x0804840b <return_input+23>: call 0x804832c <puts#plt>
0x08048410 <return_input+28>: leave
0x08048411 <return_input+29>: ret
End of assembler dump.
(gdb) break *0x08048400
Breakpoint 1 at 0x8048400: file test.c, line 7.
(gdb) break *0x08048411
Breakpoint 2 at 0x8048411: file test.c, line 9.
At this point we introduced two break points. One just before the call to gets. And another just before the function returns. Now we run it:
(gdb) run
Starting program: ./test
Breakpoint 1, 0x08048400 in return_input () at test.c:7
7 gets (array);
(gdb) x/20x $esp
0xbffff3ac: 0xbffff3b2 0xb7fca304 0xb7fc9ff4 0x08048440
0xbffff3bc: 0xbffff3d8 0xb7eb75a5 0xb7ff1040 0x0804844b
0xbffff3cc: 0xb7fc9ff4 0xbffff3d8 *0x0804841a* 0xbffff458
0xbffff3dc: 0xb7e9ec76 0x00000001 0xbffff484 0xbffff48c
0xbffff3ec: 0xb7fe18c8 0xbffff440 0xffffffff 0xb7ffeff4
This what the stack looks like just before the call to gets. I have marked the return address with asterisk (0x0804841a). Let's overwrite this:
(gdb) continue
Continuing.
AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDDDD
AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDDDD
Breakpoint 2, 0x08048411 in return_input () at test.c:9
9 }
(gdb) x/20x 0xbffff3ac
0xbffff3ac: 0xbffff3b2 0x4141a304 0x41414141 0x41414141
0xbffff3bc: 0x42424242 0x42424242 0x43434242 0x43434343
0xbffff3cc: 0x43434343 0x44444444 *0x44444444* 0xbf004444
0xbffff3dc: 0xb7e9ec76 0x00000001 0xbffff484 0xbffff48c
0xbffff3ec: 0xb7fe18c8 0xbffff440 0xffffffff 0xb7ffeff4
The above is what the stack looks like just before returning from the function. As you can see, we've overwritten the return address with those excess 'D's. Result. Let's finish up:
(gdb) x/li $eip
0x8048411 <return_input+29>: ret
(gdb) stepi
Cannot access memory at address 0x44444448
Um, eh? This 0x44444448 has come from the arse-end of nowhere. Somehow gcc has modified the return address just before we return. Thanks.
Any ideas? Am I correct in assuming gcc has done its own internal checking whether the return address is valid. And if not, it's stuck some crap in it to prevent us from crafting a nasty return address?
Any way around this? I've tried everything here - http://www.madhur.co.in/blog/2011/08/06/protbufferoverflow.html. Same result.

This is the expected result—a page fault. Your program ist stopped by the operating system because you are accessing virtual memory that is not assigned to any physical memory.
The message you see is just the debugger notifying you of that fact.

Related

RIP register doesn't understand valid memory address [duplicate]

I want a simple C method to be able to run hex bytecode on a Linux 64 bit machine. Here's the C program that I have:
char code[] = "\x48\x31\xc0";
#include <stdio.h>
int main(int argc, char **argv)
{
int (*func) ();
func = (int (*)()) code;
(int)(*func)();
printf("%s\n","DONE");
}
The code that I am trying to run ("\x48\x31\xc0") I obtained by writting this simple assembly program (it's not supposed to really do anything)
.text
.globl _start
_start:
xorq %rax, %rax
and then compiling and objdump-ing it to obtain the bytecode.
However, when I run my C program I get a segmentation fault. Any ideas?
Machine code has to be in an executable page. Your char code[] is in the read+write data section, without exec permission, so the code cannot be executed from there.
Here is a simple example of allocating an executable page with mmap:
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
int main ()
{
char code[] = {
0x8D, 0x04, 0x37, // lea eax,[rdi+rsi]
0xC3 // ret
};
int (*sum) (int, int) = NULL;
// allocate executable buffer
sum = mmap (0, sizeof(code), PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
// copy code to buffer
memcpy (sum, code, sizeof(code));
// doesn't actually flush cache on x86, but ensure memcpy isn't
// optimized away as a dead store.
__builtin___clear_cache (sum, sum + sizeof(sum)); // GNU C
// run code
int a = 2;
int b = 3;
int c = sum (a, b);
printf ("%d + %d = %d\n", a, b, c);
}
See another answer on this question for details about __builtin___clear_cache.
Until recent Linux kernel versions (sometime before 5.4), you could simply compile with gcc -z execstack - that would make all pages executable, including read-only data (.rodata), and read-write data (.data) where char code[] = "..." goes.
Now -z execstack only applies to the actual stack, so it currently works only for non-const local arrays. i.e. move char code[] = ... into main.
See Linux default behavior against `.data` section for the kernel change, and Unexpected exec permission from mmap when assembly files included in the project for the old behaviour: enabling Linux's READ_IMPLIES_EXEC process for that program. (In Linux 5.4, that Q&A shows you'd only get READ_IMPLIES_EXEC for a missing PT_GNU_STACK, like a really old binary; modern GCC -z execstack would set PT_GNU_STACK = RWX metadata in the executable, which Linux 5.4 would handle as making only the stack itself executable. At some point before that, PT_GNU_STACK = RWX did result in READ_IMPLIES_EXEC.)
The other option is to make system calls at runtime to copy into an executable page, or change permissions on the page it's in. That's still more complicated than using a local array to get GCC to copy code into executable stack memory.
(I don't know if there's an easy way to enable READ_IMPLIES_EXEC under modern kernels. Having no GNU-stack attribute at all in an ELF binary does that for 32-bit code, but not 64-bit.)
Yet another option is __attribute__((section(".text"))) const char code[] = ...;
Working example: https://godbolt.org/z/draGeh.
If you need the array to be writeable, e.g. for shellcode that inserts some zeros into strings, you could maybe link with ld -N. But probably best to use -z execstack and a local array.
Two problems in the question:
exec permission on the page, because you used an array that will go in the noexec read+write .data section.
your machine code doesn't end with a ret instruction so even if it did run, execution would fall into whatever was next in memory instead of returning.
And BTW, the REX prefix is totally redundant. "\x31\xc0" xor eax,eax has exactly the same effect as xor rax,rax.
You need the page containing the machine code to have execute permission. x86-64 page tables have a separate bit for execute separate from read permission, unlike legacy 386 page tables.
The easiest way to get static arrays to be in read+exec memory was to compile with gcc -z execstack. (Used to make the stack and other sections executable, now only the stack).
Until recently (2018 or 2019), the standard toolchain (binutils ld) would put section .rodata into the same ELF segment as .text, so they'd both have read+exec permission. Thus using const char code[] = "..."; was sufficient for executing manually-specified bytes as data, without execstack.
But on my Arch Linux system with GNU ld (GNU Binutils) 2.31.1, that's no longer the case. readelf -a shows that the .rodata section went into an ELF segment with .eh_frame_hdr and .eh_frame, and it only has Read permission. .text goes in a segment with Read + Exec, and .data goes in a segment with Read + Write (along with the .got and .got.plt). (What's the difference of section and segment in ELF file format)
I assume this change is to make ROP and Spectre attacks harder by not having read-only data in executable pages where sequences of useful bytes could be used as "gadgets" that end with the bytes for a ret or jmp reg instruction.
// TODO: use char code[] = {...} inside main, with -z execstack, for current Linux
// Broken on recent Linux, used to work without execstack.
#include <stdio.h>
// can be non-const if you use gcc -z execstack. static is also optional
static const char code[] = {
0x8D, 0x04, 0x37, // lea eax,[rdi+rsi] // retval = a+b;
0xC3 // ret
};
static const char ret0_code[] = "\x31\xc0\xc3"; // xor eax,eax ; ret
// the compiler will append a 0 byte to terminate the C string,
// but that's fine. It's after the ret.
int main () {
// void* cast is easier to type than a cast to function pointer,
// and in C can be assigned to any other pointer type. (not C++)
int (*sum) (int, int) = (void*)code;
int (*ret0)(void) = (void*)ret0_code;
// run code
int c = sum (2, 3);
return ret0();
}
On older Linux systems: gcc -O3 shellcode.c && ./a.out (Works because of const on global/static arrays)
On Linux before 5.5 (or so) gcc -O3 -z execstack shellcode.c && ./a.out (works because of -zexecstack regardless of where your machine code is stored). Fun fact: gcc allows -zexecstack with no space, but clang only accepts clang -z execstack.
These also work on Windows, where read-only data goes in .rdata instead of .rodata.
The compiler-generated main looks like this (from objdump -drwC -Mintel). You can run it inside gdb and set breakpoints on code and ret0_code
(I actually used gcc -no-pie -O3 -zexecstack shellcode.c hence the addresses near 401000
0000000000401020 <main>:
401020: 48 83 ec 08 sub rsp,0x8 # stack aligned by 16 before a call
401024: be 03 00 00 00 mov esi,0x3
401029: bf 02 00 00 00 mov edi,0x2 # 2 args
40102e: e8 d5 0f 00 00 call 402008 <code> # note the target address in the next page
401033: 48 83 c4 08 add rsp,0x8
401037: e9 c8 0f 00 00 jmp 402004 <ret0_code> # optimized tailcall
Or use system calls to modify page permissions
Instead of compiling with gcc -zexecstack, you can instead use mmap(PROT_EXEC) to allocate new executable pages, or mprotect(PROT_EXEC) to change existing pages to executable. (Including pages holding static data.) You also typically want at least PROT_READ and sometimes PROT_WRITE, of course.
Using mprotect on a static array means you're still executing the code from a known location, maybe making it easier to set a breakpoint on it.
On Windows you can use VirtualAlloc or VirtualProtect.
Telling the compiler that data is executed as code
Normally compilers like GCC assume that data and code are separate. This is like type-based strict aliasing, but even using char* doesn't make it well-defined to store into a buffer and then call that buffer as a function pointer.
In GNU C, you also need to use __builtin___clear_cache(buf, buf + len) after writing machine code bytes to a buffer, because the optimizer doesn't treat dereferencing a function pointer as reading bytes from that address. Dead-store elimination can remove the stores of machine code bytes into a buffer, if the compiler proves that the store isn't read as data by anything. https://codegolf.stackexchange.com/questions/160100/the-repetitive-byte-counter/160236#160236 and https://godbolt.org/g/pGXn3B has an example where gcc really does do this optimization, because gcc "knows about" malloc.
(And on non-x86 architectures where I-cache isn't coherent with D-cache, it actually will do any necessary cache syncing. On x86 it's purely a compile-time optimization blocker and doesn't expand to any instructions itself.)
Re: the weird name with three underscores: It's the usual __builtin_name pattern, but name is __clear_cache.
My edit on #AntoineMathys's answer added this.
In practice GCC/clang don't "know about" mmap(MAP_ANONYMOUS) the way they know about malloc. So in practice the optimizer will assume that the memcpy into the buffer might be read as data by the non-inline function call through the function pointer, even without __builtin___clear_cache(). (Unless you declared the function type as __attribute__((const)).)
On x86, where I-cache is coherent with data caches, having the stores happen in asm before the call is sufficient for correctness. On other ISAs, __builtin___clear_cache() will actually emit special instructions as well as ensuring the right compile-time ordering.
It's good practice to include it when copying code into a buffer because it doesn't cost performance, and stops hypothetical future compilers from breaking your code. (e.g. if they do understand that mmap(MAP_ANONYMOUS) gives newly-allocated anonymous memory that nothing else has a pointer to, just like malloc.)
With current GCC, I was able to provoke GCC into really doing an optimization we don't want by using __attribute__((const)) to tell the optimizer sum() is a pure function (that only reads its args, not global memory). GCC then knows sum() can't read the result of the memcpy as data.
With another memcpy into the same buffer after the call, GCC does dead-store elimination into just the 2nd store after the call. This results in no store before the first call so it executes the 00 00 add [rax], al bytes, segfaulting.
// demo of a problem on x86 when not using __builtin___clear_cache
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
int main ()
{
char code[] = {
0x8D, 0x04, 0x37, // lea eax,[rdi+rsi]
0xC3 // ret
};
__attribute__((const)) int (*sum) (int, int) = NULL;
// copy code to executable buffer
sum = mmap (0,sizeof(code),PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANON,-1,0);
memcpy (sum, code, sizeof(code));
//__builtin___clear_cache(sum, sum + sizeof(code));
int c = sum (2, 3);
//printf ("%d + %d = %d\n", a, b, c);
memcpy(sum, (char[]){0x31, 0xc0, 0xc3, 0}, 4); // xor-zero eax, ret, padding for a dword store
//__builtin___clear_cache(sum, sum + 4);
return sum(2,3);
}
Compiled on the Godbolt compiler explorer with GCC9.2 -O3
main:
push rbx
xor r9d, r9d
mov r8d, -1
mov ecx, 34
mov edx, 7
mov esi, 4
xor edi, edi
sub rsp, 16
call mmap
mov esi, 3
mov edi, 2
mov rbx, rax
call rax # call before store
mov DWORD PTR [rbx], 12828721 # 0xC3C031 = xor-zero eax, ret
add rsp, 16
pop rbx
ret # no 2nd call, CSEd away because const and same args
Passing different args would have gotten another call reg, but even with __builtin___clear_cache the two sum(2,3) calls can CSE. __attribute__((const)) doesn't respect changes to the machine code of a function. Don't do it. It's safe if you're going to JIT the function once and then call many times, though.
Uncommenting the first __clear_cache results in
mov DWORD PTR [rax], -1019804531 # lea; ret
call rax
mov DWORD PTR [rbx], 12828721 # xor-zero; ret
... still CSE and use the RAX return value
The first store is there because of __clear_cache and the sum(2,3) call. (Removing the first sum(2,3) call does let dead-store elimination happen across the __clear_cache.)
The second store is there because the side-effect on the buffer returned by mmap is assumed to be important, and that's the final value main leaves.
Godbolt's ./a.out option to run the program still seems to always fail (exit status of 255); maybe it sandboxes JITing? It works on my desktop with __clear_cache and crashes without.
mprotect on a page holding existing C variables.
You can also give a single existing page read+write+exec permission. This is an alternative to compiling with -z execstack
You don't need __clear_cache on a page holding read-only C variables because there's no store to optimize away. You would still need it for initializing a local buffer (on the stack). Otherwise GCC will optimize away the initializer for this private buffer that a non-inline function call definitely doesn't have a pointer to. (Escape analysis). It doesn't consider the possibility that the buffer might hold the machine code for the function unless you tell it that via __builtin___clear_cache.
#include <stdio.h>
#include <sys/mman.h>
#include <stdint.h>
// can be non-const if you want, we're using mprotect
static const char code[] = {
0x8D, 0x04, 0x37, // lea eax,[rdi+rsi] // retval = a+b;
0xC3 // ret
};
static const char ret0_code[] = "\x31\xc0\xc3";
int main () {
// void* cast is easier to type than a cast to function pointer,
// and in C can be assigned to any other pointer type. (not C++)
int (*sum) (int, int) = (void*)code;
int (*ret0)(void) = (void*)ret0_code;
// hard-coding x86's 4k page size for simplicity.
// also assume that `code` doesn't span a page boundary and that ret0_code is in the same page.
uintptr_t page = (uintptr_t)code & -4095ULL; // round down
mprotect((void*)page, 4096, PROT_READ|PROT_EXEC|PROT_WRITE); // +write in case the page holds any writeable C vars that would crash later code.
// run code
int c = sum (2, 3);
return ret0();
}
I used PROT_READ|PROT_EXEC|PROT_WRITE in this example so it works regardless of where your variable is. If it was a local on the stack and you left out PROT_WRITE, call would fail after making the stack read only when it tried to push a return address.
Also, PROT_WRITE lets you test shellcode that self-modifies, e.g. to edit zeros into its own machine code, or other bytes it was avoiding.
$ gcc -O3 shellcode.c # without -z execstack
$ ./a.out
$ echo $?
0
$ strace ./a.out
...
mprotect(0x55605aa3f000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC) = 0
exit_group(0) = ?
+++ exited with 0 +++
If I comment out the mprotect, it does segfault with recent versions of GNU Binutils ld which no longer put read-only constant data into the same ELF segment as the .text section.
If I did something like ret0_code[2] = 0xc3;, I would need __builtin___clear_cache(ret0_code+2, ret0_code+2) after that to make sure the store wasn't optimized away, but if I don't modify the static arrays then it's not needed after mprotect. It is needed after mmap+memcpy or manual stores, because we want to execute bytes that have been written in C (with memcpy).
You need to include the assembly in-line via a special compiler directive so that it'll properly end up in a code segment. See this guide, for example: http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html
Your machine code may be all right, but your CPU objects.
Modern CPUs manage memory in segments. In normal operation, the operating system loads a new program into a program-text segment and sets up a stack in a data segment. The operating system tells the CPU never to run code in a data segment. Your code is in code[], in a data segment. Thus the segfault.
This will take some effort.
Your code variable is stored in the .data section of your executable:
$ readelf -p .data exploit
String dump of section '.data':
[ 10] H1À
H1À is the value of your variable.
The .data section is not executable:
$ readelf -S exploit
There are 30 section headers, starting at offset 0x1150:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[...]
[24] .data PROGBITS 0000000000601010 00001010
0000000000000014 0000000000000000 WA 0 0 8
All 64-bit processors I'm familiar with support non-executable pages natively in the pagetables. Most newer 32-bit processors (the ones that support PAE) provide enough extra space in their pagetables for the operating system to emulate hardware non-executable pages. You'll need to run either an ancient OS or an ancient processor to get a .data section marked executable.
Because these are just flags in the executable, you ought to be able to set the X flag through some other mechanism, but I don't know how to do so. And your OS might not even let you have pages that are both writable and executable.
You may need to set the page executable before you may call it.
On MS-Windows, see the VirtualProtect -function.
URL: http://msdn.microsoft.com/en-us/library/windows/desktop/aa366898%28v=vs.85%29.aspx
Sorry, I couldn't follow above examples which are complicated.
So, I created an elegant solution for executing hex code from C.
Basically, you could use asm and .word keywords to place your instructions in hex format.
See below example:
asm volatile(".rept 1024\n"
CNOP
".endr\n");
where CNOP is defined as below:
#define ".word 0x00010001 \n"
Basically, c.nop instruction was not supported by my current assembler. So, I defined CNOP as the hex equivalent of c.nop with proper syntax and used inside asm, with which I was aware of.
.rept <NUM> .endr will basically, repeat the instruction NUM times.
This solution is working and verified.

fp equal to sp at startup but when copied onto enlarged stack changes to zero - why?

I'm learning x32 ARM assembly on RaspberryPi with Raspbian. I wrote
the following code:
# Define my Raspberry Pi
.cpu cortex-a53
.fpu neon-fp-armv8
.syntax unified # modern syntax
.text
.align 2
.global main
.type main, %function
main:
mov r0, 1 # line added only for breakpoint purposes
sub sp, sp, 8 # space for fp, lr
str fp, [sp, 0] # save fp
str lr, [sp, 4] # and lr
add fp, sp, 4 # set our frame pointer
Build with gcc:
gcc -g test.s -o test
Use gdb to check values of fp and sp in lines 13 and 16 and
dereference them:
$ gdb ./test
(gdb) break 13
Breakpoint 3 at 0x103d4: file test.s, line 13.
(gdb) break 16
Breakpoint 4 at 0x103e0: file test.s, line 16.
(gdb) run
Starting program: /home/pi/assembly/nine/bob/test
Breakpoint 3, main () at test.s:13
13 sub sp, sp, 8 # space for fp, lr
(gdb) print {$sp, $fp}
$1 = {0x7efffae8, 0x7efffae8}
(gdb) x $sp
0x7efffae8: 0x76f9e000
(gdb) x $fp
0x7efffae8: 0x76f9e000
(gdb) continue
Continuing.
Breakpoint 4, main () at test.s:16
16 add fp, sp, 4 # set our frame pointer
(gdb) print {$sp, $fp}
$2 = {0x7efffae0, 0x7efffae0}
(gdb) x $sp
0x7efffae0: 0x00000000
(gdb) x $fp
0x7efffae0: 0x00000000
As you see fp is equal to sp at startup and non-zero:
(gdb) print {$sp, $fp}
$1 = {0x7efffae8, 0x7efffae8}
(gdb) x $sp
0x7efffae8: 0x76f9e000
(gdb) x $fp
0x7efffae8: 0x76f9e000
but when copied onto the enlarged stack it changes to zero
(gdb) x $fp
0x7efffae0: 0x00000000
Why does it change to zero? Why does it change value at all? Is
underlying implementation somehow linking values of fp and sp so
that when sp is moved down to the initialized memory that might be
all zeroes fp is changed as well? I only found
this:
fp
Is the frame pointer register. In the obsolete APCS variants that
use fp, this register contains either zero, or a pointer to the
most recently created stack backtrace data structure. As with the
stack pointer, the frame pointer must be preserved, but in
handwritten code it does not need to be available at every
instant. However, it must be valid whenever any strictly
conforming function is called. fp must always be preserved.
This
comment
says that lr is stored as the first element on the stack but it's
definitely not - it stays the same and is not zero:
(gdb) print {$sp, $fp, $lr}
$1 = {0x7efffae8, 0x7efffae8, 0x76e6b718 <__libc_start_main+268>}
and after sp changes:
(gdb) x/2xw $sp
0x7efffae0: 0x00000000 0x76e6b718
Ok, I'm answering myself - this happens in gdb/arm-tdep.c in GDB source code:
/* The frame size is just the distance from the frame register
to the original stack pointer. */
if (pv_is_register (regs[ARM_FP_REGNUM], ARM_SP_REGNUM))
{
/* Frame pointer is fp. */
framereg = ARM_FP_REGNUM;
framesize = -regs[ARM_FP_REGNUM].k;
}
else
{
/* Try the stack pointer... this is a bit desperate. */
framereg = ARM_SP_REGNUM;
framesize = -regs[ARM_SP_REGNUM].k;
}

x86 Linux ELF Loader Troubles

I'm trying to write an ELF executable loader for x86-64 Linux, similar to this, which was implemented on ARM. Chris Rossbach's advanced OS class includes a lab that does basically what I want to do. My goal is to load a simple (statically-linked) "hello world" type binary into my process's memory and run it without execveing. I have successfully mmap'd the ELF file, set up the stack, and jumped to the ELF's entry point (_start).
// put ELF file into memory. This is just one line of a complex
// for() loop that loads the binary from a file.
mmap((void*)program_header.p_vaddr, program_header.p_memsz, map, MAP_PRIVATE|MAP_FIXED, elffd, program_header.p_offset);
newstack = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0); // Map a page for the stack
if((long)newstack < 0) {
fprintf(stderr, "ERROR: mmap returned error when allocating stack, %s\n", strerror(errno));
exit(1);
}
topstack = (unsigned long*)((unsigned char*)newstack+4096); // Top of new stack
*((unsigned long*)topstack-1) = 0; // Set up the stack
*((unsigned long*)topstack-2) = 0; // with argc, argv[], etc.
*((unsigned long*)topstack-3) = 0;
*((unsigned long*)topstack-4) = argv[1];
*((unsigned long*)topstack-5) = 1;
asm("mov %0,%%rsp\n" // Install new stack pointer
"xor %%rax, %%rax\n" // Zero registers
"xor %%rbx, %%rbx\n"
"xor %%rcx, %%rcx\n"
"xor %%rdx, %%rdx\n"
"xor %%rsi, %%rsi\n"
"xor %%rdi, %%rdi\n"
"xor %%r8, %%r8\n"
"xor %%r9, %%r9\n"
"xor %%r10, %%r10\n"
"xor %%r11, %%r11\n"
"xor %%r12, %%r12\n"
"xor %%r13, %%r13\n"
"xor %%r14, %%r14\n"
:
: "r"(topstack-5)
:"rax", "rbx", "rcx", "rdx", "rsi", "rdi", "r8", "r9", "r10", "r11", "r12", "r13", "r14");
asm("push %%rax\n"
"pop %%rax\n"
:
:
: "rax");
asm("mov %0,%%rax\n" // Jump to the entry point of the loaded ELF file
"jmp *%%rax\n"
:
: "r"(jump_target)
: );
I then step through this code in gdb. I've pasted the first few instructions of the startup code below. Everything works great until the first push instruction (starred). The push causes a segfault.
0x60026000 xor %ebp,%ebp
0x60026002 mov %rdx,%r9
0x60026005 pop %rsi
0x60026006 mov %rsp,%rdx
0x60026009 and $0xfffffffffffffff0,%rsp
0x6002600d * push %rax
0x6002600e push %rsp
0x6002600f mov $0x605f4990,%r8
I have tried:
Using the stack from the original process.
mmaping a new stack (as in the above code): (1) and (2) both cause segfaults.
pushing and poping to/from the stack before jmping to the loaded ELF file. This does not cause a segfault.
Changing the protection flags for the stack in the second mmap to PROT_READ | PROT_WRITE | PROT_EXEC. This doesn't make a difference.
I suspect this maybe has something to do with the segment descriptors (maybe?). It seems like the code from the ELF file that I'm loading does not have write access to the stack segment, no matter where it is located. I have not tried to modify the segment descriptor for the newly loaded binary or change the architectural segment registers. Is this necessary? Does anybody know how to fix this?
It turned out that when I was stepping through the loaded code in gdb, the debugger would consistently blow by the first push instruction when I typed nexti and instead continue execution. It was not in fact the push instruction that was causing the segfault but a much later instruction in the C library start code. The problem was caused by a failed call to mmap in the initial binary load that I didn't error check.
Regarding gdb randomly deciding to continue execution instead of stepping: this can be fixed by loading the symbols from the target executable after jumping to the newly loaded executable.

linux syscall using spinlock returning value to userspace

I'm, currently struggling with the correct implementation of a kernel-spinlock in combination with a return statement which should return a value to userspace. I implemented a kernel syscall 'sys_kernel_entropy_is_recording' which should return the value of a kernel-variable 'is_kernel_entropy_recording':
asmlinkage bool sys_kernel_entropy_is_recording(void)
{
spin_lock(&entropy_analysis_lock);
return is_kernel_entropy_recording;
spin_unlock(&entropy_analysis_lock);
}
At this point arise two questions:
Q1: Is this implementation correct at all, meaning will the correct value of 'is_kernel_entropy_recording' be returned to userspace and afterwards the spinlock be released?
My concerns are:
a) is it allowed to return a value from kernelspace to userspace this way at all?
b) the return statement is located before the spin_unlock statement, hence will spin_unlock be even called?
Q2: To answer these question myself I disassembled the compiled .o file but determined (at least it looks for me like) the spin_lock/spin_unlock calls are completely ignored by the compiler, as it just moves the value of 'sys_kernel_entropy_is_recording' to eax an calls ret (I'm not sure about line 'callq 0xa5'):
(gdb) disassemble /m sys_kernel_entropy_is_recording
Dump of assembler code for function sys_kernel_entropy_is_recording:
49 {
0x00000000000000a0 <+0>: callq 0xa5 <sys_kernel_entropy_is_recording+5>
0x00000000000000a5 <+5>: push %rbp
0x00000000000000ad <+13>: mov %rsp,%rbp
50 spin_lock(&entropy_analysis_lock);
51 return is_kernel_entropy_recording;
52 spin_unlock(&entropy_analysis_lock);
53 }
0x00000000000000b5 <+21>: movzbl 0x0(%rip),%eax # 0xbc <sys_kernel_entropy_is_recording+28>
0x00000000000000bc <+28>: pop %rbp
0x00000000000000bd <+29>: retq
Hence I guess the application of spinlock is not correct.. Could someone please give me an advice for an appropriate approach?
Thanks a lot in advance!
It is prohibited to return from syscall with spinlock holded. And, as usual with C code, none instruction is executed after return statement.
Common practice is to save value obtained under lock into local variable, and return value of this variable after unlock:
bool ret;
spin_lock(&entropy_analysis_lock);
ret = is_kernel_entropy_recording;
spin_unlock(&entropy_analysis_lock);
return ret;

Arglist differs from frame address

My program failed with segfault attempting to write "1" to a string.
(gdb) info frame
Stack level 0, frame at 0xb6b3c040:
eip = 0xb7877cdf; saved eip 0xb7858eae
called by frame at 0xb6b3cc50
Arglist at 0x91a1649, args:
Locals at 0x91a1649, Previous frame's sp is 0xb6b3c040
Saved registers:
ebx at 0xb6b3c02c, ebp at 0xb6b3c038, esi at 0xb6b3c030, edi at 0xb6b3c034, eip at 0xb6b3c03c
(gdb) bt
#0 0xb7877cdf in ?? () from /lib/i386-linux-gnu/libc.so.6
#1 0xb7858eae in vfprintf () from /lib/i386-linux-gnu/libc.so.6
#2 0xb787d91b in vsnprintf () from /lib/i386-linux-gnu/libc.so.6
#3 0x08ea7d7e in __gnu_cxx::__to_xstring<std::string, char> (__convf=0x85a2a50 <vsnprintf#plt>, __n=16, __fmt=0x91a1649 "%u") at /usr/include/c++/4.7/ext/string_conversions.h:95
#4 0x08ea6452 in std::to_string (__val=1) at /usr/include/c++/4.7/bits/basic_string.h:2871
...
I noticed that according to gdb, Arglist is not in stack. How it could happen? As far as I know, there is one calling convention in *nix: arguments are pushed to stack, caller clears stack frame. I went up and down through backtrace and everywhere else arglist was in stack.
You could be crashing in an assembly language routine that does not follow standard calling conventions and/or have symbolic information available.
Likely, the core issue is higher up than frame 0 anyway.

Resources