Disable randomization of memory addresses - linux

I'm trying to debug a binary that uses a lot of pointers. Sometimes for seeing output quickly to figure out errors, I print out the address of objects and their corresponding values, however, the object addresses are randomized and this defeats the purpose of this quick check up.
Is there a way to disable this temporarily/permanently so that I get the same values every time I run the program.
Oops. OS is Linux fsttcs1 2.6.32-28-generic #55-Ubuntu SMP Mon Jan 10 23:42:43 UTC 2011 x86_64 GNU/Linux

On Ubuntu , it can be disabled with...
echo 0 > /proc/sys/kernel/randomize_va_space
On Windows, this post might be of some help...
http://blog.didierstevens.com/2007/11/20/quickpost-another-funny-vista-trick-with-aslr/

To temporarily disable ASLR for a particular program you can always issue the following (no need for sudo)
setarch `uname -m` -R ./yourProgram

You can also do this programmatically from C source before a UNIX exec.
If you take a look at the sources for setarch (here's one source):
http://code.metager.de/source/xref/linux/utils/util-linux/sys-utils/setarch.c
You can see if boils down to a system call (syscall) or a function call (depending on what your system defines). From setarch.c:
#ifndef HAVE_PERSONALITY
# include <syscall.h>
# define personality(pers) ((long)syscall(SYS_personality, pers))
#endif
On my CentOS 6 64-bit system, it looks like it uses a function (which probably calls the self-same syscall above). Take a look at this snippet from the include file in /usr/include/sys/personality.h (as referenced as <sys/personality.h> in the setarch source code):
/* Set different ABIs (personalities). */
extern int personality (unsigned long int __persona) __THROW;
What it boils down to, is that you can, from C code, call and set the personality to use ADDR_NO_RANDOMIZE and then exec (just like setarch does).
#include <sys/personality.com>
#ifndef HAVE_PERSONALITY
# include <syscall.h>
# define personality(pers) ((long)syscall(SYS_personality, pers))
#endif
...
void mycode()
{
// If requested, turn off the address rand feature right before execing
if (MyGlobalVar_Turn_Address_Randomization_Off) {
personality(ADDR_NO_RANDOMIZE);
}
execvp(argv[0], argv); // ... from set-arch.
}
It's pretty obvious you can't turn address randomization off in the process you are in (grin: unless maybe dynamic loading), so this only affects forks and execs later. I believe the Address Randomization flags are inherited by child sub-processes?
Anyway, that's how you can programmatically turn off the address randomization in C source code. This may be your only solution if you don't want the force a user to intervene manually and start-up with setarch or one of the other solutions listed earlier.
Before you complain about security issues in turning this off, some shared memory libraries/tools (such as PickingTools shared memory and some IBM databases) need to be able to turn off randomization of memory addresses.

Related

Cannot Make Code Segment Execute-Only (Not Readable)

I'm trying to make the Code Segment Execute-Only (Not Readable).
But I FAILED after I tried everything the Manual told me to. Here is what I did to make the code segment unreadable.
>uname -a
Linux Emmet-VM 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:18:00 UTC 2015 i686 i686 i686 GNU/Linux
>lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.3 LTS
Release: 14.04
Codename: trusty
First, I've found this in "Intel(R)64 and IA-32 Architectures Software Developer's Manual(Combined Volumes 1,2A,2B,2C,2D,3A,3B,3C and 3D)":
Set read-enable bit to enable read and Segment Types.(Sorry, I'm still not allowed to embed pictures in my posts, so links instead)
So, I guess if I change %CS, and let it point to a Segment Descriptor which has read-enable bit set as 0, I should make the Code Segment not readable.
Then, I use the code below to insert a new Segment into LDT.entry[2], and I do set the code segment type to 8, aka 1000B, which means "Execute-Only" according to "Segment Types" link posted above:
typedef struct user_desc UserDesc;
UserDesc *seg = (UserDesc*)malloc(sizeof(UserDesc));
seg->entry_number = 0x2;
seg->base_addr = 0x00000000;
seg->limit = 0xffffffff;
seg->seg_32bit = 0x1;
seg->contents = 0x02;
seg->read_exec_only = 0x1;
seg->limit_in_pages = 0x1;
seg->seg_not_present = 0x0;
seg->useable = 0x0;
int ret = modify_ldt(1, (void*)seg, sizeof(UserDesc));
After that, I change %CS to 0x17(00010111B, meaning the entry 2 in LDT) with ljmp.
asm("ljmp $0x17, $reload_cs\n"
"reload_cs:");
But, even with this, I still can read the byte code in code segment:
void foo() {printf("foo\n");}
void test(){
char* a = (char*)foo;
printf("0x%x\n", (unsigned int)a[0]);// This prints 0x55
}
If the code segment is unreadable, code above should throw a segment fault error. But it prints 0x55 successfully.
So, I wonder, is there any mistake I've made during my test?
Or is this just a mistake in Intel's Manual?
You are still accessing the code through DS when doing (unsigned int)a[0].
Write only segments don't exist (and if they did, it would be a bad idea to set DS write only).
If you did everything correctly mov eax, [cs:...] (NASM syntax) will fail (but mov eax, [ds:...] won't).
After a quick glance at the Intel Manual execute only pages should not exist (at least directly), so using mprotect with PROT_EXEC may be of limited use (the code would still be readable).
Worth a shot, though.
There are three ways around this.
None of which can be implemented without the aid of the OS though, so they are more theoretical than practical.
Protection keys
If the CPU supports them (See section 4.6.2 of the Intel manual 3), they introduce an asymmetry in how code and data are read.
Reading data is subject to the key protection.
Fetching however is not:
How a linear address’s protection key controls access to the address depends on the mode of a linear address:
A linear address’s protection controls only data accesses to the address. It does not in any way affect instructions fetches from the address.
So it's possible to set a protection key for the code pages that your application don't have in its PKRU register.
You would still be allowed to execute the code but not to read it.
Desync the TLBs
If your application has never touched the code pages for reading, they will occupy some entries in the ITLB but not in the DTLB.
If then, the OS map them as supervisor-only without flushing the TLBs, access to them is prevented when accessed as data (since no DTLB entries for those pages are present, forcing a walk on the memory) but thanks to the ITLB the code can still be fetched.
This is more involved in practice as code span multiple pages and is actually read as data by the OS.
EPT
The Extended Data Pages are used during virtualization to translate Guest physical addresses to Host physical addresses.
Though they seems just another level of indirection, they have separate Read, Write and Execute control bits.
A paper has been written about preventing the leakage of the kernel code (to counteract dynamic Return Oriented Programming).

How can I compile C code to get a bare-metal skeleton of a minimal RISC-V assembly program?

I have the following simple C code:
void main(){
int A = 333;
int B=244;
int sum;
sum = A + B;
}
When I compile this with
$riscv64-unknown-elf-gcc code.c -o code.o
If I want to see the assembly code I use
$riscv64-unknown-elf-objdump -d code.o
But when I explore the assembly code I see that this generates a lot of code which I assume is for Proxy Kernel support (I am a newbie to riscv). However, I do not want that this code has support for Proxy kernel, because the idea is to implement only this simple C code within an FPGA.
I read that riscv provides three types of compilation: Bare-metal mode, newlib proxy kernel and riscv Linux. According to previous research, the kind of compilation that I should do is bare metal mode. This is because I desire a minimum assembly without support for the operating system or kernel proxy. Assembly functions as a system call are not required.
However, I have not yet been able to find as I can compile a C code for get a skeleton of a minimal riscv assembly program. How can I compile the C code above in bare metal mode or for get a skeleton of a minimal riscv assembly code?
Warning: this answer is somewhat out-of-date as of the latest RISC-V Privileged Spec v1.9, which includes the removal of the tohost Control/Status Register (CSR), which was a part of the non-standard Host-Target Interface (HTIF) which has since been removed. The current (as of 2016 Sep) riscv-tests instead perform a memory-mapped store to a tohost memory location, which in a tethered environment is monitored by the front-end server.
If you really and truly need/want to run RISC-V code bare-metal, then here are the instructions to do so. You lose a bunch of useful stuff, like printf or FP-trap software emulation, which the riscv-pk (proxy kernel) provides.
First things first - Spike boots up at 0x200. As Spike is the golden ISA simulator model, your core should also boot up at 0x200.
(cough, as of 2015 Jul 13, the "master" branch of riscv-tools (https://github.com/riscv/riscv-tools) is using an older pre-v1.7 Privileged ISA, and thus starts at 0x2000. This post will assume you are using v1.7+, which may require using the "new_privileged_isa" branch of riscv-tools).
So when you disassemble your bare-metal program, it better
start at 0x200!!! If you want to run it on top of the proxy-kernel, it
better start at 0x10000 (and if Linux, it’s something even larger…).
Now, if you want to run bare metal, you’re forcing yourself to write up the
processor boot code. Yuck. But let’s punt on that and pretend that’s not
necessary.
(You can also look into riscv-tests/env/p, for the “virtual machine”
description for a physically addressed machine. You’ll find the linker script
you need and some macros.h to describe some initial setup code. Or better
yet, in riscv-tests/benchmarks/common.crt.S).
Anyways, armed with the above (confusing) knowledge, let’s throw that all
away and start from scratch ourselves...
hello.s:
.align 6
.globl _start
_start:
# screw boot code, we're going minimalist
# mtohost is the CSR in machine mode
csrw mtohost, 1;
1:
j 1b
and link.ld:
OUTPUT_ARCH( "riscv" )
ENTRY( _start )
SECTIONS
{
/* text: test code section */
. = 0x200;
.text :
{
*(.text)
}
/* data: Initialized data segment */
.data :
{
*(.data)
}
/* End of uninitalized data segement */
_end = .;
}
Now to compile this…
riscv64-unknown-elf-gcc -nostdlib -nostartfiles -Tlink.ld -o hello hello.s
This compiles to (riscv64-unknown-elf-objdump -d hello):
hello: file format elf64-littleriscv
Disassembly of section .text:
0000000000000200 <_start>:
200: 7810d073 csrwi tohost,1
204: 0000006f j 204 <_start+0x4>
And to run it:
spike hello
It’s a thing of beauty.
The link script places our code at 0x200. Spike will start at
0x200, and then write a #1 to the control/status register
“tohost”, which tells Spike “stop running”. And then we spin on an address
(1: j 1b) until the front-end server has gotten the message and kills us.
It may be possible to ditch the linker script if you can figure out how to
tell the compiler to move <_start> to 0x200 on its own.
For other examples, you can peruse the following repositories:
The riscv-tests repository holds the RISC-V ISA tests that are very minimal
(https://github.com/riscv/riscv-tests).
This Makefile has the compiler options:
https://github.com/riscv/riscv-tests/blob/master/isa/Makefile
And many of the “virtual machine” description macros and linker scripts can
be found in riscv-tests/env (https://github.com/riscv/riscv-test-env).
You can take a look at the “simplest” test at (riscv-tests/isa/rv64ui-p-simple.dump).
And you can check out riscv-tests/benchmarks/common for start-up and support code for running bare-metal.
The "extra" code is put there by gcc and is the sort of stuff required for any program. The proxy kernel is designed to be the bare minimum amount of support required to run such things. Once your processor is working, I would recommend running things on top of pk rather than bare-metal.
In the meantime, if you want to look at simple assembly, I would recommend skipping the linking phase with '-c':
riscv64-unknown-elf-gcc code.c -c -o code.o
riscv64-unknown-elf-objdump -d code.o
For examples of running code without pk or linux, I would look at riscv-tests.
I'm surprised no one mentioned gcc -S which skips assembly and linking altogether and outputs assembly code, albeit with a bunch of boilerplate, but it may be convenient just to poke around.

Pthreads & Multicore compiler

I'm working with the support SMP kernel: Snapgear 2.6.21.
I have created 4 threads in my c application, and I am trying to set thread 1 to run on CPU1, thread2 on CPU 2, etc.
However, the compiler sparc-linux-gcc does not recognize these functions:
CPU_SET (int cpu, cpu_set_t * set);
CPU_ZERO (cpu_set_t * set);
and this type: cpu_set_t
It always gives me these errors:
implicit declaration of function 'CPU_ZERO'
implicit declaration of function 'CPU_SET'
'cpu_set_t' undeclared (first use in this function)
Here is my code to bind active thread to processor 0:
cpu_set_t mask;
CPU_ZERO (& mask);
CPU_SET (0, & mask) // bind processor 0
sched_setaffinity (0, sizeof(mask), & mask);
I have included and defined at the top :
**define _GNU_SOURCE
include <sched.h>**
But I always get the same errors. can you help me please?
You should read sched_setaffinity(2) carefully and test its result (and display errno on failure, e.g. with perror).
Actually, I believe you should use pthread_setaffinity_np(3) instead (and of course test its failure, etc...)
Even more, I believe that you should not bother to explicitly set the affinity. Recent Linux kernels are often quite good at dispatching running threads on different CPUs.
So simply use pthreads and don't bother about affinity, unless you see actual issues when benchmarking.
BTW, passing the -H flag to your GCC (cross-)compiler could be helpful. It shows you the included files. Perhaps also look into the preprocessed form obtained with gcc -C -E ; it looks like some header files are missing or not found (maybe some missing -I include-directory at compilation time, or some missing headers on your development system)
BTW, your kernel version looks ancient. Can't you upgrade your kernel to something newer (3.15.x or some 3.y)?

How does linux capability.h use 32-bit mask for 34 elements?

The file in /usr/include/linux/capability.h #defines 34 possible capabilities.
It goes like:
#define CAP_CHOWN 0
#define CAP_DAC_OVERRIDE 1
.....
#define CAP_MAC_ADMIN 33
#define CAP_LAST_CAP CAP_MAC_ADMIN
each process has capabilities defined thusly
typedef struct __user_cap_data_struct {
__u32 effective;
__u32 permitted;
__u32 inheritable;
} * cap_user_data_t;
I'm confused - a process can have 32-bits of effective capabilities, yet the total amount of capabilities defined in capability.h is 34. How is it possible to encode 34 positions in a 32-bit mask?
Because you haven't read all of the manual.
The capget manual starts by convincing you to not use it :
These two functions are the raw kernel interface for getting and set‐
ting thread capabilities. Not only are these system calls specific to
Linux, but the kernel API is likely to change and use of these func‐
tions (in particular the format of the cap_user_*_t types) is subject
to extension with each kernel revision, but old programs will keep
working.
The portable interfaces are cap_set_proc(3) and cap_get_proc(3); if
possible you should use those interfaces in applications. If you wish
to use the Linux extensions in applications, you should use the easier-
to-use interfaces capsetp(3) and capgetp(3).
Current details
Now that you have been warned, some current kernel details. The struc‐
tures are defined as follows.
#define _LINUX_CAPABILITY_VERSION_1 0x19980330
#define _LINUX_CAPABILITY_U32S_1 1
#define _LINUX_CAPABILITY_VERSION_2 0x20071026
#define _LINUX_CAPABILITY_U32S_2 2
[...]
effective, permitted, inheritable are bitmasks of the capabilities
defined in capability(7). Note the CAP_* values are bit indexes and
need to be bit-shifted before ORing into the bit fields.
[...]
Kernels prior to 2.6.25 prefer 32-bit capabilities with version
_LINUX_CAPABILITY_VERSION_1, and kernels 2.6.25+ prefer 64-bit capabil‐
ities with version _LINUX_CAPABILITY_VERSION_2. Note, 64-bit capabili‐
ties use datap[0] and datap[1], whereas 32-bit capabilities only use
datap[0].
where datap is defined earlier as a pointer to a __user_cap_data_struct. So you just represent a 64bit values with two __u32 in an array of two __user_cap_data_struct.
This, alone, tells me to not ever use this API, so i didn't read the rest of the manual.
They aren't bit-masks, they're just constants. E.G. CAP_MAC_ADMIN sets more than one bit. In binary, 33 is what, 10001?

Can a program assign the memory directly?

Is there any really low level programming language that can get access the memory variable directly? For example, if I have a program have a variable i. Can anyone access the memory to change my program variable i to another value?
As an example of how to change the variable in a program from “the outside”, consider the use of a debugger. Example program:
$ cat print_i.c
#include <stdio.h>
#include <unistd.h>
int main (void) {
int i = 42;
for (;;) { (void) printf("i = %d\n", i); (void) sleep(3); }
return 0;
}
$ gcc -g -o print_i print_i.c
$ ./print_i
i = 42
i = 42
i = 42
…
(The program prints the value of i every 3 seconds.)
In another terminal, find the process id of the running program and attach the gdb debugger to it:
$ ps | grep print_i
1779 p1 S+ 0:00.01 ./print_i
$ gdb print_i 1779
…
(gdb) bt
#0 0x90040df8 in mach_wait_until ()
#1 0x90040bc4 in nanosleep ()
#2 0x900409f0 in sleep ()
#3 0x00002b8c in main () at print_i.c:6
(gdb) up 3
#3 0x00002b8c in main () at print_i.c:6
6 for (;;) { (void) printf("i = %d\n", i); (void) sleep(3); }
(gdb) set variable i = 666
(gdb) continue
Now the output of the program changes:
…
i = 42
i = 42
i = 666
So, yes, it's possible to change the variable of a program from the “outside” if you have access to its memory. There are plenty of caveats here, e.g. one needs to locate where and how the variable is stored. Here it was easy because I compiled the program with debugging symbols. For an arbitrary program in an arbitrary language it's much more difficult, but still theoretically possible. Of course, if I weren't the owner of the running process, then a well-behaved operating system would not let me access its memory (without “hacking”), but that's a whole another question.
Sure, unless of course the operating system protects that memory on your behalf. Machine language (the lowest level programming language) always "accesses memory directly", and it's pretty easy to achieve in C (by casting some kind of integer to pointer, for example). Point is, unless this code's running in your process (or the kernel), whatever language it's written in, the OS would normally be protecting your process from such interference (by mapping the memory in various ways for different processes, for example).
If another process has sufficient permissions, then it can change your process's memory. On Linux, it's as simple as reading and writing the pseudo-file /proc/{pid}/mem. This is how many exploits work, though they do rely on some vulnerability that allows them to run with very high privileges (root on Unix).
Short answer: yes. Long answer: it depends on a whole lot of factors including your hardware (memory management?), your OS (protected virtual address spaces? features to circumvent these protections?) and the detailed knowledge your opponent may or may not have of both your language's architecture and your application structure.
It depends. In general, one of the functions of an operating system is called segmentation -- that means keeping programs out of each other's memory. If I write a program that tries to access memory that belongs to your program, the OS should crash me, since I'm committing something called a segmentation fault.
But there are situations where I can get around that. For example, if I have root privileges on the system, I may be able to access your memory. Or worse -- I can run your program inside a virtual machine, then sit outside that VM and do whatever I want to its memory.
So in general, you should assume that a malicious person can reach in and fiddle with your program's memory if they try hard enough.

Resources