I am wondering if pie does anything if the aslr is turned off on the system? Or is pie dependant on aslr? - linux

I am also wondering what does pie and what does aslr effect in memory as I understand, aslr randomizes addresses of libc base,stack and heap. And pie randomizes elf base and with that .text,.data,.bss,.rodata...
Is that correct or am I getting it wrong?

PIE requires position-independent code, costing a small amount of performance. (Or a large amount like 15% on ISAs like 32-bit x86 that don't easily support PC-relative addressing). See 32-bit absolute addresses no longer allowed in x86-64 Linux?.
With ASLR disabled, e.g. when running under GDB or with it disabled system-wide, Linux chooses a base address of 0x0000555555555000 to map the executable, so objdump -d addresses relative to the file start like 0x4000 end up at that high virtual address.
A PIE executable is an ELF shared object (like a .so), as opposed to an ELF "executable". An ELF executable has a base address in the ELF headers, set by the linker, so absolute addresses can be hard-coded into the machine code and data without needing relocations for fixups. (That's why there's no way to ASLR a proper ELF executable, only a PIE).
The mechanism that supports PIE was originally just a fun hack of putting an entry point in a library. Later people realized it would be useful for ASLR for static code/data in executables to be possible, so this existing support became official. (Or something like that; I haven't read up on the history.)
But anyway, ASLR is enabled by PIE, but PIE is a thing even with ASLR disabled, if you want the most general non-technical description.

Related

why non-pic code can't be totally ASLR using run-time fixups?

I understand that PIC code makes ASLR randomization more efficient and easier since the code can be placed anywhere in memory with no change in code. But if i understand right according to Wikipedia relocation dynamic linker can make "fixups" at runtime so a symbol can be located although code being not position-independent. But according to many answers i saw here non-pic code can't ASLR sections except the stack(so cant randomize program entry point). If that is correct then what are runtime fixups are used for and why can't we just fixup all locations in code at runtime before program start to make program entry point randomized.
TL:DR: Not all uses of absolute address will have relocation info in a non-PIE executable (ELF type EXEC, not DYN). Therefore the kernel's program-loader can't find them all to apply fixups.
Thus there's no way to retroactively enable ASLR for executables built as non-PIE. There's no way for a traditional executable to flag itself as having relocation metadata for every use of an absolute address, and no point in adding such a feature either since if you want text ASLR you'd just build a PIE.
Because ELF-type EXEC Linux executables are guaranteed to be loaded / mapped at the fixed base address chosen by the linker at link time, it would be a waste of space in the executable to make symbol-table entries for internal symbols. So toolchains didn't do that, and there's no reason to start. That's simply how traditional ELF executables were designed; Linux switched from a.out to ELF back in the mid 90s before stack ASLR was a thing, so it wasn't on people's radar.
e.g. the absolute address of static char buf[100] is probably embedded somewhere in the machine code that uses it (if we're talking about 32-bit code, or 64-bit code that puts the address in a register), but there's no way to know where or how many times.
Also, for x86-64 specifically, the default code model for non-PIE executables guarantees that static addresses (text / data / bss) will all be in the low 2GiB of virtual address space, so 32-bit absolute signed or unsigned addresses can work, and rel32 displacements can reach anything from anything. That's why non-PIE compiler output uses mov $symbol, %edi (5 bytes) to put an address in a register, instead of lea symbol(%rip), %rdi (7 bytes). https://godbolt.org/z/89PeK1
So even if you did know where every absolute address was, you could only ASLR it in the low 2GiB, limiting the number of bits of entropy you could introduce. (I think Windows has a mode for this: LargeAddressAware = no. But Linux doesn't. 32-bit absolute addresses no longer allowed in x86-64 Linux? Again, PIE is a better way to allow text ASLR, so people (distros) should just compile for that if they want its benefits.)
Unlike Windows, Linux doesn't spend huge effort on things that can be handled better and more efficiently by recompiling binaries from source.
That being said, GNU/Linux does support fixup relocations for 64-bit absolute addresses even in PIC / PIE ELF shared objects. That's why beginner code like NASM mov rdi, BUFFER can work even in a shared library: use objdump -drwC -Mintel to see the relocation info on that use of the symbol in a mov reg, imm64 instruction. An lea rdi, [rel BUFFER] wouldn't need any relocation entry if BUFFER wasn't a global symbol. (Equivalent of C static.)
You might be wondering why metadata is essential:
There's no reliable way to search text/data for possible absolute addresses; false positives would be possible. e.g. /usr/bin/ld probably contains 0x401000 as the default start address for an x86-64 executable. You don't want ASLR of ld's code+data to also change its defaults. Or that integer value could have come up in any number of ways in many programs, e.g. as a bitmap. And of course x86-64 machine code is variable length so there's no reliable way to even distinguish opcodes from immediate operands in the most general case.
And also potentially false negatives. Not super likely that an x86 program would construct an absolute address in a register with multiple instructions, but it's certainly possible. However in non-x86 code, that would be common.
RISC machines with fixed-length instructions can't put a 32-bit address into a 32-bit instruction; there'd be no room left for anything else. So to load from static storage, the absolute addresses would have to be split across multiple instructions, like MIPS lui $t0, %hi(0x612300) / lw $t1, %lo(0x612300)($t0) to load from a static variable at absolute address 0x612300. (There would normally be a symbol name in the asm source, but it wouldn't appear in the final linked binary unless it was .globl, so I used numbers as a reminder.) Instructions like that don't have to come in pairs; the same high-half of the address could be reused by other accesses into the same array or struct in later instructions.
Let's first have a look at Windows before having a look at Linux:
Windows' .EXE files (programs) typically have a so-called "base relocation table" and they have an "image base".
The "image base" is the "desired" start address of the program; if Windows loads the program to that address, no relocation needs to be done.
The "base relocation table" contains a list of all values in a program which represent addresses. If the program is loaded to a different address than the "image base", Windows must add the difference to all values listed in that table.
If the .EXE file does not contain a "base relocation table" (as far as I know some 32-bit GCC versions generate such files), it is not possible to load the file to another address.
This is because the following C code statements will result in exactly the same machine code (binary code) if the variable someVariable is located at the address 12340000, and it is not possible to distinguish between them:
long myVariable = 12340000;
And:
int * myVariable = &someVariable;
In the first case, the value 12340000 must not be changed in any situation; in the second case, the address (which is 12340000) must be changed to the real address if the program is loaded to another address.
If the "base relocation table" is missing, there is no information if the value 12340000 is an integer value (which must not be changed) or an address (which must be changed).
So the program must be loaded to some fixed address.
I'm not sure about the latest 32-bit Linux releases, but at least in older 32-bit Linux versions there was nothing like a "base relocation table" and programs did not use PIC. This means that these programs had to be loaded to their "favorite" address.
I don't know about 64-bit Linux programs, but if a program is compiled the same way as the (older) 32-bit programs, they also must be loaded to a certain address and ASLR is not possible.

How to force vdso memory to be in fixed address?

It appears that there is no information in the executable about where vdso should be. Assuming that I can control how the program is compiled, linked and written, how can I force it to be at the address I want it to be?

how to reserve a particular range of virtual memory from a linux process

i86-32 bits system:
Is there a way to reserve a particular range of virtual address space in a process memory map to stop ld.so (dynamic linker) from loading any shared objects into that range?
I want to use at least 2 1G virtual memory to map the two 1G huge pages, however, ld.so load the shared library in the middle, so I can't map the 1G huge pages.
Compiler can't do this job. linker scripts can't as well. ld.so is loaded into the executable by the loader, then ld.so loads other shared libraries. however, ld.so itself even in the middle of the mapped space.
entry point of ld.so and libc.so are at a higher address, which can't be changed for our application.
Entry point address: 0x46c38810
Thanks,
Jiangtao
ld.so is loaded into the executable by the loader,
No: ld.so is the loader, and it is loaded into the process by the kernel.
You do have a few choices:
the easiest solution is link the binary fully-statically. Note that on Linux such binary could still dlopen other shared libraries, although this is not a well-supported or well-tested thing to do.
harder solution is to build your own patched ld.so, and make your application use that ld.so (using -Wl,--dynamic-linker=... flag).
if you don't want to do that, rtldi may help (it will run before ld.so).
The entry point address in the shared libs are edited by the prelink.
prelink is to avoid conflicts of load address of shared libraries, to optimize and speed-up run-time loader. By default it's on in our system.
prelink is a program that modifies ELF shared libraries and ELF dynamically linked binaries and assigns a unique virtual address space slot to each libs. In such a way that the time needed for the dynamic linker to perform relocations at startup significantly decreases. Due to fewer relocations, the run-time memory consumption decreases as well.
/usr/sbin/prelink -avmR
prelinks all binaries found in directories specified in /etc/prelink.conf and all their dependent libraries, assigning libraries unique virtual address space slots
By disabling the prelink, the entry point is not in the middle of the lib. so we can get another 1G memory mmaped.

Is compiling ELF files with MSB flag possible in Linux

Is it possible to compile binary files with MSB endianness in GCC? If so, would they work correctly when executed?
Transferring comments into something resembling an answer.
In part, it depends on the CPU architecture. On SPARC or Power, you'd probably find MSB is the default. On Intel, which is definitely LSB by default, you probably can't.
On Intel. If I somehow altered every ELF header entry to MSB in a little-endian binary, would that execute properly?
No; you'd have to make lots of other changes to the code to give it the slightest chance of working. Basically, every number would have to be reworked from little-endian to big-endian.
Would that include instruction addresses, immediate parameters etc.? I would assume that regardless of the ELF flag, these should remain little-endian.
With just a modicum of further thought, I think there's a very chance high chance that the kernel for Intel will simply decline to execute the MSB ELF executable. It was compiled to expect LSB and knows it doesn't know how to deal with the alternative. To fix that, you'd have to rebuild the kernel and the dynamic loader, ld.so.1. And that's probably just the start of your problems.
On the whole, I would regard this as an exercise in futility. In the absence of information to the contrary, I don't think you need to worry about handling the headers of MSB ELF headers for Intel binaries; they won't exist in practice.
It do not think it is explicitely stated in the System V ABI but AFAIK, the ELF file is expected to be in native endianness (and e_ident[EI_DATA] describes the endianess used):
Byte e_ident[EI_DATA] specifies the data encoding of the
processor-specific data in the object file. The following encodings
are currently defined.
You might expect the processor-specific data to be in the processor endianness. For example, the content of the .got is processor-specific data and you definitely want it to be in native endianness.
On Intel computers, you have to use the ELFDATA2LSB.
From the System V ABI ~ Intel386 Supplement 4th edition:
For file identification in e_ident, the Intel386 architecture
requires the following values.
e_ident[EI_CLASS] = ELFCLASS32
e_ident[EI_DATA] = ELFDATA2LSB
From the System V ABI ~ AMD64 supplement Draft 0.99.6:
For file identification in e_ident, the AMD64 architecture
requires the following values.
e_ident[EI_CLASS] = ELFCLASS64
e_ident[EI_DATA] = ELFDATA2LSB

How does the Linux kernel determine ld.so's load address?

I know that the dynamic linker uses mmap() to load libraries. I guess it is the kernel who loads both the executable and its .interpreter into the same address space, but how does it determine where? I noticed that ld.so's load address with ASLR disabled is 0x555555554000 (on x86_64) — where does this address come from? I tried following do_execve()'s code path, but it is too ramified for me not to be confused as hell.
Read more about ELF, in particular elf(5), and about the execve(2) syscall.
An ELF file may contain an interpreter. elf(5) mentions:
PT_INTERP The array element specifies the location and
size of a null-terminated pathname to invoke
as an interpreter. This segment type is
meaningful only for executable files (though
it may occur for shared objects). However it
may not occur more than once in a file. If
it is present, it must precede any loadable
segment entry.
That interpreter is practically almost always ld-linux(8) (e.g. with GNU glibc), more precisely (on my Debian/Sid) /lib64/ld-linux-x86-64.so.2. If you compile musl-libc then build some software with it you'll get a different interpreter, /lib/ld-musl-x86_64.so.1. That ELF interpreter is the dynamic linker.
The execve(2) syscall is using that interpreter:
If the executable is a dynamically linked ELF executable, the
interpreter named in the PT_INTERP segment is used to load the needed
shared libraries. This interpreter is typically /lib/ld-linux.so.2
for binaries linked with glibc.
See also Levine's book on Linkers and loaders, and Drepper's paper: How To Write Shared Libraries
Notice that execve is also handling the shebang (i.e. first line starting with #!); see the Interpreter scripts section of execve(2). BTW, for ELF binaries, execve is doing the equivalent of mmap(2) on some segments.
Read also about vdso(7), proc(5) & ASLR. Type cat /proc/self/maps in your shell.
(I guess, but I am not sure, that the 0x555555554000 address is in the ELF program header of your executable, or perhaps of ld-linux.so; it might also come from the kernel, since 0x55555555 seems to appear in the kernel source code)

Resources