Linux minimum Load Address with LD - linux

In the process of understanding ELF program loading in Linux I was trying to experiment with the load address of a segment.
Using ld with the following linker script:
SECTIONS
{
. = 0x2000;
.text :
{
*(.text)
}
}
With the following link command:
ld -o elftest -z max-page-size=4096 -Telftest.l elftest.o
Please assume that the elftest.o is the compilation result of a trivial non important asm code.
The program links correctly with the entrypoint at 0x2000. What happens is that the program gets killed by the kernel with the output KILLED in the shell.
I want to set one thing straight:
it isn't a segfault or any other exception, the program doesn't even reach entry point
The readelf output for --segments is
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000001000 0x0000000000002000 0x0000000000002000
0x0000000000000009 0x0000000000000009 R E 1000
What I understand is the following:
Using max-page-size=4096 allows me to break the 2MiB aligment requirment for ld https://lkml.org/lkml/2012/7/9/46
The segment must be in aligned to a memory page, i.e. 4KiB so setting to 4096 is correct
The base address is arbitrary, usually for x86_64 is 0x400000
What I don't understand is:
When aligment requirment is met, why does an small address doesn't work (gets killed)?
If there is a minimun required address where is documented?
Why does it work with a higher base address i.e. 0x16000?
I'm doing this test in a 64bit computer with Arch Linux with kernel 3.17.1 and ld 2.24

You're running into kernel limitations on where pages can be mapped by userspace applications; these restrictions are intended to prevent certain kernel exploits from working. The minimum mappable address is set by the vm.mmap_min_addr sysctl value, and is usually at least 4096 (i.e, 0x1000).
For details, see: https://wiki.debian.org/mmap_min_addr (The situation is not unique to Debian; their documentation is just the first I found.)

Related

Calculate the entry point of an ELF file as a physical address (offset from 0)

I am building a RISC-V emulator which basically loads a whole ELF file into memory.
Up to now, I used the pre-compiled test binaries that the risc-v foundation provided which conveniently had an entry point exactly at the start of the .text section.
For example:
> riscv32-unknown-elf-objdump ../riscv32i-emulator/tests/simple -d
../riscv32i-emulator/tests/simple: file format elf32-littleriscv
Disassembly of section .text.init:
80000000 <_start>:
80000000: 0480006f j 80000048 <reset_vector>
...
Going into this project I didn't know much about ELF files so I just assumed that every ELF's entry point is exactly the same as the start of the .text section.
The problem arose when I compiled my own binaries, I found out that the actual entry point is not always the same as the start of the .text section, but it might be anywhere inside it, like here:
> riscv32-unknown-elf-objdump a.out -d
a.out: file format elf32-littleriscv
Disassembly of section .text:
00010074 <register_fini>:
10074: 00000793 li a5,0
10078: 00078863 beqz a5,10088 <register_fini+0x14>
1007c: 00010537 lui a0,0x10
10080: 43850513 addi a0,a0,1080 # 10438 <__libc_fini_array>
10084: 3a00006f j 10424 <atexit>
10088: 00008067 ret
0001008c <_start>:
1008c: 00002197 auipc gp,0x2
10090: cec18193 addi gp,gp,-788 # 11d78 <__global_pointer$>
...
So, after reading more about ELF files, I found out that the actual entry point address is provided by the Entry entry on the ELF's header:
> riscv32-unknown-elf-readelf a.out -h | grep Entry
Entry point address: 0x1008c
The problem now becomes that this address is not the actual address on the file (offset from 0) but is a virtual address, so obviously if I set the program counter of my emulator to this address, the emulator would crash.
Reading a bit more, I heard people talk about calculations regarding offsets from program headers and whatnot, but no one had a concrete answer.
My question is: what is the actual "formula" of how exactly you get the entry point address of the _start procedure as an offset from byte 0?
Just to be clear my emulator doesn't support virtual memory and the binary is the only thing that is loaded into my emulator's memory, so I have no use for the abstraction of virtual memory. I just want every memory address as physical address on disk.
My question is: what is the actual "formula" of how exactly you get the entry point address of the _start procedure as an offset from byte 0?
First, forget about sections. Only segments matter at runtime.
Second, use readelf -Wl to look at segments. They tell you exactly which chunk of file ([.p_offset, .p_offset + .p_filesz)) goes into which in-memory region ([.p_vaddr, .p_vaddr + .p_memsz)).
The exact calculation of "at which offset in the file does _start reside" is:
Find Elf32_Phdr which "covers" the address contained in Elf32_Ehdr.e_entry.
Using that phdr, file offset of _start is: ehdr->e_entry - phdr->p_vaddr + phdr->p_offset.
Update:
So, am I always looking for the 1st program header?
No.
Also by "covers" you mean that the 1st phdr->p_vaddr is always equal to e_entry?
No.
You are looking for a the program header (describing relationship between in-memory and on-file data) which overlaps the ehdr->e_entry in memory. That is, you are looking for the segment for which phdr->p_vaddr <= ehdr->e_entry && ehdr->e_entry < phdr->p_vaddr + phdr->p_memsz. This segment is often the first, but that is in no way guaranteed. See also this answer.

In elf binary, Is there any simple method to get memory address from offset?

In elf binary, assuming that I know the offset of binary.
In that case, how can I know the virtual address of that region of offset?
In more detail, here is binary my_binary
...and I found the data "the_key_string" in the offset of 0x204 in binary.
In this case, 0x204 is mapped in 0x0804204 when it loaded at memory.
Question:
What is the simplest way I get the address info 0x0804204 from 0x204?
Could you recommend me any useful shortcut in tools(010editor or hxd..)
...or can I do this with combination of objdump command?
ELF programs have a program header, which lists PT_LOAD segments (struct Elf32_Phdr or struct Elf64_Phdr). These have both a file offset and length (p_offset and p_filesz members) and a virtual address and length (p_vaddr and p_memsz). The point is that the the region identified by the the file offset and length becomes available at run time at the specified virtual address. The virtual address is relative to the base address of the object in memory.
You can view the program headers using readelf -l:
Elf file type is DYN (Shared object file)
Entry point 0x1670
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040
0x00000000000001f8 0x00000000000001f8 R E 0x8
INTERP 0x0000000000000238 0x0000000000000238 0x0000000000000238
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x000000000000627c 0x000000000000627c R E 0x200000
LOAD 0x0000000000006d68 0x0000000000206d68 0x0000000000206d68
0x00000000000004b8 0x0000000000000658 RW 0x200000
…
In this case, there are two load segments, one readable and executable (the program code), and one readable and writable (data and relocations).
Not all parts of the binary are covered by PT_LOAD segments and thus mapped by the loader at run time. If the data is in an unallocated section, it will just not be in memory (unless you read it from disk by other means).
But if the data is allocated, then it will fall into one of the load segments, and once you have the base address, you can use the information in the load segment to compute the virtual address from the file offset.

Required alignment of .text versus .data

I've been toying around with the ELFIO library. One of the examples, in particular, allows one to create an ELF file from scratch – defining sections, segments, entry point, and providing binary content to the relevant sections.
I noticed that a program created this way segfaults when the code segment alignment is chosen less than the page size (0x1000):
// Create a loadable segment
segment* text_seg = writer.segments.add();
text_seg->set_type( PT_LOAD );
text_seg->set_virtual_address( 0x08048000 );
text_seg->set_physical_address( 0x08048000 );
text_seg->set_flags( PF_X | PF_R );
text_seg->set_align( 0x1000 ); // can't change this
NB that the .text section is only aligned to multiples of 0x10 in the same example:
section* text_sec = writer.sections.add( ".text" );
text_sec->set_type( SHT_PROGBITS );
text_sec->set_flags( SHF_ALLOC | SHF_EXECINSTR );
text_sec->set_addr_align( 0x10 );
However, the data segment, although loaded separately through the same mechanism, does not have this problem:
segment* data_seg = writer.segments.add();
data_seg->set_type( PT_LOAD );
data_seg->set_virtual_address( 0x08048020 );
data_seg->set_physical_address( 0x08048020 );
data_seg->set_flags( PF_W | PF_R );
data_seg->set_align( 0x10 ); // note here!
Now in this specific case the data fits by design within the page that's already allocated. Not sure if this makes any difference, but I changed its virtual address to 0x8148020 and the result still works fine.
Here's the output of readelf:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000001000 0x0000000008048000 0x0000000008048000
0x000000000000001d 0x000000000000001d R E 1000
LOAD 0x0000000000001020 0x0000000008148020 0x0000000008148020
0x000000000000000e 0x000000000000000e RW 10
Why does the program fail to execute when the alignment of the executable segment is not a multiple of 0x1000 but for data 0x10 is no problem?
Update: Somehow on a second try text_seg->set_align( 0x100 ); works too, text_seg->set_align( 0x10 ); fails. The page size is 0x1000 and interestingly, the working program's VirtAddr does not adhere to it in either segment:
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000100 0x08048100 0x08048100 0x0001d 0x0001d R E 0x100
LOAD 0x000120 0x08148120 0x08148120 0x0000e 0x0000e RW 0x10
The SIGSEGV'ing one:
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000080 0x08048100 0x08048100 0x0001d 0x0001d R E 0x10
LOAD 0x0000a0 0x08148120 0x08148120 0x0000e 0x0000e RW 0x10
Resulting ELFs are here.
The ELF ABI does not require that VirtAddr or PhysAddr be page-aligned. (I believe) it only requires that
({Virt,Phys}Addr - Offset) % PageSize == 0
That is true for both working binaries, and false for the non-working one.
Update:
I don't see how this fails for the latter.
We have: VirtAddr == 0x08048100 and Offset == 0x80 (and PageSize == 4096 == 0x1000).
(0x08048100 - 0x80) % 0x1000 == 0x80 != 0
has to agree when align == 0x10, doesn't it?
No: it has to agree with the page size (as I said earlier), or the kernel will not be able to mmap the segment.
(sorry, more an extended comment than an answer)
There are some specifications about what an ELF executable should be. Read in particular elf(5) and most importantly the relevant ABI specification (see also this question), e.g. on https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI
AFAIU, these specifications require both code and data segments to be page-aligned, but you need to check that, notably in chapter 5 (program loading and dynamic linking) of the ABI spec.
Current tools generating ELF executables (notably binutils) are working hard to respect these specifications. If you code some ELF generator, you should also try hard to respect these specifications (so testing that the generated ELF apparently works is not enough).
The kernel is implementing execve(2), and dynamic loading is done also with ld-linux(8) using mmap(2). For some reasons (performance probably) it does not check that an executable obeys all the specifications.
(of course, kernel folks want commonly produced ELF executables to be successfully execve-d)
In some corner cases (like the one you observe) the kernel is not failing execve and doing something with ill-constructed ELF files.
But IMHO there is no guarantee about that. Future kernels, and future x86-64 processors, might fail on such ill-constructed ELF files.
My feeling is that you are in some grey area, a bit some "undefined behavior" of execve. If it happens to work, it is by bad luck.
Why does the program fail to execute when the alignment of the executable segment is not a multiple of 0x1000 but for data 0x10 is no problem?
To understand precisely why, you need to dive into the source code (related to execve) of your particular kernel. And I believe that it might perhaps change in the future (future versions of the kernel).
The kernel community is more or less promising past-compatibility, but this is with the specification. It could happen that some ill-formed ELF executable could be execve-d by Linux 3.10 but not by Linux 4.13 or some future Linux 5.
(I did read that some past kernels have been able to execve-d some ill formed ELF executables, but I forgot the details, perhaps something related to the 16 bytes alignment of stack pointers)

ELF program header virtual address and file offset

I know the relationship between the two:
virtual address mod page alignment == file offset mod page alignment
But can someone tell me in which direction are these two numbers computed?
Is virtual address computed from file offset according to the relationship above, or vice versa?
Update
Here is some more detail: when the linker writes the ELF file header, it sets the virtual address and file offset of the program headers.(segments)
For example there's the output of readelf -l someELFfile:
Elf file type is EXEC (Executable file)
Entry point 0x8048094
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x08048000 0x08048000 0x00154 0x00154 R E 0x1000
LOAD 0x000154 0x08049154 0x08049154 0x00004 0x00004 RW 0x1000
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10
We can see 2 LOAD segments.
The virtual address of the first LOAD ends at 0x8048154, while the second LOAD starts at 0x8049154.
In the ELF file, the second LOAD is right behind the first LOAD with file offset 0x00154, however when this ELF is loaded into memory it starts at 0x1000 bytes after the end of the first LOAD segment.
But, why? If we have to consider memory page alignment, why doesn't the second LOAD segment starts at 0x80489000? Why does it start at 0x1000 bytes AFTER THE END of the first LOAD segment?
I know the virtual address of the second LOAD satisfies the relationship:
virtual address mod page alignment == file offset mod page alignment
But I don't know why this relationship must be satisfied.
Why does it start at 0x1000 bytes AFTER THE END of the first LOAD segment?
If it didn't, it would have to start at 0x08048154, but it can't: the two LOAD segments have different flags specified for their mapping (the first is mapped with PROT_READ|PROT_EXEC, the second with PROT_READ|PROTO_WRITE. Protections (being part of the page table) can only apply to whole pages, not parts of a page. Therefore, the mappings with different protections must belong to different pages.
virtual address mod page alignment == file offset mod page alignment
But I don't know why this relationship must be satisfied.
The LOAD segments are directly mmaped from file. The actual mapping of the second LOAD segment performed for your example will look something like this (you can run your program under strace and see that it does):
mmap(0x08049000, 0x158, PROT_READ|PROT_WRITE, MAP_PRIVATE, $fd, 0)
If you try to make the virtual address or the offset non-page-aligned, mmap will fail with EINVAL. The only way to make file data to appear in virtual memory at desired address it to make VirtAddr congruent to Offset modulo Align, and that is exactly what the static linker does.
Note that for such a small first LOAD segment, the entire first segment also appears at the beginning of the second mapping (with the wrong protections). But the program is not supposed to access anything in the [0x08049000,0x08049154) range. In general, it is almost always the case that there is some "junk" before the start of actual data in the second LOAD segment (unless you get really lucky and the first LOAD segment ends on a page boundary).
See also mmap man page.
virtual address mod page alignment == file offset mod page alignment
But can someone tell me in which direction are these two numbers computed?
I believe the virtual address is deliberately setup this way to follow the file offset. Files themselves should be compact and can therefore save disk space, so all segment are stored right next to each other, with their boundary recorded in the ELF header.
virtual address mod page alignment == file offset mod page alignment
But I don't know why this relationship must be satisfied.
It needs not to, the second segment here can be mapped to 0x08049000 without any problem. As long as the segments with different flags be mapped to different virtual pages, everything is fine. But the OS has to allocate yet another physical page (4 KB usually) for the mapping and copy the 4 byte at file offset 0x154 to the start of the page, when loading the resulted ELF executable, which is kind of wasteful.
If the relationship is satisfied, however, the OS can allocate just one single physical page and copy the whole 0x158 (0x154 + 0x4) byte of the file to the page, and map the physical page both to 0x08048000 and 0x08049000 with different flags. This saves physical memory, and makes virtual memory techniques like demand-paging to be applied more easily.

Base address at which the linux kernel is loaded

I have a couple of doubts about how the kernel is loaded into memory. Upon inspecting /proc/kallsyms I'm able to find the address of various symbols in the kernel.
$ cat /proc/kallsyms | head -n 10
00000000 t __vectors_start
80008240 T asm_do_IRQ
80008240 T _stext
80008240 T __exception_text_start
80008244 T do_undefinstr
80008408 T do_IPI
8000840c T do_DataAbort
800084a8 T do_PrefetchAbort
80008544 t gic_handle_irq
800085a0 T secondary_startup
Is there any way I can find the base address at which the kernel is loaded?
In userspace, suppose I use a libc with say the puts function at an offset of 0x200. When loaded into memory at say the address 0x8048000, I would be able to find the resolved puts at 0x8048000 + 0x200. Would the same hold for the kernel? i.e. is the kernel image loaded up into memory as 1 contiguous .text section?
for MIPS architecture
file Platform contain the field/variable "load-..." assigned with the location in physical address space.
example:
openwrt/build_dir/target-mips_mips32_musl-1.1.16/linux-brcm63xx_smp/linux-4.4.14/arch/mips/bcm63xx/Platform
#
# Broadcom BCM63XX boards
#
platform-$(CONFIG_BCM63XX) += bcm63xx/
cflags-$(CONFIG_BCM63XX) += \
-I$(srctree)/arch/mips/include/asm/mach-bcm63xx/
load-$(CONFIG_BCM63XX) := 0xffffffff80010000
for ARM architecture
file Makefile.boot contain the field/variable "zreladdr-y" assigned with the location in physical address space.
example:
openwrt/build_dir/target-mips_mips32_musl-1.1.16/linux-brcm63xx_smp/linux-4.4.14/arch/arm/mach-omap1/Makefile.boot
zreladdr-y += 0x10008000
params_phys-y := 0x10000100
initrd_phys-y := 0x10800000
for Microblaze architecture
file Makefile contain the field/variable "UIMAGE_LOADADDR" assigned with the location in physical address space (exported from Xilinx ISE).
example:
openwrt/build_dir/target-mips_mips32_musl-1.1.16/linux-brcm63xx_smp/linux-4.4.14/arch/microblaze/boot/Makefile
UIMAGE_LOADADDR = $(CONFIG_KERNEL_BASE_ADDR)
Kernel is loaded at physical address of 1MiB which is mapped on PAGE_OFFSET + 0x00100000 (virtual address). usually 8MiB of virtual space is reserved for kernel image starting from PAGE_OFFSET + 0x00100000
As other answer states that Kernel base address is fixed for particular architecture. But due to many security issues kernel development community decided to make it random. It is called ASLR (Address Space Layout Randomization).
By reading your question (or because I am reading it in 2017), you may be trying to find offset used in ASLR (or KASLR for kernel).
KASLR offset = address of symbol loaded in memory - address of symbol present in binary.
As your question states you already know address of symbol in memory from /proc/kallsyms.
We can find address of symbol in binary using nm utility and vmlinux file.
nm vmlinux | grep do_IPI
This will print address of symbol do_IPI in vmlinux file. Subtracting these two address will provide you KASLR offset.
If you are using u-boot then at boot time bootloader usually print the kernel load address and entry point.
Erase Group Size: 512 Bytes
reading uImage
4670784 bytes read in 469 ms (9.5 MiB/s)
reading devicetree.dtb
20597 bytes read in 17 ms (1.2 MiB/s)
Booting Linux kernel with ramdisk and devicetree
## Booting kernel from Legacy Image at 02004000 ...
Image Name: Linux-4.9.0-xilinx
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 4670720 Bytes = 4.5 MiB
Load Address: 10000000
Entry Point: 10000000
Verifying Checksum ... OK
## Flattened Device Tree blob at 04000000
Booting using the fdt blob at 0x4000000
Loading Kernel Image ... OK
Loading Device Tree to 1cb3d000, end 1cb45074 ... OK
Starting kernel ...
In the case of this ARM kernel the load address was at 0x80008000. Also, the kernel is loaded in a contiguous manner.

Resources