elf file thinks its small, but its large! can not generate .bin and .hex files - linux

I am working on migrating a project from Kail to Gcc.
Makefile http://www.copypastecode.com/73860/
.ld file http://www.copypastecode.com/73856/
I have a Makefile and a platform.ld script and some .c and .h files.
When i make, everything compiles and links and it looks good.
arm-none-eabi-size -B Output/stm32_gps_test.elf
text data bss dec hex filename
0 0 2048 2048 800 Output/stm32_gps_test.elf
but when i check the generated files i see this:
ls Output/
7327274 2011-07-02 04:28 stm32_gps_test.elf
0 2011-07-02 04:28 stm32_gps_test.bin
34 2011-07-02 04:28 stm32_gps_test.hex
and:
tail Output/stm32_gps_test.hex
:0400000508000000EF
:00000001FF
Some info on the elf file:
arm-none-eabi-readelf -h Output/stm32_gps_test.elf
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0x8000000
Start of program headers: 52 (bytes into file)
Start of section headers: 7323752 (bytes into file)
Flags: 0x5000002, has entry point, Version5 EABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 1
Size of section headers: 40 (bytes)
Number of section headers: 18
Section header string table index: 15
What is wrong? i have tried to run objcopy to create a binfile and hexfile but the result is always the same.

When you disassemble it what do you see? (objdump -D) If you have for example a rom image at 0x80000000 and ram at 0x20000000 the .bin file from objcopy will be at a minimum 0x60000000 bytes plus the size of the image in rom. The intel hex file or srec should work though.

Does the option --set-section-flags .bss=alloc,load,contents works? With this option, the .bss section will be included in stm32_gps_test.bin.

Related

how to extract AT_EXECFN from a coredump file

I need to extract a value for AT_EXECFN from a coredump file. For those who don't know this value is part of the auxiliary vector. It is really simple to get this value while in your running process, just use getauxval, however I couldn't find any information on how to do it when you have an ELF coredump file. I can find the NT_AUXV section of this file, but I can't find out how to find an exact string where AT_EXECFN is pointing to. Say I found AT_EXECFN in the coredump. According to this I'm going to get a struct with ann address where the actual value is stored. My question is how do I find this address in the coredump file?
Here is an example:
int main() { abort(); }
gcc -w -O2 t.c && ulimit -c unlimited && ./a.out
Aborted (core dumped)
First we can look with GDB:
gdb -q ./a.out core
Reading symbols from ./a.out...
(No debugging symbols found in ./a.out)
[New LWP 86]
Core was generated by `./a.out'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) info auxv
33 AT_SYSINFO_EHDR System-supplied DSO's ELF header 0x7ffd4037a000
16 AT_HWCAP Machine-dependent CPU capability hints 0x178bfbff
...
31 AT_EXECFN File name of executable 0x7ffd40374ff0 "./a.out"
15 AT_PLATFORM String identifying platform 0x7ffd403739c9 "x86_64"
0 AT_NULL End of vector 0x0
This proves that the info is in fact present in the core, and also shows what values to expect.
The steps therefore are:
read / decode NT_AUXV note until you find an entry with .a_type == AT_EXECFN ( 31). Find the pointer to string ($addr, here you would find 0x7ffd40374ff0).
Using eu-readelf -n core is helpful to verify that you are reading expected values. Here is the output:
CORE 320 AUXV
SYSINFO_EHDR: 0x7ffd4037a000
HWCAP: 0x178bfbff <fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht>
...
EXECFN: 0x7ffd40374ff0
PLATFORM: 0x7ffd403739c9
NULL
iterate over all program headers with .p_type == PT_LOAD until you find one with .p_vaddr <= $addr and $addr < .p_vaddr + .p_memsz (that is, a LOAD segment which "covers" desired address). In above case, that's this entry:
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
...
LOAD 0x016000 0x00007ffd40354000 0x0000000000000000 0x021000 0x021000 RW 0x1000
...
finally you can find location of the string in the core file at .p_offset + $addr - .p_vaddr. Using above numbers, we expect the string to be at offset 0x016000 + 0x7ffd40374ff0 - 0x00007ffd40354000 = 225264 bytes into the file.
And we do find it there:
dd status=none bs=1 skip=225264 count=10 if=core | xxd -g1
00000000: 2e 2f 61 2e 6f 75 74 00 00 00 ./a.out...

build-id data offset in the ELF file

I need to modify the build-id of the ELF notes section. I found out that it is possible here. Also found out that I can do it by modifying this code. What I can't figure out is data location. Here is what I'm talking about.
$ eu-readelf -S myelffile
Section Headers:
[Nr] Name Type Addr Off Size ES Flags Lk Inf Al
...
[ 2] .note.ABI-tag NOTE 000000000000028c 0000028c 00000020 0 A 0 0 4
[ 3] .note.gnu.build-id NOTE 00000000000002ac 000002ac 00000024 0 A 0 0 4
...
$ eu-readelf -n myelffile
Note section [ 2] '.note.ABI-tag' of 32 bytes at offset 0x28c:
Owner Data size Type
GNU 16 GNU_ABI_TAG
OS: Linux, ABI: 3.14.0
Note section [ 3] '.note.gnu.build-id' of 36 bytes at offset 0x2ac:
Owner Data size Type
GNU 20 GNU_BUILD_ID
Build ID: d75a086c288c582036b0562908304bc3a8033235
.note.gnu.build-id section is 36 bytes. The build id is 20 bytes. What are the other 16 bytes?
I played with the code a bit and read 36 bytes of myelffile at offset 0x2ac. Got the following 040000001400000003000000474e5500d75a086c288c582036b0562908304bc3a8033235.
Then I decided to use Elf64_Shdr definition, so I read data at address 0x2ac + sizeof(Elf64_Shdr.sh_name) + sizeof(Elf64_Shdr.sh_type) + sizeof(Elf64_Shdr.sh_flags) and I got my build id, d75a086c288c582036b0562908304bc3a8033235. It does makes sense why I got it, sizeof(Elf64_Shdr.sh_name) + sizeof(Elf64_Shdr.sh_type) + sizeof(Elf64_Shdr.sh_flags) = 16 bytes, but according to Elf64_Shdr definition I should be pointing to Elf64_Addr sh_addr, i.e. section virtual address.
So what is not clear to me is what are the other 16 bytes of the section? What do they represent? I can't reconcile the Elf64_Shdr definition and the results I'm getting from my experiments.
.note.gnu.build-id section is 36 bytes. The build id is 20 bytes. What are the other 16 bytes?
Each .note.* section starts with Elf64_Nhdr (12 bytes), followed by (4-byte aligned) note name of variable size (GNU\0 here), followed by (4-byte aligned) actual note data. Documentation.
Looking at /bin/date on my system:
eu-readelf -Wn /bin/date
Note section [ 2] '.note.ABI-tag' of 32 bytes at offset 0x2c4:
Owner Data size Type
GNU 16 GNU_ABI_TAG
OS: Linux, ABI: 3.2.0
Note section [ 3] '.note.gnu.build-id' of 36 bytes at offset 0x2e4:
Owner Data size Type
GNU 20 GNU_BUILD_ID
Build ID: 979ae4616ae71af565b123da2f994f4261748cc9
What are the bytes at offset 0x2e4?
dd bs=1 skip=$((0x2e4)) count=36 < /bin/date | xxd
00000000: 0400 0000 1400 0000 0300 0000 474e 5500 ............GNU.
00000010: 979a e461 6ae7 1af5 65b1 23da 2f99 4f42 ...aj...e.#./.OB
00000020: 6174 8cc9 at..
So we have: .n_namesz == 4, .n_descsz == 20, .n_type == 3 == NT_GNU_BUILD_ID, followed by 4-byte GNU\0 note name, followed by 20 bytes of actual build-id bytes 0x97, 0x9a, etc.

Where is segment %fs for static elf images setup?

I'm trying to figure out how the %fs register is initialized
when creating a elf image by hand.
The simple snippet I'd like to run is:
.text
nop
movq %fs:0x28, %rax;
1: jmp 1b
Which should read at offset 0x28 in the %fs segment. Normally this is where the stack canary is stored. Because I create the elf image by hand the %fs segment is not setup at all by my code this fails expectedly(?) .
Here is how I create the elf image:
0000000000000000 <.text>:
0: 90 nop
1: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
8: 00 00
a: eb fe jmp 0xa
I create the .text segment via
echo 9064488b042528000000ebfe | xxd -r -p > r2.bin
Then I convert to elf:
ld -b binary -r -o raw.elf r2.bin
objcopy --rename-section .data=.text --set-section-flags .data=alloc,code,load raw.elf
At that point raw.elf contains my instructions. I then link with
ld -T raw.ld -o out.elf -M --verbose where raw.ld is:
OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", "elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
ENTRY(_entry)
PHDRS {
phdr4000000 PT_LOAD;
}
SECTIONS
{
_entry = 0x4000000;
.text 0x4000000 : { raw.elf (.text) } :phdr4000000
}
I can now start out.elf with gdb:
gdb --args out.elf
and set a breakpoint at 0x4000000:
(gdb)break *0x4000000
(gdb)run
The first nop can be stepped via stepi, however the stack canary read mov %fs:0x28,%rax segfaults.
I suppose that is expected given that maybe the OS is not setting up %fs.
For a simple m.c: int main() { return 0; } program compiled with gcc --static m.c -o m I can read from %fs. Adding:
long can()
{
long v = 0;
__asm__("movq %%fs:0x28, %0;"
: "=r"(val)::);
return v;
}
lets me read from %fs - even though I doubt that %fs:28 is setup because ld.so is not run (it is a static image).
Question:
Can anyone point out where %fs is setup in the c runtime for static images?
You need to call arch_prctl with an ARCH_SET_FS argument before you can use the %fs segment prefix. You will have to allocate the backing store somewhere (brk, mmap, or an otherwise unused part of the stack).
glibc does this in __libc_setup_tls in csu/libc-tls.c for statically linked binaries, hidden behind the TLS_INIT_TP macro.

Base Address of ELF

I am trying to find the base address of ELF files. I know that you can use readelf to find the Program Entry Point and different section details (base address, size, flags and so on).
For example, programs for x86 architecture are based at 0x8048000 by linker. using readelf I can see the program entry point but no specific field in the output tells the base address.
$ readelf -e test
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x8048390
Start of program headers: 52 (bytes into file)
Start of section headers: 4436 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 9
Size of section headers: 40 (bytes)
Number of section headers: 30
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 08048154 000154 000013 00 A 0 0 1
[ 2] .note.ABI-tag NOTE 08048168 000168 000020 00 A 0 0 4
[ 3] .note.gnu.build-i NOTE 08048188 000188 000024 00 A 0 0 4
[ 4] .gnu.hash GNU_HASH 080481ac 0001ac 000024 04 A 5 0 4
[ 5] .dynsym DYNSYM 080481d0 0001d0 000070 10 A 6 1 4
In the section details, I can see that the Offset is calculated with respect to the base address of the ELF.
So, .dynsym section starts at address, 0x080481d0 and offset is 0x1d0. This would mean the base address is, 0x08048000. Is this correct?
similarly, for programs compiled on different architectures like PPC, ARM, MIPS, I cannot see their base address but only the OEP, Section Headers.
You need to check the segment table aka program headers (readelf -l).
Elf file type is EXEC (Executable file)
Entry point 0x804a7a0
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x08048000 0x08048000 0x10fc8 0x10fc8 R E 0x1000
LOAD 0x011000 0x08059000 0x08059000 0x0038c 0x01700 RW 0x1000
DYNAMIC 0x01102c 0x0805902c 0x0805902c 0x000f8 0x000f8 RW 0x4
NOTE 0x000168 0x08048168 0x08048168 0x00020 0x00020 R 0x4
TLS 0x011000 0x08059000 0x08059000 0x00000 0x0005c R 0x4
GNU_EH_FRAME 0x00d3c0 0x080553c0 0x080553c0 0x00c5c 0x00c5c R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
The first (lowest) LOAD segment's virtual address is the default load base of the file. You can see it's 0x08048000 for this file.
The ELF mapping base Address of the .text section is defined by the ld(1) loader script in the binutils project under script template elf.sc on Linux.
The script define the following variables used by the loader ld(1):
# TEXT_START_ADDR - the first byte of the text segment, after any
# headers.
# TEXT_BASE_ADDRESS - the first byte of the text segment.
# TEXT_START_SYMBOLS - symbols that appear at the start of the
# .text section.
You can inspect the current values using the command:
~$ ld --verbose |grep SEGMENT_START
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
. = SEGMENT_START("ldata-segment", .);
The text-segment mapping values are:
0x08048000 on 32 Bits
0x400000 on 64 Bits
Also the interpreter base address of an ELF program is defined in the Auxiliary vector array at the index AT_BASE. The Auxiliary vector array is an array of the Elf_auxv_t structure and located after the envp in the process stack. It's configured while loading the ELF binary in the function create_elf_tables() of Linux kernel fs/binfmt_elf.c. The following code snippet show how to read the value:
$ cat at_base.c
#include <stdio.h>
#include <elf.h>
int
main(int argc, char* argv[], char* envp[])
{
Elf64_auxv_t *auxp;
while(*envp++ != NULL);
for (auxp = (Elf64_auxv_t *)envp; auxp->a_type != 0; auxp++) {
if (auxp->a_type == 7) {
printf("AT_BASE: 0x%lx\n", auxp->a_un.a_val);
}
}
}
$ clang -o at_base at_base.c
$ ./at_base
AT_BASE: 0x7fcfd4025000
Linux Auxiliary Vector definition
Auxiliary Vector Reference
It used to be a fixed address on x86 32 bits architecture, but with ASLR now, it's randomized. You can use setarch i386 -R to disable randomization if you want.
It's defined in the linker script. You can dump the default linker script with ld --verbose. Example output:
GNU ld (GNU Binutils) 2.23.1
Supported emulations:
elf_x86_64
elf32_x86_64
elf_i386
i386linux
elf_l1om
elf_k1om
using internal linker script:
==================================================
/* Script for -z combreloc: combine and sort reloc sections */
OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64",
"elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
ENTRY(_start)
SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/x86_64-unknown-linux-gnu/lib64"); SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/lib64"); SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/x86_64-unknown-linux-gnu/lib"); SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/lib");
SECTIONS
{
/* Read-only sections, merged into text segment: */
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
.interp : { *(.interp) }
.note.gnu.build-id : { *(.note.gnu.build-id) }
.hash : { *(.hash) }
.gnu.hash : { *(.gnu.hash) }
.dynsym : { *(.dynsym) }
.dynstr : { *(.dynstr) }
.gnu.version : { *(.gnu.version) }
.gnu.version_d : { *(.gnu.version_d) }
.gnu.version_r : { *(.gnu.version_r) }
(snip)
In case you missed it: __executable_start = SEGMENT_START("text-segment", 0x400000)).
And for me, sure enough, when I link a simple .o file into a binary, the entry point address is very close to 0x400000.
The entry point address in the ELF metadata is this value, plus the offset from the beginning of the .text section to the _start symbol. Note also that the _start symbol can be configured. Again from my default linker script example: ENTRY(_start).

avr-gcc link error with independent obj

a little project of avr, when i directly compile main.c(all other c are included in main.c), all ok.
avr-gcc -Wall -mmcu=atmega8 -g -O1 $1 -o $1.out
avr-objdump -dS $1.out>$1.asm
with asm, i can see all vector here.
main.c.out: file format elf32-avr
Disassembly of section .text:
00000000 <__vectors>:
0: 1d cd rjmp .-1478 ; 0xfffffa3c <__eeprom_end+0xff7efa3c>
2: 37 cd rjmp .-1426 ; 0xfffffa72 <__eeprom_end+0xff7efa72>
when i compile each c into obj (just add some header files, codes almost same), then link them, the result fail. of course MCU become mad.
for i in src/*.c; do j=`basename $i`; j=obj/${j%%.c}.o; avr-gcc -c $i -o $j -mmcu=atmega8 -g -O1 -Wall; done;
avr-ld obj/*.o -o a.out;
avr-objdump -dS a.out >a.asm;
here is code in asm, not vector jump here, but just my rom datas.
Disassembly of section .text:
00000000 <tm_tone>:
0: 00 00 e0 1d 9e 1a b5 17 62 16 ee 13 c1 11 d0 0f ........b.......
00000010 <tiger>:
10: 31 32 33 31 31 32 33 31 33 34 35 30 33 34 35 30 1231123134503450
any advise? thanks. if i need special each obj file manually when use avr-ld?
only this can be ok, after compile to obj. link use avr-gcc.
avr-gcc -Wall -mmcu=atmega8 -g -O1 -o main.o src/main.c obj/usart.o obj/irda.o obj/everybody.o obj/audio.o

Resources