Is it safe for ld to interpret executables linked by gold? - linux

Take a simple hello world program and compile it as follows:
> g++ --version
g++ 6.3.0
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> g++ -fuse-ld=gold test.cpp -o test
Inspecting the binary produced:
> readelf -l ./test
Elf file type is EXEC (Executable file)
Entry point 0x400750
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R 8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000000ac8 0x0000000000000ac8 R E 1000
LOAD 0x0000000000000dc0 0x0000000000401dc0 0x0000000000401dc0
0x0000000000000288 0x00000000000003d0 RW 1000
DYNAMIC 0x0000000000000de0 0x0000000000401de0 0x0000000000401de0
0x0000000000000200 0x0000000000000200 RW 8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x0000000000000a8c 0x0000000000400a8c 0x0000000000400a8c
0x000000000000003c 0x000000000000003c R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0
GNU_RELRO 0x0000000000000dc0 0x0000000000401dc0 0x0000000000401dc0
0x0000000000000240 0x0000000000000240 RW 8
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .dynsym .dynstr .gnu.hash .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame .eh_frame_hdr
03 .jcr .fini_array .init_array .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .jcr .fini_array .init_array .dynamic .got
Notice that the interpreter used is ld. Whilst the program happens to work, I've not been able to find any information on whether this is safe. For all I know, gold interprets the ELF specification in a subtly different and incompatible way that requires a different interpreter.
I've done my best to research this, but have been unable to find anything that answers my question. The closest I've found is that gold struggles to link the Linux kernel (or struggled, since time has past and it may have been fixed).

You're falling into the naming trap — gold is a link editor, while ld.so is a dynamic loader. Although at different times, they are called linkers (the latter often referred to also as runtime linker.)
Their scope and usage is very different, the first one generates the final executable you'll eventually run, while the latter takes the generated file, finds its dependencies, and resolve (links) the undefined symbols between those.
Indeed, gold and ld (precisely, bfd-ld), the link editors, are provided by binutils (or alternative toolchain packages such as clang and so on), while ld.so is provided by the C library package, usually glibc on Linux distributions, but alternatively uclibc or musl.
Combining this with Martin Rosenau's comment...
Looking at the content of /usr/bin/gold, you can see that the string /lib64/ld-linux-x86-64.so.2 is stored inside the gold executable. This means that the gold linker itself "decides" using that runtime interpreter. For this reason I doubt that there are incompatibilities.
... ld.so should be compatible with the gold linker.

Related

what is segment 00 in my Linux executable program (64 bits)

Here is a very simple assembly program, just return 12 after executed.
$ cat a.asm
global _start
section .text
_start: mov rax, 60 ; system call for exit
mov rdi, 12 ; exit code 12
syscall
It can be built and executed correctly:
$ nasm -f elf64 a.asm && ld a.o && ./a.out || echo $?
12
But the size of a.out is big, it is more than 4k:
$ wc -c a.out
4664 a.out
I try to understand it by reading elf content:
$ readelf -l a.out
Elf file type is EXEC (Executable file)
Entry point 0x401000
There are 2 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000000b0 0x00000000000000b0 R 0x1000
LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000
0x000000000000000c 0x000000000000000c R E 0x1000
Section to Segment mapping:
Segment Sections...
00
01 .text
it is strange, segment 00 is aligned by 0x1000, I think it means such segment at least will occupy 4096 bytes.
My question is what is this segment 00?
(nasm version 2.14.02, ld version 2.34, os is Ubuntu 20.04.1)
Since it starts at file offset zero, it is probably a "padding" segment introduced to make the loading of the ELF more efficient.
The .text segment will, in fact, be already aligned in the file as it should be in memory.
You can force ld not to align sections both in memory and in the file with -n. You can also strip the symbols with -s.
This will reduce the size to about 352 bytes.
Now the ELF contains:
The ELF header (Needed)
The program header table (Needed)
The code (Needed)
The string table (Possibly unneeded)
The section table (Possibly unneeded)
The string table can be removed, but apparently strips can't do that.
I've removed the .shstrtab section data and all the section headers manually to shrink the size down to 144 bytes.
Consider that 64 bytes come from the ELF header, 60 from the single program header and 12 from your code; for a total of 136 bytes.
The extra 8 bytes are padding, 4 bytes at the end of the code section (easy to remove), and one at the end of the program header (which requires a bit of patching).

What's the difference between "statically linked" and "not a dynamic executable" from Linux ldd?

Consider this AMD64 assembly program:
.globl _start
_start:
xorl %edi, %edi
movl $60, %eax
syscall
If I compile that with gcc -nostdlib and run ldd a.out, I get this:
statically linked
If I instead compile that with gcc -static -nostdlib and run ldd a.out, I get this:
not a dynamic executable
What's the difference between statically linked and not a dynamic executable? And if my binary was already statically linked, why does adding -static affect anything?
There are two separate things here:
Requesting an ELF interpreter (ld.so) or not.
Like #!/bin/sh but for binaries, runs before your _start.
This is the difference between a static vs. dynamic executable.
The list of dynamically linked libraries for ld.so to load happens to be empty.
This is apparently what ldd calls "statically linked", i.e. that any libraries you might have linked at build time were static libraries.
Other tools like file and readelf give more information and use terminology that matches what you'd expect.
Your GCC is configured so -pie is the default, and gcc doesn't make a static-pie for the special case of no dynamic libraries.
gcc -nostdlib just makes a PIE that happens not to link to any libraries but is otherwise identical to a normal PIE, specifying an ELF interpreter.
ldd confusingly calls this "statically linked".
file : ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2 ...
gcc -nostdlib -static overrides the -pie default and makes a true static executable.
file : ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked ...
gcc -nostdlib -no-pie also chooses to make a static executable as an optimization for the case where there are no dynamic libraries at all. Since a non-PIE executable couldn't have been ASLRed anyway, this makes sense. Byte-for-byte identical to the -static case.
gcc -nostdlib -static-pie makes an ASLRable executable that doesn't need an ELF interpreter. GCC doesn't do this by default for gcc -pie -nostdlib, unlike the no-pie case where it chooses to sidestep ld.so when no dynamically-linked libraries are involved.
file : ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), statically linked ...
-static-pie is obscure, rarely used, and older file doesn't identify it as statically linked.
-nostdlib doesn't imply -no-pie or -static, and -static-pie has to be explicitly specified to get that.
gcc -static-pie invokes ld -static -pie, so ld has to know what that means. Unlike with the non-PIE case where you don't have to ask for a dynamic executable explicitly, you just get one if you pass ld any .so libraries. I think that's why you happen to get a static executable from gcc -nostdlib -no-pie - GCC doesn't have to do anything special, it's just ld doing that optimization.
But ld doesn't enable -static implicitly when -pie is specified, even when there are no shared libraries to link.
Details
Examples generated with gcc --version gcc (Arch Linux 9.3.0-1) 9.3.0
ld --version GNU ld (GNU Binutils) 2.34 (also readelf is binutils)
ldd --version ldd (GNU libc) 2.31
file --version file-5.38 - note that static-pie detection has changed in recent patches, with Ubuntu cherry-picking an unreleased patch. (Thanks #Joseph for the detective work) - this in 2019 detected dynamic = having a PT_INTERP to handle static-pie, but it was reverted to detect based on PT_DYNAMIC so shared libraries count as dynamic. debian bug #948269. static-pie is an obscure rarely-used feature.
GCC ends up running ld -pie exit.o with a dynamic linker path specified, and no libraries. (And a boatload of other options to support possible LTO link-time optimization, but the keys here are -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie. collect2 is just a wrapper around ld.)
$ gcc -nostdlib exit.s -v # output manually line wrapped with \ for readability
...
COLLECT_GCC_OPTIONS='-nostdlib' '-v' '-mtune=generic' '-march=x86-64'
/usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/collect2 \
-plugin /usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/liblto_plugin.so \
-plugin-opt=/usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/lto-wrapper \
-plugin-opt=-fresolution=/tmp/ccoNx1IR.res \
--build-id --eh-frame-hdr --hash-style=gnu \
-m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie \
-L/usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0 \
-L/usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/../../../../lib -L/lib/../lib \
-L/usr/lib/../lib \
-L/usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/../../.. \
/tmp/cctm2fSS.o
You get a dynamic PIE with no dependencies on other libraries. Running it still invokes the "ELF interpreter" /lib64/ld-linux-x86-64.so.2 on it which runs before jumping to your _start. (Although the kernel has already mapped the executable's ELF segments to ASLRed virtual addresses, along with ld.so's text / data / bss).
file and readelf are more descriptive.
PIE non-static executable from gcc -nostdlib
$ gcc -nostdlib exit.s -o exit-default
$ ls -l exit-default
-rwxr-xr-x 1 peter peter 13536 May 2 02:15 exit-default
$ ldd exit-default
statically linked
$ file exit-default
exit-default: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=05a4d1bdbc94d6f91cca1c9c26314e1aa227a3a5, not stripped
$ readelf -a exit-default
...
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x1000
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040
0x00000000000001f8 0x00000000000001f8 R 0x8
INTERP 0x0000000000000238 0x0000000000000238 0x0000000000000238
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x00000000000002b1 0x00000000000002b1 R 0x1000
LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000
0x0000000000000009 0x0000000000000009 R E 0x1000
... (the Read+Exec segment to be mapped at virt addr 0x1000 is where your text section was linked.)
If you strace it you can also see the differences:
$ gcc -nostdlib exit.s -o exit-default
$ strace ./exit-default
execve("./exit-default", ["./exit-default"], 0x7ffe1f526040 /* 51 vars */) = 0
brk(NULL) = 0x5617eb1e4000
arch_prctl(0x3001 /* ARCH_??? */, 0x7ffcea703380) = -1 EINVAL (Invalid argument)
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9ff5b3e000
arch_prctl(ARCH_SET_FS, 0x7f9ff5b3ea80) = 0
mprotect(0x5617eabac000, 4096, PROT_READ) = 0
exit(0) = ?
+++ exited with 0 +++
vs. -static and -static-pie the first instruction executed in user-space is your _start (which you can also check with GDB using starti).
$ strace ./exit-static-pie
execve("./exit-static-pie", ["./exit-static-pie"], 0x7ffcdac96dd0 /* 51 vars */) = 0
exit(0) = ?
+++ exited with 0 +++
gcc -nostdlib -static-pie
$ gcc -nostdlib -static-pie exit.s -o exit-static-pie
$ ls -l exit-static-pie
-rwxr-xr-x 1 peter peter 13440 May 2 02:18 exit-static-pie
peter#volta:/tmp$ ldd exit-static-pie
statically linked
peter#volta:/tmp$ file exit-static-pie
exit-static-pie: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=daeb4a8f11bec1bb1aaa13cd48d24b5795af638e, not stripped
$ readelf -a exit-static-pie
...
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x1000
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000229 0x0000000000000229 R 0x1000
LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000
0x0000000000000009 0x0000000000000009 R E 0x1000
... (no Interp header, but still a read+exec text segment)
Notice that the addresses are still relative to the image base, leaving ASLR up to the kernel.
Surprisingly, ldd doesn't say that it's not a dynamic executable. That might be a bug, or a side effect of some implementation detail.
gcc -nostdlib -static traditional non-PIE old-school static executable
$ gcc -nostdlib -static exit.s -o exit-static
$ ls -l exit-static
-rwxr-xr-x 1 peter peter 4744 May 2 02:26 exit-static
peter#volta:/tmp$ ldd exit-static
not a dynamic executable
peter#volta:/tmp$ file exit-static
exit-static: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=1b03e3d05709b7288fe3006b4696fd0c11fb1cb2, not stripped
peter#volta:/tmp$ readelf -a exit-static
ELF Header:
...
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x401000
... (Note the absolute entry-point address nailed down at link time)
(And that the ELF type is EXEC, not DYN)
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x000000000000010c 0x000000000000010c R 0x1000
LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000
0x0000000000000009 0x0000000000000009 R E 0x1000
NOTE 0x00000000000000e8 0x00000000004000e8 0x00000000004000e8
0x0000000000000024 0x0000000000000024 R 0x4
Section to Segment mapping:
Segment Sections...
00 .note.gnu.build-id
01 .text
02 .note.gnu.build-id
...
Those are all the program headers; unlike pie / static-pie I'm not leaving any out, just other whole parts of the readelf -a output.
Also note the absolute virtual addresses in the program headers that don't give the kernel a choice where in virtual address space to map the file. This is the difference between EXEC and DYN types of ELF objects. PIE executables are shared objects with an entry point, allowing us to get ASLR for the main executable. Actual EXEC executables have a link-time-chosen memory layout.
ldd apparently only reports "not a dynamic executable" when both:
no ELF interpreter (dynamic linker) path
ELF type = EXEC

Linker issue while cross compiling for cortex-a8 device [duplicate]

This question already has an answer here:
Error during Cross-compiling C code with Dynamic libraries
(1 answer)
Closed 6 years ago.
I am using VAR-SOM-AM33 board and want to run sample code like hello world run on device and it gives -sh:./test:not found error
Toolchain used for compile code is gcc-linaro-arm-linux-gnueabihf-4.7-2013.03-20130313_linux.
Code is as below
#include <stdio.h>
int main(){
printf("hello world");
return 0;
}
for crosscompile file
arm-linux-gnueabihf-gcc test.c -march=armv7-a -marm -mthumb-interwork -mfloat-abi=hard -mfpu=neon -mtune=cortex-a8 -o test
output binary is shows as follows
test: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.31, BuildID[sha1]=2ce1c5b3d97dac2093fe2cd2d340cdaa9989923f, not stripped
After copy that file into hardware and run it shows following error
root#am335x-evm:~# ./test
-sh: ./test: not found
File permission also change by
root#am335x-evm:~# chmod +x test
but result shows same not found error.
Demo file which is running on hardware,its architecture as follows
root#am335x-evm:~# readelf -A /usr/bin/hello
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "7-A"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Application
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-2
Tag_VFP_arch: VFPv3
Tag_Advanced_SIMD_arch: NEONv1
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Needed
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align8_needed: Yes
Tag_ABI_align8_preserved: Yes, except leaf SP
Tag_ABI_enum_size: int
Tag_ABI_HardFP_use: SP and DP
and file which is cross compiled,its architecture as follows
root#am335x-evm:~# readelf -A test
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "7-A"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Application
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-2
Tag_VFP_arch: VFPv3
Tag_Advanced_SIMD_arch: NEONv1
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Needed
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align8_needed: Yes
Tag_ABI_align8_preserved: Yes, except leaf SP
Tag_ABI_enum_size: int
Tag_ABI_HardFP_use: SP and DP
Tag_ABI_VFP_args: VFP registers
Tag_CPU_unaligned_access: v6
Tag_unknown_44: 1 (0x1)
Also hardware cpuinfo is as follows
root#am335x-evm:~# cat /proc/cpuinfo
Processor : ARMv7 Processor rev 2 (v7l)
BogoMIPS : 718.02
Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x3
CPU part : 0xc08
CPU revision : 2
Hardware : Variscite VAR-SOM-AM33
Revision : 0000
Serial : 0000000000000000
I have tried running ldd command on target device.
root#am335x-evm:~# ldd
-sh: ldd: not found
So I suspect that the issue is related to the linker.
If I simply compile the file, without linking it.
arm-linux-gnueabihf-gcc -mtune=cortex-a8 -march=armv7 -O -c test.c -o test
Now if I run this file I get this error.
root#am335x-evm:~# chmod +x test
root#am335x-evm:~# ./test
./test: line 1: syntax error: word unexpected (expecting ")")
please suggest how to resolve this.
Thanks for solution, I found error with linker.
existing file has linker ld-linux.so.3 and crosscompiled file has linker ld-linux-armhf.so.3
root#am335x-evm:/usr/bin# readelf -l hello
Elf file type is EXEC (Executable file)
Entry point 0x82fc
There are 8 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
EXIDX 0x00044c 0x0000844c 0x0000844c 0x00008 0x00008 R 0x4
PHDR 0x000034 0x00008034 0x00008034 0x00100 0x00100 R E 0x4
INTERP 0x000134 0x00008134 0x00008134 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.3]
crosscompiled file program header
root#am335x-evm:~# readelf -l test
Elf file type is EXEC (Executable file)
Entry point 0x82f9
There are 8 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
EXIDX 0x000450 0x00008450 0x00008450 0x00008 0x00008 R 0x4
PHDR 0x000034 0x00008034 0x00008034 0x00100 0x00100 R E 0x4
INTERP 0x000134 0x00008134 0x00008134 0x00019 0x00019 R 0x1
[Requesting program interpreter: /lib/ld-linux-armhf.so.3]
after chang that linker program runs on target device.
root#am335x-evm:~# cd /lib/
root#am335x-evm:/lib# ls -l ld-linux.so.3
lrwxrwxrwx 1 1000 1000 12 Aug 7 2012 ld-linux.so.3 -> ld-2.12.2.so
root#am335x-evm:/lib# ln -s /li ld-2.12.2.so ld-linux-armhf.so.3
/lib/ /linuxrc
root#am335x-evm:/lib# ln -s /lib/ld-2.12.2.so ld-linux-armhf.so.3
root#am335x-evm:/lib# ldconfig
root#am335x-evm:/lib# cd
root#am335x-evm:~# ./test
hello worldroot#am335x-evm:~#

objdump and udis86 produce different output when disassembling /proc/kcore

I need to disassemble /proc/kcore file in Linux and I need to obtain virtual addresses of some special instructions to put kprobes later on it. According to this document /proc/kcore is an image of physical memory, but in this question someone answered that it is kernel's virtual memory (exactly what I am looking for).
When I use objdump tool to disassemble it, it starts with address something like f7c0b000, but udis86 starts with 0x0 (and totally different instruction). When I try to grep some specific instruction, let's say mov 0xf7c1d60c,%edx, I got:
objdump
f7c0b022 mov 0xf7c1d60c,%edx
udis86
290ec02a mov 0xf7c1d60c,%edx
It looks like the offset between udis86 and objdump is always 0xbffff000. Why so strange offset? How can I obtain virtual address of specific instruction? Somewhere I've read, that kernel is statically mapped at virtual address 0xc0000000 + 0x100000. If /proc/kcore is really physical image, is it correct only to add 0x100000 to addresses returned by objdump and I will get virtual address?
objdump understands ELF format files (such as /proc/kcore). It is able to extract the executable sections of the file while ignoring non-executable content (such as .note sections).
You can see the structure of an ELF exectuable using the -h flag, for example:
# objdump -h /proc/kcore
/proc/kcore: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 note0 00001944 0000000000000000 0000000000000000 000002a8 2**0
CONTENTS, READONLY
1 .reg/0 000000d8 0000000000000000 0000000000000000 0000032c 2**2
CONTENTS
2 .reg 000000d8 0000000000000000 0000000000000000 0000032c 2**2
CONTENTS
3 load1 00800000 ffffffffff600000 0000000000000000 7fffff602000 2**12
CONTENTS, ALLOC, LOAD, CODE
(...)
It looks like the udcli tool from udis86 probably starts disassembling things from the beginning of the file, which suggests that your output will probably start with a bunch of irrelevant output and it's up to you to figure out where execution starts.
UPDATE
Here's the verification. We use this answer to extract the first load section from /proc/kcore, like this:
# dd if=/proc/kcore of=mysection bs=1 skip=$[0x7fffff602000] count=$[0x00800000]
And now if we view that with udcli:
# udcli mysection
0000000000000000 48 dec eax
0000000000000001 c7c060000000 mov eax, 0x60
0000000000000007 0f05 syscall
0000000000000009 c3 ret
000000000000000a cc int3
000000000000000b cc int3
We see that it looks almost identical to the output of objdump -d /proc/kcore:
# objdump -d /proc/kcore
/proc/kcore: file format elf64-x86-64
Disassembly of section load1:
ffffffffff600000 <load1>:
ffffffffff600000: 48 c7 c0 60 00 00 00 mov $0x60,%rax
ffffffffff600007: 0f 05 syscall
ffffffffff600009: c3 retq
ffffffffff60000a: cc int3
ffffffffff60000b: cc int3

How to make profilers (valgrind, perf, pprof) pick up / use local version of library with debugging symbols when using mpirun?

Edit: added important note that it is about debugging MPI application
System installed shared library doesn't have debugging symbols:
$ readelf -S /usr/lib64/libfftw3.so | grep debug
$
I have therefore compiled and instaled in my home directory my owne version, with debugging enabled (--with-debug CFLAGS=-g):
$ $ readelf -S ~/lib64/libfftw3.so | grep debug
[26] .debug_aranges PROGBITS 0000000000000000 001d3902
[27] .debug_pubnames PROGBITS 0000000000000000 001d8552
[28] .debug_info PROGBITS 0000000000000000 001ddebd
[29] .debug_abbrev PROGBITS 0000000000000000 003e221c
[30] .debug_line PROGBITS 0000000000000000 00414306
[31] .debug_str PROGBITS 0000000000000000 0044aa23
[32] .debug_loc PROGBITS 0000000000000000 004514de
[33] .debug_ranges PROGBITS 0000000000000000 0046bc82
I have set both LD_LIBRARY_PATH and LD_RUN_PATH to include ~/lib64 first, and ldd program confirms that local version of library should be used:
$ ldd a.out | grep fftw
libfftw3.so.3 => /home/narebski/lib64/libfftw3.so.3 (0x00007f2ed9a98000)
The program in question is parallel numerical application using MPI (Message Passing Interface). Therefore to run this application one must use mpirun wrapper (e.g. mpirun -np 1 valgrind --tool=callgrind ./a.out). I use OpenMPI implementation.
Nevertheless, various profilers: callgrind tool in Valgrind, CPU profiling google-perfutils and perf doesn't find those debugging symbols, resulting in more or less useless output:
calgrind:
$ callgrind_annotate --include=~/prog/src --inclusive=no --tree=none
[...]
--------------------------------------------------------------------------------
Ir file:function
--------------------------------------------------------------------------------
32,765,904,336 ???:0x000000000014e500 [/usr/lib64/libfftw3.so.3.2.4]
31,342,886,912 /home/narebski/prog/src/nonlinearity.F90:__nonlinearity_MOD_calc_nonlinearity_kxky [/home/narebski/prog/bin/a.out]
30,288,261,120 /home/narebski/gene11/src/axpy.F90:__axpy_MOD_axpy_ij [/home/narebski/prog/bin/a.out]
23,429,390,736 ???:0x00000000000fc5e0 [/usr/lib64/libfftw3.so.3.2.4]
17,851,018,186 ???:0x00000000000fdb80 [/usr/lib64/libmpi.so.1.0.1]
google-perftools:
$ pprof --text a.out prog.prof
Total: 8401 samples
842 10.0% 10.0% 842 10.0% 00007f200522d5f0
619 7.4% 17.4% 5025 59.8% calc_nonlinearity_kxky
517 6.2% 23.5% 517 6.2% axpy_ij
427 5.1% 28.6% 3156 37.6% nl_to_direct_xy
307 3.7% 32.3% 1234 14.7% nl_to_fourier_xy_1d
perf events:
$ perf report --sort comm,dso,symbol
# Events: 80K cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... .................... ............................................
#
32.42% a.out libfftw3.so.3.2.4 [.] fdc4c
16.25% a.out 7fddcd97bb22 [.] 7fddcd97bb22
7.51% a.out libatlas.so.0.0.0 [.] ATL_dcopy_xp1yp1aXbX
6.98% a.out a.out [.] __nonlinearity_MOD_calc_nonlinearity_kxky
5.82% a.out a.out [.] __axpy_MOD_axpy_ij
Edit Added 11-07-2011:
I don't know if it is important, but:
$ file /usr/lib64/libfftw3.so.3.2.4
/usr/lib64/libfftw3.so.3.2.4: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, stripped
and
$ file ~/lib64/libfftw3.so.3.2.4
/home/narebski/lib64/libfftw3.so.3.2.4: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, not stripped
If /usr/lib64/libfftw3.so.3.2.4 is listed in callgrind output, then your LD_LIBRARY_PATH=~/lib64 had no effect.
Try again with export LD_LIBRARY_PATH=$HOME/lib64. Also watch out for any shell scripts you invoke, which might reset your environment.
You and Employed Russian are almost certainly right; the mpirun script is messing things up here. Two options:
Most x86 MPI implementations, as a practical matter, treat just running the executable
./a.out
the same as
mpirun -np 1 ./a.out.
They don't have to do this, but OpenMPI certainly does, as does MPICH2 and IntelMPI. So if you can do the debug serially, you should just be able to
valgrind --tool=callgrind ./a.out.
However, if you do want to run with mpirun, the issue is probably that your ~/.bashrc
(or whatever) is being sourced, undoing your changes to LD_LIBRARY_PATH etc. Easiest is just to temporarily put your changed environment variables in your ~/.bashrc for the duration of the run.
The way recent profiling tools typically handle this situation is to consult an external, matching non-stripped version of the library.
On debian-based Linux distros this is typically done by installing the -dbg suffixed version of a package; on Redhat-based they are named -debuginfo.
In the case of the tools you mentioned above; they will typically Just Work (tm) and find the debug symbols for a library if the debug info package has been installed in the standard location.

Resources