Linux default behavior of executable .data section changed between 5.4 and 5.9? - linux

Story
Case 1
I accidentally wrote my Assembly code in the .data section. I compiled it and executed it. The program ran normally under Linux 5.4.0-53-generic even though I didn't specify a flag like execstack.
Case 2:
After that, I executed the program under Linux 5.9.0-050900rc5-generic. The program got SIGSEGV. I inspected the virtual memory permission by reading /proc/$pid/maps. It turned out that the section is not executable.
I think there is a configuration on Linux that manages that permission. But I don't know where to find.
Code
[Linux 5.4.0-53-generic]
Run (normal)
ammarfaizi2#integral:/tmp$ uname -r
5.4.0-53-generic
ammarfaizi2#integral:/tmp$ cat test.asm
[section .data]
global _start
_start:
mov eax, 60
xor edi, edi
syscall
ammarfaizi2#integral:/tmp$ nasm --version
NASM version 2.14.02
ammarfaizi2#integral:/tmp$ nasm -felf64 test.asm -o test.o
ammarfaizi2#integral:/tmp$ ld test.o -o test
ammarfaizi2#integral:/tmp$ ./test
ammarfaizi2#integral:/tmp$ echo $?
0
ammarfaizi2#integral:/tmp$ md5sum test
7ffff5fd44e6ff0a278e881732fba525 test
ammarfaizi2#integral:/tmp$
Check Permission (00400000-00402000 rwxp), so it is executable.
## Debug
gef➤ shell cat /proc/`pgrep test`/maps
00400000-00402000 rwxp 00000000 08:03 7471589 /tmp/test
7ffff7ffb000-7ffff7ffe000 r--p 00000000 00:00 0 [vvar]
7ffff7ffe000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso]
7ffffffde000-7ffffffff000 rwxp 00000000 00:00 0 [stack]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
gef➤
[Linux 5.9.0-050900rc5-generic]
Run (Segfault)
root#esteh:/tmp# uname -r
5.9.0-050900rc5-generic
root#esteh:/tmp# cat test.asm
[section .data]
global _start
_start:
mov eax, 60
xor edi, edi
syscall
root#esteh:/tmp# nasm --version
NASM version 2.14.02
root#esteh:/tmp# nasm -felf64 test.asm -o test.o
root#esteh:/tmp# ld test.o -o test
root#esteh:/tmp# ./test
Segmentation fault (core dumped)
root#esteh:/tmp# echo $?
139
root#esteh:/tmp# md5sum test
7ffff5fd44e6ff0a278e881732fba525 test
root#esteh:/tmp#
Check Permission (00400000-00402000 rw-p), so it is NOT executable.
## Debug
gef➤ shell cat /proc/`pgrep test`/maps
00400000-00402000 rw-p 00000000 fc:01 2412 /tmp/test
7ffff7ff9000-7ffff7ffd000 r--p 00000000 00:00 0 [vvar]
7ffff7ffd000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso]
7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0 [stack]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
gef➤
objdump -p
root#esteh:/tmp# objdump -p test
test: file format elf64-x86-64
Program Header:
LOAD off 0x0000000000000000 vaddr 0x0000000000400000 paddr 0x0000000000400000 align 2**12
filesz 0x0000000000001009 memsz 0x0000000000001009 flags rw-
Questions
Where is the configuration on Linux that manages default ELF sections permission?
Are my observations on permissions correct?
Summary
Default permission for .data section on Linux 5.4.0-53-generic is executable.
Default permission for .data section on Linux 5.9.0-050900rc5-generic is NOT executable.

Your binary is missing PT_GNU_STACK. As such, this change appears to have been caused by commit 9fccc5c0c99f238aa1b0460fccbdb30a887e7036:
From 9fccc5c0c99f238aa1b0460fccbdb30a887e7036 Mon Sep 17 00:00:00 2001
From: Kees Cook <keescook#chromium.org>
Date: Thu, 26 Mar 2020 23:48:17 -0700
Subject: x86/elf: Disable automatic READ_IMPLIES_EXEC on 64-bit
With modern x86 64-bit environments, there should never be a need for
automatic READ_IMPLIES_EXEC, as the architecture is intended to always
be execute-bit aware (as in, the default memory protection should be NX
unless a region explicitly requests to be executable).
There were very old x86_64 systems that lacked the NX bit, but for those,
the NX bit is, obviously, unenforceable, so these changes should have
no impact on them.
Suggested-by: Hector Marco-Gisbert <hecmargi#upv.es>
Signed-off-by: Kees Cook <keescook#chromium.org>
Signed-off-by: Borislav Petkov <bp#suse.de>
Reviewed-by: Jason Gunthorpe <jgg#mellanox.com>
Link: https://lkml.kernel.org/r/20200327064820.12602-4-keescook#chromium.org
---
arch/x86/include/asm/elf.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 397a1c74433ec..452beed7892bb 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
## -287,7 +287,7 ## extern u32 elf_hwcap2;
* CPU: | lacks NX* | has NX, ia32 | has NX, x86_64 |
* ELF: | | | |
* ---------------------|------------|------------------|----------------|
- * missing PT_GNU_STACK | exec-all | exec-all | exec-all |
+ * missing PT_GNU_STACK | exec-all | exec-all | exec-none |
* PT_GNU_STACK == RWX | exec-stack | exec-stack | exec-stack |
* PT_GNU_STACK == RW | exec-none | exec-none | exec-none |
*
## -303,7 +303,7 ## extern u32 elf_hwcap2;
*
*/
#define elf_read_implies_exec(ex, executable_stack) \
- (executable_stack == EXSTACK_DEFAULT)
+ (mmap_is_ia32() && executable_stack == EXSTACK_DEFAULT)
struct task_struct;
--
cgit 1.2.3-1.el7
This was first present in the 5.8 series. See also Unexpected exec permission from mmap when assembly files included in the project.

This is only a guess: I think the culprit is the READ_IMPLIES_EXEC personality that was being set automatically in the absence of a PT_GNU_STACK segment.
In the 5.4 kernel source we can find this piece of code:
SET_PERSONALITY2(loc->elf_ex, &arch_state);
if (elf_read_implies_exec(loc->elf_ex, executable_stack))
current->personality |= READ_IMPLIES_EXEC;
That's the only thing that can transform an RW section into an RWX one. Any other use of PROC_EXEC didn't seem to be changed or relevant to this question, to me.
The executable_stack is set here:
for (i = 0; i < loc->elf_ex.e_phnum; i++, elf_ppnt++)
switch (elf_ppnt->p_type) {
case PT_GNU_STACK:
if (elf_ppnt->p_flags & PF_X)
executable_stack = EXSTACK_ENABLE_X;
else
executable_stack = EXSTACK_DISABLE_X;
break;
But if the PT_GNU_STACK segment is not present, that variable retains its default value:
int executable_stack = EXSTACK_DEFAULT;
Now this workflow is identical in both 5.4 and the latest kernel source, what changed is the definition of elf_read_implies_exec:
Linux 5.4:
/*
* An executable for which elf_read_implies_exec() returns TRUE will
* have the READ_IMPLIES_EXEC personality flag set automatically.
*/
#define elf_read_implies_exec(ex, executable_stack) \
(executable_stack != EXSTACK_DISABLE_X)
Latest Linux:
/*
* An executable for which elf_read_implies_exec() returns TRUE will
* have the READ_IMPLIES_EXEC personality flag set automatically.
*
* The decision process for determining the results are:
*
* CPU: | lacks NX* | has NX, ia32 | has NX, x86_64 |
* ELF: | | | |
* ---------------------|------------|------------------|----------------|
* missing PT_GNU_STACK | exec-all | exec-all | exec-none |
* PT_GNU_STACK == RWX | exec-stack | exec-stack | exec-stack |
* PT_GNU_STACK == RW | exec-none | exec-none | exec-none |
*
* exec-all : all PROT_READ user mappings are executable, except when
* backed by files on a noexec-filesystem.
* exec-none : only PROT_EXEC user mappings are executable.
* exec-stack: only the stack and PROT_EXEC user mappings are executable.
*
* *this column has no architectural effect: NX markings are ignored by
* hardware, but may have behavioral effects when "wants X" collides with
* "cannot be X" constraints in memory permission flags, as in
* https://lkml.kernel.org/r/20190418055759.GA3155#mellanox.com
*
*/
#define elf_read_implies_exec(ex, executable_stack) \
(mmap_is_ia32() && executable_stack == EXSTACK_DEFAULT)
Note how in the 5.4 version the elf_read_implies_exec returned a true value if the stack was not explicitly marked as not executable (via the PT_GNU_STACK segment).
In the latest source, the check is now more defensive: the elf_read_implies_exec is true only on 32-bit executable, in the case where no PT_GNU_STACK segment was found in the ELF binary.
I assembled your program, linked it, and found no PT_GNU_STACK segment, so this may be the reason.
If this is indeed the issue and if I followed the code correctly, if you set the stack as not executable in the binary, its data section should not be mapped executable anymore (not even on Linux 5.4).

Related

creating Linux i386 a.out executable shorter than 4097 bytes

I'm trying to create a Linux i386 a.out executable shorter than 4097 bytes, but all my efforts have failed so far.
I'm compiling it with:
$ nasm -O0 -f bin -o prog prog.nasm && chmod +x prog
I'm testing it in a Ubuntu 10.04 i386 VM running Linux 2.6.32 with:
$ sudo modprobe binfmt_aout
$ sudo sysctl vm.mmap_min_addr=4096
$ ./prog; echo $?
Hello, World!
0
This is the source code of the 4097-byte executable which works:
; prog.nasm
bits 32
cpu 386
org 0x1000 ; Linux i386 a.out QMAGIC file format has this.
SECTION_text:
a_out_header:
dw 0xcc ; magic=QMAGIC; Demand-paged executable with the header in the text. The first page (0x1000 bytes) is unmapped to help trap NULL pointer references.
db 0x64 ; type=M_386
db 0 ; flags=0
dd SECTION_data - SECTION_text ; a_text=0x1000 (byte size of .text; mapped as r-x)
dd SECTION_end - SECTION_data ; a_data=0x1000 (byte size of .data; mapped as rwx, not just rw-)
dd 0 ; a_bss=0 (byte size of .bss)
dd 0 ; a_syms=0 (byte size of symbol table data)
dd _start ; a_entry=0x1020 (in-memory address of _start == file offset of _start + 0x1000)
dd 0 ; a_trsize=0 (byte size of relocation info or .text)
dd 0 ; a_drsize=0 (byte size of relocation info or .data)
_start: mov eax, 4 ; __NR_write
mov ebx, 1 ; argument: STDOUT_FILENO
mov ecx, msg ; argument: address of string to output
mov edx, msg_end - msg ; argument: number of bytes
int 0x80 ; syscall
mov eax, 1 ; __NR_exit
xor ebx, ebx ; argument: EXIT_SUCCESS == 0.
int 0x80 ; syscall
msg: db 'Hello, World!', 10
msg_end:
times ($$ - $) & 0xfff db 0 ; padding to multiple of 0x1000 ; !! is this needed?
SECTION_data: db 0
; times ($$ - $) & 0xfff db 0 ; padding to multiple of 0x1000 ; !! is this needed?
SECTION_end:
How can I make the executable file smaller? (Clarification: I still want a Linux i386 a.out executable. I know that that it's possible to create a smaller Linux i386 ELF executable.) There is several thousands bytes of padding at the end of the file, which seems to be required.
So far I've discovered the following rules:
If a_text or a_data is 0, Linux doesn't run the program. (See relevant Linux source block 1 and 2.)
If a_text is not a multiple of 0x1000 (4096), Linux doesn't run the program. (See relevant Linux source block 1 and 2.)
If the file is shorter than a_text + a_data bytes, Linux doesn't run the program. (See relevant Linux source code location.)
Thus file_size >= a_text + a_data >= 0x1000 + 1 == 4097 bytes.
The combinations nasm -f aout + ld -s -m i386linux and nasm -f elf + ld -s -m i386linux and as -32 + ld -s -m i386linux produce an executable of 4100 bytes, which doesn't even work (because its a_data is 0), and by adding a single byte to section .data makes the executable file 8196 bytes long, and it will work. Thus this path doesn't lead to less than 4097 bytes.
Did I miss something?
TL;DR It doesn't work.
It is impossible to make a Linux i386 a.out QMAGIC executable shorter than 4097 bytes work on Linux 2.6.32, based on evidence in the Linux kernel source code of the binfmt_aout module.
Details:
If a_text is 0, Linux doesn't run the program. (Evidence for this check: a_text is passed as the length argument to mmap(2) here.)
If a_data is 0, Linux doesn't run the program. (Evidence for this check: a_data is passed as the length argument to mmap(2) here.)
If a_text is not a multiple of 0x1000 (4096), Linux doesn't run the program. (Evidence for this check: fd_offset + ex.a_text is passed as the offset argument to mmap(2) here. For QMAGIC, fd_offset is 0.)
If the file is shorter than a_text + a_data bytes, Linux doesn't run the program. (Evidence for this check: file sizes is compared to a_text + a_data + a_syms + ... here.)
Thus file_size >= a_text + a_data >= 0x1000 + 1 == 4097 bytes.
I've also tried OMAGIC, ZMAGIC and NMAGIC, but none of them worked. Details:
For OMAGIC, read(2) is used instead of mmap(2) within here, thus it can work. However, Linux tries to load the code to virtual memory address 0 (N_TXTADDR is 0), and this causes SIGKILL (if non-root and vm.mmap_min_addr is larger than 0) or SIGILL (otherwise), thus it doesn't work. Maybe the reason for SIGILL is that the page allocated by set_brk is not executable (but that should be indicated by SIGSEGV), this could be investigated further.
For ZMAGIC and NMAGIC, read(2) instead of mmap(2) within here if fd_offset is not a multiple of the page size (0x1000). fd_offset is 32 for NMAGIC, and 1024 for ZMAGIC, so good. However, it doesn't work for the same reason (load to virtual memory address 0).
I wonder if it's possible to run OMAGIC, ZMAGIC or NMAGIC executables at all on Linux 2.6.32 or later.

Loaded glibc base address different for each function

I'm trying to calculate the base address of the library of a binary file.
I have the address of printf, puts ecc and then I subtract it's offset to get the base address of the library.
I was doing this for printf, puts and signal, but every time I got a different base address.
I also tried to do the things in this post, but I couldn't get the right result either.
ASLR is disabled.
this is where I take the address of the library function:
gdb-peda$ x/20wx 0x804b018
0x804b018 <signal#got.plt>: 0xf7e05720 0xf7e97010 0x080484e6 0x080484f6
0x804b028 <puts#got.plt>: 0xf7e3fb40 0x08048516 0x08048526 0xf7df0d90
0x804b038 <memset#got.plt>: 0xf7f18730 0x08048556 0x08048566 0x00000000
then I have:
gdb-peda$ info proc mapping
process 114562
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x8048000 0x804a000 0x2000 0x0 /home/ofey/CTF/Pwnable.tw/applestore/applestore
0x804a000 0x804b000 0x1000 0x1000 /home/ofey/CTF/Pwnable.tw/applestore/applestore
0x804b000 0x804c000 0x1000 0x2000 /home/ofey/CTF/Pwnable.tw/applestore/applestore
0x804c000 0x806e000 0x22000 0x0 [heap]
0xf7dd8000 0xf7fad000 0x1d5000 0x0 /lib/i386-linux-gnu/libc-2.27.so
0xf7fad000 0xf7fae000 0x1000 0x1d5000 /lib/i386-linux-gnu/libc-2.27.so
0xf7fae000 0xf7fb0000 0x2000 0x1d5000 /lib/i386-linux-gnu/libc-2.27.so
0xf7fb0000 0xf7fb1000 0x1000 0x1d7000 /lib/i386-linux-gnu/libc-2.27.so
0xf7fb1000 0xf7fb4000 0x3000 0x0
0xf7fd0000 0xf7fd2000 0x2000 0x0
0xf7fd2000 0xf7fd5000 0x3000 0x0 [vvar]
0xf7fd5000 0xf7fd6000 0x1000 0x0 [vdso]
0xf7fd6000 0xf7ffc000 0x26000 0x0 /lib/i386-linux-gnu/ld-2.27.so
0xf7ffc000 0xf7ffd000 0x1000 0x25000 /lib/i386-linux-gnu/ld-2.27.so
0xf7ffd000 0xf7ffe000 0x1000 0x26000 /lib/i386-linux-gnu/ld-2.27.so
0xfffdd000 0xffffe000 0x21000 0x0 [stack]
and :
gdb-peda$ info sharedlibrary
From To Syms Read Shared Object Library
0xf7fd6ab0 0xf7ff17fb Yes /lib/ld-linux.so.2
0xf7df0610 0xf7f3d386 Yes /lib/i386-linux-gnu/libc.so.6
I then found the offset of signal and puts to calculate the base libc address.
base_with_signal_offset = 0xf7e05720 - 0x3eda0 = 0xf7dc6980
base_with_puts_offset = 0xf7e3fb40 - 0x809c0 = 0xf7dbf180
I was expecting base_with_signal_offset = base_with_puts_offset = 0xf7dd8000, but that's not the case.
What I'm doing wrong?
EDIT(To let you understand where I got those offset):
readelf -s /lib/x86_64-linux-gnu/libc-2.27.so | grep puts
I get :
191: 00000000000809c0 512 FUNC GLOBAL DEFAULT 13 _IO_puts##GLIBC_2.2.5
422: 00000000000809c0 512 FUNC WEAK DEFAULT 13 puts##GLIBC_2.2.5
496: 00000000001266c0 1240 FUNC GLOBAL DEFAULT 13 putspent##GLIBC_2.2.5
678: 00000000001285d0 750 FUNC GLOBAL DEFAULT 13 putsgent##GLIBC_2.10
1141: 000000000007f1f0 396 FUNC WEAK DEFAULT 13 fputs##GLIBC_2.2.5
1677: 000000000007f1f0 396 FUNC GLOBAL DEFAULT 13 _IO_fputs##GLIBC_2.2.5
2310: 000000000008a640 143 FUNC WEAK DEFAULT 13 fputs_unlocked##GLIBC_2.2.5
I was expecting base_with_signal_offset = base_with_puts_offset = 0xf7dd8000
There are 3 numbers in your calculation:
&puts_at_runtime - symbol_value_from_readelf == &first_executable_pt_load_segment_libc.
The readelf output shows that you got one of these almost correct: the value of puts in 64-bit /lib/x86_64-linux-gnu/libc-2.27.so is indeed 0x809c0, but that is not the library you are actually using. You need to repeat the same on the actually used 32-bit library: /lib/i386-linux-gnu/libc-2.27.so.
For the first number -- &puts_at_runtime, you are using value from the puts#got.plt import stub. That value is only guaranteed to have been resolved (point to actual puts in libc.so) IFF you have LD_BIND_NOW=1 set in the environment, or you linked your executable with -z now linker flag, or you actually called puts already.
It may be better to print &puts in GDB.
The last number -- &first_executable_pt_load_segment_libc is correct (because info shared shows that libc.so.6 .text section starts at 0xf7df0610, which is between 0xf7dd8000 and 0xf7fad000.
So putting it all together, the only error was that you used the wrong version of libc.so to extract the symbol_value_from_readelf.
On my system:
#include <signal.h>
#include <stdio.h>
int main() {
puts("Hello");
signal(SIGINT, SIG_IGN);
return 0;
}
gcc -m32 t.c -fno-pie -no-pie
gdb -q a.out
... set breakpoint on exit from main
Breakpoint 1, 0x080491ae in main ()
(gdb) p &puts
$1 = (<text variable, no debug info> *) 0xf7e31300 <puts>
(gdb) p &signal
$2 = (<text variable, no debug info> *) 0xf7df7d20 <ssignal>
(gdb) info proc map
process 114065
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x8048000 0x8049000 0x1000 0x0 /tmp/a.out
...
0x804d000 0x806f000 0x22000 0x0 [heap]
0xf7dc5000 0xf7de2000 0x1d000 0x0 /lib/i386-linux-gnu/libc-2.29.so
...
(gdb) info shared
From To Syms Read Shared Object Library
0xf7fd5090 0xf7ff0553 Yes (*) /lib/ld-linux.so.2
0xf7de20e0 0xf7f2b8d6 Yes (*) /lib/i386-linux-gnu/libc.so.6
Given above, we expect readelf -s to give us 0xf7e31300 - 0xf7dc5000 ==
0x6c300 for puts and 0xf7df7d20 - 0xf7dc5000 == 0x32d20 for signal respectively.
readelf -Ws /lib/i386-linux-gnu/libc-2.29.so | egrep ' (puts|signal)\W'
452: 00032d20 68 FUNC WEAK DEFAULT 14 signal##GLIBC_2.0
458: 0006c300 400 FUNC WEAK DEFAULT 14 puts##GLIBC_2.0
QED.

Computing offset of a function in memory

I am reading documentation for a uprobe tracer and there is a instruction how to compute offset of a function in memory. I am quoting it here.
Following example shows how to dump the instruction pointer and %ax
register at the probed text address. Probe zfree function in /bin/zsh:
# cd /sys/kernel/debug/tracing/
# cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh
# objdump -T /bin/zsh | grep -w zfree
0000000000446420 g DF .text 0000000000000012 Base zfree
0x46420 is the offset of zfree in object /bin/zsh that is loaded at
0x00400000.
I do not know why, but they took output 0x446420 and subtracted 0x400000 to get 0x46420. It seamed as an error to me. Why 0x400000?
I have tried to do the same on my Fedora 23 with 4.5.6-200 kernel.
First I turned off memory address randomization
echo 0 > /proc/sys/kernel/randomize_va_space
Then I figured out where binary is in memory
$ cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
555555554000-55555560f000 r-xp 00000000 fd:00 2387155 /usr/bin/zsh
Took the offset
marko#fedora:~ $ objdump -T /bin/zsh | grep -w zfree
000000000005dc90 g DF .text 0000000000000012 Base zfree
And figured out where zfree is via gdb
$ gdb -p 21067 --batch -ex 'p zfree'
$1 = {<text variable, no debug info>} 0x5555555b1c90 <zfree>
marko#fedora:~ $ python
Python 2.7.11 (default, Mar 31 2016, 20:46:51)
[GCC 5.3.1 20151207 (Red Hat 5.3.1-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> hex(0x5555555b1c90-0x555555554000)
'0x5dc90'
You see, I've got the same result as in objdump without subtracting anything.
But then I tried the same on another machine with SLES and there it's the same as in uprobe documentation.
Why is there such a difference? How do I compute correct offset then?
As far as I see the difference may be caused only by the way how examined binary was built. Saying more precisely - if ELF has fixed load address or not. Lets do simple experiment. We have simple test code:
int main(void) { return 0; }
Then, build it in two ways:
$ gcc -o t1 t.c # create image with fixed load address
$ gcc -o t2 t.c -pie # create load-base independent image
Now, lets check load base addresses for these two images:
$ readelf -l --wide t1 | grep LOAD
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00067c 0x00067c R E 0x200000
LOAD 0x000680 0x0000000000600680 0x0000000000600680 0x000228 0x000230 RW 0x200000
$ readelf -l --wide t2 | grep LOAD
LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x0008cc 0x0008cc R E 0x200000
LOAD 0x0008d0 0x00000000002008d0 0x00000000002008d0 0x000250 0x000258 RW 0x2000
Here you can see that first image requires fixed load address - 0x400000, and the second one has no address requirements at all.
And now we can compare addresses that objdump tells about main:
$ objdump -t t1 | grep ' main'
00000000004004b6 g F .text 000000000000000b main
$ objdump -t t2 | grep ' main'
0000000000000710 g F .text 000000000000000b main
As we see, the address is a complete virtual address that first byte of main will occupy if image is loaded at address, stored in program header. And of course the second image never won't be loaded at 0x0 but instead at another, randomly chosen location, that will offset real function position.

The address where filename has been loaded is missing [GDB]

I have following sample code
#include<stdio.h>
int main()
{
int num1, num2;
printf("Enter two numbers\n");
scanf("%d",&num1);
scanf("%d",&num2);
int i;
for(i = 0; i < num2; i++)
num1 = num1 + num1;
printf("Result is %d \n",num1);
return 0;
}
I compiled this code with -g option to gcc.
gcc -g file.c
Generate separate symbol file
objcopy --only-keep-debug a.out a.out.sym
Strip the symbols from a.out
strip -s a.out
Load this a.out in gdb
gdb a.out
gdb says "no debug information found" fine.
Then I use add-symbol-file command in gdb
(gdb) add-symbol-file a.out.debug [Enter]
The address where a.out.debug has been loaded is missing
I want to know how to find this address?
Is there any command or trick to find it?
This address is representing WHAT?
I know gdb has an other command symbol-file but it overwrites the previous loaded symbols.
So I have to use this command to add many symbol files in gdb.
my system is 64bit running ubuntu LTS 12.04
gdb version is 7.4-2012.04
gcc version is 4.6.3
objcopy --only-keep-debug a.out a.out.sym
If you want GDB to load the a.out.sym automatically, follow the steps outlined here (note in particular that you need to do the "add .gnu_debuglink" step).
This address is representing WHAT
The address GDB wants is the location of .text section of the binary. To find it, use readelf -WS a.out. E.g.
$ readelf -WS /bin/date
There are 28 section headers, starting at offset 0xe350:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 0000000000400238 000238 00001c 00 A 0 0 1
...
[13] .text PROGBITS 0000000000401900 001900 0077f8 00 AX 0 0 16
Here, you want to give GDB 0x401900 as the load address.

How is /usr/lib64/libc.so generated?

[root#xx test]# cat /usr/lib64/libc.so
/* GNU ld script
Use the shared library, but some functions are only in
the static library, so try that secondarily. */
OUTPUT_FORMAT(elf64-x86-64)
GROUP ( /lib64/libc.so.6 /usr/lib64/libc_nonshared.a AS_NEEDED ( /lib64/ld-linux-x86-64.so.2 ) )
Anyone knows how this kind of stuff is generated?
This is generated when glibc is compiled using Make utility.
There is a rule (started by make install) in glibc's Makefile, which does just echo needed lines into some temporary file $#.new:
(echo '/* GNU ld script';\
echo ' Use the shared library, but some functions are only in';\
echo ' the static library, so try that secondarily. */';\
cat $<; \
echo 'GROUP ( $(slibdir)/libc.so$(libc.so-version)' \
'$(libdir)/$(patsubst %,$(libtype.oS),$(libprefix)$(libc-name))'\
' AS_NEEDED (' $(slibdir)/$(rtld-installed-name) ') )' \
) > $#.new
And then this file is renamed to libc.so
mv -f $#.new $#
Here is a comment from Makefile, which explains a bit:
# What we install as libc.so for programs to link against is in fact a
# link script. It contains references for the various libraries we need.
# The libc.so object is not complete since some functions are only defined
# in libc_nonshared.a.
# We need to use absolute paths since otherwise local copies (if they exist)
# of the files are taken by the linker.
I understand this as: libc.so.6 is not complete and needs something, which can't be stored in shared library. So, glibc developers moved this something to static part of glibc - libc_nonshared.a. To force always linking both libc.so.6 and libc_nonstared.a, they created a special linking script which instructs ld linker to use both when it is asked for -lc (libc)
What is in the nonshared part? Let's check:
$ objdump -t /usr/lib/libc_nonshared.a |grep " F "|grep -v __
00000000 g F .text 00000058 .hidden atexit
00000000 w F .text 00000050 .hidden stat
00000000 w F .text 00000050 .hidden fstat
00000000 w F .text 00000050 .hidden lstat
00000000 g F .text 00000050 .hidden stat64
00000000 g F .text 00000050 .hidden fstat64
00000000 g F .text 00000050 .hidden lstat64
00000000 g F .text 00000050 .hidden fstatat
00000000 g F .text 00000050 .hidden fstatat64
00000000 w F .text 00000058 .hidden mknod
00000000 g F .text 00000050 .hidden mknodat
00000000 l F .text 00000001 nop
There are atexit(), *stat*(), mknod functions. Why? Don't know really, but it is a fact of glibc.
Here is some long explaination http://giraffe-data.com/~bryanh/giraffehome/d/note/proglib and I cite beginning of it:
The stat() family of functions and mknod() are special. Their
interfaces are tied so tightly to the underlying operating system that
they change occasionally.
On managed systems you may need to install glibc-devel and/or glibc-devel.i686.

Resources