How to point a shared object to debug information in GDB without altering the files? - linux

I have two files - a shared object file and debug information file.
How can I tell GDB to use the debug information file for that shared object without altering the files, file names or creating links?
Is it even possible?
I just want to tell GDB about it, not to change anything.
EDIT: Here is what I am trying to do (on Ubuntu 16.04, x86_64)
I am taking the libc and libc debug information files from my system, and copy them to a new directory. Then, I preload the moved libc to a process and attach to it with GDB.
sudo apt install libc6-dbg
cp /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.23.so debug_file
cp /lib/x86_64-linux-gnu/libc.so.6 .
cat << EOF > traceme.c
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
int main(void) {
printf("trace me:\nsudo gdb -p %d\n", getpid());
sleep(20);
return 0;
}
EOF
gcc -o traceme traceme.c
LD_PRELOAD=./libc.so.6 ./traceme &
sudo gdb -p 28163
Now, my GDB sessions is this:
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007ff8e77c68b0 0x00007ff8e7919ac4 Yes (*) ./libc.so.6
0x00007ff8e7b71ac0 0x00007ff8e7b8f810 Yes /lib64/ld-linux-x86-64.so.2
(*): Shared library is missing debugging information.
(gdb) add-symbol-file debug_file 0x00007ff8e77c68b0
add symbol table from file "debug_file" at
.text_addr = 0x7ff8e77c68b0
(y or n) y
Reading symbols from debug_file...done.
(gdb) p &main_arena
$1 = (struct malloc_state *) 0x3c4b20 <main_arena>
(gdb) p main_arena
Cannot access memory at address 0x3c4b20
(gdb) info proc mappings
process 28163
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x400000 0x401000 0x1000 0x0 /home/ubuntu/tmp/z/traceme
0x600000 0x601000 0x1000 0x0 /home/ubuntu/tmp/z/traceme
0x601000 0x602000 0x1000 0x1000 /home/ubuntu/tmp/z/traceme
0xff8000 0x1019000 0x21000 0x0 [heap]
0x7ff8e77a7000 0x7ff8e7967000 0x1c0000 0x0 /home/ubuntu/tmp/z/libc.so.6
0x7ff8e7967000 0x7ff8e7b67000 0x200000 0x1c0000 /home/ubuntu/tmp/z/libc.so.6
0x7ff8e7b67000 0x7ff8e7b6b000 0x4000 0x1c0000 /home/ubuntu/tmp/z/libc.so.6
0x7ff8e7b6b000 0x7ff8e7b6d000 0x2000 0x1c4000 /home/ubuntu/tmp/z/libc.so.6
0x7ff8e7b6d000 0x7ff8e7b71000 0x4000 0x0
0x7ff8e7b71000 0x7ff8e7b97000 0x26000 0x0 /lib/x86_64-linux-gnu/ld-2.23.so
0x7ff8e7d91000 0x7ff8e7d96000 0x5000 0x0
0x7ff8e7d96000 0x7ff8e7d97000 0x1000 0x25000 /lib/x86_64-linux-gnu/ld-2.23.so
0x7ff8e7d97000 0x7ff8e7d98000 0x1000 0x26000 /lib/x86_64-linux-gnu/ld-2.23.so
0x7ff8e7d98000 0x7ff8e7d99000 0x1000 0x0
0x7ffe53a5a000 0x7ffe53a7b000 0x21000 0x0 [stack]
0x7ffe53b3a000 0x7ffe53b3c000 0x2000 0x0 [vvar]
0x7ffe53b3c000 0x7ffe53b3e000 0x2000 0x0 [vdso]
0xffffffffff600000 0xffffffffff601000 0x1000 0x0 [vsyscall]
For some reason, the main_arena symbol is not within the mapping of the libc.

How can I tell GDB to use the symbols file for that shared object without altering the files, file names or creating links?
(gdb) info shared
Will tell you at what address your foo.so is loaded. Say it's $addr.
(gdb) add-symbol-file /path/to/foo.so.debug $addr
will tell GDB to add debug symbols for foo.so from foo.so.debug
Update:
(gdb) p main_arena
Cannot access memory at address 0x3c4b20
I am pretty sure this is a bug in GDB. You are correct: it's not relocating the .data section when it should.
Fortunately, there is a workaround:
(gdb) add-symbol-file debug_file 0x00007ff8e77c68b0 -s .data 0x7ff8e77a7000
(The first address is from info shared. The second address is from info proc map for the (first) address where libc.so.6 is loaded.)

Related

what is segment 00 in my Linux executable program (64 bits)

Here is a very simple assembly program, just return 12 after executed.
$ cat a.asm
global _start
section .text
_start: mov rax, 60 ; system call for exit
mov rdi, 12 ; exit code 12
syscall
It can be built and executed correctly:
$ nasm -f elf64 a.asm && ld a.o && ./a.out || echo $?
12
But the size of a.out is big, it is more than 4k:
$ wc -c a.out
4664 a.out
I try to understand it by reading elf content:
$ readelf -l a.out
Elf file type is EXEC (Executable file)
Entry point 0x401000
There are 2 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000000b0 0x00000000000000b0 R 0x1000
LOAD 0x0000000000001000 0x0000000000401000 0x0000000000401000
0x000000000000000c 0x000000000000000c R E 0x1000
Section to Segment mapping:
Segment Sections...
00
01 .text
it is strange, segment 00 is aligned by 0x1000, I think it means such segment at least will occupy 4096 bytes.
My question is what is this segment 00?
(nasm version 2.14.02, ld version 2.34, os is Ubuntu 20.04.1)
Since it starts at file offset zero, it is probably a "padding" segment introduced to make the loading of the ELF more efficient.
The .text segment will, in fact, be already aligned in the file as it should be in memory.
You can force ld not to align sections both in memory and in the file with -n. You can also strip the symbols with -s.
This will reduce the size to about 352 bytes.
Now the ELF contains:
The ELF header (Needed)
The program header table (Needed)
The code (Needed)
The string table (Possibly unneeded)
The section table (Possibly unneeded)
The string table can be removed, but apparently strips can't do that.
I've removed the .shstrtab section data and all the section headers manually to shrink the size down to 144 bytes.
Consider that 64 bytes come from the ELF header, 60 from the single program header and 12 from your code; for a total of 136 bytes.
The extra 8 bytes are padding, 4 bytes at the end of the code section (easy to remove), and one at the end of the program header (which requires a bit of patching).

Get machine code of the proccess by PID without attaching a debugger

I want to get a machine code of the running proccess by his PID for analysing malicious instructions, by using heuristic methods of data analysing.
All I need to know is list of current machine instructions and values of registers (EIP, EAX, EBX...).
I can use gdb for reach this goal gdb output, but is take a several problems:
I don't know how interact with gdb from my application;
malicious code can use some technics of debugger detection like this: http://www.ouah.org/linux-anti-debugging.txt
https://www.youtube.com/watch?v=UTVp4jpJoyc&list=LLw7XNcx80oj63tRYAg7hrsA
for windows;
Getting info from console output makes work of my application slower.
Is are any way to get this information by PID in Linux? Or maybe Windows?
you may have a look to gcore:
$ gcore
usage: gcore [-o filename] pid
so you can dump process core using its pid:
$ gcore 792
warning: Could not load vsyscall page because no executable was specified
0x00007f5f73998410 in ?? ()
Saved corefile core.792
and then open it in gdb:
$ gdb -c core.792
GNU gdb (GDB) Fedora 8.0.1-30.fc26
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
[...]
[New LWP 792]
Missing separate debuginfo for the main executable file
Try: dnf --enablerepo='*debug*' install /usr/lib/debug/.build-id/09/b9d38bb6291b6282de4a2692e45448828d50da
Core was generated by `./a.out'.
#0 0x00007f5f73998410 in ?? ()
(gdb) info registers
rax 0xfffffffffffffe00 -512
rbx 0x0 0
rcx 0x7f5f73998410 140047938061328
rdx 0x1 1
rsi 0x7ffd30683d73 140725415591283
rdi 0x3 3
rbp 0x7ffd30683d90 0x7ffd30683d90
rsp 0x7ffd30683d68 0x7ffd30683d68
r8 0x1d 29
r9 0x0 0
r10 0x3 3
r11 0x246 582
r12 0x4006d0 4196048
r13 0x7ffd30683e70 140725415591536
r14 0x0 0
r15 0x0 0
rip 0x7f5f73998410 0x7f5f73998410
eflags 0x246 [ PF ZF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
or even using the binary image from /proc to get some symbols:
gdb -c core.792 /proc/792/exe
You may know that you can pass scripts to gdb, this can ease not having to interact with it from your binary (man gdb for more details).
if you don't want to use gdb directly you may try using ptrace() directly, but it is for sure more work.
For the anti debugging technics, well... they work and there is no easy way to handle them directly as far as I know, each one may be worked arounded manually, (patching binary, disassembling from unaligned addresses manually by setting then in objdump, etc...)
I'm not an expert of the domain, I hope this will help you a bit.

How to single step ARM assembly in GDB on QEMU?

I'm trying to learn about ARM assembler programming using the GNU assembler. I've setup my PC with QEmu and have a Debian ARM-HF chroot environment.
If I assemble and link my test program:
.text
.global _start
_start:
mov r0, #6
bx lr
with:
as test.s -o test.o
ld test.o -o test
Then load the file into gdb and set a breakpoint on _start:
root#Latitude-E6420:/root# gdb test
GNU gdb (GDB) 7.6.1 (Debian 7.6.1-1)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
For bug reporting instructions, please see:
...
Reading symbols from /root/test...(no debugging symbols found)...done.
(gdb) break _start
Breakpoint 1 at 0x8054
(gdb)
How do I single step the code, display the assembler source code and monitor the registers?
I tried some basic commands and they did not work:
(gdb) break _start
Breakpoint 1 at 0x8054
(gdb) info regi
The program has no registers now.
(gdb) stepi
The program is not being run.
(gdb) disas
No frame selected.
(gdb) r
Starting program: /root/test
qemu: Unsupported syscall: 26
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
qemu: Unsupported syscall: 26
During startup program terminated with signal SIGSEGV, Segmentation fault.
(gdb)
Your problem here is that you're trying to run an ARM gdb under QEMU's user-mode emulation. QEMU doesn't support the ptrace syscall (that's what syscall number 26 is), so this is never going to work.
What you need to do is run your test binary under QEMU with the QEMU options to enable QEMU's own builtin gdb stub which will listen on a TCP port. Then you can run a gdb compiled to run on your host system but with support for ARM targets, and tell that to connect to the TCP port.
(Emulating ptrace within QEMU is technically very tricky, and it would not provide much extra functionality that you can't already achieve via the QEMU builtin gdbstub. It's very unlikely it'll ever be implemented.)
Minimal working QEMU user mode example
I was missing the -fno-pie -no-pie options:
sudo apt-get install gdb-multiarch gcc-arm-linux-gnueabihf qemu-user
printf '
#include <stdio.h>
#include <stdlib.h>
int main() {
puts("hello world");
return EXIT_SUCCESS;
}
' > hello_world.c
arm-linux-gnueabihf-gcc -fno-pie -ggdb3 -no-pie -o hello_world hello_world.c
qemu-arm -L /usr/arm-linux-gnueabihf -g 1234 ./hello_world
On another terminal:
gdb-multiarch -q --nh \
-ex 'set architecture arm' \
-ex 'set sysroot /usr/arm-linux-gnueabihf' \
-ex 'file hello_world' \
-ex 'target remote localhost:1234' \
-ex 'break main' \
-ex continue \
-ex 'layout split'
;
This leaves us at main, in a split code / disassembly view due to layout split. You will also interested in:
layout regs
which shows the registers.
At the end of the day however, GDB Dashboard is more flexible and reliable: gdb split view with code
-fno-pie -no-pie is required because the packaged Ubuntu GCC uses -fpie -pie by default, and those fail due to a QEMU bug: How to GDB step debug a dynamically linked executable in QEMU user mode?
There was no gdbserver --multi-like functionality for the QEMU GDB stub on QEMU 2.11: How to restart QEMU user mode programs from the GDB stub as in gdbserver --multi?
For those learning ARM assembly, I am starting some runnable examples with assertions and using the C standard library for IO at: https://github.com/cirosantilli/arm-assembly-cheat
Tested on Ubuntu 18.04, gdb-multiarch 8.1, gcc-arm-linux-gnueabihf 7.3.0, qemu-user 2.11.
Freestanding QEMU user mode example
This analogous procedure also works on an ARM freestanding (no standard library) example:
printf '
.data
msg:
.ascii "hello world\\n"
len = . - msg
.text
.global _start
_start:
/* write syscall */
mov r0, #1 /* stdout */
ldr r1, =msg /* buffer */
ldr r2, =len /* len */
mov r7, #4 /* Syscall ID. */
swi #0
/* exit syscall */
mov r0, #0 /* Status. */
mov r7, #1 /* Syscall ID. */
swi #0
' > hello_world.S
arm-linux-gnueabihf-gcc -ggdb3 -nostdlib -o hello_world -static hello_world.S
qemu-arm -g 1234 ./hello_world
On another terminal:
gdb-multiarch -q --nh \
-ex 'set architecture arm' \
-ex 'file hello_world' \
-ex 'target remote localhost:1234' \
-ex 'layout split' \
;
We are now left at the first instruction of the program.
QEMU full system examples
Linux kernel: How to debug the Linux kernel with GDB and QEMU?
Bare metal: https://github.com/cirosantilli/newlib-examples/tree/f70f8a33f8b727422bd6f0b2975c4455d0b33efa#gdb
Single step of an assembly instruction is done with stepi. disas will disassemble around the current PC. info regi will display the current register state. There are some examples for various processors on my blog for my ELLCC cross development tool chain project.
You should add the -g option too to the assembling. Otherwise the codeline info is not included.
That crash probably comes from running some garbage after the code lines.
Maybe you should add the exit system call:
mov eax, 1 ; exit
mov ebx, 0 ; returm value
int 0x80 ; system call

Is changing default virtual address in elf header to 0 possible?

Can I change the default virtual address(ph_vaddr) in the elf to 0x0. will this allow access to null pointer?? or the kernel does not allow to load at address 0?
I just want to know that if I change the p_vaddr of some section say .text to 0x0, does linux allow this? Is there some constraint that virtual address can start only after some value? Whenever I was trying to set .text vaddr using ld --section-start anywhere between 0 to 9999 it was getting killed. I want to know what is going on??
Can I change the default virtual address(ph_vaddr) in the elf to 0x0.
Yes, that is in fact how PIE (position independent) executables are usually linked.
echo "int main() { return 0; }" | gcc -xc - -fPIE -pie -o a.out
readelf -l a.out | grep LOAD | head -1
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
Note: above makes an executable that is of type ET_DYN.
will this allow access to null pointer?
No. When the kernel discovers that the .e_type == ET_DYN for the executable, it will relocate all of its segments elsewhere.
You can also make an executable of type ET_EXEC with .p_vaddr == 0, like so:
echo "int main() { return 0; }" | gcc -xc - -o a.out -Wl,-Ttext=0
readelf -l a.out | grep LOAD | head -1
LOAD 0x0000000000200000 0x0000000000000000 0x0000000000000000
The kernel will refuse to run it:
./a.out
Killed
You could mmap(2) with MAP_FIXED a segment starting at (void*)0 but I don't think you should.
I have no idea if changing the virtual address in elf(5) would do the equivalent. Are you speaking of p_vaddr for some segment?
Actually, you should really not use the NULL address in application code on Linux, especially if some of that code is coded in C, because the NULL pointer has a very special meaning, including to the compiler. In particular, some optimizations are done based on the fact that NULL is not dereferencable.
It is well known that GCC does optimize, for instance,
x = *p;
if (!p) goto wasnull;
into just x= *p; because if phas been dereferenced it cannot be NULL; And GCC is right in doing that optimization for application code (not for free-standing one).
Also the kernel is usually doing Address Space Layout Randomization.

why does a stack program segment have executable attribute

Here is a dump from a.out
STACK off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**2
filesz 0x00000000 memsz 0x00000000 flags rwx
Why does a stack segment have executable attribute?
Why isn't there a heap segment with rw- attribute?
//On ubuntu 32bit machine. Program is a simple hello world.
Command:
ld test.o startup.s; objdump -dhSxt -M intel-pneumonic a.out
//startup.s has a small assembly code with _start symbol which calls main and exits after main returns.
Command: gcc test.c
Try gcc test.c -Wl,-z,noexecstack.
That should be the default on any reasonably modern distribution.

Resources