Debugging a NASM shared object with gdb, I get a segmentation fault when a thread accesses its thread-local variable. When I print the value of the thread-local variable with gdb, it responds:
(gdb) p MQ_FDes_Core
The inferior has not yet allocated storage for thread-local variables in the shared library `/opt/ThTest/TLS_Test.so' for Thread 0x7fffeaf26700 (LWP 4317).
where MQ_FDes_Core is the thread local variable. If I execute the line "mov [MQ_FDes_Core],rax" gdb generates a segmentation fault because the tls has not yet been initialized.
The code section where this occurs is:
lea rdi,[MQ_Name]
mov rsi,rax
call mq_open wrt ..plt
mov [MQ_FDes_Core],rax
The disassembly of that section (using Agner Fog's objconv):
Open_message_queue:
lea rdi, [rel MQ_Name] ; 1367 _ 48: 8D. 3D, 0020230A(rel)
mov rsi, rax ; 136E _ 48: 89. C6
call ?_014 ; 1371 _ E8, FFFFFB9A(rel)
mov qword [rel MQ_FDes_Core], rax ; 1376 _ 48: 89. 05, 002017E3(rel)
The tdata is allocated in the .tdata section:
section .tdata align=16
MQ_FDes_Core: dq 0
The NASM 2.13.02 manual section "7.9.4 Thread Local Storage in ELF: elf Special Symbols and WRT" says to write it like this for ELF64:
mov rax,[rel MQ_FDes_Core wrt ..gottpoff]
mov rcx,[fs:rax]
But with that the NASM assembler says, "error: parser: expecting ]" at the first line.
I link with ld against -ldl, -lpthread and -lrt.
I've researched a number of web resources including Ulrich Drepper's "ELF Handling For Thread-Local Storage" but I don't yet know how to correct this. Much of it focuses on C or C++ but this is NASM.
Thanks for any ideas on initializing thread local storage in a dynamically loaded shared object.
UPDATE -- A MINIMAL EXAMPLE shared object (build into a PIE executable):
default rel
section .tdata align=16
MQ_FDes_Core: dq 0
section .text
global main
main:
mov qword [MQ_FDes_Core], 87
ret
Build with nasm -f elf64 tls.asm && gcc -pie tls.o
(gdb) start
Temporary breakpoint 1 at 0x1120
Starting program: /tmp/a.out
Temporary breakpoint 1, 0x0000555555555120 in main ()
(gdb) p MQ_FDes_Core
Cannot find thread-local storage for process 642417, executable file /tmp/a.out:
Cannot find thread-local variables on this target
Related
I'm trying to debug this simple C program:
#include <stdio.h>
int main(int argc, char *argv[]) {
printf("Hello\n");
}
But when I disassemble the main function I get this:
(gdb) disas main
Dump of assembler code for function main:
0x000000000000063a <+0>: push rbp
0x000000000000063b <+1>: mov rbp,rsp
0x000000000000063e <+4>: sub rsp,0x10
0x0000000000000642 <+8>: mov DWORD PTR [rbp-0x4],edi
0x0000000000000645 <+11>: mov QWORD PTR [rbp-0x10],rsi
0x0000000000000649 <+15>: lea rdi,[rip+0x94] # 0x6e4
0x0000000000000650 <+22>: call 0x510 <puts#plt>
0x0000000000000655 <+27>: mov eax,0x0
0x000000000000065a <+32>: leave
0x000000000000065b <+33>: ret
End of assembler dump.
And this is already pretty strange because addresses starts with a prefix of 4... for 32 bit executables and 8... for 64 bit executables I think.
But going on I then put a breakpoint:
(gdb) b *0x0000000000000650
Breakpoint 1 at 0x650
I run it and I get this error message:
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x650
Your code was most probably compiled as Position-Independent Executable (PIE) to allow Address Space Layout Randomization (ASLR). On some systems, gcc is configured to create PIEs by default (that implies the options -pie -fPIE being passed to gcc).
When you start GDB to debug a PIE, it starts reading addresses from 0, since your executable was not started yet, and therefore not relocated (in PIEs, all addresses including the .text section are relocatable and they start at 0, similar to a dynamic shared object). This is a sample output:
$ gcc -o prog main.c -pie -fPIE
$ gdb -q prog
Reading symbols from prog...(no debugging symbols found)...done.
gdb-peda$ disassemble main
Dump of assembler code for function main:
0x000000000000071a <+0>: push rbp
0x000000000000071b <+1>: mov rbp,rsp
0x000000000000071e <+4>: sub rsp,0x10
0x0000000000000722 <+8>: mov DWORD PTR [rbp-0x4],edi
0x0000000000000725 <+11>: mov QWORD PTR [rbp-0x10],rsi
0x0000000000000729 <+15>: lea rdi,[rip+0x94] # 0x7c4
0x0000000000000730 <+22>: call 0x5d0 <puts#plt>
0x0000000000000735 <+27>: mov eax,0x0
0x000000000000073a <+32>: leave
0x000000000000073b <+33>: ret
End of assembler dump.
As you can see, this shows a similar output to yours, with .text adresses starting at low values.
Relocation takes place once you start your executable, so after that, your code will be placed at some random address in your process memory:
gdb-peda$ start
...
gdb-peda$ disassemble main
Dump of assembler code for function main:
0x00002b1c8f17271a <+0>: push rbp
0x00002b1c8f17271b <+1>: mov rbp,rsp
=> 0x00002b1c8f17271e <+4>: sub rsp,0x10
0x00002b1c8f172722 <+8>: mov DWORD PTR [rbp-0x4],edi
0x00002b1c8f172725 <+11>: mov QWORD PTR [rbp-0x10],rsi
0x00002b1c8f172729 <+15>: lea rdi,[rip+0x94] # 0x2b1c8f1727c4
0x00002b1c8f172730 <+22>: call 0x2b1c8f1725d0 <puts#plt>
0x00002b1c8f172735 <+27>: mov eax,0x0
0x00002b1c8f17273a <+32>: leave
0x00002b1c8f17273b <+33>: ret
End of assembler dump.
As you can see, the addresses now take "real" values that you can set breakpoints to. Note that usually you will still not see the effect of ASLR in GDB though, since it disables randomization by default (debugging a program with randomized location would be cumbersome). You can check this with show disable-randomization. If you really want to see the effects of ASLR in your PIE, set disable-randomization off. Then every run will relocate your code to random addresses.
So the bottom line is: When debugging PIE code, start your program in GDB first and then figure out the addresses.
Alternatively, you can explicitly disable the creation of PIE code and compile your application using gcc filename.c -o filename -no-pie -fno-PIE.
My system does not enforce PIE creation by default, so unfortunately I don't know about the implications of disabling PIE on such a system (would be glad to see comments on that).
For a more comprehensive explanation of position-independent code (PIC) in general (which is of utmost importance for shared libraries), have a look at Ulrich Drepper's paper "How to Write Shared Libraries".
I'm trying to understand why as behaves differently than nasm when doing syscalls on the assembly level. Because I'm a glutton for punishment, I'm using Intel syntax. Here's my program:
.intel_syntax noprefix
.section .rodata
.LC0:
.string "Hello world!\n"
.text
.globl _start
.type _start, #function
_start:
mov edx, 13
mov ecx, OFFSET FLAT:.LC0
mov eax, 4
int 0x80
ret
I assemble with as -o prog.o prog.s and link with ld -s -o prog prog.o.
But when I run it, I get:
$ ./prog
Hello world!
Segmentation fault (core dumped)
GDB is not particularly helpful here. When I stepi on ret, it says Cannot access memory at address 0x1. Which is puzzling, because the value of ESP is:
(gdb) info registers esp
info registers esp
esp 0xbffff660 0xbffff660
Why does this program segfault?
Because it never exits properly. _start doesn't have a parent stack frame, so returning from it will cause a crash.
You can return from main to have the standard library's _start implementation call exit for you, but if you're writing your own _start, you need to call exit yourself, as there's no parent stack frame to return to.
I tried to put code not in the main function, but directly into _start:
segment .text
global _start
_start:
push rbp
mov rbp, rsp
; ... program logic ...
leave
ret
Compile:
yasm -f elf64 main.s
ld -o main main.o
Run:
./main
Segmentation fault(core dumped)
I read, leave is
mov esp,ebp
pop ebp
But why is it that such an epilogue to the pop stack frame and the set base frame pointer to a previous frame's base results in a segmentation fault?
Indeed, making an exit system call exits gracefully.
As per ABI1 the stack at the entry on _start is
There is no "return address".
The only way to exit a process is through SYS_EXIT
xorl %edi, %edi ;Error code
movl $60, %eax ;SYS_EXIT
syscall
1 Section 3.4.1 Initial Stack and Register State.
The LEAVE instruction is defined to not cause any exceptions, so it cannot be the source of your fault. You should be using GDB. Debuggers are invaluable in solving these sorts of problems.
This is what happens:
$ gdb ./main
[...]
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000001 in ?? ()
(gdb) x /gx $rsp-8
0x7fffffffe650: 0x0000000000000001
So, most likely your program ran to completion, but the first thing on the stack that 0x0000000000000001. RET popped that into the RIP register, and then it segfaulted because that address is not mapped.
I don't write a lot of code on Linux, but I would bet that _start is required to use the exit system call. The only way you could possibly return to a useful address is if the kernel put a function somewhere that would do this for you.
Why doesn't the following code give a segmentation fault?
global _start
section .data
_start:
mov ecx, 3
xor byte[_start+1], 0x02
mov eax, 1
mov ebx, 2
int 80h
I expected it to segfault at the same place (line marked with a comment) as when the same code is run in the .text section:
global _start
section .text ; changed from data to text
_start:
mov ecx, 3
xor byte[_start+1], 0x02 ; ******get segmentation fault here
mov eax, 1
mov ebx, 2
int 80h
Now, I know that section .data is for read-write, and section .text is for read only.
But why would it matter when I try to access illegal memory address?
For the example here, I expected to get segmentation fault also at section .data, in the same place that I got it in section .text.
[_start+1] is clearly not an illegal address. It's part of the 5 bytes encoding mov ecx, 3. (look at objdump -Mintel -drw a.out to see disassembly with the hex machine code).
IDK why you think there would be a problem writing to an address in .data where you've defined the contents. It's more common to use pseudo-instructions like db to assemble bytes into the data section, but assemblers will happily assemble instructions or db into bytes anywhere you put them.
The crash you'd expect from the .data version is from _start being mapped without execute permission but thanks to surprising defaults in the toolchain, programs with asm source files often end up with read-implies-exec (like gcc -zexecstack) unless you take precautions to avoid that:
Why data and stack segments are executable?
Unexpected exec permission from mmap when assembly files included in the project
If you applied that section .note.GNU-stack noalloc noexec nowrite progbits change, code fetch from RIP=_start would fault.
The version that tries to write to the .text section of course segfaults because it's mapped read-only.
section .text
global _start
_start:
nop
main:
mov eax, 1
mov ebx, 2
xor eax, eax
ret
I compile with these commands:
nasm -f elf main.asm
ld -melf_i386 -o main main.o
When I run the code, Linux throw a segmentation fault error
(I am using Linux Mint Nadia 64 bits). Why this error is produced?
Because ret is NOT the proper way to exit a program in Linux, Windows, or Mac!!!!
_start is not a function, there is no return address on the stack because there is no user-space caller to return to. Execution in user-space started here (in a static executable), at the process entry point. (Or with dynamic linking, it jumped here after the dynamic linker finished, but same result).
On Linux / OS X, the stack pointer is pointing at argc on entry to _start (see the i386 or x86-64 System V ABI doc for more details on the process startup environment); the kernel puts command line args into user-space stack memory before starting user-space. (So if you do try to ret, EIP/RIP = argc = a small integer, not a valid address. If your debugger shows a fault at address 0x00000001 or something, that's why.)
For Windows it is ExitProcess and Linux is is system call -
int 80H using sys_exit, for x86 or using syscall using 60 for 64-bit or a call to exit from the C Library if you are linking to it.
32-bit Linux (i386)
%define SYS_exit 1 ; call number __NR_exit from <asm/unistd_32.h>
mov eax, SYS_exit ; use the NASM macro we defined earlier
xor ebx, ebx ; ebx = 0 exit status
int 80H ; _exit(0)
64-bit Linux (amd64)
mov rax, 60 ; SYS_exit aka __NR_exit from asm/unistd_64.h
xor rdi, rdi ; edi = 0 first arg to 64-bit system calls
syscall ; _exit(0)
(In GAS you can actually #include <sys/syscall.h> or <asm/unistd.h> to get the right numbers for the mode you're assembling a .S for, but NASM can't easily use the C preprocessor.
See Polygot include file for nasm/yasm and C for hints.)
32-bit Windows (x86)
push 0
call ExitProcess
Or Windows/Linux linking against the C Library
; pass an int exit_status as appropriate for the calling convention
; push 0 / xor edi,edi / xor ecx,ecx
call exit
(Or for 32-bit x86 Windows, call _exit, because C names get prepended with an underscore, unlike in x86-64 Windows. The POSIX _exit function would be call __exit, if Windows had one.)
Windows x64's calling convention includes shadow space which the caller has to reserve, but exit isn't going to return so it's ok to let it step on that space above its return address. Also, 16-byte stack alignment is required by the calling convention before call exit except for 32-bit Windows, but often won't actually crash for a simple function like exit().
call exit (unlike a raw exit system call or libc _exit) will flush stdio buffers first. If you used printf from _start, use exit to make sure all output is printed before you exit, even if stdout is redirected to a file (making stdout full-buffered, not line-buffered).
It's generally recommended that if you use libc functions, you write a main function and link with gcc so it's called by the normal CRT start functions which you can ret to.
See also
Syscall implementation of exit()
How come _exit(0) (exiting by syscall) prevents me from receiving any stdout content?
Defining main as something that _start falls through into doesn't make it special, it's just confusing to use a main label if it's not like a C main function called by a _start that's prepared to exit after main returns.