I'm trying to understand why as behaves differently than nasm when doing syscalls on the assembly level. Because I'm a glutton for punishment, I'm using Intel syntax. Here's my program:
.intel_syntax noprefix
.section .rodata
.LC0:
.string "Hello world!\n"
.text
.globl _start
.type _start, #function
_start:
mov edx, 13
mov ecx, OFFSET FLAT:.LC0
mov eax, 4
int 0x80
ret
I assemble with as -o prog.o prog.s and link with ld -s -o prog prog.o.
But when I run it, I get:
$ ./prog
Hello world!
Segmentation fault (core dumped)
GDB is not particularly helpful here. When I stepi on ret, it says Cannot access memory at address 0x1. Which is puzzling, because the value of ESP is:
(gdb) info registers esp
info registers esp
esp 0xbffff660 0xbffff660
Why does this program segfault?
Because it never exits properly. _start doesn't have a parent stack frame, so returning from it will cause a crash.
You can return from main to have the standard library's _start implementation call exit for you, but if you're writing your own _start, you need to call exit yourself, as there's no parent stack frame to return to.
Related
Debugging a NASM shared object with gdb, I get a segmentation fault when a thread accesses its thread-local variable. When I print the value of the thread-local variable with gdb, it responds:
(gdb) p MQ_FDes_Core
The inferior has not yet allocated storage for thread-local variables in the shared library `/opt/ThTest/TLS_Test.so' for Thread 0x7fffeaf26700 (LWP 4317).
where MQ_FDes_Core is the thread local variable. If I execute the line "mov [MQ_FDes_Core],rax" gdb generates a segmentation fault because the tls has not yet been initialized.
The code section where this occurs is:
lea rdi,[MQ_Name]
mov rsi,rax
call mq_open wrt ..plt
mov [MQ_FDes_Core],rax
The disassembly of that section (using Agner Fog's objconv):
Open_message_queue:
lea rdi, [rel MQ_Name] ; 1367 _ 48: 8D. 3D, 0020230A(rel)
mov rsi, rax ; 136E _ 48: 89. C6
call ?_014 ; 1371 _ E8, FFFFFB9A(rel)
mov qword [rel MQ_FDes_Core], rax ; 1376 _ 48: 89. 05, 002017E3(rel)
The tdata is allocated in the .tdata section:
section .tdata align=16
MQ_FDes_Core: dq 0
The NASM 2.13.02 manual section "7.9.4 Thread Local Storage in ELF: elf Special Symbols and WRT" says to write it like this for ELF64:
mov rax,[rel MQ_FDes_Core wrt ..gottpoff]
mov rcx,[fs:rax]
But with that the NASM assembler says, "error: parser: expecting ]" at the first line.
I link with ld against -ldl, -lpthread and -lrt.
I've researched a number of web resources including Ulrich Drepper's "ELF Handling For Thread-Local Storage" but I don't yet know how to correct this. Much of it focuses on C or C++ but this is NASM.
Thanks for any ideas on initializing thread local storage in a dynamically loaded shared object.
UPDATE -- A MINIMAL EXAMPLE shared object (build into a PIE executable):
default rel
section .tdata align=16
MQ_FDes_Core: dq 0
section .text
global main
main:
mov qword [MQ_FDes_Core], 87
ret
Build with nasm -f elf64 tls.asm && gcc -pie tls.o
(gdb) start
Temporary breakpoint 1 at 0x1120
Starting program: /tmp/a.out
Temporary breakpoint 1, 0x0000555555555120 in main ()
(gdb) p MQ_FDes_Core
Cannot find thread-local storage for process 642417, executable file /tmp/a.out:
Cannot find thread-local variables on this target
when I assembly the following assembly code I get the error Segmentation fault (core dumped)
section .text
global _start
_start:
mov eax, 8
My Makefile is as follows
all:
nasm -f elf64 -o asm.o asm.s
ld asm.o -o asm
rm asm.o
I don't know what the issue is.
I am running 64-bit Ubuntu.
The CPU execute the program, findd the mov eax, 8 instruction, executed it... and what now? There are no more instructions in the object file, but nobody told the CPU! It executes whatever is next, probably no valid instruction, which results in a segmentation fault, just like #MichaelPetch said.
The easiest solution IMO is to use a wrapper, which takes care of initializing and cleaning up your program, e.g., GCC. Just put the mov eax, 8 into the main function, which you may be familiar with from C.
Modify the source file as follows:
section .text
global main
main:
mov eax, 8
ret
(main is a function, so you need the ret instruction to return from it.)
and use the following script:
nasm -f elf64 -o asm.o asm.s
gcc asm.o -o asm
rm asm.o
I made this fast example.asm with 64 bits registers
section .data
msg db "Hello world"
section .text
global _start:
_start:
call _myfunk
call _exit
_myfunk:
mov rax,1
mov rdi,1
mov rdx,12
mov rsi,msg
syscall
ret
_exit:
mov rax, 60
mov rdi,0
syscall
To compile this assembly code you can use nasm and ld commands
nasm -f elf64 example.asm -o example.o
ld example.o -o example.elf
and now run the program ./example.elf
I am only started with assembly, but I can help - now it's 4 years old and you may not need it, but maybe others.
In this:
section .text
global _start
_start:
mov eax, 8
you forgot to stop the program after finishing the code,
so inside the _start label, you can add 3 lines, just like below:
(i don't know why you need 8 in eax reg. so i am moving it to ebx)
section .text
global _start
_start:
mov eax, 8
mov ebx, eax
mov eax, 1 ; this is for system call exit
int 0x80 ; system call
Here the value of ebx will be treated as return value , so you can get the value (8) by typing in your terminal
echo $?
good luck :)
I tried to put code not in the main function, but directly into _start:
segment .text
global _start
_start:
push rbp
mov rbp, rsp
; ... program logic ...
leave
ret
Compile:
yasm -f elf64 main.s
ld -o main main.o
Run:
./main
Segmentation fault(core dumped)
I read, leave is
mov esp,ebp
pop ebp
But why is it that such an epilogue to the pop stack frame and the set base frame pointer to a previous frame's base results in a segmentation fault?
Indeed, making an exit system call exits gracefully.
As per ABI1 the stack at the entry on _start is
There is no "return address".
The only way to exit a process is through SYS_EXIT
xorl %edi, %edi ;Error code
movl $60, %eax ;SYS_EXIT
syscall
1 Section 3.4.1 Initial Stack and Register State.
The LEAVE instruction is defined to not cause any exceptions, so it cannot be the source of your fault. You should be using GDB. Debuggers are invaluable in solving these sorts of problems.
This is what happens:
$ gdb ./main
[...]
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000001 in ?? ()
(gdb) x /gx $rsp-8
0x7fffffffe650: 0x0000000000000001
So, most likely your program ran to completion, but the first thing on the stack that 0x0000000000000001. RET popped that into the RIP register, and then it segfaulted because that address is not mapped.
I don't write a lot of code on Linux, but I would bet that _start is required to use the exit system call. The only way you could possibly return to a useful address is if the kernel put a function somewhere that would do this for you.
Why doesn't the following code give a segmentation fault?
global _start
section .data
_start:
mov ecx, 3
xor byte[_start+1], 0x02
mov eax, 1
mov ebx, 2
int 80h
I expected it to segfault at the same place (line marked with a comment) as when the same code is run in the .text section:
global _start
section .text ; changed from data to text
_start:
mov ecx, 3
xor byte[_start+1], 0x02 ; ******get segmentation fault here
mov eax, 1
mov ebx, 2
int 80h
Now, I know that section .data is for read-write, and section .text is for read only.
But why would it matter when I try to access illegal memory address?
For the example here, I expected to get segmentation fault also at section .data, in the same place that I got it in section .text.
[_start+1] is clearly not an illegal address. It's part of the 5 bytes encoding mov ecx, 3. (look at objdump -Mintel -drw a.out to see disassembly with the hex machine code).
IDK why you think there would be a problem writing to an address in .data where you've defined the contents. It's more common to use pseudo-instructions like db to assemble bytes into the data section, but assemblers will happily assemble instructions or db into bytes anywhere you put them.
The crash you'd expect from the .data version is from _start being mapped without execute permission but thanks to surprising defaults in the toolchain, programs with asm source files often end up with read-implies-exec (like gcc -zexecstack) unless you take precautions to avoid that:
Why data and stack segments are executable?
Unexpected exec permission from mmap when assembly files included in the project
If you applied that section .note.GNU-stack noalloc noexec nowrite progbits change, code fetch from RIP=_start would fault.
The version that tries to write to the .text section of course segfaults because it's mapped read-only.
I have written a simple Linux assembly shellcode which print "Hello, world!" to stdout.
xor eax,eax
xor ebx,ebx
xor ecx,ecx
xor edx,edx
jmp short string
code:
pop ecx
mov bl,1
mov al,13
mov al,4
int 0x80
dec bl
mov al,1
int 0x80
string:
call code
db 'hellow, world!'
The program name is hello.S. Now, compiling the code:
$ nasm -o hello hello.S
$ ./s-proc -p hello
/* The following shellcode is 47 bytes long: */
char shellcode[] =
"\x66\x31\xc0\x66\x31\xdb\x66\x31\xc9\x66\x31\xd2\xeb\x10\x66"
"\x59\xb3\x01\xb0\x0d\xb0\x04\xcd\x80\xfe\xcb\xb0\x01\xcd\x80"
"\xe8\xed\xff\x68\x65\x6c\x6c\x6f\x77\x2c\x20\x77\x6f\x72\x6c"
"\x64\x21";
$ ./s-proc -e hello
Calling code ...
Segmentation fault
$
The program is correct but it gives error.
About the s-proc:
s-proc is a C program which used to execute the shellcode. Using ld command makes shellcode large therefor I used s-proc.
The source code of s-pros.c could be found here and here
The wrapper code simply uses malloc to get a chunk of memory and reads the file into it. However nowadays heap memory is not executable, hence you get a segfault. You could use mprotect to mark the required page(s) executable. If you decide to put the shellcode on the stack, you need executable stack turned on (-z execstack option to gcc).