Difference in behaviour between code executed by a pthread and the main thread in x64-assembly - linux

When writing some x64 assembly, I stumbled upon something weird. A function call works fine when executed on a main thread, but causes a segmentation fault when executed as a pthread. At first I thought I was invalidating the stack, as it only segfaults on the second call, but this does not match with the fact that it works properly on the main thread yet crashes on a newly-spawned thread.
From gdb:
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Value: 1337
Value: 1337
[New Thread 0x7ffff77f6700 (LWP 8717)]
Return value: 0
Value: 1337
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff77f6700 (LWP 8717)]
__printf (format=0x600570 <fmt> "Value: %d\n") at printf.c:28
28 printf.c: No such file or directory.
Does anyone have an idea about what could be going on here?
extern printf
extern pthread_create
extern pthread_join
extern pthread_exit
section .data
align 4
fmt db "Value: %d", 0x0A, 0
fmt_rval db "Return value: %d", 0x0A, 0
tID dw 0
section .text
global _start
_start:
mov rdi, 1337
call show_value
call show_value ; <- this call works fine
; CREATE THREAD
mov ecx, 0 ; function argument
mov edx, thread_1 ; function pointer
mov esi, 0 ; attributes
mov rdi, tID ; pointer to threadID
call pthread_create
mov rdi, rax
call show_rval
mov rsi, 0 ; return value
mov rdi, [tID] ; id to wait on
call pthread_join
mov rdi, rax
call show_rval
call exit
thread_1:
mov rdi, 1337
call show_value
call show_value ; <- this additional call causes a segfault
ret
show_value:
push rdi
mov rsi, rdi
mov rdi, fmt
call printf
pop rdi
ret
show_rval:
push rdi
mov rsi, rdi
mov rdi, fmt_rval
call printf
pop rdi
ret
exit:
mov rax, 60
mov rdi, 0
syscall
The binary was generated on Ubuntu 14.04 (64-bit of course), with:
nasm -felf64 -g -o $1.o $1.asm
ld -I/lib64/ld-linux-x86-64.so.2 -o $1.out $1.o -lc -lpthread

Functions that take a variable number of parameters like printf require the RAX register to be set properly. You need to set it to the number of vector registers used, which in your case is 0. From Section 3.2.3 Parameter Passing in the System V 64-bit ABI:
RAX
temporary register;
with variable arguments passes information about the number of vector registers used;
1st return register
Section 3.5.7 contains more detailed information about the parameter passing mechanism of functions taking a variable number of arguments. That section says:
When a function taking variable-arguments is called, %rax must be set to the total number of floating point parameters passed to the function in vector registers.
Modify your code to set RAX to zero in your call to printf:
show_value:
push rdi
xor rax, rax ; rax = 0
mov rsi, rdi
mov rdi, fmt
call printf
pop rdi
ret
You have a similar issue with show_rval
One other observation is that you could simplify linking your executable by using GCC instead of LD
I would recommend renaming _start to main and simply use GCC to link the final executable. GCC's C runtime code will provide a _start label that does proper initialization of the C runtime, which could potentially be required in some scenarios. When the C runtime code is finished initialization it transfers (via a CALL) to the label main. You could then produce your executable with:
nasm -felf64 -g -o $1.o $1.asm
gcc -o $1.out $1.o -lpthread
I don't think this is related to your problem, but was meant more as an FYI.
By not properly setting RAX for the printf call, unwanted behavior may occur in some cases. In this case, the value of RAX not being set properly for the printf call in an environment with threads causes a segmentation fault. The code without threads happened to work because you were lucky.

Related

Why is the RDI register missing in this "Hello world" assembly program?

I found this "Hello" (shellcode) assembly program:
SECTION .data
SECTION .text
global main
main:
mov rax, 1
mov rsi, 0x6f6c6c6548 ; "Hello" is stored in reverse order "olleH"
push rsi
mov rsi, rsp
mov rdx, 5
syscall
mov rax, 60
syscall
And I found that mov rdi, 1 is missing. In other "hello world" programs that instruction appears so I would like to understand why this happens.
I was going to say it's an intentional trick or hack to save code bytes, using argc as the file descriptor. (1 if you run it from the shell without extra command line args). main(int argc, char**argv) gets its args in EDI and RSI respectively, in the x86-64 SysV calling convention used on Linux.
But given the other choices, like mov rax, 1 instead of mov eax, edi, it's probably just a bug that got overlooked because the code happened to work.
It would not work in real shellcode for a code-injection attack, where execution would probably reach this code with garbage other than 0, 1, or 2 in EDI. The shellcode test program on the tutorial you linked calls a const char[] of machine code as the only thing in main, which will normally compile to asm that doesn't touch RDI.
This code wouldn't work for code-injection attacks based on strcpy or other C-string overflows either, since the machine code contains 00 bytes as part of mov eax, 1, mov edx, 5, and the end of that character string.
Also, modern linkers don't link .rodata into an executable segment, and -zexecstack only affects the actual stack, not all readable memory. So that shellcode test won't work, although I expect it did when written. See How to get c code to execute hex machine code? for working ways, like using a local array and compiling with -zexecstack.
That tutorial is overall not great, probably something this guy wrote while learning. (But not as bad as I expected based on this bug and the use of Kali; it's at least decently written, just missing some tricks.)
Since you're using NASM, you don't need to manually waste time looking up ASCII codes and getting the byte order correct. Unlike some assemblers, mov rsi, "Hello" / push rsi results in those bytes being in memory in source order.
You also don't need an empty .data section, especially when making shellcode which is just a self-contained snippet of machine code which can't reference anything outside itself.
Writing a 32-bit register implicitly zero-extends to 64-bit. NASM optimizes mov rax,1 into mov eax,1 for you (as you can see in the objdump -d AT&D disassembly; objdump -drwC -Mintel to use Intel-syntax disassembly similar to NASM.)
The following should work:
global main
main:
mov rax, `Hello\n ` ; non-zero padding to fill 8 bytes
push rax
mov rsi, rsp
push 1 ; push imm8
pop rax ; __NR_write
mov edi, eax ; STDOUT_FD is also 1
lea edx, [rax-1 + 6] ; EDX = 6; using 3 bytes with no zeros
syscall
mov al, 60 ; assuming write success, RAX = 5, zero outside the low byte
;lea eax, [rdi-1 + 60] ; the safe way that works even with ./hello >&- to return -EBADF
syscall
This is fewer bytes of machine code than the original, and avoids \x00 bytes which strcpy would stop on. I changed the string to end with a newline, using NASM backticks to support C-style escape sequences like \n as 0x0a byte.
Running normally (I linked it into a static executable without CRT, despite it being called main instead of _start. ld foo.o -o foo):
$ strace ./foo > /dev/null
execve("./foo", ["./foo"], 0x7ffecdc70a20 /* 54 vars */) = 0
write(1, "Hello\n", 6) = 6
exit(1) = ?
Running with stdout closed to break the mov al, 60 __NR_exit hack:
$ strace ./foo >&-
execve("./foo", ["./foo"], 0x7ffe3d24a240 /* 54 vars */) = 0
write(1, "Hello\n", 6) = -1 EBADF (Bad file descriptor)
syscall_0xffffffffffffff3c(0x1, 0x7ffd0b37a988, 0x6, 0, 0, 0) = -1 ENOSYS (Function not implemented)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xffffffffffffffda} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)
To still exit cleanly, use lea eax, [rdi-1 + 60] (3 bytes) instead of mov al, 60 (2 bytes) to set RAX according to the unmodified EDI, instead of depending on the upper bytes of RAX being zero which they aren't after an error return.
See also https://codegolf.stackexchange.com/questions/132981/tips-for-golfing-in-x86-x64-machine-code

Passing arguments to execve(2) via registers

I have wrote a short program to execute a bash using execve(2).
my_prog.c
int main(){
execve(Argument_1,Argument_2,NULL);
}
Here is the disassembly of execve(2) [neglecting the main assembly for simplicity]
(gdb) disassemble execve
Dump of assembler code for function execve:
0xf7ea77e0 <+0>: push ebx
0xf7ea77e1 <+1>: mov edx,DWORD PTR [esp+0x10]
0xf7ea77e5 <+5>: mov ecx,DWORD PTR [esp+0xc]
0xf7ea77e9 <+9>: mov ebx,DWORD PTR [esp+0x8]
0xf7ea77ed <+13>: mov eax,0xb
0xf7ea77f2 <+18>: call DWORD PTR gs:0x10
0xf7ea77f9 <+25>: pop ebx
0xf7ea77fa <+26>: cmp eax,0xfffff001
0xf7ea77ff <+31>: jae 0xf7e0f730
0xf7ea7805 <+37>: ret
I found the arguments to execv(2) in the following registers
eax --> index of execve(2) in syscall table
ebx --> Argument_2
ecx --> Argument_3
and Argument_1 on the top of the stack
(gdb) x/xw $esp
0xffffce00: 0x080484ea
(gdb) x/s 0x080484ea
0x80484ea: "/bin/bash" <--- Argument_1
The edx contains 0x080484ab
(gdb) x/xw 0x080484ab
0x80484ab <__libc_csu_init+75>: 0x8301c783
(gdb) x/xw 0x8301c783
0x8301c783: Cannot access memory at address 0x8301c783
I am on linux-intel(x86) system, so i assume that all parameters to execve(2) should be passed via registers but i couldn't found Argument_1 in any register though it is present on the stack.
I'm stopped at the first instruction of execve module.
execve is not a "module", it's a libc function, which loads arguments into registers, and then performs the actual system call.
i couldn't found Argument_1 in any register
If you stopped on instruction at address 0xf7ea77f2, you would.
But you are stopped on entry into a C function execve, so the arguments are where a C function expects them. On i*86, the arguments are passed on the stack, so that's where you'll find them: x/3wx $esp is what you want (at that point in the program).

Why does my shellcode segfault when executed from C, but not as a stand-alone executable?

I'm trying to execute a shell with shellcode. I've made this code in a 64-bits machine:
section .text
global _start
_start:
xor rax, rax
push rax
mov rbx, "/bin//sh"
push rbx
mov rdi, rsp
mov al, 59
syscall
mov al, 60
xor rdi, rdi
syscall
After using nasm and linking with ld if i execute the file this works fine. The problem is if i get the shellcode from this and tried to execute it with this program:
int main(){
char *shellcode = "\x48\x31\xc0\x50\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x89\xe7\xb0\x3b\x0f\x05\xb0\x3c\x48\x31\xff\x0f\x05";
(*(void(*)()) shellcode)();
}
It gives me a segmentation fault error. I can't see what's wrong here. Any help would be appreciated.
EDIT: Already tried the gcc -z execstack to make the stack executable, still gives a segmentation fault error
It is normal, because your shellcode is not setting the registers rsi and rdx, and when your C program executes the shellcode will have garbage in the registers rdi and rdx. It is because the syscall execve needs more arguments.
int execve (const char *filename, const char *argv [], const char *envp[]);
As extra information, the segmentation fault is because after your execve syscall you will get an error in rax and you will move 60 to the last 8 bits of rax and call to this syscall that doesn't exist.

Why does this NASM assembly program print nothing

I have the following assembly, which I assemble with NASM and then link with gcc:
section .text
extern printf
global main
main:
sub rsp, 8 ;align stack pointer
mov rax, 0 ;no vector arguments
mov rdi, intro_message ;First argument
call printf
mov rax, 60 ;exit
syscall
section .data
intro_message:
db 'Hello world',0
When I run ./a.out, nothing is printed, and the instruction pointer seems to have moved to an invalid location. What have I done wrong?

Linux x86-64 Hello World and register usage for parameters

I found this page which has a Hello World example for x86-64 on Linux:
http://blog.markloiseau.com/2012/05/64-bit-hello-world-in-linux-assembly-nasm/
; 64-bit "Hello World!" in Linux NASM
global _start ; global entry point export for ld
section .text
_start:
; sys_write(stdout, message, length)
mov rax, 1 ; sys_write
mov rdi, 1 ; stdout
mov rsi, message ; message address
mov rdx, length ; message string length
syscall
; sys_exit(return_code)
mov rax, 60 ; sys_exit
mov rdi, 0 ; return 0 (success)
syscall
section .data
message: db 'Hello, world!',0x0A ; message and newline
length: equ $-message ; NASM definition pseudo-instruction
The Author says:
An integer value representing the system_write call is placed in the
first register, followed by its arguments. When the system call and
its arguments are all in their proper registers, the system is called
and the message is displayed.
What does he mean by "proper" registers/What would be an im"proper" register?
What happens if I have a function with more arguments than I have registers?
Does rax always point to the function call (this would always be a system call?)? Is that its only purpose?
By "the proper registers", the author means the registers specified by the x86-64 ABI, in the Linux Kernel Calling Conventions section. The system call number goes in rax, and arguments go in rdi, rsi, rdx, r10, r8 and r9, in that order.
This calling convention (especially the use of syscall!) is only used for system calls, which can only have up to six arguments. Application functions use a different (but similar) calling convention which spills some arguments to the stack, or to other registers.

Resources