Writing a putchar in Assembly for x86_64 with 64 bit Linux? - linux

I am trying to use the write syscall in order to reproduce the putchar function behavior which prints a single character. My code is as follows,
asm_putchar:
push rbp
mov rbp, rsp
mov r8, rdi
call:
mov rax, 1
mov rdi, 1
mov rsi, r8
mov rdx, 1
syscall
return:
mov rsp, rbp
pop rbp
ret

From man 2 write, you can see the signature of write is,
ssize_t write(int fd, const void *buf, size_t count);
It takes a pointer (const void *buf) to a buffer in memory. You can't pass it a char by value, so you have to store it to memory and pass a pointer.
(Don't print one char at a time unless you only have one to print, that's really inefficient. Construct a buffer in memory and print that. e.g. this x86-64 Linux NASM function: How do I print an integer in Assembly Level Programming without printf from the c library?)
A NASM version of GCC: putchar(char) in inline assembly:
; x86-64 System V calling convention: input = byte in DIL
; clobbers: RDI, RSI, RDX, RCX, R11 (last 2 by syscall itself)
; returns: RAX = write return value: 1 for success, -1..-4095 for error
writechar:
mov byte [rsp-4], dil ; store the char from RDI
mov edi, 1 ; EDI = fd=1 = stdout
lea rsi, [rsp-4] ; RSI = buf
mov edx, edi ; RDX = len = 1
syscall ; rax = write(1, buf, 1)
ret
If you do pass an invalid pointer in RSI, such as '2' (integer 50), the system call will return -EFAULT (-14) in RAX. (The kernel returns error codes on bad pointers to system calls, instead of delivering a SIGSEGV like it would if you deref in user-space).
See also What are the return values of system calls in Assembly?
Instead of writing code to check return values, in toy programs / experiments you should just run them under strace ./a.out, especially if you're writing your own _start without libc there won't be any other system calls during startup that you don't make yourself, so it's very easy to read the output. How should strace be used?

Related

how to properly display a value from the stack using a system call [duplicate]

I am trying to use the write syscall in order to reproduce the putchar function behavior which prints a single character. My code is as follows,
asm_putchar:
push rbp
mov rbp, rsp
mov r8, rdi
call:
mov rax, 1
mov rdi, 1
mov rsi, r8
mov rdx, 1
syscall
return:
mov rsp, rbp
pop rbp
ret
From man 2 write, you can see the signature of write is,
ssize_t write(int fd, const void *buf, size_t count);
It takes a pointer (const void *buf) to a buffer in memory. You can't pass it a char by value, so you have to store it to memory and pass a pointer.
(Don't print one char at a time unless you only have one to print, that's really inefficient. Construct a buffer in memory and print that. e.g. this x86-64 Linux NASM function: How do I print an integer in Assembly Level Programming without printf from the c library?)
A NASM version of GCC: putchar(char) in inline assembly:
; x86-64 System V calling convention: input = byte in DIL
; clobbers: RDI, RSI, RDX, RCX, R11 (last 2 by syscall itself)
; returns: RAX = write return value: 1 for success, -1..-4095 for error
writechar:
mov byte [rsp-4], dil ; store the char from RDI
mov edi, 1 ; EDI = fd=1 = stdout
lea rsi, [rsp-4] ; RSI = buf
mov edx, edi ; RDX = len = 1
syscall ; rax = write(1, buf, 1)
ret
If you do pass an invalid pointer in RSI, such as '2' (integer 50), the system call will return -EFAULT (-14) in RAX. (The kernel returns error codes on bad pointers to system calls, instead of delivering a SIGSEGV like it would if you deref in user-space).
See also What are the return values of system calls in Assembly?
Instead of writing code to check return values, in toy programs / experiments you should just run them under strace ./a.out, especially if you're writing your own _start without libc there won't be any other system calls during startup that you don't make yourself, so it's very easy to read the output. How should strace be used?

Reading a single-key input on Linux (without waiting for return) using x86_64 sys_call

I want to make Linux just take 1 keystroke from keyboard using sys_read, but sys_read just wait until i pressed enter. How to read 1 keystroke ? this is my code:
Mov EAX,3
Mov EBX,0
Mov ECX,Nada
Mov EDX,1
Int 80h
Cmp ECX,49
Je Do_C
Jmp Error
I already tried using BIOS interrupt but it's failed (Segmentation fault), I want capture number 1 to 8 input from keyboard.
Syscalls in 64-bit linux
The tables from man syscall provide a good overview here:
arch/ABI instruction syscall # retval Notes
──────────────────────────────────────────────────────────────────
i386 int $0x80 eax eax
x86_64 syscall rax rax See below
arch/ABI arg1 arg2 arg3 arg4 arg5 arg6 arg7 Notes
──────────────────────────────────────────────────────────────────
i386 ebx ecx edx esi edi ebp -
x86_64 rdi rsi rdx r10 r8 r9 -
I have omitted the lines that are not relevant here. In 32-bit mode, the parameters were transferred in ebx, ecx, etc and the syscall number is in eax. In 64-bit mode it is a little different: All registers are now 64-bit wide and therefore have a different name. The syscall number is still in eax, which now becomes rax. But the parameters are now passed in rdi, rsi, etc. In addition, the instruction syscall is used here instead of int 0x80 to trigger a syscall.
The order of the parameters can also be read in the man pages, here man 2 ioctl and man 2 read:
int ioctl(int fd, unsigned long request, ...);
ssize_t read(int fd, void *buf, size_t count);
So here the value of int fd is in rdi, the second parameter in rsi etc.
How to get rid of waiting for a newline
Firstly create a termios structure in memory (in .bss section):
termios:
c_iflag resd 1 ; input mode flags
c_oflag resd 1 ; output mode flags
c_cflag resd 1 ; control mode flags
c_lflag resd 1 ; local mode flags
c_line resb 1 ; line discipline
c_cc resb 19 ; control characters
Then get the current terminal settings and disable canonical mode:
; Get current settings
mov eax, 16 ; syscall number: SYS_ioctl
mov edi, 0 ; fd: STDIN_FILENO
mov esi, 0x5401 ; request: TCGETS
mov rdx, termios ; request data
syscall
; Modify flags
and byte [c_lflag], 0FDh ; Clear ICANON to disable canonical mode
; Write termios structure back
mov eax, 16 ; syscall number: SYS_ioctl
mov edi, 0 ; fd: STDIN_FILENO
mov esi, 0x5402 ; request: TCSETS
mov rdx, termios ; request data
syscall
Now you can use sys_read to read in the keystroke:
mov eax, 0 ; syscall number: SYS_read
mov edi, 0 ; int fd: STDIN_FILENO
mov rsi, buf ; void* buf
mov rdx, len ; size_t count
syscall
Afterwards check the return value in rax: It contains the number of characters read.
(Or a -errno code on error, e.g. if you closed stdin by running ./a.out <&- in bash. Use strace to print a decoded trace of the system calls your program makes, so you don't need to actually write error handling in toy experiments.)
References:
What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
Why does the sys_read system call end when it detects a new line?
How do i read single character input from keyboard using nasm (assembly) under ubuntu?
Using the raw keyboard mode under Linux (external site with example in 32-bit assembly)

Why won't this stack string print in x64 NASM on macOS?

I've been able to successfully print a string using the sys_write to stdout on macOS. However, I cannot get this stack string to print using execve syscall with echo:
global _main
default rel
section .text
_main:
mov rbp, rsp
sub rsp, 32
mov rax, 'this a t'
mov [rbp-16], rax
mov rax, 'est'
mov [rbp-8], rax
mov rax, '/bin/ech'
mov [rbp-32], rax
xor rax, rax
mov al, 'o'
mov [rbp-24], rax
push 0
mov rax, 0
mov [rbp], rax
exit_program:
;rdi filename
;rsi argv
;rdx envp
lea rdi, [rbp-32]
lea rsi, [rbp-32]
mov rdx, 0
mov rax, 0x200003b
syscall
Currently, my return is EFAULT status code from execve.
The memory layout as shown in the screenshot is the string "This is a test" followed by null bytes for termination.
UPDATE: Trace output: execve("/bin/echo", [0x6863652f6e69622f, 0x6f, 0x7420612073696874, 0x747365], NULL) = -1 EFAULT (Bad address)
execve takes 3 args: a char* and two char *[] arrays, each terminated by a NULL pointer.
Your first arg is fine. It points to a zero-terminated array of ASCII characters which are a valid path.
Your argv is a char[], not char *[], because you passed the same value as your first arg! So when the system call interprets the data as an array of pointers to copy into the new process's arg array, it finds an invalid pointer 0x6863652f6e69622f as the first one. (The bytes of that pointer are ASCII codes.)
The trace output makes that pretty clear.
Your 3rd is NULL, not a pointer to NULL. Linux supports this, treating a NULL as an empty array. I don't know if MacOS does or not; if you still get EFAULT after passing a valid argv[] set RDX to a pointer to a qword 0 somewhere on the stack.
Keeping your existing setup code, you could change the last part to
lea rdi, [rbp-32] ; pointer to "/bin/echo"
push 0 ; NULL terminator
mov rdx, rsp ; envp = empty array
push some_reg ; holding a pointer to "this is a test"
push rdi ; pointer to "/bin/echo" = argv[0]
mov rsi, rsp ; argv
syscall
Note that envp[] and argv[] are terminated by the same NULL pointer. If you wanted a non-empty envp you couldn't do that.
If this is supposed to be shellcode, you're going to need to replace the push 0 with pushing an xor-zeroed register, and it looks like you could simplify some of the other stuff. But get it working first.

nasm segmentation fault while using arg

extern puts
global main
section .text
main:
mov rax, rdi
label:
test rax, rax
je exit
push rsi
mov rdi, [rsi]
call puts
pop rsi
dec rax
add rsi, 8
jmp label
exit:
pop rsi
ret
I wrote nasm code like that. However segmentation fault occur in last. I can't understand why segmentation fault is occur.
rax is not guaranteed to be preserved across function calls, as it is used to return integer results from functions (in the case of puts "a nonnegative number on success, or EOF on error") You need to save the value of rax before calling puts, like you're doing with rsi, and restore it afterwards.
Obviously you want to get the command line parameters in a GCC environment on a 64-bit Linux, where they are passed according to the GCC calling convention which follows the Linux calling convention "System V AMD64 ABI".
Let's translate the program logic to C:
#include <stdio.h>
int main ( int argc, char** argv )
{
if (argc != 0)
{
do
{
puts (*argv);
argc--;
argv++;
} while (argc);
}
return;
}
The asm program doesn't return an exit code. That exit code should be in RAX when the function returns. BTW: argc is always >0 since the first string of argv holds the program name.
The main function is both "caller" (calls puts) and "callee" (returns to the GCC environment). As caller it has to preserve RAX and RSI before the call to puts and restore them when it needs them. A callee-saved register is not used. Don't forget to align the stack by 16.
This works:
extern puts
global main
section .text
main: ; RDI: argc, RSI: argv, stack is unaligned by 8
mov rax, rdi
label:
test rax, rax
je exit
push rbx ; Push 8 bytes to align the stack before the call
push rax ; Save it (caller-saved)
push rsi ; Save it (caller-saved)
mov rdi, [rsi] ; Argument for puts
call puts
pop rsi ; Restore it
pop rax ; Restore it
pop rbx ; "Unalign" the stack
dec rax
add rsi, 8
jmp label
exit:
; pop rsi ; Once too much
xor eax, eax ; RAX = 0 (return 0)
ret ; RAX: return value

x64 bit assembly

I started assembly (nasm) programming not too long ago. Now I made a C function with assembly implementation which prints an integer. I got it working using the extended registers, but when I want to write it with the x64 registers (rax, rbx, ..) my implementation fails. Does any of you see what I missed?
main.c:
#include <stdio.h>
extern void printnum(int i);
int main(void)
{
printnum(8);
printnum(256);
return 0;
}
32 bit version:
; main.c: http://pastebin.com/f6wEvwTq
; nasm -f elf32 -o printnum.o printnum.asm
; gcc -o printnum printnum.o main.c -m32
section .data
_nl db 0x0A
nlLen equ $ - _nl
section .text
global printnum
printnum:
enter 0,0
mov eax, [ebp+8]
xor ebx, ebx
xor ecx, ecx
xor edx, edx
push ebx
mov ebx, 10
startLoop:
idiv ebx
add edx, 0x30
push dx ; With an odd number of digits this will screw up the stack, but that's ok
; because we'll reset the stack at the end of this function anyway.
; Needs fixing though.
inc ecx
xor edx, edx
cmp eax, 0
jne startLoop
push ecx
imul ecx, 2
mov edx, ecx
mov eax, 4 ; Prints the string (from stack) to screen
mov ebx, 1
mov ecx, esp
add ecx, 4
int 80h
mov eax, 4 ; Prints a new line
mov ebx, 1
mov ecx, _nl
mov edx, nlLen
int 80h
pop eax ; returns the ammount of used characters
leave
ret
x64 version:
; main.c : http://pastebin.com/f6wEvwTq
; nasm -f elf64 -o object/printnum.o printnum.asm
; gcc -o bin/printnum object/printnum.o main.c -m64
section .data
_nl db 0x0A
nlLen equ $ - _nl
section .text
global printnum
printnum:
enter 0, 0
mov rax, [rbp + 8] ; Get the function args from the stac
xor rbx, rbx
xor rcx, rcx
xor rdx, rdx
push rbx ; The 0 byte of the string
mov rbx, 10 ; Dividor
startLoop:
idiv rbx ; modulo is in rdx
add rdx, 0x30
push dx
inc rcx ; increase the loop variable
xor rdx, rdx ; resetting the modulo
cmp rax, 0
jne startLoop
push rcx ; push the counter on the stack
imul rcx, 2
mov rdx, rcx ; string length
mov rax, 4
mov rbx, 1
mov rcx, rsp ; the string
add rcx, 4
int 0x80
mov rax, 4
mov rbx, 1
mov rcx, _nl
mov rdx, nlLen
int 0x80
pop rax
leave
ret ; return to the C routine
Thanks in advance!
I think your problem is that you're trying to use the 32-bit calling conventions in 64-bit mode. That won't fly, not if you're calling these assembly routines from C. The 64-bit calling convention is documented here: http://www.x86-64.org/documentation/abi.pdf
Also, don't open-code system calls. Call the wrappers in the C library. That way errno gets set properly, you take advantage of sysenter/syscall, you don't have to deal with the differences between the normal calling convention and the system-call argument convention, and you're insulated from certain low-level ABI issues. (Another of your problems is that write is system call number 1, not 4, for Linux/x86-64.)
Editorial aside: There are two, and only two, reasons to write anything in assembly nowadays:
You are writing one of the very few remaining bits of deep magic that cannot be written in C alone (a good example is the guts of libffi)
You are hand-optimizing an inner-loop subroutine that has been measured to be performance-critical and the C compiler doesn't do a good enough job on.
Otherwise just write whatever it is in C. Your successors will thank you.
EDIT: checked system call numbers.
I'm not sure if this answer is related to the problem you're seeing (since you didn't specify anything about what the failure is), but 64-bit code has a different calling convention than 32-bit code does. Both of the major 64-bit Intel ABIs (Windows & Linux/BSD/Mac OS) pass function parameters in registers and not on the stack. Your program appears to still be expecting them on the stack, which isn't the normal way to go about it.
Edit: Now that I see there is a C main() routine that calls your functions, my answer is exactly about the problem you're having.

Resources