how to properly display a value from the stack using a system call [duplicate] - linux

I am trying to use the write syscall in order to reproduce the putchar function behavior which prints a single character. My code is as follows,
asm_putchar:
push rbp
mov rbp, rsp
mov r8, rdi
call:
mov rax, 1
mov rdi, 1
mov rsi, r8
mov rdx, 1
syscall
return:
mov rsp, rbp
pop rbp
ret

From man 2 write, you can see the signature of write is,
ssize_t write(int fd, const void *buf, size_t count);
It takes a pointer (const void *buf) to a buffer in memory. You can't pass it a char by value, so you have to store it to memory and pass a pointer.
(Don't print one char at a time unless you only have one to print, that's really inefficient. Construct a buffer in memory and print that. e.g. this x86-64 Linux NASM function: How do I print an integer in Assembly Level Programming without printf from the c library?)
A NASM version of GCC: putchar(char) in inline assembly:
; x86-64 System V calling convention: input = byte in DIL
; clobbers: RDI, RSI, RDX, RCX, R11 (last 2 by syscall itself)
; returns: RAX = write return value: 1 for success, -1..-4095 for error
writechar:
mov byte [rsp-4], dil ; store the char from RDI
mov edi, 1 ; EDI = fd=1 = stdout
lea rsi, [rsp-4] ; RSI = buf
mov edx, edi ; RDX = len = 1
syscall ; rax = write(1, buf, 1)
ret
If you do pass an invalid pointer in RSI, such as '2' (integer 50), the system call will return -EFAULT (-14) in RAX. (The kernel returns error codes on bad pointers to system calls, instead of delivering a SIGSEGV like it would if you deref in user-space).
See also What are the return values of system calls in Assembly?
Instead of writing code to check return values, in toy programs / experiments you should just run them under strace ./a.out, especially if you're writing your own _start without libc there won't be any other system calls during startup that you don't make yourself, so it's very easy to read the output. How should strace be used?

Related

Converting C code to x86-64 bit Assembly?

I am trying to create a random letter generator for my string and I've been given a C code and will have to convert it into Assembly language for my program. I'm doing this in x86-64 bit NASM assembly language. I'm supposed to be using only system calls and not C/C++ function calls.
Here's the C/C++ code that I've got to convert:
int genran(int x,int y)
{
int a = 0;
a = a + x * 1103515245 + 12345;
return (unsigned int)(a / 65536) % (y + 1);
}
I am new to Assembly and any help is appreciated, here's what I've got so far. I know its somewhat wrong but I will be working on improving it:
section .data
string db "The random string is generated below: "
len_string equ $-string
a dd 0
x dd ?
y dd ?
rem dd 0
section .bss
string_buff resb 21 ;Our string's length is 20 characters
section .txt
global main
main:
mov rax, 1
mov rdi, 1
mov rsi, string
mov rdx, len_string
syscall
mov rax, a
mov rbx, x ;we have to come up with a value of x?
mul rbx, 1103515245
add rbx, 12345
mov rax, rbx
div rax, 65536
mov rdx, y ;we have to come up with a value of y?
add rdx, 1
;mod rax, rdx
;ret
exit:
mov rax, 60
xor rdi, rdi
syscall
mov rax, rbx
div rax, 65536
Please read the instructions; this doesn't exist. div is an opcode that only takes one argument. The code should look like
mov rax, rbx
xor rdx, rdx ; Div takes rdx:rax as an implicit 128 bit argument.
mov rcx, 65536
div rcx
At this skill level I don't recommend trying to use read or write system calls. I very much instead recommend calling the C standard library until you get the hang of it. read and write have too many gotchas. You will want to write them once and include them in a mini-library any more assembly work you do.
We can skip read by taking argument off the command line as follows:
mov rbx, [rsp + 16] ; argv[1]
Unless I'm very much mistaken, the first thing on the stack is argc, and then argv[0], then argv[1], ...

Reading a single-key input on Linux (without waiting for return) using x86_64 sys_call

I want to make Linux just take 1 keystroke from keyboard using sys_read, but sys_read just wait until i pressed enter. How to read 1 keystroke ? this is my code:
Mov EAX,3
Mov EBX,0
Mov ECX,Nada
Mov EDX,1
Int 80h
Cmp ECX,49
Je Do_C
Jmp Error
I already tried using BIOS interrupt but it's failed (Segmentation fault), I want capture number 1 to 8 input from keyboard.
Syscalls in 64-bit linux
The tables from man syscall provide a good overview here:
arch/ABI instruction syscall # retval Notes
──────────────────────────────────────────────────────────────────
i386 int $0x80 eax eax
x86_64 syscall rax rax See below
arch/ABI arg1 arg2 arg3 arg4 arg5 arg6 arg7 Notes
──────────────────────────────────────────────────────────────────
i386 ebx ecx edx esi edi ebp -
x86_64 rdi rsi rdx r10 r8 r9 -
I have omitted the lines that are not relevant here. In 32-bit mode, the parameters were transferred in ebx, ecx, etc and the syscall number is in eax. In 64-bit mode it is a little different: All registers are now 64-bit wide and therefore have a different name. The syscall number is still in eax, which now becomes rax. But the parameters are now passed in rdi, rsi, etc. In addition, the instruction syscall is used here instead of int 0x80 to trigger a syscall.
The order of the parameters can also be read in the man pages, here man 2 ioctl and man 2 read:
int ioctl(int fd, unsigned long request, ...);
ssize_t read(int fd, void *buf, size_t count);
So here the value of int fd is in rdi, the second parameter in rsi etc.
How to get rid of waiting for a newline
Firstly create a termios structure in memory (in .bss section):
termios:
c_iflag resd 1 ; input mode flags
c_oflag resd 1 ; output mode flags
c_cflag resd 1 ; control mode flags
c_lflag resd 1 ; local mode flags
c_line resb 1 ; line discipline
c_cc resb 19 ; control characters
Then get the current terminal settings and disable canonical mode:
; Get current settings
mov eax, 16 ; syscall number: SYS_ioctl
mov edi, 0 ; fd: STDIN_FILENO
mov esi, 0x5401 ; request: TCGETS
mov rdx, termios ; request data
syscall
; Modify flags
and byte [c_lflag], 0FDh ; Clear ICANON to disable canonical mode
; Write termios structure back
mov eax, 16 ; syscall number: SYS_ioctl
mov edi, 0 ; fd: STDIN_FILENO
mov esi, 0x5402 ; request: TCSETS
mov rdx, termios ; request data
syscall
Now you can use sys_read to read in the keystroke:
mov eax, 0 ; syscall number: SYS_read
mov edi, 0 ; int fd: STDIN_FILENO
mov rsi, buf ; void* buf
mov rdx, len ; size_t count
syscall
Afterwards check the return value in rax: It contains the number of characters read.
(Or a -errno code on error, e.g. if you closed stdin by running ./a.out <&- in bash. Use strace to print a decoded trace of the system calls your program makes, so you don't need to actually write error handling in toy experiments.)
References:
What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
Why does the sys_read system call end when it detects a new line?
How do i read single character input from keyboard using nasm (assembly) under ubuntu?
Using the raw keyboard mode under Linux (external site with example in 32-bit assembly)

Why won't this stack string print in x64 NASM on macOS?

I've been able to successfully print a string using the sys_write to stdout on macOS. However, I cannot get this stack string to print using execve syscall with echo:
global _main
default rel
section .text
_main:
mov rbp, rsp
sub rsp, 32
mov rax, 'this a t'
mov [rbp-16], rax
mov rax, 'est'
mov [rbp-8], rax
mov rax, '/bin/ech'
mov [rbp-32], rax
xor rax, rax
mov al, 'o'
mov [rbp-24], rax
push 0
mov rax, 0
mov [rbp], rax
exit_program:
;rdi filename
;rsi argv
;rdx envp
lea rdi, [rbp-32]
lea rsi, [rbp-32]
mov rdx, 0
mov rax, 0x200003b
syscall
Currently, my return is EFAULT status code from execve.
The memory layout as shown in the screenshot is the string "This is a test" followed by null bytes for termination.
UPDATE: Trace output: execve("/bin/echo", [0x6863652f6e69622f, 0x6f, 0x7420612073696874, 0x747365], NULL) = -1 EFAULT (Bad address)
execve takes 3 args: a char* and two char *[] arrays, each terminated by a NULL pointer.
Your first arg is fine. It points to a zero-terminated array of ASCII characters which are a valid path.
Your argv is a char[], not char *[], because you passed the same value as your first arg! So when the system call interprets the data as an array of pointers to copy into the new process's arg array, it finds an invalid pointer 0x6863652f6e69622f as the first one. (The bytes of that pointer are ASCII codes.)
The trace output makes that pretty clear.
Your 3rd is NULL, not a pointer to NULL. Linux supports this, treating a NULL as an empty array. I don't know if MacOS does or not; if you still get EFAULT after passing a valid argv[] set RDX to a pointer to a qword 0 somewhere on the stack.
Keeping your existing setup code, you could change the last part to
lea rdi, [rbp-32] ; pointer to "/bin/echo"
push 0 ; NULL terminator
mov rdx, rsp ; envp = empty array
push some_reg ; holding a pointer to "this is a test"
push rdi ; pointer to "/bin/echo" = argv[0]
mov rsi, rsp ; argv
syscall
Note that envp[] and argv[] are terminated by the same NULL pointer. If you wanted a non-empty envp you couldn't do that.
If this is supposed to be shellcode, you're going to need to replace the push 0 with pushing an xor-zeroed register, and it looks like you could simplify some of the other stuff. But get it working first.

Writing a putchar in Assembly for x86_64 with 64 bit Linux?

I am trying to use the write syscall in order to reproduce the putchar function behavior which prints a single character. My code is as follows,
asm_putchar:
push rbp
mov rbp, rsp
mov r8, rdi
call:
mov rax, 1
mov rdi, 1
mov rsi, r8
mov rdx, 1
syscall
return:
mov rsp, rbp
pop rbp
ret
From man 2 write, you can see the signature of write is,
ssize_t write(int fd, const void *buf, size_t count);
It takes a pointer (const void *buf) to a buffer in memory. You can't pass it a char by value, so you have to store it to memory and pass a pointer.
(Don't print one char at a time unless you only have one to print, that's really inefficient. Construct a buffer in memory and print that. e.g. this x86-64 Linux NASM function: How do I print an integer in Assembly Level Programming without printf from the c library?)
A NASM version of GCC: putchar(char) in inline assembly:
; x86-64 System V calling convention: input = byte in DIL
; clobbers: RDI, RSI, RDX, RCX, R11 (last 2 by syscall itself)
; returns: RAX = write return value: 1 for success, -1..-4095 for error
writechar:
mov byte [rsp-4], dil ; store the char from RDI
mov edi, 1 ; EDI = fd=1 = stdout
lea rsi, [rsp-4] ; RSI = buf
mov edx, edi ; RDX = len = 1
syscall ; rax = write(1, buf, 1)
ret
If you do pass an invalid pointer in RSI, such as '2' (integer 50), the system call will return -EFAULT (-14) in RAX. (The kernel returns error codes on bad pointers to system calls, instead of delivering a SIGSEGV like it would if you deref in user-space).
See also What are the return values of system calls in Assembly?
Instead of writing code to check return values, in toy programs / experiments you should just run them under strace ./a.out, especially if you're writing your own _start without libc there won't be any other system calls during startup that you don't make yourself, so it's very easy to read the output. How should strace be used?

nasm segmentation fault while using arg

extern puts
global main
section .text
main:
mov rax, rdi
label:
test rax, rax
je exit
push rsi
mov rdi, [rsi]
call puts
pop rsi
dec rax
add rsi, 8
jmp label
exit:
pop rsi
ret
I wrote nasm code like that. However segmentation fault occur in last. I can't understand why segmentation fault is occur.
rax is not guaranteed to be preserved across function calls, as it is used to return integer results from functions (in the case of puts "a nonnegative number on success, or EOF on error") You need to save the value of rax before calling puts, like you're doing with rsi, and restore it afterwards.
Obviously you want to get the command line parameters in a GCC environment on a 64-bit Linux, where they are passed according to the GCC calling convention which follows the Linux calling convention "System V AMD64 ABI".
Let's translate the program logic to C:
#include <stdio.h>
int main ( int argc, char** argv )
{
if (argc != 0)
{
do
{
puts (*argv);
argc--;
argv++;
} while (argc);
}
return;
}
The asm program doesn't return an exit code. That exit code should be in RAX when the function returns. BTW: argc is always >0 since the first string of argv holds the program name.
The main function is both "caller" (calls puts) and "callee" (returns to the GCC environment). As caller it has to preserve RAX and RSI before the call to puts and restore them when it needs them. A callee-saved register is not used. Don't forget to align the stack by 16.
This works:
extern puts
global main
section .text
main: ; RDI: argc, RSI: argv, stack is unaligned by 8
mov rax, rdi
label:
test rax, rax
je exit
push rbx ; Push 8 bytes to align the stack before the call
push rax ; Save it (caller-saved)
push rsi ; Save it (caller-saved)
mov rdi, [rsi] ; Argument for puts
call puts
pop rsi ; Restore it
pop rax ; Restore it
pop rbx ; "Unalign" the stack
dec rax
add rsi, 8
jmp label
exit:
; pop rsi ; Once too much
xor eax, eax ; RAX = 0 (return 0)
ret ; RAX: return value

Resources