Intel x86_64 assembly function calling conventions(linux, windows), stack arguments, stack manipulation - nasm

I call assembly function from C (separate asm file), and pass 6 pointer arguments to it.
In Linux(using Nasm) calling convention is : rdi, rsi, rdx, rcx, r8, r9 and rest on stack.
In Windows (using Masm) calling convention is: rcx, rdx, r8, r9 and rest on stack.
How can I address that rest on stack arguments on both platforms??
I read 64 abi and microsoft msdn but I can't see clear answer how to use this arguments, all my tries ended up with segmentation fault.
Another question is : is it necessary in MASM to do this :
push rbp
mov rbp, rsp
at my assembly function beginning?(and of course reverse it at the end).

Related

How to read a file using shellcode without explicitly mentioning syscall (0x0f05) with write permissions disabled?

I'm working on a ctf-like challenge and it is filtering my shellcode to make sure I don't have the hex value encodings of the syscall, sysenter and int instructions 0x0f05 0x0f34 and 0x80cd respectively. It has also disabled write permissions. I have a shellcode that can open a file which uses the sendfile system call but it includes the syscall instruction. The previous challenge was the same but with write permissions enabled. I successfully used a self-modifying shellcode in that challenge to get the flag.
This is the assembly code (with syscall) I used to read the file "flag" (GAS intel syntax):
.globl _start
_start:
.intel_syntax noprefix
mov rbx, 0x67616c66
push rbx
mov rax, 2
mov rdi, rsp
mov rsi, 0
syscall
mov rdi, 1
mov rsi, rax
mov rdx, 0
mov r10, 1000
mov rax, 40
syscall
mov rax, 60
syscall
I have been searching for an alternative way to do a system call in Linux for the past day but it seems impossible (I'm a newbie to assembly).
I read about an alternative way to do system calls by Call Gates method, but it seems to rely on the Global Descriptor Table and I don't think I can access that due to ASLR (Correct me if I'm wrong).
I'm not necessarily looking for an exact answer but just looking for some help understanding a way I can do a system call in this conditions.

Should %rsp be aligned to 16-byte boundary before calling a function in NASM?

I saw the following rules from NASM's document:
The stack pointer %rsp must be aligned to a 16-byte boundary before making a call. Fine, but the process of making a call pushes the return address (8 bytes) on the stack, so when a function gets control, %rsp is not aligned. You have to make that extra space yourself, by pushing something or subtracting 8 from %rsp.
And I have a snippet of NASM assembly code as below:
The %rsp should be at the boundary of 8-bytes before I call the function "inc" in "_start" which violates the rules described in NASM's document. But actually, everything is going on well. So, how can I understand this?
I built this under Ubuntu 20.04 LTS (x86_64).
global _start
section .data
init:
db 0x2
section .rodata
codes:
db '0123456789abcdef'
section .text
inc:
mov rax, [rsp+8] ; read param from the stack;
add rax, 0x1
ret
print:
lea rsi, [codes + rax]
mov rax, 1
mov rdi, 1
mov rdx, 1
syscall
ret
_start:
; enable AC check;
pushf
or dword [rsp], 1<<18
popf
mov rdi, [init] ; move the first 8 bytes of init to %rdi;
push rdi ; %rsp -> 8 bytes;
call inc
pop r11 ; clean stack by the caller;
call print
mov rax, 60
xor rdi, rdi
syscall
The ABI is a set of rules for how functions should behave to be interoperable with each other. Each of the rules on one side are paired with allowed assumptions on the other. In this case, the rule about stack alignment for the caller is an allowed assumption about stack alignment for the callee. Since your inc function doesn't depend on 16-byte stack alignment, it's fine to call that particular function with a stack that's only 8-byte aligned.
If you're wondering why it didn't break when you enabled AC, that's because you're only loading 8-byte values from the stack, and the stack is still 8-byte aligned. If you did sub rsp, 4 or something to break 8-byte alignment too, then you would get a bus error.
Where the ABI becomes important is when the situation isn't one function you wrote yourself in assembly calling another function you wrote yourself in assembly. A function in someone else's library (including the C standard library), or one that you compiled from C instead of writing in assembly, is within its rights to do movaps [rsp - 24], xmm0 or something, which would break if you didn't properly align the stack before calling it.
Side note: the ABI also says how you're supposed to pass parameters (the calling convention), but you're just kind of passing them wherever. Again, fine from your own assembly, but they'll definitely break if you try to call them from C.

Why is RCX not used for passing parameters to system calls, being replaced with R10? [duplicate]

This question already has answers here:
Linux x64: why does r10 come before r8 and r9 in syscalls?
(2 answers)
Closed 3 years ago.
According to System V X86-64 ABI, function calls in the applications use the following sequence of registers to pass integer arguments:
rdi, rsi, rdx, rcx, r8, r9
But system call arguments (other than syscall number) are passed in another sequence of registers:
rdi, rsi, rdx, r10, r8, r9
Why does the kernel use r10 instead of rcx for the fourth argument? Is it somehow related to the fact that rcx is not preserved while r10 is?
X86-64 system calls use syscall instruction. This instruction saves return address to rcx, and after that it loads rip from IA32_LSTAR MSR. I.e. rcx is immediately destroyed by syscall. This is the reason why rcx had to be replaced for system call ABI.
This same syscall instruction also saves rflags into r11, and then masks rflags using IA32_FMASK MSR. This is why r11 isn't saved by the kernel.
So, these changes reflect how the syscall mechanism works. This is why the kernel is forced to declare rcx and r11 as not saved and even can't use them for parameter passing.
Reference: Intel's Instruction Set Reference, look for SYSCALL.

x86_64 Linux syscall arguments

I'm learning x86_64 assembly on Linux and I've run into some conflicting information that I was hoping could get cleared up. On one hand, I've read that for syscall arguments, you would use registers in the order of rdi, rsi, rdx, and so on. But on the other hand I've read that you use the registers rbx, rcx, rdx, and so on. One person told me that the reasoning for this is because of ABI, but I'm not totally understanding what that exactly means.
Why are there two formats for this and which would be the proper one to use?
According to this Wikibooks page, it depends on which instruction you are using to perform the syscall.
If you are using int $0x80 (the 32-bit ABI with call numbers from asm/unistd_32.h), then you should use eax for the syscall number, and ebx, ecx, edx, esi, edi, and ebp for the parameters (in that order).
If you are using the syscall instruction (the 64-bit ABI with native call numbers from asm/unistd.h), you should use rax for the syscall number and rdi, rsi, rdx, r10, r8, and r9 for the parameters.
What are the calling conventions for UNIX & Linux system calls on i386 and x86-64
In 64-bit mode syscall is preferred because it's faster, available on all x86-64 Linux kernels, and supports 64-bit pointers and integers. The int 0x80 ABI truly takes input in ecx, not rcx, for example, making it unusable for write or read of a buffer on the stack in 64-bit code.
What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?.

Assembly : Converting x86 code to x64 for a simple example

While I'm learning x64 assembly, I'm trying to make add method to add two integers and return integer using assembly.
I had the working code with x86 and I tried to convert to x64
simply I changed the registers' names to be that of x64 and the object file generated without errors but when I use it inside my C function I always get sum = 0.
I think there's a problem with the arguments location and I didn't find a good documentation for such issues.
section .text
global addi
addi:
push rbp
mov rbp, rsp
mov rax,[rbp+12]
mov rdx,[rbp+8]
add rax, rdx
pop rbp
ret
x86-64 has a calling conversion defined by amd, all toolchain for this platform should follow this:
1st argument -> rdi
2nd -> rsi
3rd -> rdx
4th -> rcx
5th -> rax
so it should look like:
section .text
global addi
addi:
mov rax, rsi
add rax, rdi
ret
The x64 ABI mandates that the first few arguments are passed in the registers, not on the stack.
See Stack frame layout on x86-64 for a nice explanation.

Resources