NASM (64-bit system call)

NASM (64-bit system call) - nasm

how do i store 2 digit data using extra segment(es) and appropriate offset....i tired this way..
SECTION .data
offset : db 0100H
data : db 56H
SECTION .bss
.......(initializations not required in this program)
SECTION .text
global main
main :
mov ax,1000H
mov es,ax
mov si,[offset]
mov al,[data]
mov es:[si],al
mov rdi,0
mov rax,60
syscall
What is wrong in this? I'm completely new in nasm programming...

Directly accessing a certain memory address using the segent pointers only work in 16 bit real mode. In 64 bit protected mode the physical memory addresses are mapped to application visible addresses by the MMU. This is typically managed by the OS while user space applications are simply not allowed to temper with it. If you want to access certain physical memory you need to tell the OS map it into your memory space somehow.

Related

Is it possible for a program to read itself?

Theoretical question. But let's say I have written an assembly program. I have "labelx:" I want the program to read at this memory address and only this size and print to stdout.
Would it be something like
jmp labelx
And then would i then use the Write syscall , making sure to read from the next instruction from labelx:
mov rsi,rip
mov rdi,0x01
mov rdx,?
mov rax,0x01
syscall
to then output to stdout.
However how would I obtain the size to read itself? Especially if there is a
label after the code i want to read or code after. Would I have to manually
count the lines?
mov rdx,rip+(bytes*lines)
And then syscall with populated registers for the syscall to write to from rsi to rdi. Being stdout.
Is this Even possible? Would i have to use the read syscall first, as the write system call requires rsi to be allocated memory buffer. However I assumed .text is already allocated memory and is read only. Would I have to allocate onto the stack or heap or a static buffer first before write, if it's even possible in the first place?
I'm using NASM syntax btw. And pretty new to assembly. And just a question.

Yes, the .text section is just bytes in memory, no different from section .rodata where you might normally put msg: db "hello", 10. x86 is a Von Neumann architecture (not Harvard), so there's no distinction between code pointers and data pointers, other than what you choose to do with them. Use objdump -drwC -Mintel on a linked executable to see the machine-code bytes, or GDB's x command in a running process, to see bytes anywhere.
You can get the assembler to calculate the size by putting labels at the start/end of the part you want, and using mov edx, prog_end - prog_start in the code at the point where you want that size in RDX.
See How does $ work in NASM, exactly? for more about subtracting two labels (in the same section) to get a size. (Where $ is an implicit label at the start of the current line, although $ isn't likely what you want here.)
To get the current address into a register, you need a RIP-relative LEA, not mov, because RIP isn't a general-purpose register and there's no special form of mov that reads it.
here:
lea rsi, [rel here] ; with DEFAULT REL you could just use [here]
mov edi, 1 ; stdout fileno
mov edx, .end - here ; assemble-time constant size calculation
mov eax, 1 ; __NR_write
syscall
.end:
This is fully position-independent, unlike if you used mov esi, here. (How to load address of function or label into register)
The LEA could use lea rsi, [rel $] to assemble to the same machine-code bytes, but you want a label there so you can subtract them.
I optimized your MOV instructions to use 32-bit operand-size, implicitly zero-extending into the full 64-bit RDX and RAX. (And RDI, but write(int fd, void *buf, size_t len) only looks at EDI anyway for the file descriptor).
Note that you can write any bytes of any section; there's nothing special about having a block of code write itself. In the above example, put the start/end labels anywhere. (e.g. foo: and .end:, and mov edx, foo.end - foo taking advantage of how NASM local labels work, by appending to the previous non-local label, so you can reference them from somewhere else. Or just give them both non-dot names.)

Why do x86-64 Linux system calls modify RCX, and what does the value mean?

I'm trying to allocate some memory in linux with sys_brk syscall. Here is what I tried:
BYTES_TO_ALLOCATE equ 0x08
section .text
global _start
_start:
mov rax, 12
mov rdi, BYTES_TO_ALLOCATE
syscall
mov rax, 60
syscall
The thing is as per linux calling convention I expected the return value to be in rax register (pointer to the allocated memory). I ran this in gdb and after making sys_brk syscall I noticed the following register contents
Before syscall
rax 0xc 12
rbx 0x0 0
rcx 0x0 0
rdx 0x0 0
rsi 0x0 0
rdi 0x8 8
After syscall
rax 0x401000 4198400
rbx 0x0 0
rcx 0x40008c 4194444 ; <---- What does this value mean?
rdx 0x0 0
rsi 0x0 0
rdi 0x8 8
I do not quite understand the value in the rcx register in this case. Which one to use as a pointer to the beginning of 8 bytes I allocated with sys_brk?

The system call return value is in rax, as always. See What are the calling conventions for UNIX & Linux system calls on i386 and x86-64.
Note that sys_brk has a slightly different interface than the brk / sbrk POSIX functions; see the C library/kernel differences section of the Linux brk(2) man page. Specifically, Linux sys_brk sets the program break; the arg and return value are both pointers. See Assembly x86 brk() call use. That answer needs upvotes because it's the only good one on that question.
The other interesting part of your question is:
I do not quite understand the value in the rcx register in this case
You're seeing the mechanics of how the syscall / sysret instructions are designed to allow the kernel to resume user-space execution but still be fast.
syscall doesn't do any loads or stores, it only modifies registers. Instead of using special registers to save a return address, it simply uses regular integer registers.
It's not a coincidence that RCX=RIP and R11=RFLAGS after the kernel returns to your user-space code. The only way for this not to be the case is if a ptrace system call modified the process's saved rcx or r11 value while it was inside the kernel. (ptrace is the system call gdb uses). In that case, Linux would use iret instead of sysret to return to user space, because the slower general-case iret can do that. (See What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code? for some walk-through of Linux's system-call entry points. Mostly the entry points from 32-bit processes, not from syscall in a 64-bit process, though.)
Instead of pushing a return address onto the kernel stack (like int 0x80 does), syscall:
sets RCX=RIP, R11=RFLAGS (so it's impossible for the kernel to even see the original values of those regs before you executed syscall).
masks RFLAGS with a pre-configured mask from a config register (the IA32_FMASK MSR). This lets the kernel disable interrupts (IF) until it's done swapgs and setting rsp to point to the kernel stack. Even with cli as the first instruction at the entry point, there'd be a window of vulnerability. You also get cld for free by masking off DF so rep movs / stos go upward even if user-space had used std.
Fun fact: AMD's first proposed syscall / swapgs design didn't mask RFLAGS, but they changed it after feedback from kernel developers on the amd64 mailing list (in ~2000, a couple years before the first silicon).
jumps to the configured syscall entry point (setting CS:RIP = IA32_LSTAR). The old CS value isn't saved anywhere, I think.
It doesn't do anything else, the kernel has to use swapgs to get access to an info block where it saved the kernel stack pointer, because rsp still has its value from user-space.
So the design of syscall requires a system-call ABI that clobbers registers, and that's why the values are what they are.

Can't understand assembly mov instruction between register and a variable

I am using NASM assembler on linux 64 bit.
There is something with variables and registers I can't understand.
I create a variable named "msg":
msg db "hello, world"
Now when I want to write to the stdout I move the msg to rsi register, however I don't understand the mov instruction bitwise ... the rsi register consists of 64 bit , while the msg variable has 12 symbols which is 8 bits each , which means the msg variable has a size of 12 * 8 bits , which is greater than 64 bits obviously.
So how is this even possible to make an instruction like:
mov rsi, msg , without overflowing the memory allocated for rsi.
Or does the rsi register contain the memory location of the first symbol of the string and after writing 1 symbol it changes to the memory location of the next symbol?
Sorry if I wrote complete nonsense, I'm new to assembly and i just can't get the grasp of it for a while.

In NASM syntax (unlike MASM syntax) mov rsi, symbol puts the address of the symbol into RSI. (Inefficiently with a 64-bit absolute immediate; use a RIP-relative LEA or mov esi, symbol instead. How to load address of function or label into register in GNU Assembler)
mov rsi, [symbol] would load 8 bytes starting at symbol. It's up to you to choose a useful place to load 8 bytes from when you write an instruction like that.
mov rsi, msg ; rsi = address of msg. Use lea rsi, [rel msg] instead
movzx eax, byte [rsi+1] ; rax = 'e' (upper 7 bytes zeroed)
mov edx, [msg+6] ; rdx = ' wor' (upper 4 bytes zeroed)
Note that you can use mov esi, msg because symbol addresses always fit in 32 bits (in the default "small" code model, where all static code/data goes in the low 2GB of virtual address space). NASM makes this optimization for you with assemble-time constants (like mov rax, 1), but probably it can't with link-time constants. Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?
and after writing 1 symbol it changes to the memory location of the next symbol?
No, if you want that you have to inc rsi. There is no magic. Pointers are just integers that you manipulate like any other integers, and strings are just bytes in memory.
Accessing registers doesn't magically modify them.
There are instructions like lodsb and pop that load from memory and increment a pointer (rsi or rsp respectively), but x86 doesn't have any pre/post-increment/decrement addressing modes, so you can't get that behaviour with mov even if you want it. Use add/sub or inc/dec.

Disclaimer: I'm not familiar with the flavor of assembly that you're dealing with, so the following is more general. The particular flavor may have more features than what I'm used to. In general, assembly deals with single byte/word entities where the size depends on the processor. I've done quite a bit of work on 8 and 16-bit processors, so that is where my answer is coming from.
General statements about Assembly:
Assembly is just like a high level language, except you have to handle a lot more of the details. So if you're used to some operation in say C, you can start there and then break the operation down even further.
For instance, if you have declared two variables that you want to add, that's pretty easy in C:
x = a + b;
In assembly, you have to break that down further:
mov R1, a * get value from a into register R1
mov R2, b * get value from b into register R2
add R1,R2 * perform the addition (typically goes into a particular location I'll call it the accumulator
mov x, acc * store the result of the addition from the accumulator into x
Depending on the flavor of assembly and the processor, you may be able to directly refer to variables in the addition instruction, but like I said I would have to look at the specific flavor you're working with.
Comments on your specific question:
If you have a string of characters, then you would have to move each one individually using a loop of some sort. I would set up a register to contain the starting address of your string, and then increment that register after each character is moved. It acts like a pointer in C. You will need to have some sort of indication for the termination of the string or another value that tells the size of the string, so you know when to stop.

How to limit the address space of 32bit application on 64bit Linux to 3GB?

Is it possible to make 64bit Linux loader to limit the address space of the loaded 32bit program to some upper limit?
Or to set some holes in the address space that to not be allocated by the kernel?
I mean for specific executable, not globally for all processes, neither through kernel configuration. Some code or ELF executable flags are examples of appropriate solution.
The limit should be forced for all loaded shared libraries as well.
Clarification:
The problem I want to fix is that my code uses the numbers above 0xc0000000 as a handle values and I want to clearly distinct between handle values and memory addresses, even when the memory addresses are allocated and returned by some third party library function.
As long as the address space in 64bit Linux is very close to 4G limit, there is no enough addressing space left for the handle values.
On the other hand 3GB or even less is far enough for all my needs.

OK, I found the answer of this question elsewhere.
The solution is to change the "personality" of your program to PER_LINUX32_3GB, using the Linux system call sys_personality.
But there is a problem. After switching to PER_LINUX32_3GB Linux kernel will not allocate space in the upper 1GB, but the already allocated space, for example the application stack, remains there.
The solution is to "restart" your program through sys_execve system call.
Here is the code where I packed everything in one:
proc ___SwitchLinuxTo3GB
begin
cmp esp, $c0000000
jb .finish ; the system is native 32bit
; check the current personality.
mov eax, sys_personality
mov ebx, -1
int $80
; and exit if it is what intended
test eax, ADDR_LIMIT_3GB
jnz .finish ; everything is OK.
; set the needed personality
mov eax, sys_personality
mov ebx, PER_LINUX32_3GB
int $80
; and restart the process
mov eax, [esp+4] ; argument count
mov ebx, [esp+8] ; the filename of the executable.
lea ecx, [esp+8] ; the arguments list.
lea edx, [ecx+4*eax+4] ; the environment list.
mov eax, sys_execve
int $80
; if something gone wrong, it comes here and stops!
int3
.finish:
return
endp

x86 memory access segmentation fault

I am learning x86 assembly out of curiosity. I'm currently using a Linux based OS with the NASM assembler. I am having a difficult time understanding why
SECTION .text
global _start
_start:
nop
mov ebx, 25
mov [0xFFF], ebx
;Exit the program
mov eax, 1
mov ebx, 0
int 0x80
Would lead to a segmentation fault (when moving the contents of the ebx register to memory location 0xFFF). I was thinking that building a program in pure asm would give me unrestricted access to my process' virtual address space. Is this not the case?
How would you implement something like a heap in assembly?

On Linux(x86) - although you have a virtual address range of 4gb in your process, not all of it is accessible. The upper 1gb is where the kernel resides, and there are areas of low memory that can't be used. Virtual memory address 0xfff can't be written to or read from (by default) so your program crashes with a segfault.
In a followup comment you suggested you were intending to create a heap in assembler. That can be done, and one method is to use the sys_brk system call. It is accessed via int 0x80 and EAX=45 . It takes a pointer in EBX representing the new top of the heap. Generally the bottom of the heap area is initialized to the area just beyond your programs data segment(above your program in memory). To get the address of the initial heap location you can call sys_break with EBX set to 0. After the system call EAX will be the current base pointer of the heap. You can save that away when you need to access your heap memory or allocate more heap space.
This code provides an example for purposes of clarity (not performance), but might be a starting point to understanding how you can manipulate the heap area:
SECTION .data
heap_base: dd 0 ; Memory address for base of our heap
SECTION .text
global _start
_start:
; Use `brk` syscall to get current memory address
; For the bottom of our heap This can be achieved
; by calling brk with an address (EBX) of 0
mov eax, 45 ; brk system call
xor ebx, ebx ; don't request additional space, we just want to
; get the memory address for the base of our processes heap area.
int 0x80
mov [heap_base], eax ; Save the heap base
;Now allocate some space (8192 bytes)
mov eax, 45 ; brk system call
mov ebx, [heap_base] ; ebx = address for base of heap
add ebx, 0x2000 ; increase heap by 8192 bytes
int 0x80
; Example usage
mov eax, [heap_base] ; Get pointer to the heap's base
mov dword [eax+0xFFF], 25 ; mov value 25 to DWORD at heapbase+0xFFF
;Exit the program
mov eax, 1
xor ebx, ebx
int 0x80

You don't have unrestricted RAM. Furthermore, you don't have unrestricted access to the part of your address space which is backed by RAM. Code pages are mapped read-only. And as a ring-3 program, you can't change that yourself.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string