How can I allocate memory in the Heap in x64 assembly. I want to store the value of the sidt function, but I can't seem to find a way on how to do so?
I use Visual studio 2012.
You will have two options (assuming you're running in user space on top of an operating system).
use whatever your operating system provides to map you some writable memory (in UNIX brk/sbrk/mmap)
call the malloc library function in the C standard library (which will do (1) under the hood for you)
I'd go for number 2 as it's much simpler and kind of portable.
Something similar to the following should do the trick:
movq $0x10, %rdi
callq malloc
; %rax will now contain the pointer to the memory
Assuming ADM64 (System V AMD64 ABI) calling conventions, that'll call malloc(16) which should return you a pointer to a memory block with 16 bytes. The address should reside in the %rax register after the call returns (or 0 if not enough memory).
EDIT: Wikipedia says about the x86-64 calling conventions that Microsoft apparently uses a different calling convention (first register in RCX not RDI). So you'd need to modify movl $0x10, %rdi to movl $0x10, %rcx.
Judging by your environment, I'm guessing that you're writing assembly code in Windows. You'll need to use the Windows equivelent to an sbrk system call. You may find this MSDN reference useful!
Write the code to call malloc in C, then have the compiler produce an assembly listing, which will show you the name used for malloc (probably _malloc in the case of Microsoft compilers), and how to call it.
Another option would be to allocate space from the stack with a subtract from esp, equal to the size of a structure that will hold the sidt information.
Related
im learning x86 assembly, using the code below for testing, i see in gdb console that the rsp register which points at the top of the stack starts at 0x7FFFFFFFDFD0, if i understand correctly, in the code i haven't used push or pop which modifies rsp, so 0x7FFFFFFFDFD0 its a default value, this implicate that we have the same number of bytes in stack, but im using linux where stack size is 8mb
section .text
global _start
_start:
mov rcx, 2
add rcx, 8
mov rax, 0x1
mov rbx, 0xff
int 0x80
For 64-bit 80x86; typically (see note 1) only 48 bits of a virtual address can be used. To make it easier to increase the number of bits that can be used in future processors without breaking older software; AMD decided that the unused highest 16 bits of a 64-bit virtual address should match. Addresses that comply with this are called "canonical addresses", and addresses that don't are called "non-canonical addresses". Normally (see note 2) any attempt to access anything at a non-canonical address causes an exception (a general protection fault).
This gives a virtual address space like:
0x0000000000000000 to 0x00007FFFFFFFFFFF = canonical (often "user space")
0x0000800000000000 to 0xFFFF7FFFFFFFFFFF = non-canonical hole
0xFFFF800000000000 to 0xFFFFFFFFFFFFFFFF = canonical (often "kernel space")
This makes it reasonably obvious that, without Address Space Layout Randomization, a process' initial thread's stack (see note 3) begins at an address slightly lower than the highest address that a process can use.
The difference between the highest address a process can use and the address you're seeing (0x7FFFFFFFDFD0) is only 2030 bytes; which (as mentioned by Fuz's comment) is used by things like ELF aux vectors, command line arguments and environment variables, that consume part of the stack before your code is started.
Note 1: Intel recently (about 2 years ago?) created an extension that (if supported by CPU and OS) makes 57 bits of a virtual address usable. In this case the "non-canonical hole" shrinks, and the highest virtual address a process can use would be increased to 0x00FFFFFFFFFFFFFF.
Note 2: More recently (about 6 months ago?) Intel created an extension that (if supported by CPU and OS and enabled for a process) can make the unused higher bits of an address ignored by the CPU; so that software can pack other information into those bits (e.g. maybe a "data type") without doing explicit masking before use.
Note 3: Because operating systems typically provide no isolation between threads (e.g. any thread can corrupt any other thread's stack or any other thread's "thread local data"); if you create more threads they can't use the same "top of stack" address.
Intel core i5, Ubunu 16.04
I'm reading about the memory paging here and now trying to experiment with it. I wrote a simple assembly program for getting Segmentation Fault and ran in gdb. Here it is:
section .text
global _start
_start:
mov rax, 0xFFFFFFFFFFFF0A31
mov [val], eax
mov eax, 4
mov ebx, 1
mov ecx, val
mov edx, 2
int 0x80
mov eax, 1
int 0x80
segment .bss
dummy resb 0xFFA
val resb 1
I assemble and link this into a 64-bit ELF static executable.
As far as I read each process has its own Page Table which cr3 register points to. Now I would like to look at the page table myself? Is it possible to find info about process page table in Linux?
You would need to have a program compiled as a kernel module to read the page tables. I am sure there are projects out there to do that.
Take a look here: https://github.com/jethrogb/ptdump
Seems to describe what you want
You can see all the mappings your process has in /proc/PID/smaps. This tells you what you can access without getting a SIGSEGV.
This is not the same thing as your cr3 page table, because the kernel doesn't always "wire" all your mappings. i.e. a hardware page fault isn't always a SIGSEGV: the kernel page-fault handler checks whether your process logically has that memory mapped and corrects the situation, or whether you really did violate the memory protections.
After an mmap() system call, or on process startup to map the text / data / BSS segments, you logically have memory mapped, but Linux might have decided to be lazy and not provide any physical pages yet. (e.g. maybe the pages aren't in the pagecache yet, so there's no need to block until you try to actually touch that memory and get a page fault).
Or for BSS memory, multiple logical pages might start out copy-on-write mapped to the same physical page of zeros. Even though according to Unix semantics your memory is read-write, the page tables would actually have read-only mappings. Writing a page will page-fault, and the kernel will point that entry at a new physical page of zeros before returning to your process at the instruction which faulted (which will then be re-run and succeed).
Anyway, this doesn't directly answer your question, but might be part of what you actually want. If you want to look under the hood, then sure have fun looking at the actual page tables, but you generally don't need to do that. smaps can tell you how much of a mapping is resident in memory.
See also what does pss mean in /proc/pid/smaps for details on what the fields mean.
BTW, see Why in 64bit the virtual address are 4 bits short (48bit long) compared with the physical address (52 bit long)? for a nice diagram of the 4-level page table format (and how 2M / 1G hugepages fit in).
I wrote a simple assembly program for getting Segmentation Fault and ran in gdb.... As far as I read each process has its own Page Table which cr3 register points to. Now I would like to look at the page table myself? Is it possible to find info about process page table in Linux?
The operating system maintains the page tables. They are protected from user-mode access (as you are trying to do).
To understand how protection works you are going to need to understand the difference between processor modes (e.g., Kernel and User) and how the processor shifts between these modes.
In short, however, trying to write code to examine page tables as you are doing is a dead end. You are better off learning about page table structure from books rather than trying to write code. I suggest looking at the Intel manuals.
https://software.intel.com/en-us/articles/intel-sdm
Sadly, this is rather dry and Intel writes the worst processor manuals I have seen. I recommend looking exclusively at 64-bit mode. Intel's 32-bit is overly complicated. If there is talk about segments, you are reading 32-bit and can ignore it. Intel's documentation never specifies whether addresses are physical or logical. So you may need to look at on-line lectures for clarification.
To supplement this reading, you can look at the Linux Source code. https://github.com/torvalds/linux
To conclude, it appears you need two prerequisites to get where you want to go: (1) processor modes; and (2) page table structure.
In 32 bit Intel architecture, the mmap2 system call has 6 parameters. The sixth parameter is stored in the ebp register. However, right before entering the kernel via sysenter, this happens (in linux-gate.so.1, the page of code mapped into user processes by the kernel):
push %ebp
movl %esp, %ebp
sysenter
This means that ebp should now have the stack pointer's contents in it instead of the sixth parameter. How does Linux get the parameter right?
That blog post you linked in comments has a link to Linus's post, which gave me the clue to the answer:
Which means that now the kernel can happily trash %ebp as part of the
sixth argument setup, since system call restarting will re-initialize
it to point to the user-level stack that we need in %ebp because
otherwise it gets totally lost.
I'm a disgusting pig, and proud of it to boot.
-- Linus Torvalds
It turns out sysenter is designed to require user-space to cooperate with the kernel in saving the return address and user-space stack pointer. (Upon entering the kernel, %esp will be the kernel stack.) It does way less stuff than int 0x80, which is why it's way faster.
After entry into the kernel, the kernel has user-space's %esp value in %ebp, which it needs anyway. It accesses the 6th param from the user-space stack memory, along with the return address for SYSEXIT. Immediately after entry, (%ebp) holds the 6th syscall param. (Matching the standard int 0x80 ABI where user-space puts the 6th parameter there directly.)
From Michael's comment: "Here's the 32-bit sysenter_target code: look at the part starting at line 417"
From Intel's instruction reference manual entry for SYSENTER (links in the x86 wiki):
The SYSENTER and SYSEXIT instructions are companion instructions, but
they do not constitute a call/return pair. When executing a SYSENTER
instruction, the processor does not save state information for the
user code (e.g., the instruction pointer), and neither the SYSENTER
nor the SYSEXIT instruction supports passing parameters on the stack.
To use the SYSENTER and SYSEXIT instructions as companion instructions
for transitions between privilege level 3 code and privilege level 0
operating system procedures, the following conventions must be
followed:
The segment descriptors for the privilege level 0 code and
stack segments and for the privilege level 3 code and stack segments
must be contiguous in a descriptor table. This convention allows the
processor to compute the segment selectors from the value entered in
the SYSENTER_CS_MSR MSR.
The fast system call “stub” routines
executed by user code (typically in shared libraries or DLLs) must
save the required return IP and processor state information if a
return to the calling procedure is required. Likewise, the operating
system or executive procedures called with SYSENTER instructions must
have access to and use this saved return and state information when
returning to the user code.
I'm using a memory mapped file and I need to use an atomic store on Go. I would use StoreUint64() if I were working on regularly allocated memory. However, I'm not sure how atomic operations work on memory mapped files.
Is it safe to use StoreUint64() on memory mapped files?
It's safe. For example, on amd64, it uses the XCHG instruction.
func StoreUint64
func StoreUint64(addr *uint64, val uint64)
StoreUint64 atomically stores val into *addr.
src/sync/atomic/asm_amd64.s;
TEXT ·StoreUint64(SB),NOSPLIT,$0-16
MOVQ addr+0(FP), BP
MOVQ val+8(FP), AX
XCHGQ AX, 0(BP)
RET
Intel 64 and IA-32 Architectures Software Developer's Manual
XCHG—Exchange Register/Memory with Register
Description
Exchanges the contents of the destination (first) and
source (second) operands. The operands can be two general-purpose
registers or a register and a memory location. If a memory operand is
referenced, the processor’s locking protocol is automatically
implemented for the duration of the exchange operation, regardless of
the presence or absence of the LOCK prefix or of the value of the
IOPL.
I study the Linux kernel and found out that for x86_64 architecture the interrupt int 0x80 doesn't work for calling system calls1.
For the i386 architecture (32-bit x86 user-space), what is more preferable: syscall or int 0x80 and why?
I use Linux kernel version 3.4.
Footnote 1: int 0x80 does work in some cases in 64-bit code, but is never recommended. What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
syscall is the default way of entering kernel mode on x86-64. This instruction is not available in 32 bit modes of operation on Intel processors.
sysenter is an instruction most frequently used to invoke system calls in 32 bit modes of operation. It is similar to syscall, a bit more difficult to use though, but that is the kernel's concern.
int 0x80 is a legacy way to invoke a system call and should be avoided.
The preferred way to invoke a system call is to use vDSO, a part of memory mapped in each process address space that allows to use system calls more efficiently (for example, by not entering kernel mode in some cases at all). vDSO also takes care of more difficult, in comparison to the legacy int 0x80 way, handling of syscall or sysenter instructions.
Also, see this and this.
My answer here covers your question.
In practice, recent kernels are implementing a VDSO, notably to dynamically optimize system calls (the kernel sets the VDSO to some code best for the current processor). So you should use the VDSO, and you'll better use, for existing syscalls, the interface provided by the libc.
Notice that, AFAIK, a significant part of the cost of simple syscalls is going from user-space to kernel and back. Hence, for some syscalls (probably gettimeofday, getpid ...) the VDSO might avoid even that (and technically might avoid doing a real syscall). For most syscalls (like open, read, send, mmap ....) the kernel cost of the syscall is large enough to make any improvement of the user-space to kernel space transition (e.g. using SYSENTER or SYSCALL machine instructions instead of INT) insignificant.
Beware of this before changing : system call numbers differ when doing 0x80 or syscall, e.g sys_write is 4 with 0x80 and 1 with syscall.
http://docs.cs.up.ac.za/programming/asm/derick_tut/syscalls.html for 32 bits or 0x80
http://blog.rchapman.org/post/36801038863/linux-system-call-table-for-x86-64 for syscall