Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm trying to write an assembly program for Linux assembly x86 that erases a file from a directory. Any tips?
maybe something like this:
.section .data
fpath:
.asciz "/home/user/filename" # path to file to delete
.section .text
.globl _start
_start:
movl $10, %eax # unlink syscall
movl $fpath, %ebx # path to file to delete
int $0x80
movl %eax, %ebx # put syscall ret value in ebx
movl $1, %eax # exit syscall
int $0x80
then check the return value at the command line, 0 being success.
$>echo $?
I tried this code and it would not unlink a file if invoked from the same
directory as the file to delete, it would unlink files from other dirs.
This is 32 bit code so on a 64 bit os if your source file was un-link.s,
you would need to create the executable with:
$> as --32 -gdwarf2 un-link.s -o un-link.o
leave out the -gdwarf2 if you don't need to run it with gdb debugger,
then link with:
$> ld -m elf_i386 un-link.o -o un-link
hope it works for you
I see alot of people using int 0x80 for this, this is deprecated and for most things, you should not use it because it is extremely slow and is subject to possible removal at any point in future. The only cases where you should use int 0x80 would be in a case where space saving is much more important than speed or where sysenter is not supported on the target platform (pre-Pentium 4 CPUs).
In x86 Linux syscalls work by having the syscall code in eax and any arguments for it in the successive registers, ebx, ecx etc. The return value of the syscall will be placed in eax.
mov eax, 0xa ;0xa is the 'unlink' syscall, which removes a file
mov ebx, <location in memory of a null terminated string containing a path>
push <LABEL TO JUMP TO AFTER SYSENTER IS COMPLETE>
push ecx
push edx
push ebp
mov ebp, esp
sysenter
If you don't care about your registers being filled with garbage after the syscall finishes (such as in cases where you'll be overwriting the contents of those registers immediately anyway) you can use this trick to save some space and speed things up a little:
mov eax, 0xa ;0xa is the 'unlink' syscall, which removes a file
mov ebx, <location in memory of a null terminated string containing a path>
push <LABEL TO JUMP TO AFTER SYSENTER IS COMPLETE>
lea ebp, [esp-12]
sysenter
If you want to write assembly code that erases a file from a directory, you can use multiple methods, but one of the best ways is to use the linux systemcall 'unlink'. In order to have to use the proper system call, you have to figure out if you have an x86 system, or an x86-64 system, or any other kind of system. I will specify how to do this using only x86/x86-64.
So system calls work as following in linux:
1.
You put a number into return register (eax or rax depending on system) that corresponds to a system call's number, like $1 is sys_exit on x86 and $29 is sys_pause, you have to make sure you get the right number in %eax or it won't work. (I feel the need to emphasize this, this number is SYSTEM DEPENDENT, so an x86 syscall number in %eax will not do the same thing in an x86-64 system). Also, OS's may tamper with this as well, I will only talk about linux, I can't speak for any other OS.
2.
Then you move the arguments into the proper registers that is specified BY THE SYSTEM, again, find a reference for your specific system, and you will know which registers to put your arguments for the function into (some syscalls don't need arguments, so this step is unnecessary, why would you need an argument for sys_exit, idk).
3.
Then finally you use syscall for your system to let the system know to run a specific function (the one you specified by putting a number into %eax/%rax).
Let's look at an example of some assembly code that deletes a file in two different systems, x86, and x86-64 (windows has its own method, but I will not talk about that one, although it does exist). If I wanted to delete/unlink the file stored at address 0x7fff50 (let's say you don't want to specify fpath and you know the address)
In x86:
movl $10, %eax # defines which systemcall we are using (10th)
movl $0x7fff50, %ebx # moves the address of file we want to delete into %ebx (this is where the argument of sys_unlink(x86) is stored)
int $0x80 # equivalent of syscall that starts the appropriate function
In x86-64:
movq $87, %rax # defines which systemcall we are using (87th)
movq $0x7fff50, %rdi # moves the address of file we want to delete into %rdi (this is where the argument of sys_unlink(x86-64) is stored)
syscall # starts the appropriate function
If you are more of a normal person and don't know the absolute address of the file you are trying to delete, we have an easy way to write relative addresses of the file itself, like so (as copied from the other answer)
.section .data
fpath:
.asciz "/home/user/filename" # path to file to delete
then the system itself will figure out where the absolute path (something like 7fff34fb) to the file is, and move that into the register when you compile the assembly. Might look like this
In x86-64:
.section .data
fpath:
.asciz "/home/user/filename" # path to file to delete
.section .text
.globl _start
_start:
movq $87, %rax # defines which systemcall we are using (87th)
movq $fpath, %rdi # moves the address of file we want to delete into %rdi
# (this is where the argument of sys_unlink(x86-64) is stored)
syscall # starts the appropriate function
You can access the output of the function (usually 0 for success or 1 for fail status) by looking at the %eax register. After each syscall, an integer is returned in %eax
You can put this text into a text file 'program.s' (I would suggest using notepad/notepad++, '.s' stands for assembly code) and compile it using gcc with $ gcc -c program.s to get the object code, and you can print and look at the object code and the hex keys involved with each command using $ objdump -d program.o
For more information about how to look up which syscall is which number and which register to put the arguements in, go here:
For x86-64: https://filippo.io/linux-syscall-table/
Related
So i need to recursievly delete files in a directory using x86_64 assembly.
here is my code and i know it is bad. My problem is that every syscall works individualy(i can individualy delete directories or documents) , But as soon as i merge them together like this, it doesn't work . #edit: as pointed by #fuz, the question was not descriptive enough. so i want it do open directory that contains file called "test.txt", delete that file, and then to delete the directory that contained the file. But it just exits the program when i compile it with nasm. I am using Linux mint
global _start
section .text
_start:
;open directory
mov rax, 2 ;sys_open
mov rdi, dir ;pointer to the directory
mov rsi, 0 ;read only
syscall
;delete document
mov rax, 87 ;sys_unlink
mov rdi, doc ;points to the document
syscall
;delete directory
mov rax, 84 ;sys_rmdir
mov rdi, dir
syscall
_exit:
mov rax, 60
mov rdi, 80
syscall
section .data
dir: db 'test',0
doc: db 'test.txt',0
Your problem is not with assembly, it's with understanding basic POSIX / Unix file manipulation system calls. open("dir") does not make later unlink / rmdir system calls relative to that directory, they're still relative to your current working directory. You can change that with chdir().
(And that's easier to do with C than with asm. For most system calls, the glibc wrapper is trivial and passes on all the args unchanged to the kernel. When that's not the case, the NOTES section of the Linux man page documents that.)
There are system calls that do things relative to an open directory file descriptor (instead of the CWD), whose names add an ...at suffix to the traditional system-call names. The documentation uses C syntax to describe the system calls, but given the ABI this tells you how to call them in asm.
unlinkat(int dirfd, const char *pathname, int flags)
fd-relative rmdir is actually done with unlinkat(fd, path, AT_REMOVEDIR). (Otherwise, with flags=0, unlinkat behaves like regular unlink).
linkat, symlinkat, readlinkat, statat, mkdirat, execveat, ...
various others, including renameat, and a fun renameat2 that takes flags allowing you to atomically swap two pathnames on the same filesystem.
The notes section of the openat man page explains why these at system calls exist in the first place, under Rationale for openat() - avoids race conditions between readdir and open if someone else renames a directory component of the path. And since chdir() is per-process, not per-thread, allows different threads to do relative stuff in different directories at the same time.
Like Jester said, use strace find test -name 'test.txt' -delete or something like that to see how to actually recurse through directories with open(O_DIRECTORY), getdents, and unlinkat.
getdents is the raw system call that the POSIX readdir interface is built on top of, on Linux. The man pages document this. In asm, you can either use libc function calls to readdir, or you'd have to use getdents yourself.
Or since you aren't actually recursing, just hard-coding some relative paths, you could just make unlink("test/test.txt") and rmdir("test") system calls.
I'm trying to learn some assembly, and I'm starting out by outputting text to the screen. I'm starting to think it might be my environment and/or compilation: by now, I'm so frustrated that I've literally copy-pasted assembly code but it just won't call the system calls. Here is the source code (mainly adapted from https://en.wikibooks.org/wiki/X86_Assembly/Interfacing_with_Linux)
.section .data
msg: .ascii "Hello World\n"
.section .text
.global main
main:
movq $1, %rdi # write to stdout
movq $msg, %rsi # use string "Hello World"
movq $12, %rdx # write 12 characters
syscall # make syscall
movq $60, %rax # use the _exit syscall
movq $0, %rdi # error code 0
syscall # make syscall
I'm on a 64-bit machine running Kali Linux, and am compiling with GCC. Like so:
gcc -c test.s
gcc test.o -no-pie
I've debugged the program with GDB and the syscall instruction always sets the eax register to 0xffffffffffffffda (-38) which does not seem right...
Can anyone give an insight?
Syscalls usually return a negative value in case of error, the absolute value being the errno value itself.
In your case 38is ENOSYS: Function not implemented.
But what syscall function are you calling? Let's see, the function number is stored into rax (eax in 32-bits) before issuing the syscall and your program loads... nothing!
It looks like you lost one line in your copy/paste:
movq $1, %rax ; use the write syscall
Your code is missing the first instruction from the sample code:
movq $1, %rax ; use the write syscall
Without this code, it ends up executing an unexpected (and probably invalid) system call, based on whatever happened to be in %rax when main was called.
Program writes executable placed in it's second segment on disk, decrypts it(into /tmp/decbd), and executes(as it was planned)
file decbd appears on disk, and can be executed via shell, last execve call return eax=-14, and after end of the program, execution flows on data and gets segfault.
http://pastebin.com/KywXTB0X
In second segment after compilation using hexdump and dd I manually placed echo binary encrypted via openssl, and when I stopped execution right before last int 0x80 command, I've already been able to run my "echo" in decbd, using another terminal.
You should have narrowed it down to a minimal example. See MCVE.
You should comment your code if you want other people to help.
You should learn to use the debugger and/or other tools.
For point #1, you could have gone down to:
section .text
global _start ;must be declared for linker (ld)
_start:
mov eax,11 ; execve syscall
mov ebx,program ; name of program
mov ecx,[esp+4] ; pointer to argument array
mov ebp,[esp] ; number of arguments
lea edx,[esp+4*ebp+2] ; pointer to environ array
int 0x80
section .data
program db '/bin/echo',0
For point #3, using the debugger you could have seen that:
ebx is okay
ebp is okay
ecx is wrong
edx is wrong
It's an easy fix. ecx should be loaded with the address, not the value and edx should be skipping 2 pointers which are 4 bytes each, so the offset should be 8 not 2. The fixed code could look like this:
section .text
global _start ;must be declared for linker (ld)
_start:
mov eax,11 ; execve syscall
mov ebx,program ; name of program
lea ecx,[esp+4] ; pointer to argument array
mov ebp,[esp] ; number of arguments
lea edx,[esp+4*ebp+8] ; pointer to environ array (skip argc and NULL)
int 0x80
section .data
program db '/bin/echo',0
man execve says this in the "ERRORS" section with regard to return code -14 (-EFAULT):
EFAULT filename points outside your accessible address space.
You passed a bad pointer to execve().
section .text
global _start
_start:
nop
main:
mov eax, 1
mov ebx, 2
xor eax, eax
ret
I compile with these commands:
nasm -f elf main.asm
ld -melf_i386 -o main main.o
When I run the code, Linux throw a segmentation fault error
(I am using Linux Mint Nadia 64 bits). Why this error is produced?
Because ret is NOT the proper way to exit a program in Linux, Windows, or Mac!!!!
_start is not a function, there is no return address on the stack because there is no user-space caller to return to. Execution in user-space started here (in a static executable), at the process entry point. (Or with dynamic linking, it jumped here after the dynamic linker finished, but same result).
On Linux / OS X, the stack pointer is pointing at argc on entry to _start (see the i386 or x86-64 System V ABI doc for more details on the process startup environment); the kernel puts command line args into user-space stack memory before starting user-space. (So if you do try to ret, EIP/RIP = argc = a small integer, not a valid address. If your debugger shows a fault at address 0x00000001 or something, that's why.)
For Windows it is ExitProcess and Linux is is system call -
int 80H using sys_exit, for x86 or using syscall using 60 for 64-bit or a call to exit from the C Library if you are linking to it.
32-bit Linux (i386)
%define SYS_exit 1 ; call number __NR_exit from <asm/unistd_32.h>
mov eax, SYS_exit ; use the NASM macro we defined earlier
xor ebx, ebx ; ebx = 0 exit status
int 80H ; _exit(0)
64-bit Linux (amd64)
mov rax, 60 ; SYS_exit aka __NR_exit from asm/unistd_64.h
xor rdi, rdi ; edi = 0 first arg to 64-bit system calls
syscall ; _exit(0)
(In GAS you can actually #include <sys/syscall.h> or <asm/unistd.h> to get the right numbers for the mode you're assembling a .S for, but NASM can't easily use the C preprocessor.
See Polygot include file for nasm/yasm and C for hints.)
32-bit Windows (x86)
push 0
call ExitProcess
Or Windows/Linux linking against the C Library
; pass an int exit_status as appropriate for the calling convention
; push 0 / xor edi,edi / xor ecx,ecx
call exit
(Or for 32-bit x86 Windows, call _exit, because C names get prepended with an underscore, unlike in x86-64 Windows. The POSIX _exit function would be call __exit, if Windows had one.)
Windows x64's calling convention includes shadow space which the caller has to reserve, but exit isn't going to return so it's ok to let it step on that space above its return address. Also, 16-byte stack alignment is required by the calling convention before call exit except for 32-bit Windows, but often won't actually crash for a simple function like exit().
call exit (unlike a raw exit system call or libc _exit) will flush stdio buffers first. If you used printf from _start, use exit to make sure all output is printed before you exit, even if stdout is redirected to a file (making stdout full-buffered, not line-buffered).
It's generally recommended that if you use libc functions, you write a main function and link with gcc so it's called by the normal CRT start functions which you can ret to.
See also
Syscall implementation of exit()
How come _exit(0) (exiting by syscall) prevents me from receiving any stdout content?
Defining main as something that _start falls through into doesn't make it special, it's just confusing to use a main label if it's not like a C main function called by a _start that's prepared to exit after main returns.
This description is valid for Linux 32 bit:
When a Linux program begins, all pointers to command-line arguments are stored on the stack. The number of arguments is stored at 0(%ebp), the name of the program is stored at 4(%ebp), and the arguments are stored from 8(%ebp).
I need the same information for 64 bit.
Edit:
I have working code sample which shows how to use argc, argv[0] and argv[1]: http://cubbi.com/fibonacci/asm.html
.globl _start
_start:
popq %rcx # this is argc, must be 2 for one argument
cmpq $2,%rcx
jne usage_exit
addq $8,%rsp # skip argv[0]
popq %rsi # get argv[1]
call ...
...
}
It looks like parameters are on the stack. Since this code is not clear, I ask this question. My guess that I can keep rsp in rbp, and then access these parameters using 0(%rbp), 8(%rbp), 16(%rbp) etc. It this correct?
Despite the accepted answer being more than sufficient, I would like to give an explicit answer, as there are some other answers which might confuse.
Most important (for more information see examples below): in x86-64 the command line arguments are passed via stack:
(%rsp) -> number of arguments
8(%rsp) -> address of the name of the executable
16(%rsp) -> address of the first command line argument (if exists)
... so on ...
It is different from the function parameter passing in x86-64, which uses %rdi, %rsi and so on.
One more thing: one should not deduce the behavior from reverse engineering of the C main-function. C runtime provides the entry point _start, wraps the command line arguments and calls main as a common function. To see it, let's consider the following example.
No C runtime/GCC with -nostdlib
Let's check this simple x86-64 assembler program, which do nothing but returns 42:
.section .text
.globl _start
_start:
movq $60, %rax #60 -> exit
movq $42, %rdi #return 42
syscall #run kernel
We build it with:
as --64 exit64.s -o exit64.o
ld -m elf_x86_64 exit64.o -o exit64
or with
gcc -nostdlib exit64.s -o exit64
run in gdb with
./exit64 first second third
and stop at the breakpoint at _start. Let's check the registers:
(gdb) info registers
...
rsi 0x0 0
rdi 0x0 0
...
Nothing there. What about the stack?
(gdb) x/5g $sp
0x7fffffffde40: 4 140737488347650
0x7fffffffde50: 140737488347711 140737488347717
0x7fffffffde60: 140737488347724
So the first element on the stack is 4 - the expected argc. The next 4 values look a lot like pointers. Let's look at the second pointer:
(gdb) print (char[5])*(140737488347711)
$1 = "first"
As expected it is the first command line argument.
So there is experimental evidence, that the command line arguments are passed via stack in x86-64. However only by reading the ABI (as the accepted answer suggested) we can be sure, that this is really the case.
With C runtime
We have to change the program slightly, renaming _start into main, because the entry point _start is provided by the C runtime.
.section .text
.globl main
main:
movq $60, %rax #60 -> exit
movq $42, %rdi #return 42
syscall #run kernel
We build it with (C runtime is used per default):
gcc exit64gcc.s -o exit64gcc
run in gdb with
./exit64gcc first second third
and stop at the breakpoint at main. What is at the stack?
(gdb) x/5g $sp
0x7fffffffdd58: 0x00007ffff7a36f45 0x0000000000000000
0x7fffffffdd68: 0x00007fffffffde38 0x0000000400000000
0x7fffffffdd78: 0x00000000004004ed
It does not look familiar. And registers?
(gdb) info registers
...
rsi 0x7fffffffde38 140737488346680
rdi 0x4 4
...
We can see that rdi contains the argc value. But if we now inspect the pointer in rsi strange things happen:
(gdb) print (char[5])*($rsi)
$1 = "\211\307???"
But wait, the second argument of the main function in C is not char *, but char ** also:
(gdb) print (unsigned long long [4])*($rsi)
$8 = {140737488347644, 140737488347708, 140737488347714, 140737488347721}
(gdb) print (char[5])*(140737488347708)
$9 = "first"
And now we found our arguments, which are passed via registers as it would be for a normal function in x86-64.
Conclusion:
As we can see, the is a difference concerning passing of command line arguments between code using C runtime and code which doesn't.
It looks like section 3.4 Process Initialization, and specifically figure 3.9, in the already mentioned System V AMD64 ABI describes precisely what you want to know.
I do believe what you need to do is check out the x86-64 ABI. Specifically, I think you need to look at section 3.2.3 Parameter Passing.