I've just started learning/using Assembly x64 on Linux, and am trying to call getcwd() with call. After attempting to call the getcwd() function, I am also trying to output the result which is not working and I do not understand why. Any pointers/help would be really appreciated. Sorry if its a stupid question. I've looked online for examples but haven't found any that help me specifically. Many thanks. Here is the code:
section .text
global _start
extern getcwd
_start:
mov rdi,rbx
mov rsi,128
call getcwd wrt ..plt
mov rax,1
mov rdi,1
mov rsi,rbx
mov rdx,128
syscall
mov rax,60
mov rdi,0
syscall
I compile with:
nasm -f elf64 -o file.o file.asm
gcc -nostdlib -v -o file file.o -lc
./file
And nothing is shown
Here is a possible implementation that allocates space on the stack. I also switched to main and puts:
global main
extern getcwd
extern puts
main:
sub rsp, 128+8 ; buffer + alignment
mov rdi, rsp
mov rsi, 128
call getcwd wrt ..plt
call puts wrt ..plt
add rsp, 128+8
ret
This code
mov rdi,rbx
mov rsi,128
call getcwd wrt ..plt
is equivalent to C code getcwd(__undefined__, 128);, where __undefined__ is some value that happens to be in ebx on entry to _start. On my system it appeared to be NULL, which to getcwd signalizes that no buffer is passed (regardless of its size being 128).
The getcwd function then returns pointer to the newly-allocated buffer (as an extension to POSIX, see man 3 getcwd). Your subsequent mov rax,1 overwrites this address with the system call, and subsequent instructions pass the value in ebx as the buffer to the write syscall. Since it was NULL before, by the calling convention it remains such, and you call write(1,NULL,128);, which returns EFAULT and writes nothing.
Related
I've been looking at a tutorial for assembly, and I'm trying to get a hello world program to run. I am using Bash on Ubuntu on Windows.
Here is the assembly:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
I am using these commands to create the executable:
nasm -f elf64 hello.asm -o hello.o
ld -o hello hello.o -m elf_x86_64
And I run it using:
./hello
The program then seems to run without a segmentation fault or error, but it produces no output.
I can't figure out why the code won't produce an output, but I wonder if using Bash on Ubuntu on Windows has anything to do with it? Why doesn't it produce output and how can I fix it?
Related: WSL2 does allow 32-bit user-space programs, WSL1 doesn't. See Does WSL 2 really support 32 bit program? re: making sure you're actually using WSL2. The rest of this answer was written before WLS2 existed.
The issue is with Ubuntu for Windows (Windows Subsystem for Linux version 1). It only supports the 64-bit syscall interface and not the 32-bit x86 int 0x80 system call mechanism.
Besides not being able to use int 0x80 (32-bit compatibility) in 64-bit binaries, Ubuntu on Windows (WSL1) doesn't support running 32-bit executables either. (Same as if you'd built a real Linux kernel without CONFIG_IA32_EMULATION, like some Gentoo users do.)
You need to convert from using int 0x80 to syscall. It's not difficult. A different set of registers are used for a syscall and the system call numbers are different from their 32-bit counterparts. Ryan Chapman's blog has information on the syscall interface, the system calls, and their parameters. Sys_write and Sys_exit are defined this way:
%rax System call %rdi %rsi %rdx %r10 %r8 %r9
----------------------------------------------------------------------------------
0 sys_read unsigned int fd char *buf size_t count
1 sys_write unsigned int fd const char *buf size_t count
60 sys_exit int error_code
Using syscall also clobbers RCX and the R11 registers. They are considered volatile. Don't rely on them being the same value after the syscall.
Your code could be modified to be:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov edx,len ;message length
mov rsi,msg ;message to write
mov edi,1 ;file descriptor (stdout)
mov eax,edi ;system call number (sys_write)
syscall ;call kernel
xor edi, edi ;Return value = 0
mov eax,60 ;system call number (sys_exit)
syscall ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
Note: in 64-bit code if the destination register of an instruction is 32-bit (like EAX, EBX, EDI, ESI etc) the processor zero extends the result into the upper 32-bits of the 64-bit register. mov edi,1 has the same effect as mov rdi,1.
This answer isn't a primer on writing 64-bit code, only about using the syscall interface. If you are interested in the nuances of writing code that calls the C library, and conforms to the 64-bit System V ABI there are reasonable tutorials to get you started like Ray Toal's NASM tutorial. He discusses stack alignment, the red zone, register usage, and a basic overview of the 64-bit System V calling convention.
As already pointed out in comments by Ross Ridge, don't use 32-bit calling of kernel functions when you compile 64bit.
Either compile for 32bit or "translate" the code into 64 bit syscalls.
Here is what that could look like:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov rdx,len ;message length
mov rsi,msg ;message to write
mov rdi,1 ;file descriptor (stdout)
mov rax,1 ;system call number (sys_write)
syscall ;call kernel
mov rax,60 ;system call number (sys_exit)
mov rdi,0 ;add this to output error code 0(to indicate program terminated without errors)
syscall ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
I've been looking at a tutorial for assembly, and I'm trying to get a hello world program to run. I am using Bash on Ubuntu on Windows.
Here is the assembly:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
I am using these commands to create the executable:
nasm -f elf64 hello.asm -o hello.o
ld -o hello hello.o -m elf_x86_64
And I run it using:
./hello
The program then seems to run without a segmentation fault or error, but it produces no output.
I can't figure out why the code won't produce an output, but I wonder if using Bash on Ubuntu on Windows has anything to do with it? Why doesn't it produce output and how can I fix it?
Related: WSL2 does allow 32-bit user-space programs, WSL1 doesn't. See Does WSL 2 really support 32 bit program? re: making sure you're actually using WSL2. The rest of this answer was written before WLS2 existed.
The issue is with Ubuntu for Windows (Windows Subsystem for Linux version 1). It only supports the 64-bit syscall interface and not the 32-bit x86 int 0x80 system call mechanism.
Besides not being able to use int 0x80 (32-bit compatibility) in 64-bit binaries, Ubuntu on Windows (WSL1) doesn't support running 32-bit executables either. (Same as if you'd built a real Linux kernel without CONFIG_IA32_EMULATION, like some Gentoo users do.)
You need to convert from using int 0x80 to syscall. It's not difficult. A different set of registers are used for a syscall and the system call numbers are different from their 32-bit counterparts. Ryan Chapman's blog has information on the syscall interface, the system calls, and their parameters. Sys_write and Sys_exit are defined this way:
%rax System call %rdi %rsi %rdx %r10 %r8 %r9
----------------------------------------------------------------------------------
0 sys_read unsigned int fd char *buf size_t count
1 sys_write unsigned int fd const char *buf size_t count
60 sys_exit int error_code
Using syscall also clobbers RCX and the R11 registers. They are considered volatile. Don't rely on them being the same value after the syscall.
Your code could be modified to be:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov edx,len ;message length
mov rsi,msg ;message to write
mov edi,1 ;file descriptor (stdout)
mov eax,edi ;system call number (sys_write)
syscall ;call kernel
xor edi, edi ;Return value = 0
mov eax,60 ;system call number (sys_exit)
syscall ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
Note: in 64-bit code if the destination register of an instruction is 32-bit (like EAX, EBX, EDI, ESI etc) the processor zero extends the result into the upper 32-bits of the 64-bit register. mov edi,1 has the same effect as mov rdi,1.
This answer isn't a primer on writing 64-bit code, only about using the syscall interface. If you are interested in the nuances of writing code that calls the C library, and conforms to the 64-bit System V ABI there are reasonable tutorials to get you started like Ray Toal's NASM tutorial. He discusses stack alignment, the red zone, register usage, and a basic overview of the 64-bit System V calling convention.
As already pointed out in comments by Ross Ridge, don't use 32-bit calling of kernel functions when you compile 64bit.
Either compile for 32bit or "translate" the code into 64 bit syscalls.
Here is what that could look like:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov rdx,len ;message length
mov rsi,msg ;message to write
mov rdi,1 ;file descriptor (stdout)
mov rax,1 ;system call number (sys_write)
syscall ;call kernel
mov rax,60 ;system call number (sys_exit)
mov rdi,0 ;add this to output error code 0(to indicate program terminated without errors)
syscall ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
when I assembly the following assembly code I get the error Segmentation fault (core dumped)
section .text
global _start
_start:
mov eax, 8
My Makefile is as follows
all:
nasm -f elf64 -o asm.o asm.s
ld asm.o -o asm
rm asm.o
I don't know what the issue is.
I am running 64-bit Ubuntu.
The CPU execute the program, findd the mov eax, 8 instruction, executed it... and what now? There are no more instructions in the object file, but nobody told the CPU! It executes whatever is next, probably no valid instruction, which results in a segmentation fault, just like #MichaelPetch said.
The easiest solution IMO is to use a wrapper, which takes care of initializing and cleaning up your program, e.g., GCC. Just put the mov eax, 8 into the main function, which you may be familiar with from C.
Modify the source file as follows:
section .text
global main
main:
mov eax, 8
ret
(main is a function, so you need the ret instruction to return from it.)
and use the following script:
nasm -f elf64 -o asm.o asm.s
gcc asm.o -o asm
rm asm.o
I made this fast example.asm with 64 bits registers
section .data
msg db "Hello world"
section .text
global _start:
_start:
call _myfunk
call _exit
_myfunk:
mov rax,1
mov rdi,1
mov rdx,12
mov rsi,msg
syscall
ret
_exit:
mov rax, 60
mov rdi,0
syscall
To compile this assembly code you can use nasm and ld commands
nasm -f elf64 example.asm -o example.o
ld example.o -o example.elf
and now run the program ./example.elf
I am only started with assembly, but I can help - now it's 4 years old and you may not need it, but maybe others.
In this:
section .text
global _start
_start:
mov eax, 8
you forgot to stop the program after finishing the code,
so inside the _start label, you can add 3 lines, just like below:
(i don't know why you need 8 in eax reg. so i am moving it to ebx)
section .text
global _start
_start:
mov eax, 8
mov ebx, eax
mov eax, 1 ; this is for system call exit
int 0x80 ; system call
Here the value of ebx will be treated as return value , so you can get the value (8) by typing in your terminal
echo $?
good luck :)
I have the following code:
section .text
global _start ;must be declared for using gcc
_start: ;tell linker entry point
mov ecx, 2 ;read-write perms
mov ebx, name ;name of file
mov eax, 8 ;system call number (sys_creat)
int 0x80 ;call kernel
mov eax, 1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
name db 'C:\\test.txt',0xa
It is meant to create a file (test.txt) in the C drive however doesn't work, what is the correct way to do this?
First, syscall=8 is sys_creat, not write.
But the easiest way to find out what is happening is looking at the strace output of the program. There you can see if the syscall succeeded, and if not, what is the error value. (errno)
Afaik creat(2) is not used anymore, and Open(2) with O_CREAT in the second argument is used nowadays.
Program writes executable placed in it's second segment on disk, decrypts it(into /tmp/decbd), and executes(as it was planned)
file decbd appears on disk, and can be executed via shell, last execve call return eax=-14, and after end of the program, execution flows on data and gets segfault.
http://pastebin.com/KywXTB0X
In second segment after compilation using hexdump and dd I manually placed echo binary encrypted via openssl, and when I stopped execution right before last int 0x80 command, I've already been able to run my "echo" in decbd, using another terminal.
You should have narrowed it down to a minimal example. See MCVE.
You should comment your code if you want other people to help.
You should learn to use the debugger and/or other tools.
For point #1, you could have gone down to:
section .text
global _start ;must be declared for linker (ld)
_start:
mov eax,11 ; execve syscall
mov ebx,program ; name of program
mov ecx,[esp+4] ; pointer to argument array
mov ebp,[esp] ; number of arguments
lea edx,[esp+4*ebp+2] ; pointer to environ array
int 0x80
section .data
program db '/bin/echo',0
For point #3, using the debugger you could have seen that:
ebx is okay
ebp is okay
ecx is wrong
edx is wrong
It's an easy fix. ecx should be loaded with the address, not the value and edx should be skipping 2 pointers which are 4 bytes each, so the offset should be 8 not 2. The fixed code could look like this:
section .text
global _start ;must be declared for linker (ld)
_start:
mov eax,11 ; execve syscall
mov ebx,program ; name of program
lea ecx,[esp+4] ; pointer to argument array
mov ebp,[esp] ; number of arguments
lea edx,[esp+4*ebp+8] ; pointer to environ array (skip argc and NULL)
int 0x80
section .data
program db '/bin/echo',0
man execve says this in the "ERRORS" section with regard to return code -14 (-EFAULT):
EFAULT filename points outside your accessible address space.
You passed a bad pointer to execve().