Assembler code doesn't work on Linux

Assembler code doesn't work on Linux - linux

I'm trying to run the following assembler code in Linux using the JWasm compiler. But for all commands, it says, command not found. Why? And it returns an error in the lines which starts with ";". Is it a kind of a comment line? Can I remove this lines? Thanks.
;--- "hello world" for 64-bit Linux, using SYSCALL.
;--- assemble: JWasm -elf64 -Fo=Lin64_1.o Lin64_1.asm
;--- link: gcc Lin64_1.o -o Lin64_1
stdout equ 1
SYS_WRITE equ 1
SYS_EXIT equ 60
.data
string db 10,"Hello, world!",10
.code
_start:
mov edx, sizeof string
mov rsi, offset string
mov edi, stdout
mov eax, SYS_WRITE
syscall
mov eax, SYS_EXIT
syscall
end _start

I am unfamiliar with JWasm, but generally un-indented entries are assembler directives and not instructions.
You want to place a (space/tab) for any actual assembler instructions (things the CPU would run), not assembler directives (things the assembler uses to help you out)

; usually denotes comments in most kinds of assembly, strange that JWasm doesn't recognize the lines as such. Try removing them.

Related

Confused about 64-bit registers - ASM

I'm currently learning assembly, I'm using Intel syntax on a 64bit ubuntu, using nasm.
So I found two websites that reference the syscalls numbers:
This one for 32 bit registers (eax, ebx, ...): https://syscalls.kernelgrok.com
This one for 64 bits registers (rax, rbx, ...): https://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64
The thing is that my code doesn't work when I'm using the 64 bits syscall numbers, but it works when I replace the 'e' from the 32 bit registers by a 'r', so for instance in sys_write I use rbx to store the fd instead of rdi as and it works.
I'm quite lost right now. This code doesn't work:
message db 'Hello, World', 10
section .text
global _start
_start: mov rax,4
mov rdi, 1
mov rsi, message
mov rdx, 13
syscall
mov rax, 1
mov rdi, 0
syscall

Run strace ./my_program - you make a bogus stat system call, then write which succeeds, then fall off the end and segfault.
$ strace ./foo
execve("./foo", ["./foo"], 0x7ffe6b91aa00 /* 51 vars */) = 0
stat(0x1, 0x401000) = -1 EFAULT (Bad address)
write(0, "Hello, World\n", 13Hello, World
) = 13
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xd} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)
It's not register names that are your problem, it's call numbers. You're using 32-bit call numbers but calling the 64-bit syscall ABI.
Call numbers and calling convention both differ.
int 0x80 system calls only ever look at the low 32 bits of registers which is why you shouldn't use them in 64-bit code.
The code you posted in a comment with mov rcx, message would work fine with mov ecx, message and so on, if it works with mov rcx, message. See What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?.
Note that writing a 32-bit register zero-extends into the full 64-bit register so you should always use mov edi, 1 instead of mov rdi, 1. (Although NASM will do this optimization for you to save code-size; they're so equivalent that some assemblers will silently do it for you.)

Segmentation fault (core dumped) when I run my assembly code [duplicate]

I've been looking at a tutorial for assembly, and I'm trying to get a hello world program to run. I am using Bash on Ubuntu on Windows.
Here is the assembly:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
I am using these commands to create the executable:
nasm -f elf64 hello.asm -o hello.o
ld -o hello hello.o -m elf_x86_64
And I run it using:
./hello
The program then seems to run without a segmentation fault or error, but it produces no output.
I can't figure out why the code won't produce an output, but I wonder if using Bash on Ubuntu on Windows has anything to do with it? Why doesn't it produce output and how can I fix it?

Related: WSL2 does allow 32-bit user-space programs, WSL1 doesn't. See Does WSL 2 really support 32 bit program? re: making sure you're actually using WSL2. The rest of this answer was written before WLS2 existed.
The issue is with Ubuntu for Windows (Windows Subsystem for Linux version 1). It only supports the 64-bit syscall interface and not the 32-bit x86 int 0x80 system call mechanism.
Besides not being able to use int 0x80 (32-bit compatibility) in 64-bit binaries, Ubuntu on Windows (WSL1) doesn't support running 32-bit executables either. (Same as if you'd built a real Linux kernel without CONFIG_IA32_EMULATION, like some Gentoo users do.)
You need to convert from using int 0x80 to syscall. It's not difficult. A different set of registers are used for a syscall and the system call numbers are different from their 32-bit counterparts. Ryan Chapman's blog has information on the syscall interface, the system calls, and their parameters. Sys_write and Sys_exit are defined this way:
%rax System call %rdi %rsi %rdx %r10 %r8 %r9
----------------------------------------------------------------------------------
0 sys_read unsigned int fd char *buf size_t count
1 sys_write unsigned int fd const char *buf size_t count
60 sys_exit int error_code
Using syscall also clobbers RCX and the R11 registers. They are considered volatile. Don't rely on them being the same value after the syscall.
Your code could be modified to be:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov edx,len ;message length
mov rsi,msg ;message to write
mov edi,1 ;file descriptor (stdout)
mov eax,edi ;system call number (sys_write)
syscall ;call kernel
xor edi, edi ;Return value = 0
mov eax,60 ;system call number (sys_exit)
syscall ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
Note: in 64-bit code if the destination register of an instruction is 32-bit (like EAX, EBX, EDI, ESI etc) the processor zero extends the result into the upper 32-bits of the 64-bit register. mov edi,1 has the same effect as mov rdi,1.
This answer isn't a primer on writing 64-bit code, only about using the syscall interface. If you are interested in the nuances of writing code that calls the C library, and conforms to the 64-bit System V ABI there are reasonable tutorials to get you started like Ray Toal's NASM tutorial. He discusses stack alignment, the red zone, register usage, and a basic overview of the 64-bit System V calling convention.

As already pointed out in comments by Ross Ridge, don't use 32-bit calling of kernel functions when you compile 64bit.
Either compile for 32bit or "translate" the code into 64 bit syscalls.
Here is what that could look like:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov rdx,len ;message length
mov rsi,msg ;message to write
mov rdi,1 ;file descriptor (stdout)
mov rax,1 ;system call number (sys_write)
syscall ;call kernel
mov rax,60 ;system call number (sys_exit)
mov rdi,0 ;add this to output error code 0(to indicate program terminated without errors)
syscall ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string

Segmentation fault in assembly code using macros [duplicate]

I've been looking at a tutorial for assembly, and I'm trying to get a hello world program to run. I am using Bash on Ubuntu on Windows.
Here is the assembly:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
I am using these commands to create the executable:
nasm -f elf64 hello.asm -o hello.o
ld -o hello hello.o -m elf_x86_64
And I run it using:
./hello
The program then seems to run without a segmentation fault or error, but it produces no output.
I can't figure out why the code won't produce an output, but I wonder if using Bash on Ubuntu on Windows has anything to do with it? Why doesn't it produce output and how can I fix it?

As already pointed out in comments by Ross Ridge, don't use 32-bit calling of kernel functions when you compile 64bit.
Either compile for 32bit or "translate" the code into 64 bit syscalls.
Here is what that could look like:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov rdx,len ;message length
mov rsi,msg ;message to write
mov rdi,1 ;file descriptor (stdout)
mov rax,1 ;system call number (sys_write)
syscall ;call kernel
mov rax,60 ;system call number (sys_exit)
mov rdi,0 ;add this to output error code 0(to indicate program terminated without errors)
syscall ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string

Segmentation Fault at the end of a simple _start that doesn't do anything

when I assembly the following assembly code I get the error Segmentation fault (core dumped)
section .text
global _start
_start:
mov eax, 8
My Makefile is as follows
all:
nasm -f elf64 -o asm.o asm.s
ld asm.o -o asm
rm asm.o
I don't know what the issue is.
I am running 64-bit Ubuntu.

The CPU execute the program, findd the mov eax, 8 instruction, executed it... and what now? There are no more instructions in the object file, but nobody told the CPU! It executes whatever is next, probably no valid instruction, which results in a segmentation fault, just like #MichaelPetch said.
The easiest solution IMO is to use a wrapper, which takes care of initializing and cleaning up your program, e.g., GCC. Just put the mov eax, 8 into the main function, which you may be familiar with from C.
Modify the source file as follows:
section .text
global main
main:
mov eax, 8
ret
(main is a function, so you need the ret instruction to return from it.)
and use the following script:
nasm -f elf64 -o asm.o asm.s
gcc asm.o -o asm
rm asm.o

I made this fast example.asm with 64 bits registers
section .data
msg db "Hello world"
section .text
global _start:
_start:
call _myfunk
call _exit
_myfunk:
mov rax,1
mov rdi,1
mov rdx,12
mov rsi,msg
syscall
ret
_exit:
mov rax, 60
mov rdi,0
syscall
To compile this assembly code you can use nasm and ld commands
nasm -f elf64 example.asm -o example.o
ld example.o -o example.elf
and now run the program ./example.elf

I am only started with assembly, but I can help - now it's 4 years old and you may not need it, but maybe others.
In this:
section .text
global _start
_start:
mov eax, 8
you forgot to stop the program after finishing the code,
so inside the _start label, you can add 3 lines, just like below:
(i don't know why you need 8 in eax reg. so i am moving it to ebx)
section .text
global _start
_start:
mov eax, 8
mov ebx, eax
mov eax, 1 ; this is for system call exit
int 0x80 ; system call
Here the value of ebx will be treated as return value , so you can get the value (8) by typing in your terminal
echo $?
good luck :)

Why doesn't the 'syscall' instruction work under Linux?

I have a very basic assembly program that runs in Linux userland:
section .text
global _start
_start:
mov edx, 14
mov ecx, msg
mov ebx, 1
mov eax, 4
syscall
mov eax, 1
syscall
section .data
msg db "Hello, World!", 0xA
However, this doesn't work as it is, but only if I replace the syscalls with int 0x80. Don't these do the same thing? I know that syscall was designed to be lower-latency, but other than that, I didn't think there was a difference. Why doesn't it work?

syscall works only in x86-64 operating systems and you should put the system call number in rax register instead of eax.
See this website for more information.

The syscall instruction doesn't store "return RIP" or "return RSP" anywhere, so these are typically stored in registers in previous instructions before the syscall instruction is used.
I suspect that on Linux RCX and RDX are used for this purpose; and that all the other parameters end up in different registers because of this.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string