section .text
global _start
_start:
nop
main:
mov eax, 1
mov ebx, 2
xor eax, eax
ret
I compile with these commands:
nasm -f elf main.asm
ld -melf_i386 -o main main.o
When I run the code, Linux throw a segmentation fault error
(I am using Linux Mint Nadia 64 bits). Why this error is produced?
Because ret is NOT the proper way to exit a program in Linux, Windows, or Mac!!!!
_start is not a function, there is no return address on the stack because there is no user-space caller to return to. Execution in user-space started here (in a static executable), at the process entry point. (Or with dynamic linking, it jumped here after the dynamic linker finished, but same result).
On Linux / OS X, the stack pointer is pointing at argc on entry to _start (see the i386 or x86-64 System V ABI doc for more details on the process startup environment); the kernel puts command line args into user-space stack memory before starting user-space. (So if you do try to ret, EIP/RIP = argc = a small integer, not a valid address. If your debugger shows a fault at address 0x00000001 or something, that's why.)
For Windows it is ExitProcess and Linux is is system call -
int 80H using sys_exit, for x86 or using syscall using 60 for 64-bit or a call to exit from the C Library if you are linking to it.
32-bit Linux (i386)
%define SYS_exit 1 ; call number __NR_exit from <asm/unistd_32.h>
mov eax, SYS_exit ; use the NASM macro we defined earlier
xor ebx, ebx ; ebx = 0 exit status
int 80H ; _exit(0)
64-bit Linux (amd64)
mov rax, 60 ; SYS_exit aka __NR_exit from asm/unistd_64.h
xor rdi, rdi ; edi = 0 first arg to 64-bit system calls
syscall ; _exit(0)
(In GAS you can actually #include <sys/syscall.h> or <asm/unistd.h> to get the right numbers for the mode you're assembling a .S for, but NASM can't easily use the C preprocessor.
See Polygot include file for nasm/yasm and C for hints.)
32-bit Windows (x86)
push 0
call ExitProcess
Or Windows/Linux linking against the C Library
; pass an int exit_status as appropriate for the calling convention
; push 0 / xor edi,edi / xor ecx,ecx
call exit
(Or for 32-bit x86 Windows, call _exit, because C names get prepended with an underscore, unlike in x86-64 Windows. The POSIX _exit function would be call __exit, if Windows had one.)
Windows x64's calling convention includes shadow space which the caller has to reserve, but exit isn't going to return so it's ok to let it step on that space above its return address. Also, 16-byte stack alignment is required by the calling convention before call exit except for 32-bit Windows, but often won't actually crash for a simple function like exit().
call exit (unlike a raw exit system call or libc _exit) will flush stdio buffers first. If you used printf from _start, use exit to make sure all output is printed before you exit, even if stdout is redirected to a file (making stdout full-buffered, not line-buffered).
It's generally recommended that if you use libc functions, you write a main function and link with gcc so it's called by the normal CRT start functions which you can ret to.
See also
Syscall implementation of exit()
How come _exit(0) (exiting by syscall) prevents me from receiving any stdout content?
Defining main as something that _start falls through into doesn't make it special, it's just confusing to use a main label if it's not like a C main function called by a _start that's prepared to exit after main returns.
Related
I'm currently learning assembly, I'm using Intel syntax on a 64bit ubuntu, using nasm.
So I found two websites that reference the syscalls numbers:
This one for 32 bit registers (eax, ebx, ...): https://syscalls.kernelgrok.com
This one for 64 bits registers (rax, rbx, ...): https://blog.rchapman.org/posts/Linux_System_Call_Table_for_x86_64
The thing is that my code doesn't work when I'm using the 64 bits syscall numbers, but it works when I replace the 'e' from the 32 bit registers by a 'r', so for instance in sys_write I use rbx to store the fd instead of rdi as and it works.
I'm quite lost right now. This code doesn't work:
message db 'Hello, World', 10
section .text
global _start
_start: mov rax,4
mov rdi, 1
mov rsi, message
mov rdx, 13
syscall
mov rax, 1
mov rdi, 0
syscall
Run strace ./my_program - you make a bogus stat system call, then write which succeeds, then fall off the end and segfault.
$ strace ./foo
execve("./foo", ["./foo"], 0x7ffe6b91aa00 /* 51 vars */) = 0
stat(0x1, 0x401000) = -1 EFAULT (Bad address)
write(0, "Hello, World\n", 13Hello, World
) = 13
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xd} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)
It's not register names that are your problem, it's call numbers. You're using 32-bit call numbers but calling the 64-bit syscall ABI.
Call numbers and calling convention both differ.
int 0x80 system calls only ever look at the low 32 bits of registers which is why you shouldn't use them in 64-bit code.
The code you posted in a comment with mov rcx, message would work fine with mov ecx, message and so on, if it works with mov rcx, message. See What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?.
Note that writing a 32-bit register zero-extends into the full 64-bit register so you should always use mov edi, 1 instead of mov rdi, 1. (Although NASM will do this optimization for you to save code-size; they're so equivalent that some assemblers will silently do it for you.)
I want to call printf function from assembly language in linux.
i want to know the method for for 64 bit and 32 bit assembly language programs.
1) please tell me for two cases if i want to pass a 32 bit arguement and 64 bit arguement in printf with a string. how should i do it?
2) for x86 32 bit architecture if i want to do the same thing as in point 1.
please tell me the code. and let me know do i need to adjust the stack for both cases and do i just need to pass the arguements in registers?
Thanks alot
There are 2 ways to print a string with assembly language in Linux.
1) Use syscall for x64, or int 0x80 for x86. It's not printf, it's kernel routines. You can find more here (x86) and here (x64).
2) Use printf from glibc. I assume you are familiar with the structure of NASM program, so here is a nice x86 example from acm.mipt.ru:
global main
;Declare used libc functions
extern exit
extern puts
extern scanf
extern printf
section .text
main:
;Arguments are passed in reversed order via stack (for x86)
;For x64 first six arguments are passed in straight order
; via RDI, RSI, RDX, RCX, R8, R9 and other are passed via stack
;The result comes back in EAX/RAX
push dword msg
call puts
;After passing arguments via stack, you have to clear it to
; prevent segfault with add esp, 4 * (number of arguments)
add esp, 4
push dword a
push dword b
push dword msg1
call scanf
add esp, 12
;For x64 this scanf call will look like:
; mov rdi, msg1
; mov rsi, b
; mov rdx, a
; call scanf
mov eax, dword [a]
add eax, dword [b]
push eax
push dword msg2
call printf
add esp, 8
push dword 0
call exit
add esp, 4
ret
section .data
msg : db "An example of interfacing with GLIBC.",0xA,0
msg1 : db "%d%d",0
msg2 : db "%d", 0xA, 0
section .bss
a resd 1
b resd 1
You can assembly it with nasm -f elf32 -o foo.o foo.asm and link with gcc -m32 -o foo foo.o for x86. For x64 just replace elf32 with elf64 and -m32 with -m64. Note than you need gcc-multilib to build x86 programs on x64 system using gcc.
I'm trying to teach myself assembly. I've found a good website; however, everything is written for x86 and I use a 64-bit machine.
I know what the problem is, but I don't know how to fix it. If I run the program with strace, then here is the results:
execve("./file", ["./file", "hello"], [/* 94 vars */]) = 0
creat(NULL, 0) = -1 EINVAL (Invalid argument)
write(0, NULL, 0 <unfinished ...>
+++ exited with 234 +++
So, I know that when I call creat, that the file name "hello" is not being passed and as a result I don't have a file descriptor.
Here is the code in question:
section .text
global _start
_start:
pop rbx ; argc
pop rbx ; prog name
pop rbx ; the file name
mov eax,85 ; syscall number for creat()
mov ecx,00644Q ; rw,r,r
int 80h ; call the kernel
I know that I can use the syscall command; however, I want to use interrupt.
Any ideas or suggestions would be helpful. Also, I'm using nasm an assembler.
You attempted to use the 32 bit mechanism. If you have a 32 bit tutorial, you can of course create 32 bit programs and those will work as-is in compatibility mode.
If you want to write 64 bit code however, you will need to use the 64 bit conventions and interfaces. Here, that means the syscall instruction with the appropriate registers:
global _start
_start:
mov eax,85 ; syscall number for creat()
mov rdi,[rsp+16] ; argv[1], the file name
mov esi,00644Q ; rw,r,r
syscall ; call the kernel
xor edi, edi ; exit code 0
mov eax, 60 ; syscall number for exit()
syscall
See also the x86-64 sysv abi on wikipedia or the abi pdf for more details.
Program writes executable placed in it's second segment on disk, decrypts it(into /tmp/decbd), and executes(as it was planned)
file decbd appears on disk, and can be executed via shell, last execve call return eax=-14, and after end of the program, execution flows on data and gets segfault.
http://pastebin.com/KywXTB0X
In second segment after compilation using hexdump and dd I manually placed echo binary encrypted via openssl, and when I stopped execution right before last int 0x80 command, I've already been able to run my "echo" in decbd, using another terminal.
You should have narrowed it down to a minimal example. See MCVE.
You should comment your code if you want other people to help.
You should learn to use the debugger and/or other tools.
For point #1, you could have gone down to:
section .text
global _start ;must be declared for linker (ld)
_start:
mov eax,11 ; execve syscall
mov ebx,program ; name of program
mov ecx,[esp+4] ; pointer to argument array
mov ebp,[esp] ; number of arguments
lea edx,[esp+4*ebp+2] ; pointer to environ array
int 0x80
section .data
program db '/bin/echo',0
For point #3, using the debugger you could have seen that:
ebx is okay
ebp is okay
ecx is wrong
edx is wrong
It's an easy fix. ecx should be loaded with the address, not the value and edx should be skipping 2 pointers which are 4 bytes each, so the offset should be 8 not 2. The fixed code could look like this:
section .text
global _start ;must be declared for linker (ld)
_start:
mov eax,11 ; execve syscall
mov ebx,program ; name of program
lea ecx,[esp+4] ; pointer to argument array
mov ebp,[esp] ; number of arguments
lea edx,[esp+4*ebp+8] ; pointer to environ array (skip argc and NULL)
int 0x80
section .data
program db '/bin/echo',0
man execve says this in the "ERRORS" section with regard to return code -14 (-EFAULT):
EFAULT filename points outside your accessible address space.
You passed a bad pointer to execve().
I'm trying to define some subroutines that have calls to printf in them.
A very trivial example is as follows:
extern printf
LINUX equ 80H
EXIT equ 60
section .data
intfmt: db "%ld", 10, 0
segment .text
global main
main:
call os_return ; return to operating system
os_return:
mov rax, EXIT ; Linux system call 60 i.e. exit ()
mov rdi, 0 ; Error code 0 i.e. no errors
int LINUX ; Interrupt Linux kernel
test:
push rdi
push rsi
mov rsi, 10
mov rdi, intfmt
xor rax, rax
call printf
pop rdi
pop rsi
ret
Here test just has a call to printf that outputs the number 10 to the screen. I would not expect this to get called as I have no call to it.
However when compiling and running:
nasm -f elf64 test.asm
gcc -m64 -o test test.o
I get the output:
10
10
I'm totally baffled and wondered if someone could explain why this is happening?
int 80H invokes the 32-bit system call interface, which a) uses the 32-bit system call numbers and b) is intended for use by 32-bit code, not 64-bit code. Your code is actually performing a umask system call with random parameters.
For a 64-bit system call, use the syscall instruction instead:
...
os_return:
mov rax, EXIT ; Linux system call 60 i.e. exit ()
mov rdi, 0 ; Error code 0 i.e. no errors
syscall ; Interrupt Linux kernel
...
I would say that your call to exit is failing, so when it returns, it falls through to the test function, that prints the first 10.
Then when you return with ret you go back to the instruction just after the call os_return, that is, well os_return. The call to exit fails again and falls through to the test function again. But this time the ret returns from the main function and the program ends.
About why is the exit call failing, I cannot tell as I don't have a 64-bit system available. But you could disassemble the exit function from libc and see how it is done there. My guess is that the int LINUX interface is 32-bit only, as it exists only for historic compatibility, and 64-bit linux in not so old.