Related
I'm getting a undefined reference to _printf when building an assembly program that defines its own _start instead of main, using NASM on x86-64 Ubuntu
Build commands:
nasm -f elf64 hello.asm
ld -s -o hello hello.o
hello.o: In function `_start':
hello.asm:(.text+0x1a): undefined reference to `_printf'
MakeFile:4: recipe for target 'compile' failed
make: *** [compile] Error 1
nasm source:
extern _printf
section .text
global _start
_start:
mov rdi, format ; argument #1
mov rsi, message ; argument #2
mov rax, 0
call _printf ; call printf
mov rax, 0
ret ; return 0
section .data
message: db "Hello, world!", 0
format: db "%s", 0xa, 0
Hello, World! should be the output
3 problems:
GNU/Linux using ELF object files does not decorate / mangle C names with a leading underscore. Use call printf, not _printf (Unlike MacOS X, which does decorate symbols with an _; keep that in mind if you're looking at tutorials for other OSes. Windows also uses a different calling convention, but only 32-bit Windows mangles names with _ or other decorations that encode the choice of calling convention.)
You didn't tell ld to link libc, and you didn't define printf yourself, so you didn't give the linker any input files that contain a definition for that symbol. printf is a library function defined in libc.so, and unlike the GCC front-end, ld doesn't include it automatically.
_start is not a function, you can't ret from it. RSP points to argc, not a return address. Define main instead if you want it to be a normal function.
Link with gcc -no-pie -nostartfiles hello.o -o hello if you want a dynamic executable that provides its own _start instead of main, but still uses libc.
This is safe for dynamic executables on GNU/Linux, because glibc can run its init functions via dynamic linker hooks. It's not safe on Cygwin, where its libc is only initialized by calls from its CRT start file (which do that before calling main).
Use call exit to exit, instead of making an _exit system call directly if you use printf; that lets libc flush any buffered output. (If you redirect output to a file, stdout will be full-buffered, vs. line buffered on a terminal.)
-static would not be safe; in a static executable no dynamic-linker code runs before your _start, so there's no way for libc to get itself initialized unless you call the functions manually. That's possible, but generally not recommended.
There are other libc implementations that don't need any init functions called before printf / malloc / other functions work. In glibc, stuff like the stdio buffers are allocated at runtime. (This used to be the case for MUSL libc, but that's apparently not the case anymore, according to Florian's comment on this answer.)
Normally if you want to use libc functions, it's a good idea to define a main function instead of your own _start entry point. Then you can just link with gcc normally, with no special options.
See What parts of this HelloWorld assembly code are essential if I were to write the program in assembly? for that and a version that uses Linux system calls directly, without libc.
If you wanted your code to work in a PIE executable like gcc makes by default (without --no-pie) on recent distros, you'd need call printf wrt ..plt.
Either way, you should use lea rsi, [rel message] because RIP-relative LEA is more efficient than mov r64, imm64 with a 64-bit absolute address. (In position-dependent code, the best option for putting a static address in a 64-bit register is 5-byte mov esi, message, because static addresses in non-PIE executables are known to be in the low 2GiB of virtual address space, and thus work as 32-bit sign- or zero-extended executables.
But RIP-relative LEA is not much worse and works everywhere.)
;;; Defining your own _start but using libc
;;; works on Linux for non-PIE executables
default rel ; Use RIP-relative for [symbol] addressing modes
extern printf
extern exit ; unlike _exit, exit flushes stdio buffers
section .text
global _start
_start:
;; RSP is already aligned by 16 on entry at _start, unlike in functions
lea rdi, [format] ; argument #1 or better mov edi, format
lea rsi, [message] ; argument #2
xor eax, eax ; no FP args to the variadic function
call printf ; for a PIE executable: call printf wrt ..plt
xor edi, edi ; arg #1 = 0
call exit ; exit(0)
; exit definitely does not return
section .rodata ;; read-only data can go in .rodata instead of read-write .data
message: db "Hello, world!", 0
format: db "%s", 0xa, 0
Assemble normally, link with gcc -no-pie -nostartfiles hello.o. This omits the CRT startup files that would normally define a _start that does some stuff before calling main. Libc init functions are called from dynamic linker hooks so printf is usable.
This would not be the case with gcc -static -nostartfiles hello.o. I included examples of what happens if you use the wrong options:
peter#volta:/tmp$ nasm -felf64 nopie-start.asm
peter#volta:/tmp$ gcc -no-pie -nostartfiles nopie-start.o
peter#volta:/tmp$ ./a.out
Hello, world!
peter#volta:/tmp$ file a.out
a.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=0cd1cd111ba0c6926d5d69f9191bdf136e098e62, not stripped
# link error without -no-pie because it doesn't automatically make PLT stubs
peter#volta:/tmp$ gcc -nostartfiles nopie-start.o
/usr/bin/ld: nopie-start.o: relocation R_X86_64_PC32 against symbol `printf##GLIBC_2.2.5' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status
# runtime error with -static
peter#volta:/tmp$ gcc -static -no-pie -nostartfiles nopie-start.o -o static_start-hello
peter#volta:/tmp$ ./static_start-hello
Segmentation fault (core dumped)
Alternative version, defining main instead of _start
(And simplifying by using puts instead of printf.)
default rel ; Use RIP-relative for [symbol] addressing modes
extern puts
section .text
global main
main:
sub rsp, 8 ;; RSP was 16-byte aligned *before* a call pushed a return address
;; RSP is now 16-byte aligned, ready for another call
mov edi, message ; argument #1, optimized to use non-PIE-only move imm32
call puts
add rsp, 8 ; restore the stack
xor eax, eax ; return 0
ret
section .rodata
message: db "Hello, world!", 0 ; puts appends a newline
puts pretty much exactly implements printf("%s\n", string); C compilers will make this optimization for you, but in asm you should do it yourself.
Link with gcc -no-pie hello.o, or even statically link using gcc -no-pie -static hello.o. The CRT startup code will call glibc init functions.
peter#volta:/tmp$ nasm -felf64 nopie-main.asm
peter#volta:/tmp$ gcc -no-pie nopie-main.o
peter#volta:/tmp$ ./a.out
Hello, world!
# link error if you leave out -no-pie because of the imm32 absolute address
peter#volta:/tmp$ gcc nopie-main.o
/usr/bin/ld: nopie-main.o: relocation R_X86_64_32 against `.rodata' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: final link failed: nonrepresentable section on output
collect2: error: ld returned 1 exit status
main is a function, so you need to re-align the stack before making another function call. A dummy push is also a valid way to align the stack on function entry, but add/sub rsp, 8 is clearer.
An alternative is jmp puts to tailcall it, so main's return value will be whatever puts returns. In this case, you must not modify rsp first: you just jump to puts with your return address still on the stack, exactly like if your caller had called puts.
PIE-compatible code defining a main
(You can make a PIE that defines its own _start. That's left as an exercise for the reader.)
default rel ; Use RIP-relative for [symbol] addressing modes
extern puts
section .text
global main
main:
sub rsp, 8 ;; RSP was 16-byte aligned *before* a call pushed a return address
lea rdi, [message] ; argument #1
call puts wrt ..plt
add rsp, 8
xor eax, eax ; return 0
ret
section .rodata
message: db "Hello, world!", 0 ; puts appends a newline
peter#volta:/tmp$ nasm -felf64 pie.asm
peter#volta:/tmp$ gcc pie.o
peter#volta:/tmp$ ./a.out
Hello, world!
peter#volta:/tmp$ file a.out
a.out: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=b27e6032f955d628a542f6391b50805c68541fb9, not stripped
I'm interested in building a static ELF program without (g)libc, using unistd.h provided by the Linux headers.
I've read through these articles/question which give a rough idea of what I'm trying to do, but not quite:
http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html
Compiling without libc
https://blogs.oracle.com/ksplice/entry/hello_from_a_libc_free
I have basic code which depends only on unistd.h, of which, my understanding is that each of those functions are provided by the kernel, and that libc should not be needed. Here's the path I've taken that seems the most promising:
$ gcc -I /usr/include/asm/ -nostdlib grabbytes.c -o grabbytesstatic
/usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000400144
/tmp/ccn1mSkn.o: In function `main':
grabbytes.c:(.text+0x38): undefined reference to `open'
grabbytes.c:(.text+0x64): undefined reference to `lseek'
grabbytes.c:(.text+0x8f): undefined reference to `lseek'
grabbytes.c:(.text+0xaa): undefined reference to `read'
grabbytes.c:(.text+0xc5): undefined reference to `write'
grabbytes.c:(.text+0xe0): undefined reference to `read'
collect2: error: ld returned 1 exit status
Before this, I had to manually define SEEK_END and SEEK_SET according to the values found in the kernel headers. Else it would error saying that those were not defined, which makes sense.
I imagine that I need to link into an unstripped vmlinux to provide the symbols to utilize. However, I read through the symbols and while there were plenty of llseeks, they were not llseek verbatim.
So my question can go in a few directions:
How can I specify an ELF file to utilize symbols from? And I'm guessing if/how that's possible, the symbols won't match up. If this is correct, is there an existing header file which will redefine llseek and default_llseek or whatever is exactly in the kernel?
Is there a better way to write Posix code in C without a libc?
My goal is to write or port fairly standard C code using (perhaps solely) unistd.h and invoke it without libc. I'm probably okay without a few unistd functions, and am not sure which ones exist "purely" as kernel calls or not. I love assembly, but that's not my goal here. Hoping to stay as strictly C as possible (I'm fine with a few external assembly files if I have to), to allow for a libc-less static system at some point.
Thank you for reading!
If you're looking to write POSIX code in C, the abandonment of libc is not going to be helpful. Although you could implement a syscall function in assembler, and copy structures and defines from the kernel header, you would essentially be writing your own libc, which almost certainly would not be POSIX compliant. With all the great libc implementations out there, there's almost no reason to begin implementing your own.
dietlibc and musl libc are both frugal libc implementations which yield impressively small binaries The linker is generally smart; as long as a library is written to avoid the accidentally pulling in numerous dependencies, only the functions you use will actually be linked into your program.
Here is a simple hello world program:
#include<unistd.h>
int main(){
char str[] = "Hello, World!\n";
write(1, str, sizeof str - 1);
return 0;
}
Compiling it with musl below yeilds a binary of a less than 3K
$ musl-gcc -Os -static hello.c
$ strip a.out
$ wc -c a.out
2800 a.out
dietlibc produces an even smaller binary, less than 1.5K:
$ diet -Os gcc hello.c
$ strip a.out
$ wc -c a.out
1360 a.out
This is far from ideal, but a little bit of (x86_64) assembler has me down to just under 5KB (but most of that is "other things than code" - the actual code is under 1KB [771 bytes to be precise], but the file size is much larger, I think because the code size is rounded to 4KB, and then some header/footer/extra stuff is added to that]
Here's what I did:
gcc -g -static -nostdlib -o glibc start.s glibc.c -Os -lc
glibc.c contains:
#include <unistd.h>
int main()
{
const char str[] = "Hello, World!\n";
write(1, str, sizeof(str));
_exit(0);
}
start.s contains:
.globl _start
_start:
xor %ebp, %ebp
mov %rdx, %r9
mov %rsp, %rdx
and $~16, %rsp
push $0
push %rsp
call main
hlt
.globl _exit
_exit:
// We known %RDI already has the exit code...
mov $0x3c, %eax
syscall
hlt
That main point of this is not to show that it's not the system call part of glibc that takes up a lot of space, but the "prepar things" - and beware that if you were to call for example printf, possibly even (v)sprintf, or exit(), or any other "standard library" function, you are in the land of "nobody knows what will happen".
Edit: Updated "start.s" to put argc/argv in the right places:
_start:
xor %ebp, %ebp
mov %rdx, %r9
pop %rdi
mov %rsp, %rsi
and $~16, %rsp
push %rax
push %rsp
// %rdi = argc, %rsi=argv
call main
Note that I've changed which register contains what thing, so that it matches main - I had them slightly wrong order in the previous code.
This description is valid for Linux 32 bit:
When a Linux program begins, all pointers to command-line arguments are stored on the stack. The number of arguments is stored at 0(%ebp), the name of the program is stored at 4(%ebp), and the arguments are stored from 8(%ebp).
I need the same information for 64 bit.
Edit:
I have working code sample which shows how to use argc, argv[0] and argv[1]: http://cubbi.com/fibonacci/asm.html
.globl _start
_start:
popq %rcx # this is argc, must be 2 for one argument
cmpq $2,%rcx
jne usage_exit
addq $8,%rsp # skip argv[0]
popq %rsi # get argv[1]
call ...
...
}
It looks like parameters are on the stack. Since this code is not clear, I ask this question. My guess that I can keep rsp in rbp, and then access these parameters using 0(%rbp), 8(%rbp), 16(%rbp) etc. It this correct?
Despite the accepted answer being more than sufficient, I would like to give an explicit answer, as there are some other answers which might confuse.
Most important (for more information see examples below): in x86-64 the command line arguments are passed via stack:
(%rsp) -> number of arguments
8(%rsp) -> address of the name of the executable
16(%rsp) -> address of the first command line argument (if exists)
... so on ...
It is different from the function parameter passing in x86-64, which uses %rdi, %rsi and so on.
One more thing: one should not deduce the behavior from reverse engineering of the C main-function. C runtime provides the entry point _start, wraps the command line arguments and calls main as a common function. To see it, let's consider the following example.
No C runtime/GCC with -nostdlib
Let's check this simple x86-64 assembler program, which do nothing but returns 42:
.section .text
.globl _start
_start:
movq $60, %rax #60 -> exit
movq $42, %rdi #return 42
syscall #run kernel
We build it with:
as --64 exit64.s -o exit64.o
ld -m elf_x86_64 exit64.o -o exit64
or with
gcc -nostdlib exit64.s -o exit64
run in gdb with
./exit64 first second third
and stop at the breakpoint at _start. Let's check the registers:
(gdb) info registers
...
rsi 0x0 0
rdi 0x0 0
...
Nothing there. What about the stack?
(gdb) x/5g $sp
0x7fffffffde40: 4 140737488347650
0x7fffffffde50: 140737488347711 140737488347717
0x7fffffffde60: 140737488347724
So the first element on the stack is 4 - the expected argc. The next 4 values look a lot like pointers. Let's look at the second pointer:
(gdb) print (char[5])*(140737488347711)
$1 = "first"
As expected it is the first command line argument.
So there is experimental evidence, that the command line arguments are passed via stack in x86-64. However only by reading the ABI (as the accepted answer suggested) we can be sure, that this is really the case.
With C runtime
We have to change the program slightly, renaming _start into main, because the entry point _start is provided by the C runtime.
.section .text
.globl main
main:
movq $60, %rax #60 -> exit
movq $42, %rdi #return 42
syscall #run kernel
We build it with (C runtime is used per default):
gcc exit64gcc.s -o exit64gcc
run in gdb with
./exit64gcc first second third
and stop at the breakpoint at main. What is at the stack?
(gdb) x/5g $sp
0x7fffffffdd58: 0x00007ffff7a36f45 0x0000000000000000
0x7fffffffdd68: 0x00007fffffffde38 0x0000000400000000
0x7fffffffdd78: 0x00000000004004ed
It does not look familiar. And registers?
(gdb) info registers
...
rsi 0x7fffffffde38 140737488346680
rdi 0x4 4
...
We can see that rdi contains the argc value. But if we now inspect the pointer in rsi strange things happen:
(gdb) print (char[5])*($rsi)
$1 = "\211\307???"
But wait, the second argument of the main function in C is not char *, but char ** also:
(gdb) print (unsigned long long [4])*($rsi)
$8 = {140737488347644, 140737488347708, 140737488347714, 140737488347721}
(gdb) print (char[5])*(140737488347708)
$9 = "first"
And now we found our arguments, which are passed via registers as it would be for a normal function in x86-64.
Conclusion:
As we can see, the is a difference concerning passing of command line arguments between code using C runtime and code which doesn't.
It looks like section 3.4 Process Initialization, and specifically figure 3.9, in the already mentioned System V AMD64 ABI describes precisely what you want to know.
I do believe what you need to do is check out the x86-64 ABI. Specifically, I think you need to look at section 3.2.3 Parameter Passing.
I'm fairly new to Linux (Ubuntu 10.04) and a total novice to assembler. I was following some tutorials and I couldn't find anything specific to Linux.
So, my question is, what is a good package to compile/run assembler and what are the command line commands to compile/run for that package?
The GNU assembler is probably already installed on your system. Try man as to see full usage information. You can use as to compile individual files and ld to link if you really, really want to.
However, GCC makes a great front-end. It can assemble .s files for you. For example:
$ cat >hello.s <<"EOF"
.section .rodata # read-only static data
.globl hello
hello:
.string "Hello, world!" # zero-terminated C string
.text
.global main
main:
push %rbp
mov %rsp, %rbp # create a stack frame
mov $hello, %edi # put the address of hello into RDI
call puts # as the first arg for puts
mov $0, %eax # return value = 0. Normally xor %eax,%eax
leave # tear down the stack frame
ret # pop the return address off the stack into RIP
EOF
$ gcc hello.s -no-pie -o hello
$ ./hello
Hello, world!
The code above is x86-64. If you want to make a position-independent executable (PIE), you'd need lea hello(%rip), %rdi, and call puts#plt.
A non-PIE executable (position-dependent) can use 32-bit absolute addressing for static data, but a PIE should use RIP-relative LEA. (See also Difference between movq and movabsq in x86-64 neither movq nor movabsq are a good choice.)
If you wanted to write 32-bit code, the calling convention is different, and RIP-relative addressing isn't available. (So you'd push $hello before the call, and pop the stack args after.)
You can also compile C/C++ code directly to assembly if you're curious how something works:
$ cat >hello.c <<EOF
#include <stdio.h>
int main(void) {
printf("Hello, world!\n");
return 0;
}
EOF
$ gcc -S hello.c -o hello.s
See also How to remove "noise" from GCC/clang assembly output? for more about looking at compiler output, and writing useful small functions that will compile to interesting output.
The GNU assembler (gas) and NASM are both good choices. However, they have some differences, the big one being the order you put operations and their operands.
gas uses AT&T syntax (guide: https://stackoverflow.com/tags/att/info):
mnemonic source, destination
nasm uses Intel style (guide: https://stackoverflow.com/tags/intel-syntax/info):
mnemonic destination, source
Either one will probably do what you need. GAS also has an Intel-syntax mode, which is a lot like MASM, not NASM.
Try out this tutorial: http://asm.sourceforge.net/intro/Assembly-Intro.html
See also more links to guides and docs in Stack Overflow's x86 tag wiki
If you are using NASM, the command-line is just
nasm -felf32 -g -Fdwarf file.asm -o file.o
where 'file.asm' is your assembly file (code) and 'file.o' is an object file you can link with gcc -m32 or ld -melf_i386. (Assembling with nasm -felf64 will make a 64-bit object file, but the hello world example below uses 32-bit system calls, and won't work in a PIE executable.)
Here is some more info:
http://www.nasm.us/doc/nasmdoc2.html#section-2.1
You can install NASM in Ubuntu with the following command:
apt-get install nasm
Here is a basic Hello World in Linux assembly to whet your appetite:
http://web.archive.org/web/20120822144129/http://www.cin.ufpe.br/~if817/arquivos/asmtut/index.html
I hope this is what you were asking...
There is also FASM for Linux.
format ELF executable
segment readable executable
start:
mov eax, 4
mov ebx, 1
mov ecx, hello_msg
mov edx, hello_size
int 80h
mov eax, 1
mov ebx, 0
int 80h
segment readable writeable
hello_msg db "Hello World!",10,0
hello_size = $-hello_msg
It comiles with
fasm hello.asm hello
My suggestion would be to get the book Programming From Ground Up:
http://nongnu.askapache.com/pgubook/ProgrammingGroundUp-1-0-booksize.pdf
That is a very good starting point for getting into assembler programming under linux and it explains a lot of the basics you need to understand to get started.
The assembler(GNU) is as(1)
3 syntax (nasm, tasm, gas ) in 1 assembler, yasm.
http://www.tortall.net/projects/yasm/
For Ubuntu 18.04 installnasm . Open the terminal and type:
sudo apt install as31 nasm
nasm docs
For compiling and running:
nasm -f elf64 example.asm # assemble the program
ld -s -o example example.o # link the object file nasm produced into an executable file
./example # example is an executable file
I'm in an interesting problem.I forgot I'm using 64bit machine & OS and wrote a 32 bit assembly code. I don't know how to write 64 bit code.
This is the x86 32-bit assembly code for Gnu Assembler (AT&T syntax) on Linux.
//hello.S
#include <asm/unistd.h>
#include <syscall.h>
#define STDOUT 1
.data
hellostr:
.ascii "hello wolrd\n";
helloend:
.text
.globl _start
_start:
movl $(SYS_write) , %eax //ssize_t write(int fd, const void *buf, size_t count);
movl $(STDOUT) , %ebx
movl $hellostr , %ecx
movl $(helloend-hellostr) , %edx
int $0x80
movl $(SYS_exit), %eax //void _exit(int status);
xorl %ebx, %ebx
int $0x80
ret
Now, This code should run fine on a 32bit processor & 32 bit OS right? As we know 64 bit processors are backward compatible with 32 bit processors. So, that also wouldn't be a problem. The problem arises because of differences in system calls & call mechanism in 64-bit OS & 32-bit OS. I don't know why but they changed the system call numbers between 32-bit linux & 64-bit linux.
asm/unistd_32.h defines:
#define __NR_write 4
#define __NR_exit 1
asm/unistd_64.h defines:
#define __NR_write 1
#define __NR_exit 60
Anyway using Macros instead of direct numbers is paid off. Its ensuring correct system call numbers.
when I assemble & link & run the program.
$cpp hello.S hello.s //pre-processor
$as hello.s -o hello.o //assemble
$ld hello.o // linker : converting relocatable to executable
Its not printing helloworld.
In gdb its showing:
Program exited with code 01.
I don't know how to debug in gdb. using tutorial I tried to debug it and execute instruction by instruction checking registers at each step. its always showing me "program exited with 01". It would be great if some on could show me how to debug this.
(gdb) break _start
Note: breakpoint -10 also set at pc 0x4000b0.
Breakpoint 8 at 0x4000b0
(gdb) start
Function "main" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Temporary breakpoint 9 (main) pending.
Starting program: /home/claws/helloworld
Program exited with code 01.
(gdb) info breakpoints
Num Type Disp Enb Address What
8 breakpoint keep y 0x00000000004000b0 <_start>
9 breakpoint del y <PENDING> main
I tried running strace. This is its output:
execve("./helloworld", ["./helloworld"], [/* 39 vars */]) = 0
write(0, NULL, 12 <unfinished ... exit status 1>
Explain the parameters of write(0, NULL, 12) system call in the output of strace?
What exactly is happening? I want to know the reason why exactly its exiting with exitstatus=1?
Can some one please show me how to debug this program using gdb?
Why did they change the system call numbers?
Kindly change this program appropriately so that it can run correctly on this machine.
EDIT:
After reading Paul R's answer. I checked my files
claws#claws-desktop:~$ file ./hello.o
./hello.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
claws#claws-desktop:~$ file ./hello
./hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped
I agree with him that these should be ELF 32-bit relocatable & executable. But that doesn't answer my my questions. All of my questions still questions. What exactly is happening in this case? Can someone please answer my questions and provide an x86-64 version of this code?
Remember that everything by default on a 64-bit OS tends to assume 64-bit. You need to make sure that you are (a) using the 32-bit versions of your #includes where appropriate (b) linking with 32-bit libraries and (c) building a 32-bit executable. It would probably help if you showed the contents of your makefile if you have one, or else the commands that you are using to build this example.
FWIW I changed your code slightly (_start -> main):
#include <asm/unistd.h>
#include <syscall.h>
#define STDOUT 1
.data
hellostr:
.ascii "hello wolrd\n" ;
helloend:
.text
.globl main
main:
movl $(SYS_write) , %eax //ssize_t write(int fd, const void *buf, size_t count);
movl $(STDOUT) , %ebx
movl $hellostr , %ecx
movl $(helloend-hellostr) , %edx
int $0x80
movl $(SYS_exit), %eax //void _exit(int status);
xorl %ebx, %ebx
int $0x80
ret
and built it like this:
$ gcc -Wall test.S -m32 -o test
verfied that we have a 32-bit executable:
$ file test
test: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.6.4, dynamically linked (uses shared libs), not stripped
and it appears to run OK:
$ ./test
hello wolrd
As noted by Paul, if you want to build 32-bit binaries on a 64-bit system, you need to use the -m32 flag, which may not be available by default on your installation (some 64-bit Linux distros don't include 32-bit compiler/linker/lib support by default).
On the other hand, you could instead build your code as 64-bit, in which case you need to use the 64-bit calling conventions. In that case, the system call number goes in %rax, and the arguments go in %rdi, %rsi, and %rdx
Edit
Best place I've found for this is www.x86-64.org, specifically abi.pdf
64-bit CPUs can run 32-bit code, but they have to use a special mode to do it. Those instructions are all valid in 64-bit mode, so nothing stopped you from building a 64-bit executable.
Your code builds and runs correctly with gcc -m32 -nostdlib hello.S. That's because -m32 defines __i386, so /usr/include/asm/unistd.h includes <asm/unistd_32.h>, which has the right constants for the int $0x80 ABI.
See also Assembling 32-bit binaries on a 64-bit system (GNU toolchain) for more about _start vs. main with/without libc and static vs. dynamic executables.
$ file a.out
a.out: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, BuildID[sha1]=973fd6a0b7fa15b2d95420c7a96e454641c31b24, not stripped
$ strace ./a.out > /dev/null
execve("./a.out", ["./a.out"], 0x7ffd43582110 /* 64 vars */) = 0
strace: [ Process PID=2773 runs in 32 bit mode. ]
write(1, "hello wolrd\n", 12) = 12
exit(0) = ?
+++ exited with 0 +++
Technically, if you'd used the right call numbers, your code would happen to work from 64-bit mode as well: What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code? But int 0x80 is not recommended in 64-bit code. (Actually, it's never recommended. For efficiency, 32-bit code should call through the kernel's exported VDSO page so it can use sysenter for fast system calls on CPUs that support it).
But that doesn't answer my my questions. What exactly is happening in this case?
Good question.
On Linux, int $0x80 with eax=1 is sys_exit(ebx), regardless of what mode the calling process was in. The 32-bit ABI is available in 64-bit mode (unless your kernel was compiled without i386 ABI support), but don't use it. Your exit status is from movl $(STDOUT), %ebx.
(BTW, there's a STDOUT_FILENO macro defined in unistd.h, but you can't #include <unistd.h> from a .S because it also contains C prototypes which aren't valid asm syntax.)
Notice that __NR_exit from unistd_32.h and __NR_write from unistd_64.h are both 1, so your first int $0x80 exits your process. You're using the wrong system call numbers for the ABI you're invoking.
strace is decoding it incorrectly, as if you'd invoked syscall (because that's the ABI a 64-bit process is expected to use). What are the calling conventions for UNIX & Linux system calls on x86-64
eax=1 / syscall means write(rd=edi, buf=rsi, len=rdx), and this is how strace is incorrectly decoding your int $0x80.
rdi and rsi are 0 (aka NULL) on entry to _start, and your code sets rdx=12 with movl $(helloend-hellostr) , %edx.
Linux initializes registers to zero in a fresh process after execve. (The ABI says undefined, Linux chooses zero to avoid info leaks). In your statically-linked executable, _start is the first user-space code that runs. (In a dynamic executable, the dynamic linker runs before _start, and does leave garbage in registers).
See also the x86 tag wiki for more asm links.