This question already has answers here:
Assembling 32-bit binaries on a 64-bit system (GNU toolchain)
(2 answers)
Closed 6 years ago.
I tried to write simple program without main
segment .data
fmt db "test", 0xa, 0
segment .text
global _start
extern printf
_start:
lea rdi, [fmt] ; print simple string
xor eax, eax
call printf
mov eax, 60 ; exit successfully
xor edi, edi
syscall
Compile:
yasm -f elf64 main.s; ld -o main main.o
Got
main.o: In function `_start':
main.s:(.text+0xb): undefined reference to `printf'
How should one fix this?
Cause of Your Error
The reason you get undefined reference to printf while linking is because you didn't link against the C library. As well, when using your own _start label as an entry point there are other issues that must be overcome as discussed below.
Using LIBC without C Runtime Startup Code
To make sure you are using the appropriate dynamic linker at runtime and link against the C library you can use GCC as a frontend for LD. To supply your own _start label and avoid the C runtime startup code use the -nostartfiles option. Although I don't recommend this method, you can link with:
gcc -m64 -nostartfiles main.o -o main
If you wish to use LD you can do it by using a specific dynamic linker and link against the C library with -lc. To link x86_64 code you can use:
ld -melf_x86_64 --dynamic-linker=/lib64/ld-linux-x86-64.so.2 main.o -lc -o main
If you had been compiling and linking for 32-bit, then the linking could have been done with:
ld -melf_i386 --dynamic-linker=/lib/ld-linux.so.2 main.o -lc -o main
Preferred Method Using C Runtime Startup Code
Since you are using the C library you should consider linking against the C runtime. This method also works if you were to create a static executable. The best way to use the C library is to rename your _start label to main and then use GCC to link:
gcc -m64 main.o -o main
Doing it this way allows you to exit your program by returning (using RET) from main the same way a C program ends. This also has the advantage of flushing standard output when the program exits.
Flushing output buffers / exit
Using syscall with EAX=60 to exit won't flush the output buffers. Because of this you may find yourself having to add a newline character on your output to see output before exiting. This is still not a guarantee you will see what you output with printf. Rather than the sys_exit SYSCALL you might consider calling the C library function exit. the MAN PAGE for exit says:
All open stdio(3) streams are flushed and closed. Files created by tmpfile(3) are removed.
I'd recommend replacing the sys_exit SYSCALL with:
extern exit
xor edi, edi ; In 64-bit code RDI is 1st argument = return value
call exit
The printf function is provided by the C standard library, libc. To use it, you need to pass -lc to the linker:
ld -o main main.o -lc
I recommend you to use the C compiler as a frontend for the linker to get possible other flags right, too:
cc -o main -nostartfiles main.o
Note that since your program doesn't initialize the C standard library, e parts of it may not work as expected. If you want to use the C standard library in your assembly program, I recommend you to make your assembly program start from main and to link it by invoking the C compiler:
yasm -f elf64 main.s
cc -o main main.o
Related
I'm trying to invoke sys_write via the syscall instruction under Linux environment on an x86_64 machine with inline assembly in C files. To achieve this, I wrote the following code:
[tjm#ArchPad poc]$ cat poc.c
const char S[] = "abcd\n";
int main() {
__asm __volatile(" \
movq $1, %%rax; \
movq $1, %%rdi; \
movq %0, %%rsi; \
movq $5, %%rdx; \
syscall;"
:
: "i"(S)
);
return 0;
}
While my compiling it with GCC, the following error promoted:
[tjm#ArchPad poc]$ gcc -O0 poc.c -o poc
/usr/bin/ld: /tmp/ccWYxmSb.o: relocation R_X86_64_32S against symbol `S' can not be used when making a PIE object; recompile with -fPIE
collect2: error: ld returned 1 exit status
Then I added -fPIE to the compiler command. Unfortunately, nothing changed:
[tjm#ArchPad poc]$ gcc -fPIE -O0 poc.c -o poc
/usr/bin/ld: /tmp/ccdK36lx.o: relocation R_X86_64_32S against symbol `S' can not be used when making a PIE object; recompile with -fPIE
collect2: error: ld returned 1 exit status
This is so weird that it made me curious to compile and link it manually:
[tjm#ArchPad poc]$ gcc -fPIE -O0 poc.c -S
[tjm#ArchPad poc]$ as -O0 poc.s -o poc.o
[tjm#ArchPad poc]$ ld -O0 -melf_x86_64 --dynamic-linker /usr/lib/ld-linux-x86-64.so.2 /usr/lib/crt{1,i,n}.o -lc poc.o -o poc
Surprisingly, no error occurred during the attempt above. I then tested the executable's functionality, which seems working:
[tjm#ArchPad poc]$ ./poc
abcd
Any ideas of why this happens and how to prevent GCC from promoting the error above?
The immediate move instruction movq $S, %rsi, despite taking a 64-bit register, takes only a 32-bit signed immediate. When building a position-independent executable which is to run with ASLR, as is the default on most Linux systems, your program is typically not located in the low or high 31 bits of virtual memory, so the address of a global or static object like S won't fit in a signed 32-bit immediate.
One fix is to use movabs instead, which does take a full 64-bit immediate. Replacing movq %0, %%rsi with movabs %0, %%rsi lets the code build. However, the better approach is to use RIP-relative addressing lea S(%rip), %rsi, which is a shorter instruction and avoids the need for a relocation at load time. This is a little awkward to do inside inline asm, so you can let the compiler load the address into the register for you, with an input operand like "S" (S) (confusingly the constraint for the rsi register is S, which happens to coincide with the name you chose for your array variable).
When you linked manually, you built a non-position-independent executable, which meant that treating the address as a constant worked. You can get the same effect by passing -no-pie to gcc.
There is another serious problem with your code, in that it doesn't declare the registers it clobbers - not only those which you explicitly mov into, but also rcx and r11 which syscall modifies implicitly. You should take a look at How to invoke a system call via syscall or sysenter in inline assembly? which has a correct example. It would be wise to study those examples carefully - GCC inline asm is powerful, but also very easy to get wrong in subtle ways that the compiler will not help you detect, and which may appear to work in a simple program but fail unpredictably in a more complex one.
GCC supports multiple dialects for the ASM instructions and the default can vary. Read more here:
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Extended-Asm
... and of course, the dialects are not compatible with each other.
I have a working position independent Linux freestanding x86_64 hello world:
main.S
.text
.global _start
_start:
asm_main_after_prologue:
/* Write */
mov $1, %rax /* syscall number */
mov $1, %rdi /* stdout */
lea msg(%rip), %rsi /* buffer */
mov $len, %rdx /* len */
syscall
/* Exit */
mov $60, %rax /* syscall number */
mov $0, %rdi /* exit status */
syscall
msg:
.ascii "hello\n"
len = . - msg
which I can assemble and run with:
as -o main.o main.S
ld -o main.out main.o
./main.out
Since it is position independent due to the RIP relative load, now I wanted to link it as a PIE and see it get loaded at random addresses every time to have some fun.
First I tried:
ld -pie -o main.out main.o
but then running it fails with:
-bash: ./main.out: No such file or directory
and readelf -Wa says that a weird interpreter /lib/ld64.so.1 was used instead of the regular one /lib64/ld-linux-x86-64.so.2 for some reason.
I then learnt that his is actually the recommended System V AMD64
ABI interpreter name at 5.2.1 "Program Interpreter".
In any case, I then try to force matters with:
ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -o main.out main.o
and now it works: I get hello and the executable gets loaded to a different address every time according to GDB.
Finally, as a final step, I wanted to also make that executable be statically linked to make things even more minimal, and possibly get rid of the explicit -dynamic-linker.
That's what I could not do, and this is why I'm asking here.
If I try either of:
ld -static -pie -o main.out main.o
ld -static -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -o main.out main.o
-static does not seem to make any difference: I still get dynamic executables.
After quickly glancing at the kernel 5.0 source code in fs/binfmt_elf.c I saw this interesting comment:
* There are effectively two types of ET_DYN
* binaries: programs (i.e. PIE: ET_DYN with INTERP)
* and loaders (ET_DYN without INTERP, since they
* _are_ the ELF interpreter). The loaders must
so I guess when I achieve what I want, I will have a valid interpreter, and I'm so going to use my own minimal hello world as the interpreter of another program.
One thing I might try later on is see how some libc implementation compiles its loader and copy it.
Related question: Compile position-independent executable with statically linked library on 64 bit machine but that mentions an external library, so hopefully this is more minimal and answerable.
Tested in Ubuntu 18.10.
You want to add --no-dynamic-linker to your link command:
$ ld main.o -o main.out -pie --no-dynamic-linker
$ file main.out
main.out: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, not stripped
$ ./main.out
hello
so I guess when I achieve what I want, I will have a valid interpreter, and I'm so going to use my own minimal hello world as the interpreter of another program.
I am not sure I understood what you are saying correctly. If you meant that main.out would have itself as its interpreter, that's wrong.
P.S. GLIBC-2.27 added support for -static-pie, so you no longer have to resort to assembly to get a statically linked PIE binary. But you'll have to use very recent GCC and GLIBC.
I'm getting a undefined reference to _printf when building an assembly program that defines its own _start instead of main, using NASM on x86-64 Ubuntu
Build commands:
nasm -f elf64 hello.asm
ld -s -o hello hello.o
hello.o: In function `_start':
hello.asm:(.text+0x1a): undefined reference to `_printf'
MakeFile:4: recipe for target 'compile' failed
make: *** [compile] Error 1
nasm source:
extern _printf
section .text
global _start
_start:
mov rdi, format ; argument #1
mov rsi, message ; argument #2
mov rax, 0
call _printf ; call printf
mov rax, 0
ret ; return 0
section .data
message: db "Hello, world!", 0
format: db "%s", 0xa, 0
Hello, World! should be the output
3 problems:
GNU/Linux using ELF object files does not decorate / mangle C names with a leading underscore. Use call printf, not _printf (Unlike MacOS X, which does decorate symbols with an _; keep that in mind if you're looking at tutorials for other OSes. Windows also uses a different calling convention, but only 32-bit Windows mangles names with _ or other decorations that encode the choice of calling convention.)
You didn't tell ld to link libc, and you didn't define printf yourself, so you didn't give the linker any input files that contain a definition for that symbol. printf is a library function defined in libc.so, and unlike the GCC front-end, ld doesn't include it automatically.
_start is not a function, you can't ret from it. RSP points to argc, not a return address. Define main instead if you want it to be a normal function.
Link with gcc -no-pie -nostartfiles hello.o -o hello if you want a dynamic executable that provides its own _start instead of main, but still uses libc.
This is safe for dynamic executables on GNU/Linux, because glibc can run its init functions via dynamic linker hooks. It's not safe on Cygwin, where its libc is only initialized by calls from its CRT start file (which do that before calling main).
Use call exit to exit, instead of making an _exit system call directly if you use printf; that lets libc flush any buffered output. (If you redirect output to a file, stdout will be full-buffered, vs. line buffered on a terminal.)
-static would not be safe; in a static executable no dynamic-linker code runs before your _start, so there's no way for libc to get itself initialized unless you call the functions manually. That's possible, but generally not recommended.
There are other libc implementations that don't need any init functions called before printf / malloc / other functions work. In glibc, stuff like the stdio buffers are allocated at runtime. (This used to be the case for MUSL libc, but that's apparently not the case anymore, according to Florian's comment on this answer.)
Normally if you want to use libc functions, it's a good idea to define a main function instead of your own _start entry point. Then you can just link with gcc normally, with no special options.
See What parts of this HelloWorld assembly code are essential if I were to write the program in assembly? for that and a version that uses Linux system calls directly, without libc.
If you wanted your code to work in a PIE executable like gcc makes by default (without --no-pie) on recent distros, you'd need call printf wrt ..plt.
Either way, you should use lea rsi, [rel message] because RIP-relative LEA is more efficient than mov r64, imm64 with a 64-bit absolute address. (In position-dependent code, the best option for putting a static address in a 64-bit register is 5-byte mov esi, message, because static addresses in non-PIE executables are known to be in the low 2GiB of virtual address space, and thus work as 32-bit sign- or zero-extended executables.
But RIP-relative LEA is not much worse and works everywhere.)
;;; Defining your own _start but using libc
;;; works on Linux for non-PIE executables
default rel ; Use RIP-relative for [symbol] addressing modes
extern printf
extern exit ; unlike _exit, exit flushes stdio buffers
section .text
global _start
_start:
;; RSP is already aligned by 16 on entry at _start, unlike in functions
lea rdi, [format] ; argument #1 or better mov edi, format
lea rsi, [message] ; argument #2
xor eax, eax ; no FP args to the variadic function
call printf ; for a PIE executable: call printf wrt ..plt
xor edi, edi ; arg #1 = 0
call exit ; exit(0)
; exit definitely does not return
section .rodata ;; read-only data can go in .rodata instead of read-write .data
message: db "Hello, world!", 0
format: db "%s", 0xa, 0
Assemble normally, link with gcc -no-pie -nostartfiles hello.o. This omits the CRT startup files that would normally define a _start that does some stuff before calling main. Libc init functions are called from dynamic linker hooks so printf is usable.
This would not be the case with gcc -static -nostartfiles hello.o. I included examples of what happens if you use the wrong options:
peter#volta:/tmp$ nasm -felf64 nopie-start.asm
peter#volta:/tmp$ gcc -no-pie -nostartfiles nopie-start.o
peter#volta:/tmp$ ./a.out
Hello, world!
peter#volta:/tmp$ file a.out
a.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=0cd1cd111ba0c6926d5d69f9191bdf136e098e62, not stripped
# link error without -no-pie because it doesn't automatically make PLT stubs
peter#volta:/tmp$ gcc -nostartfiles nopie-start.o
/usr/bin/ld: nopie-start.o: relocation R_X86_64_PC32 against symbol `printf##GLIBC_2.2.5' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status
# runtime error with -static
peter#volta:/tmp$ gcc -static -no-pie -nostartfiles nopie-start.o -o static_start-hello
peter#volta:/tmp$ ./static_start-hello
Segmentation fault (core dumped)
Alternative version, defining main instead of _start
(And simplifying by using puts instead of printf.)
default rel ; Use RIP-relative for [symbol] addressing modes
extern puts
section .text
global main
main:
sub rsp, 8 ;; RSP was 16-byte aligned *before* a call pushed a return address
;; RSP is now 16-byte aligned, ready for another call
mov edi, message ; argument #1, optimized to use non-PIE-only move imm32
call puts
add rsp, 8 ; restore the stack
xor eax, eax ; return 0
ret
section .rodata
message: db "Hello, world!", 0 ; puts appends a newline
puts pretty much exactly implements printf("%s\n", string); C compilers will make this optimization for you, but in asm you should do it yourself.
Link with gcc -no-pie hello.o, or even statically link using gcc -no-pie -static hello.o. The CRT startup code will call glibc init functions.
peter#volta:/tmp$ nasm -felf64 nopie-main.asm
peter#volta:/tmp$ gcc -no-pie nopie-main.o
peter#volta:/tmp$ ./a.out
Hello, world!
# link error if you leave out -no-pie because of the imm32 absolute address
peter#volta:/tmp$ gcc nopie-main.o
/usr/bin/ld: nopie-main.o: relocation R_X86_64_32 against `.rodata' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: final link failed: nonrepresentable section on output
collect2: error: ld returned 1 exit status
main is a function, so you need to re-align the stack before making another function call. A dummy push is also a valid way to align the stack on function entry, but add/sub rsp, 8 is clearer.
An alternative is jmp puts to tailcall it, so main's return value will be whatever puts returns. In this case, you must not modify rsp first: you just jump to puts with your return address still on the stack, exactly like if your caller had called puts.
PIE-compatible code defining a main
(You can make a PIE that defines its own _start. That's left as an exercise for the reader.)
default rel ; Use RIP-relative for [symbol] addressing modes
extern puts
section .text
global main
main:
sub rsp, 8 ;; RSP was 16-byte aligned *before* a call pushed a return address
lea rdi, [message] ; argument #1
call puts wrt ..plt
add rsp, 8
xor eax, eax ; return 0
ret
section .rodata
message: db "Hello, world!", 0 ; puts appends a newline
peter#volta:/tmp$ nasm -felf64 pie.asm
peter#volta:/tmp$ gcc pie.o
peter#volta:/tmp$ ./a.out
Hello, world!
peter#volta:/tmp$ file a.out
a.out: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=b27e6032f955d628a542f6391b50805c68541fb9, not stripped
I'm learning assembly with NASM for a class I have in college. I would like to link the C Runtime Library with ld, but I just can't seem to wrap my head around it. I have a 64 bit machine with Linux Mint installed.
The reason I'm confused is that -- to my knowledge -- instead of linking the C runtime, gcc copies the things that you need into your program. I might be wrong though, so don't hesitate to correct me on this, please.
What I did up to this point is, to link it using gcc. That produces a mess of a machine code that I'm unable to follow though, even for a small program like swapping rax with rbx, which isn't that great for learning purposes. (Please note that the program works.)
I'm not sure if it's relevant, but these are the commands that I'm using to compile and link:
# compilation
nasm -f elf64 swap.asm
# gcc
gcc -o swap swap.o
# ld, no c runtime
ld -s -o swap swap.o
Thank you in advance!
Conclusion:
Now that I have a proper answer to the question, here are a few things that I would like to mention. Linking glibc dynamically can be done like in Z boson's answer (for 64 bit systems). If you would like to do it statically, do follow this link (that I'm re-posting from Z boson's answer).
Here's an article that Jester posted, about how programs start in linux.
To see what gcc does to link your .o-s, try this command out: gcc -v -o swap swap.o. Note that 'v' stands for 'verbose'.
Also, you should read this if you are interested in 64 bit assembly.
Thank you for your answers and helpful insight! End of speech.
Here is an example which uses libc without using GCC.
extern printf
extern _exit
section .data
hello: db 'Hello world!',10
section .text
global _start
_start:
xor eax, eax
mov edi, hello
call printf
mov rax, 0
jmp _exit
Compile and link like this:
nasm -f elf64 hello.asm
ld hello.o -dynamic-linker /lib64/ld-linux-x86-64.so.2 -lc -m elf_x86_64
This has worked fine so far for me but for static linkage it's complicated.
If you want to call simple library functions like atoi, but still avoid using the C runtime, you can do that. (i.e. you write _start, rather than just writing a main that gets called after a bunch of boiler-plate code runs.)
gcc -o swap -nostartfiles swap.o
As people say in comments, some parts of glibc depend on constructors/destructors run from the standard startup files. Probably this is the case for stdio (puts/printf/scanf/getchar), and maybe malloc. A lot of functions are "pure" functions that just process the input they're given, though. sprintf/sscanf might be ok to use.
For example:
$ cat >exit64.asm <<EOF
section .text
extern exit
global _start
_start:
xor edi, edi
jmp exit ; doesn't return, so optimize like a tail-call
;; or make the syscall directly, if the jmp is commented
mov eax, 231 ; exit(0)
syscall
; movl eax, 1 ; 32bit call
; int 0x80
EOF
$ yasm -felf64 exit64.asm && gcc -nostartfiles exit64.o -o exit64-dynamic
$ nm exit64-dynamic
0000000000601020 D __bss_start
0000000000600ec0 d _DYNAMIC
0000000000601020 D _edata
0000000000601020 D _end
U exit##GLIBC_2.2.5
0000000000601000 d _GLOBAL_OFFSET_TABLE_
00000000004002d0 T _start
$ ltrace ./exit64-dynamic
enable_breakpoint pid=11334, addr=0x1, symbol=(null): Input/output error
exit(0 <no return ...>
+++ exited (status 0) +++
$ strace ... # shows the usual system calls by the runtime dynamic linker
I'm interested in building a static ELF program without (g)libc, using unistd.h provided by the Linux headers.
I've read through these articles/question which give a rough idea of what I'm trying to do, but not quite:
http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html
Compiling without libc
https://blogs.oracle.com/ksplice/entry/hello_from_a_libc_free
I have basic code which depends only on unistd.h, of which, my understanding is that each of those functions are provided by the kernel, and that libc should not be needed. Here's the path I've taken that seems the most promising:
$ gcc -I /usr/include/asm/ -nostdlib grabbytes.c -o grabbytesstatic
/usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 0000000000400144
/tmp/ccn1mSkn.o: In function `main':
grabbytes.c:(.text+0x38): undefined reference to `open'
grabbytes.c:(.text+0x64): undefined reference to `lseek'
grabbytes.c:(.text+0x8f): undefined reference to `lseek'
grabbytes.c:(.text+0xaa): undefined reference to `read'
grabbytes.c:(.text+0xc5): undefined reference to `write'
grabbytes.c:(.text+0xe0): undefined reference to `read'
collect2: error: ld returned 1 exit status
Before this, I had to manually define SEEK_END and SEEK_SET according to the values found in the kernel headers. Else it would error saying that those were not defined, which makes sense.
I imagine that I need to link into an unstripped vmlinux to provide the symbols to utilize. However, I read through the symbols and while there were plenty of llseeks, they were not llseek verbatim.
So my question can go in a few directions:
How can I specify an ELF file to utilize symbols from? And I'm guessing if/how that's possible, the symbols won't match up. If this is correct, is there an existing header file which will redefine llseek and default_llseek or whatever is exactly in the kernel?
Is there a better way to write Posix code in C without a libc?
My goal is to write or port fairly standard C code using (perhaps solely) unistd.h and invoke it without libc. I'm probably okay without a few unistd functions, and am not sure which ones exist "purely" as kernel calls or not. I love assembly, but that's not my goal here. Hoping to stay as strictly C as possible (I'm fine with a few external assembly files if I have to), to allow for a libc-less static system at some point.
Thank you for reading!
If you're looking to write POSIX code in C, the abandonment of libc is not going to be helpful. Although you could implement a syscall function in assembler, and copy structures and defines from the kernel header, you would essentially be writing your own libc, which almost certainly would not be POSIX compliant. With all the great libc implementations out there, there's almost no reason to begin implementing your own.
dietlibc and musl libc are both frugal libc implementations which yield impressively small binaries The linker is generally smart; as long as a library is written to avoid the accidentally pulling in numerous dependencies, only the functions you use will actually be linked into your program.
Here is a simple hello world program:
#include<unistd.h>
int main(){
char str[] = "Hello, World!\n";
write(1, str, sizeof str - 1);
return 0;
}
Compiling it with musl below yeilds a binary of a less than 3K
$ musl-gcc -Os -static hello.c
$ strip a.out
$ wc -c a.out
2800 a.out
dietlibc produces an even smaller binary, less than 1.5K:
$ diet -Os gcc hello.c
$ strip a.out
$ wc -c a.out
1360 a.out
This is far from ideal, but a little bit of (x86_64) assembler has me down to just under 5KB (but most of that is "other things than code" - the actual code is under 1KB [771 bytes to be precise], but the file size is much larger, I think because the code size is rounded to 4KB, and then some header/footer/extra stuff is added to that]
Here's what I did:
gcc -g -static -nostdlib -o glibc start.s glibc.c -Os -lc
glibc.c contains:
#include <unistd.h>
int main()
{
const char str[] = "Hello, World!\n";
write(1, str, sizeof(str));
_exit(0);
}
start.s contains:
.globl _start
_start:
xor %ebp, %ebp
mov %rdx, %r9
mov %rsp, %rdx
and $~16, %rsp
push $0
push %rsp
call main
hlt
.globl _exit
_exit:
// We known %RDI already has the exit code...
mov $0x3c, %eax
syscall
hlt
That main point of this is not to show that it's not the system call part of glibc that takes up a lot of space, but the "prepar things" - and beware that if you were to call for example printf, possibly even (v)sprintf, or exit(), or any other "standard library" function, you are in the land of "nobody knows what will happen".
Edit: Updated "start.s" to put argc/argv in the right places:
_start:
xor %ebp, %ebp
mov %rdx, %r9
pop %rdi
mov %rsp, %rsi
and $~16, %rsp
push %rax
push %rsp
// %rdi = argc, %rsi=argv
call main
Note that I've changed which register contains what thing, so that it matches main - I had them slightly wrong order in the previous code.