I have written simple hello_world program in nasm.
segment .text ; code section
global _start ; must be declared for linker
_start: ; tell linker entry point
mov edx, len ; message len (DX is for data register)
mov ecx, msg ; message to write (CX is for count register)
mov ebx, 1 ; (stdout) file descriptor
mov eax, 1 ; system call number for write system call
int 0x80 ; call kernel
mov eax, 1 ; system call number for exit system call
int 0x80 ; call kernel
section .data
msg db "Hello, World", 0xa
len equ $ - msg
so when i am compiling this program with elf64 flag
$nasm -f elf64 hello_world.nasm
and then linking with ld
$ld hello_world.o
finally when i ran a.out it not write anything to stdout.
when i opened file unistd_64.h to show which system call is calling for no 1.
// /usr/include/asm/unistd_64.h
#ifndef _ASM_X86_UNISTD_64_H
#define _ASM_X86_UNISTD_64_H 1
#define __NR_read 0
#define __NR_write 1
#define __NR_open 2
#define __NR_close 3
#define __NR_stat 4
As you can see for write system call number is 1. this program will work if i put 4 instead of 1 but 4 is specified in unistd_32.h and i also compiled with elf64 flag so why it is not working for 64 bit?
for your reference unistd_32.h
#ifndef _ASM_X86_UNISTD_32_H
#define _ASM_X86_UNISTD_32_H 1
#define __NR_restart_syscall 0
#define __NR_exit 1
#define __NR_fork 2
#define __NR_read 3
#define __NR_write 4
You are making a mistake in which you need to use the sysv-64-abi which states that functions parameters to be passed to rdi, rsi, rdx, rcx. In addition exit syscall number is decimal 60 not 1. also you mistyped section .text. Any you need to use syscall and not old slower int soft interrupt instruction to trap into kernel.
section .text ; code section
global _start ; must be declared for linker
_start: ; tell linker entry point
mov rdx, len ; message len (DX is for data register)
mov rsi, msg ; message to write (CX is for count register)
mov rdi, 1 ; (stdout) file descriptor
mov rax, 1 ; system call number for write system call
syscall ; call kernel
mov rax, 60 ; system call number for exit system call
syscall ; call kernel
section .data
msg db "Hello, World", 0xa
len equ $ - msg
I am new with assembly. And I have problem with code. I am trying to create simple program using scanf.
This is code:
global main
extern printf
extern scanf
section .text
section .data
message: db "The result is = %d", 10, 0
request: db "Enter the number: ", 0
integer1: times 4 db 0 ; 32-bits integer = 4 bytes
formatin: db "%d", 0
main:
; Ask for an integer
push request
call printf
add esp, 4 ; remove the parameter
push integer1 ; address of integer1, where the input is going to be stored
push formatin ; arguments are right to left (first parameter)
call scanf
add esp, 8 ; remove the parameters
; Move the value under the address integer1 to EAX
mov eax, [integer1]
; Print out the content of eax register
push rax
push message
call printf
add esp, 8
; Linux terminate the app
MOV AL, 1
MOV EBX, 0
INT 80h
i compile it with:
nasm -f elf64 -o program.o program.asm
and:
ld -o program program.o
but when i try ld i get error:
program.o: In function `main':
program.asm:(.data+0x34): undefined reference to `printf'
program.asm:(.data+0x46): undefined reference to `scanf'
program.asm:(.data+0x5b): undefined reference to `printf'
I am working on 64-bit linux.
Thanks for help.
You're not linking with any libraries with your ld command. scanf and printf are defined in the C library, so you can link with that:
ld -o program program.o -lc
or you can use some other library that defines those functions, if you have it available.
I need to make a program that outputs a text file with an extension of .dna, I don't know if I can really do that, and if the text file will even be compatible with what I need to compare it afterwards. Anyway, I'm not really sure how to do this. I tried to look for some examples for NASM, but I didn't find much. I have an idea of what I'd need to do, but I just don't know what to call to generate a file.
Afterwards I'd need to write stuff into it, I'm not really sure on how to go on about that. Could anyone point me to some examples or something? I just need to see what is required to write my own thing.
Here's an example using system calls. Basically, you just open the file, write some data to it, then close and exit:
; nasm -f elf file.asm
; ld -m elf_i386 file.o
BITS 32
section .data
; don't forget the 0 terminator if it akes a C string!
filename: db 'test.txt', 0
; an error message to be printed with write(). The function doesn't
; use a C string so no need for a 0 here, but we do need length.
error_message: db 'Something went wrong.', 10 ; 10 == \n
; this next line means current location minus the error_message location
; which works out the message length.
; many of the system calls use pointer+length pairs instead of
; 0 terminated strings.
error_message_length: equ $ - error_message
; a message we'll write to our file, same as the error message
hello: db 'Hello, file!', 10 ; the 10 is a newline at the end
hello_length: equ $ - hello
fd: dd 0 ; this is like a global int variable in C
; global variables are generally a bad idea and there's other
; ways to do it, but for simplicity I'm using one here as the
; other ways are a bit more work in asm
section .text
global _start
_start:
; first, open or create the file. in C it would be:
; // $ man 2 creat
; int fd = creat("file.txt", 0644); // the second argument is permission
; we get the syscall numbers from /usr/include/asm/unistd_32.h
mov eax, 8 ; creat
mov ebx, filename ; first argument
mov ecx, 644O ; the suffix O means Octal in nasm, like the leading 0 in C. see: http://www.nasm.us/doc/nasmdoc3.html
int 80h ; calls the kernel
cmp eax, -1 ; creat returns -1 on error
je error
mov [fd], eax ; the return value is in eax - the file descriptor
; now, we'll write something to the file
; // man 2 write
; write(fd, hello_pointer, hello_length)
mov eax, 4 ; write
mov ebx, [fd],
mov ecx, hello
mov edx, hello_length
int 80h
cmp eax, -1
; it should also close the file in a normal program upon write error
; since it is open, but meh, since we just terminate the kernel
; will clean up after us
je error
; and now we close the file
; // man 2 close
; close(fd);
mov eax, 6 ; close
mov ebx, [fd]
int 80h
; and now close the program by calling exit(0);
mov eax, 1 ; exit
mov ebx, 0 ; return value
int 80h
error:
mov eax, 4 ; write
mov ebx, 1 ; write to stdout - file #1
mov ecx, error_message ; pointer to the string
mov edx, error_message_length ; length of the string
int 80h ; print it
mov eax, 1 ; exit
mov ebx, 1 ; return value
int 80h
The file will be called a.out if you copied my link command above. The -o option to ld changes that.
We can also call C functions, which helps if you need to write out things like numbers.
; nasm -f elf file.asm
; gcc -m32 file.o -nostdlib -lc # notice that we're using gcc to link, makes things a bit easier
; # the options are: -m32, 32 bit, -nostdlib, don't try to use the C lib cuz it will look for main()
; # and finally, -lc to add back some of the C standard library we want
BITS 32
; docs here: http://www.nasm.us/doc/nasmdoc6.html
; we declare the C functions as external symbols. the leading underscore is a C thing.
extern fopen
extern fprintf
extern fclose
section .data
; don't forget the 0 terminator if it akes a C string!
filename: db 'test.txt', 0
filemode: db 'wt', 0 ; the mode for fopen in C
format_string: db 'Hello with a number! %d is it.', 10, 0 ; new line and 0 terminator
; an error message to be printed with write(). The function doesn't
; use a C string so no need for a 0 here, but we do need length.
error_message: db 'Something went wrong.', 10 ; 10 == \n
; this next line means current location minus the error_message location
; which works out the message length.
; many of the system calls use pointer+length pairs instead of
; 0 terminated strings.
error_message_length: equ $ - error_message
fp: dd 0 ; this is like a global int variable in C
; global variables are generally a bad idea and there's other
; ways to do it, but for simplicity I'm using one here as the
; other ways are a bit more work in asm
section .text
global _start
_start:
; first, open or create the file. in C it would be:
; FILE* fp = fopen("text.txt", "wt");
; arguments for C functions are pushed on to the stack, right from left.
push filemode ; "wt"
push filename ; "text.txt"
call fopen
add esp, 8 ; we need to clean up our own stack. Since we pushed two four-byte items, we need to pop the 8 bytes back off. Alternatively, we could have called pop twice, but a single add instruction keeps our registers cleaner.
; the return value is in eax, store it in our fp variable after checking for errors
; in C: if(fp == NULL) goto error;
cmp eax, 0 ; check for null
je error
mov [fp], eax;
; call fprintf(fp, "format string with %d", 55);
; the 55 is just a random number to print
mov eax, 55
push eax ; all arguments are pushed, right to left. We want a 4 byte int equal to 55, so eax is it
push format_string
mov eax, [fp] ; again using eax as an intermediate to store our 4 bytes as we push to the stack
push eax
call fprintf
add esp, 12 ; 3 words this time to clean up
; fclose(fp);
mov eax, [fp] ; again using eax as an intermediate to store our 4 bytes as we push to the stack
push eax
call fclose
; the rest is unchanged from the above example
; and now close the program by calling exit(0);
mov eax, 1 ; exit
mov ebx, 0 ; return value
int 80h
error:
mov eax, 4 ; write
mov ebx, 1 ; write to stdout - file #1
mov ecx, error_message ; pointer to the string
mov edx, error_message_length ; length of the string
int 80h ; print it
mov eax, 1 ; exit
mov ebx, 1 ; return value
int 80h
There's a lot more that can be done here, like a few techniques to eliminate those global variables, or better error checking, or even writing a C style main() in assembly. But this should get you started in writing out a text file. Tip: Files are the same as writing to the screen, you just need to open/create them first!
BTW don't mix the system calls and the C library functions at the same time. The C library (fprintf etc) buffers data, the system calls don't. If you mix them, the data might end up written to the file in a surprising order.
The code is similar, but slightly different in 64 bit.
Finally, this same pattern can be used to translate almost any C code to asm - the C calling convention is the same with different functions, and the linux system call convention with the argument placement etc. follows a consistent pattern too.
Further reading:
http://en.wikipedia.org/wiki/X86_calling_conventions#cdecl on the C calling convention
http://docs.cs.up.ac.za/programming/asm/derick_tut/syscalls.html on linux system calls
What is the purpose of EBP in the following code? is another SO answer I wrote up a while ago about local variables in asm - this will have hints as to one way to get rid of that global and describes how the C compile does it. (the other way to get rid of that global is to either keep the fd/fp in a register and push and pop it onto the stack when you need to free up the register for something else)
And the man pages referenced in the code for each function. From your linux prompt, do things like man 2 write or man 3 fprintf to see more. (System calls are in manual section 2 and C functions are in manual section 3).
According to this paper and a few stackoverflow posts, argc is at the top of the stack and argv is below it.
I've tried about 3-4 different ways of doing it:
Popping it into an initialized variable (.data) - output done by calling printf.
Popping it into uninitialized space (.bss) - output done by calling sys_write()
A mixture of the above + tweaks.
I've been told that argc and argv aren't in the stack by someone on a forum, which I don't understand; how are other people doing it with similar code?
Here's an example of what I've attempted (3 days worth of knowledge - try not to giggle):
section .bss
argc: resd 1 ; alloc 4 bytes for popped value
section .text
global _start
_start:
pop dword[argc] ; pop argc, place in var
mov ebx,0x01 ; file descriptor = STDOUT
mov ecx,argc ; var (addr) - points to buffer
mov edx,1 ; length of buffer (single digit)
mov eax,0x04 ; syscall number for sys_write()
int 0x80 ; request the kernel to make syscall
exit:
mov ebx,0x00 ; arg for sys_exit() - sys_exit(0)
mov eax,0x01 ; syscall number for sys_exit()
int 0x80 ; request the kernel to make syscall
Solution:
section .data
msg db Value: %d\n
section .text
global main
extern printf
main:
push dword[esp+4]
push msg
call printf
add esp,8
mov eax,0
ret
The process of getting argc looks ok to me (for a 32-bit Linux machine), although you're 4 bytes off since the top of the stack most likely contains the return address to the startup code that called main.
Also, the sys_write system call expects a pointer to a string in ecx. What you're giving it is a pointer to an integer, which isn't the same thing.If you want to print the value of argc you'll have to convert it to a string first (or use the printf function).
Here's some example code (I'm using the GNU assembler since I don't have NASM on this machine):
format: .asciz "%d\n"
.text
.globl main
.type main, #function
main:
pushl 4(%esp) # push argc
pushl $format # push the format string
call printf
addl $8,%esp # pop the arguments
movl $0, %eax # return value
ret
Is there a way to define a string pointer in the .text part of the assembly code like this?
SECTION .text
global main
main:
fmt: dd "%s", 10, 0
or maybe have the string constructed and have a register pointing to it, put all of this could be done in the .text section?
Assemblers are pretty dumb and you have to write all things explicitly, like this:
SECTION .text
global main
main:
; Some code here, you don't want to execute data.
mov ebx, fmt ; ebx points to fmt[0] ('%')
mov eax, dword [pfmt] ; eax also points to fmt[0] ('%')
; Some more code here.
pfmt dd fmt ; pfmt is a constant pointer to fmt[0] ('%')
fmt db "%s", 10, 0 ; fmt is a constant string
You may be able to use macros to simplify coding:
%macro LoadRegWithStrAddr 2+
jmp %%endstr
%%str: db %2
%%endstr:
mov %1, %%str
%endmacro
SECTION .text
global main
main:
LoadRegWithStrAddr ebx, "%s", 10, 0 ; ebx points to "%s\n"
LoadRegWithStrAddr ebx, "%s", 10, 0 expands into:
jmp %%endstr
%%str: db "%s", 10, 0
%%endstr:
mov ebx, %%str
See NASM documentation.