Hello all.
So I'm learning assembly.And as per my usual learning steps with any new language I pick up I've arrived at networking with assembly.
Which, sadly isn't going that well as I've pretty much failed at step 0, which would be getting a socket through which communication can begin.
The assembly code should be roughly equal to the following C code:
#include <stdio.h>
#include <sys/socket.h>
int main(){
int sock;
sock = socket(AF_INET, SOCK_STREAM, 0);
}
(Let's ignore the fact that it's not closing the socket for now.)
So here's what I did thus far:
Checked the manual. Which would imply that I need to make a socketcall() this is all good and well. The problem starts with that it would need an int that describes what sort of socketcall it should make. The calls manpage isn't helping much with this either as it only describes that:
On a some architectures—for example, x86-64 and ARM—there is no
socketcall() system call; instead socket(2), accept(2), bind(2), and
so on really are implemented as separate system calls.
Yet there are no such calls in the original list of syscalls - and as far as I know the socket(), accept(), bind(), listen(), etc. are calls from libnet and not from the kernel. This got me utterly confused so I've decided to compile the above C code and check up on it with strace. This yielded the following:
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
While that didn't got me any closer to knowing what socket() is it did explain it's arguments. For witch I don't seem to find the proper documentation (again). I thought that PF_INET, SOCK_STREAM, IPPROTO_IP would be defined in <sys/socket.h> but my grep-ing for them didn't seem to find anything of use. So I decided to just wing it by using gdb in tandem with disass main to find the values. This gave the following output:
Dump of assembler code for function main:
0x00000000004004fd <+0>: push rbp
0x00000000004004fe <+1>: mov rbp,rsp
0x0000000000400501 <+4>: sub rsp,0x10
0x0000000000400505 <+8>: mov edx,0x0
0x000000000040050a <+13>: mov esi,0x1
0x000000000040050f <+18>: mov edi,0x2
0x0000000000400514 <+23>: call 0x400400
0x0000000000400519 <+28>: mov DWORD PTR [rbp-0x4],eax
0x000000000040051c <+31>: leave
0x000000000040051d <+32>: ret
End of assembler dump.
In my experience this would imply that socket() gets it's parameters from EDX (PF_INET), ESI (SOCK_STREAM), and EDI (IPPROTO_IP). Which would be odd for a syscall (as the convention with linux syscalls would be to use EAX/RAX for the call number and other registers for the parameters in increasing order, eg. RBX, RCX, RDX ...). The fact that this is beaing CALL-ed and not INT 0x80'd would also imply that this is not in fact a system call but rather something thats being called from a shared object. Or something.
But then again. Passing arguments in registers is very odd for something that's CALL-ed. Normally as far as I know argument's for called things should be PUSH-ed onto the stack, as the compiler can't know what registers they would try to use.
This behavior becomes even more curious when checking the produced binary with ldd:
linux-vdso.so.1 (0x00007fff4a7fc000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f56b0c61000)
/lib64/ld-linux-x86-64.so.2 (0x00007f56b1037000)
There appears to be no networking library's linked.
And that's the point where I've ran out of ideas.
So I'm asking for the following:
A documentation that describes the x86-64 linux kernel's actual syscalls and their associated numbers. (Preferably as a header file for C.)
The header files that define PF_INET, SOCK_STREAM, IPPROTO_IP as it really bugs me that I wasn't able to find them on my own system.
Maybe a tutorial for networking in assembly on x86-64 linux. (For x86-32 it's easy to find material but for some reason I came up empty with the 64 bits stuff.)
Thanks!
The 64 bit calling convention does use registers to pass arguments, both in user space and to system calls. As you have seen, the user space convention is rdi,rsi, rdx, rcx, r8, r9. For system calls, r10 is used instead of rcx which is clobbered by the syscall instruction. See wikipedia or the ABI documentation for more details.
The definitions of the various constants are hidden in header files, which are nevertheless easily found via a file system search assuming you have the necessary development packages installed. You should look in /usr/include/x86_64-linux-gnu/bits/socket.h and /usr/include/linux/in.h.
As for a system call list, it's trivial to google one, such as this. You can also always look in the kernel source of course.
socket.asm
; Socket
; Compile with: nasm -f elf socket.asm
; Link with (64 bit systems require elf_i386 option): ld -m elf_i386 socket.o -o socket
; Run with: ./socket
%include 'functions.asm'
SECTION .text
global _start
_start:
xor eax, eax ; init eax 0
xor ebx, ebx ; init ebx 0
xor edi, edi ; init edi 0
xor esi, esi ; init esi 0
_socket:
push byte 6 ; push 6 onto the stack (IPPROTO_TCP)
push byte 1 ; push 1 onto the stack (SOCK_STREAM)
push byte 2 ; push 2 onto the stack (PF_INET)
mov ecx, esp ; move address of arguments into ecx
mov ebx, 1 ; invoke subroutine SOCKET (1)
mov eax, 102 ; invoke SYS_SOCKETCALL (kernel opcode 102)
int 80h ; call the kernel
call iprintLF ; call our integer printing function (print the file descriptor in EAX or -1 on error)
_exit:
call quit ; call our quit function
more docs...
this is for x86 system. if you want use for x86_64 system change x86 register to x86_64. for example change 'eax' to 'rax' or 'esp' to 'rsp'. and change syscall value in eax(rax), see https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md
[bits 32]
global _start
section .data
msg: db "Socket Failed To Create!",0xa,0
len: equ $-msg
msg1: db "Socket Created",0xa,0
len1: equ $-msg1
msg2: db "Recv Or Send Failed",0xa,0
len2: equ $-msg2
msg3: db "Shutdown Socket Failed",0xa,0
len3: equ $-msg3
DATASIZE: equ 5
SOCK_STREAM: equ 1
AF_INET: equ 2
AF_INET: equ 2
INADDR_ANY: equ 0
MSG_WAITALL: equ 0x100
MSG_DONTWAIT: equ 0x40
SHUT_RDWR: equ 2
SYS_SOCKET: equ 1 ; sys_socket(2)
SYS_BIND: equ 2 ; sys_bind(2)
SYS_CONNECT: equ 3 ; sys_connect(2)
SYS_LISTEN: equ 4 ; sys_listen(2)
SYS_ACCEPT: equ 5 ; sys_accept(2)
SYS_GETSOCKNAME:equ 6 ; sys_getsockname(2)
SYS_GETPEERNAME:equ 7 ; sys_getpeername(2)
SYS_SOCKETPAIR: equ 8 ; sys_socketpair(2)
SYS_SEND: equ 9 ; sys_send(2)
SYS_RECV: equ 10 ; sys_recv(2)
SYS_SENDTO: equ 11 ; sys_sendto(2)
SYS_RECVFROM: equ 12 ; sys_recvfrom(2)
SYS_SHUTDOWN: equ 13 ; sys_shutdown(2)
SYS_SETSOCKOPT: equ 14 ; sys_setsockopt(2)
SYS_GETSOCKOPT: equ 15 ; sys_getsockopt(2)
SYS_SENDMSG: equ 16 ; sys_sendmsg(2)
SYS_RECVMSG: equ 17 ; sys_recvmsg(2)
SYS_ACCEPT4: equ 18 ; sys_accept4(2)
SYS_RECVMMSG: equ 19 ; sys_recvmmsg(2)
SYS_SENDMMSG: equ 20 ; sys_sendmmsg(2)
struc sockaddr_in, -0x30
.sin_family: resb 2 ;2bytes
.sin_port: resb 2 ;2bytes
.sin_addr: resb 4 ;4bytes
.sin_zero: resb 8 ;8bytes
endstruc
struc socket, -0x40
.socketfd resb 4
.connectionfd resb 4
.count resb 4
.data resb DATASIZE
endstruc
section .text
_start:
push ebp
mov ebp, esp
sub esp, 0x400 ;1024byte
xor edx, edx ;or use cdq
;
; int socket(int domain, int type, int protocol);
; domain: The domain argument specifies a communication domain
;
push edx ; Push protocol
push dword SOCK_STREAM ; Push type
push dword AF_INET ; Push domain
mov ecx, esp ; ECX points to args
mov ebx, SYS_SOCKET ;
mov eax, 0x66 ; socketcall()
int 0x80
cmp eax, 0
jl .socket_failed
mov [ebp + socket.socketfd], eax
;
; fill struct sockaddr_in serv_addr;
;
mov word [ebp + sockaddr_in.sin_family], AF_INET
mov word [ebp + sockaddr_in.sin_port], 0x3905
mov dword [ebp + sockaddr_in.sin_addr], INADDR_ANY
push dword [ebp + sockaddr_in.sin_addr]
push word [ebp + sockaddr_in.sin_port]
push word [ebp + sockaddr_in.sin_family]
mov ecx, esp
;
; int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
;
push byte 0x10 ; sizeof(struct sockaddr)
push ecx ; pointer struct sockaddr
push dword [ebp + socket.socketfd]
mov ecx, esp ; ECX points to args
mov ebx, SYS_BIND ;
mov eax, 0x66
int 0x80
cmp eax, 0
jne .socket_failed
;
; int listen(int sockfd, int backlog);
;
push dword 0x10
push dword [ebp + socket.socketfd]
mov ecx, esp
mov ebx, SYS_LISTEN
mov eax, 0x66
int 0x80
cmp eax, 0
jne .socket_failed
;
; int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
;
xor ebx, ebx
push ebx
push ebx
push dword [ebp + socket.socketfd]
mov ecx, esp
mov ebx, SYS_ACCEPT
mov eax, 0x66
int 0x80
cmp eax, -1
je .socket_failed
mov [ebp + socket.connectionfd], eax
mov dword [ebp + socket.count], 0
.again:
lea edi, [ebp + socket.data]
mov ecx, DATASIZE
mov eax, 0
rep stosd
lea eax, [ebp + socket.data]
;
; ssize_t recv(int sockfd, const void *buf, size_t len, int flags);
;
push dword MSG_WAITALL
push dword DATASIZE
push eax
push dword [ebp + socket.connectionfd]
mov ecx, esp
mov ebx, SYS_RECV
mov eax, 0x66
int 0x80
cmp eax, 0
jle .recv_or_send_failed
mov edx, eax
lea ecx, [ebp + socket.data]
call printk
inc dword [ebp + socket.count]
cmp dword [ebp + socket.count], 5
jle .again
.break:
;
; int shutdown(int sockfd, int how);
;
push dword SHUT_RDWR
push dword [ebp + socket.socketfd]
mov ecx, esp
mov ebx, SYS_SHUTDOWN
mov eax, 0x66
int 0x80
cmp eax, 0
jne .shutdown_failed
;
; int close(int fd)
;
mov ebx, [ebp + socket.connectionfd]
mov eax, 0x06
int 0x80
cmp eax, 0
jne .shutdown_failed
jmp .success
.shutdown_failed:
mov edx, len3
mov ecx, msg3
call printk
jmp .end
.recv_or_send_failed:
mov edx, len2
mov ecx, msg2
call printk
jmp .end
.socket_failed:
mov edx, len
mov ecx, msg
call printk
jmp .end
.success:
mov edx, len1
mov ecx, msg1
call printk
jmp .end
.end:
leave
mov ebx,0 ;first syscall argument: exit code
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
ret
; EDX: message length
; ECX: pointer to message to write
printk:
pusha
mov ebx,1 ;first argument: file handle (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
popa
ret
Related
I'm trying to get information about file permissions. I am using the sys_access system call. Here is my code snippet:
mov eax, 33
mov ebx, fileName
mov ecx, 1
int 80h
cmp eax, 0
jl .error
If eax is -1 there is an error, and I am not getting one, but I need to check all the permissions of the file (owner, group, others). How do I do that?
You can use the kernel function sys_newstat (No. 106 - look at this table) to get the file permissions. The structure stat is a never ending horror, but the following example works at least on my Debian Wheezy 64 bit (NASM, 32-bit and 64-bit modes):
SECTION .data
filename db '/root' ; Just an example, can be replaced with any name
filename_len equ $ - filename ; Length of filename
db 0 ; Terminator for `Int 80h / EAX = 106`
perm_out db 'Permissions: '
perm db 'drwxrwxrwx'
perm_len equ $ - perm ; Index of last character in `perm`
lf db 10
perm_out_len equ $ - perm_out ; Length of `Permissions: ...\n`
SECTION .bss
stat resb 256 ; Way too much, but size is variable depending on OS
SECTION .text
global _start
_start:
mov eax,4 ; sys-out
mov edx,filename_len ; length of string to print
mov ecx,filename ; Pointer to string
mov ebx,1 ; StdOut
int 0x80 ; Call kernel
mov eax,4 ; sys-out
mov edx,1 ; Length of string to print
mov ecx, lf ; Pointer to string
mov ebx,1 ; StdOut
int 0x80 ; Call kernel
mov eax, 106 ; sys_newstat
mov ebx, filename ; Pointer to ASCIIZ file-name
mov ecx, stat ; Pointer to structure stat
int 80h
test eax, eax
jz .noerr
mov eax,1 ; sys_exit
mov ebx,1 ; Exit code, 1=not normal
int 0x80 ; Call kernel
.noerr:
movzx eax, word [stat + 8] ; st_mode (/usr/include/asm/stat.h)
mov ebx, perm_len
; rwx bits
mov ecx, 9
.L1:
sub ebx, 1
shr eax, 1
jc .J1
mov byte [perm + ebx], '-'
.J1:
loop .L1
; directory bit
sub ebx, 1
shr eax, 6
jc .J2
mov byte [perm + ebx], '-'
.J2:
mov eax,4 ; sys-out
mov edx,perm_out_len ; Length of string to print
mov ecx,perm_out ; Pointer to string
mov ebx,1 ; StdOut
int 0x80 ; Call kernel
mov eax,1 ; sys_exit
mov ebx,0 ; Exit code, 0=normal
int 0x80 ; Call kernel
I'm trying to port my stdrepl library to FASM for learning purposes. I know that the GNU readline library already does what I'm trying to do, but I want to learn how to write non-trivial programs in assembly.
In node.js I can easily create a tty by writing:
var stdin = process.stdin;
stdin.setEncoding("utf8");
stdin.setRawMode(true);
stdin.resume();
How do I achieve the same results in pure assembly. I tried reading one byte from stdin at a time in a loop as follows, but it doesn't return the byte right after I hit a key:
oct db ?
mov eax, 3
xor ebx, ebx
mov ecx, oct
mov edx, 1
Note that the data definition oct is not a part of the loop, so please don't smite me for that. I know how to structure an assembly program.
Sorry for the delay (I really should "register" here - that'll get me "notifications", right?). As I said, it's rudimentary and imperfect. Some of the "usual" stuff may be defined elsewhere, but I think you can figure out how to get it to assemble. Just call it - no parameters - and the key is returned in al. Hope it's some use to you!
;-----------------------------
; ioctl subfunctions
%define TCGETS 0x5401 ; tty-"magic"
%define TCSETS 0x5402
; flags for 'em
%define ICANON 2 ;.Do erase and kill processing.
%define ECHO 8 ;.Enable echo.
struc termios
alignb 4
.c_iflag: resd 1 ; input mode flags
.c_oflag: resd 1 ; output mode flags
.c_cflag: resd 1 ; control mode flags
.c_lflag: resd 1 ; local mode flags
.c_line: resb 1 ; line discipline
.c_cc: resb 19 ; control characters
endstruc
;---------------------------------
getc:
push ebp
mov ebp, esp
sub esp, termios_size ; make a place for current kbd mode
push edx
push ecx
push ebx
mov eax, __NR_ioctl ; get current mode
mov ebx, STDIN
mov ecx, TCGETS
lea edx, [ebp - termios_size]
int 80h
; monkey with it
and dword [ebp - termios_size + termios.c_lflag], ~(ICANON | ECHO)
mov eax, __NR_ioctl
mov ebx, STDIN
mov ecx, TCSETS
lea edx, [ebp - termios_size]
int 80h
xor eax, eax
push eax ; this is the buffer to read into
mov eax, __NR_read
mov ebx, STDIN
mov ecx, esp ; character goes on the stack
mov edx, 1 ; just one
int 80h ; do it
; restore normal kbd mode
or dword [ebp - termios_size + termios.c_lflag], ICANON | ECHO
mov eax, __NR_ioctl
mov ebx, STDIN
mov ecx, TCSETS
lea edx, [ebp - termios_size]
int 80h
pop eax ; get character into al
pop ebx ; restore caller's regs
pop ecx
pop edx
mov esp, ebp ; leave
pop ebp
ret
;-------------------------
How come this program is not printing out to the screen, am I missing something on the INT 80 command?
section .bss
section .data
hello: db "Hello World",0xa ;10 is EOL
section .text
global _start
_start:
mov ecx, 0; ; int i = 0;
loop:
mov dl, byte [hello + ecx] ; while(data[i] != EOF) {
cmp dl, 0xa ;
je exit ;
mov ebx, ecx ; store conetents of i (ecx)
; Print single character
mov eax, 4 ; set sys_write syscall
mov ecx, byte [hello + ebx] ; ...
mov edx, 1 ; move one byte at a time
int 0x80 ;
inc ebx ; i++
mov ecx, ebx ; move ebx back to ecx
jmp loop ;
exit:
mov eax, 0x01 ; 0x01 = syscall for exit
int 0x80 ;
ADDITION
My Makefile:
sandbox: sandbox.o
ld -o sandbox sandbox.o
sandbox.o: sandbox.asm
nasm -f elf -g -F stabs sandbox.asm -l sandbox.lst
Modified Code:
section .bss
section .data
hello: db "Hello World",0xa ;10 is EOL
section .text
global _start
_start:
mov ecx, 0; ; int i = 0;
while:
mov dl, byte [hello + ecx] ; while(data[i] != EOF) {
cmp dl, 0xa ;
je exit ;
mov ebx, ecx ; store conetents of i (ecx)
; Print single character
mov eax, 4 ; set sys_write syscall
mov cl, byte [hello + ebx] ; ...
mov edx, 1 ; move one byte at a time
int 0x80 ;
inc ebx ; i++
mov ecx, ebx ; move ebx back to ecx
jmp while ;
exit:
mov eax, 0x01 ; 0x01 = syscall for exit
int 0x80 ;
One of the reasons it's not printing is because ebx is supposed to hold the value 1 to specify stdin, and another is because sys_write takes a pointer (the address of your string) as an argument, not an actual character value.
Anyway, let me show you a simpler way of structuring your program:
section .data
SYS_EXIT equ 1
SYS_WRITE equ 4
STDOUT equ 1
TRAP equ 0x80
NUL equ 0
hello: db "Hello World",0xA,NUL ; 0xA is linefeed, terminate with NUL
section .text
global _start
_start:
nop ; for good old gdb
mov ecx, hello ; ecx is the char* to be passed to sys_write
read:
cmp byte[ecx], NUL ; NUL indicates the end of the string
je exit ; if reached the NUL terminator, exit
; setup the registers for a sys_write call
mov eax, SYS_WRITE ; syscall number for sys_write
mov ebx, STDOUT ; print to stdout
mov edx, 1 ; write 1 char at a time
int TRAP; ; execute the syscall
inc ecx ; increment the pointer to the next char
jmp read ; loop back to read
exit:
mov eax, SYS_EXIT ; load the syscall number for sys_exit
mov ebx, 0 ; return a code of 0
int TRAP ; execute the syscall
It can be simpler to NUL terminate your string as I did, or you could also do $-hello to get it's length at compile time. I also set the registers up for sys_write at each iteration in the loop (as you do), since sys_write doesn't preserve all the registers.
I don't know how you got your code to assemble, but it doesn't assemble over here for a couple of very good reasons.
You cannot use loop as a label name because that name is reserved for the loop instruction.
Your line 20's instruction mov ecx, byte [hello + ebx] doesn't assemble either because the source and destination operands' sizes don't match (byte vs dword). Possible changes:
mov cl, byte [hello + ebx]
mov ecx, dword [hello + ebx]
movzx ecx, byte [hello + ebx]
Was the above not the actual code you had?
I have the following code. It works ok except one thing which limits its usage in other programs. When I run it in the debugger, Linux read system call returns value always bigger than the specified buffer size. Why is it and how to fix it, because it doesn't let the program to loop through the buffer array without a segmentation fault?
SECTION .data
address dd "log.txt", 0
badf dd "Bad file!",0
buffsize dd 1024
size dd 1024
filedesc dd 0
section .bss
buf resb 1024
SECTION .text
global main
main:
mov ebx, address
mov eax, 5 ; open(
mov ecx, 0 ; read-only mode
int 80h ; );
mov [filedesc], eax
read_loop:
mov ebx, [filedesc] ; file_descriptor,
mov eax, 3 ; read(
mov ecx, buf ; *buf,
mov edx, buffsize ; *bufsize
int 80h ; );
test eax, eax
jz done
js badfile
mov eax, 4 ; write(
mov ebx, 1 ; STDOUT,
mov edx, buffsize
mov ecx, buf ; *buf
int 80h
jmp read_loop
badfile:
mov eax, 4 ; write(
mov ebx, 1 ; STDOUT,
mov edx, 10
mov ecx, badf ; *buf
int 80h
done:
mov eax, 6
mov ebx, [filedesc]
int 0x80
mov ebx,0
mov eax,1
int 0x80
mov edx, buffsize ; *bufsize
Is wrong since buffsize is declared as follows:
buffsize dd 1024
the above code will move the address of buffsize to edx. What you want is:
mov edx, [buffsize]
which will move the value stored at buffsize to edx.
You have a few of those type of errors in there.
Could it be a negative error return code?
I don't see any test in your code for negative values.
I am trying to print a single digit integer in nasm assembly on linux. What I currently have compiles fine, but nothing is being written to the screen. Can anyone explain to me what I am doing wrong here?
section .text
global _start
_start:
mov ecx, 1 ; stores 1 in rcx
add edx, ecx ; stores ecx in edx
add edx, 30h ; gets the ascii value in edx
mov ecx, edx ; ascii value is now in ecx
jmp write ; jumps to write
write:
mov eax, ecx ; moves ecx to eax for writing
mov eax, 4 ; sys call for write
mov ebx, 1 ; stdout
int 80h ; call kernel
mov eax,1 ; system exit
mov ebx,0 ; exit 0
int 80h ; call the kernel again
This is adding, not storing:
add edx, ecx ; stores ecx in edx
This copies ecx to eax and then overwrites it with 4:
mov eax, ecx ; moves ecx to eax for writing
mov eax, 4 ; sys call for write
EDIT:
For a 'write' system call:
eax = 4
ebx = file descriptor (1 = screen)
ecx = address of string
edx = length of string
After reviewing the other two answers this is what I finally came up with.
sys_exit equ 1
sys_write equ 4
stdout equ 1
section .bss
outputBuffer resb 4
section .text
global _start
_start:
mov ecx, 1 ; Number 1
add ecx, 0x30 ; Add 30 hex for ascii
mov [outputBuffer], ecx ; Save number in buffer
mov ecx, outputBuffer ; Store address of outputBuffer in ecx
mov eax, sys_write ; sys_write
mov ebx, stdout ; to STDOUT
mov edx, 1 ; length = one byte
int 0x80 ; Call the kernel
mov eax, sys_exit ; system exit
mov ebx, 0 ; exit 0
int 0x80 ; call the kernel again
From man 2 write
ssize_t write(int fd, const void *buf, size_t count);
In addition to the other errors that have been pointed out, write() takes a pointer to the data and a length, not an actual byte itself in a register as you are trying to provide.
So you will have to store your data from a register to memory and use that address (or if it's constant as it currently is, don't load the data into a register but load its address instead).