Converting C code to x86-64 bit Assembly? - linux

I am trying to create a random letter generator for my string and I've been given a C code and will have to convert it into Assembly language for my program. I'm doing this in x86-64 bit NASM assembly language. I'm supposed to be using only system calls and not C/C++ function calls.
Here's the C/C++ code that I've got to convert:
int genran(int x,int y)
{
int a = 0;
a = a + x * 1103515245 + 12345;
return (unsigned int)(a / 65536) % (y + 1);
}
I am new to Assembly and any help is appreciated, here's what I've got so far. I know its somewhat wrong but I will be working on improving it:
section .data
string db "The random string is generated below: "
len_string equ $-string
a dd 0
x dd ?
y dd ?
rem dd 0
section .bss
string_buff resb 21 ;Our string's length is 20 characters
section .txt
global main
main:
mov rax, 1
mov rdi, 1
mov rsi, string
mov rdx, len_string
syscall
mov rax, a
mov rbx, x ;we have to come up with a value of x?
mul rbx, 1103515245
add rbx, 12345
mov rax, rbx
div rax, 65536
mov rdx, y ;we have to come up with a value of y?
add rdx, 1
;mod rax, rdx
;ret
exit:
mov rax, 60
xor rdi, rdi
syscall

mov rax, rbx
div rax, 65536
Please read the instructions; this doesn't exist. div is an opcode that only takes one argument. The code should look like
mov rax, rbx
xor rdx, rdx ; Div takes rdx:rax as an implicit 128 bit argument.
mov rcx, 65536
div rcx
At this skill level I don't recommend trying to use read or write system calls. I very much instead recommend calling the C standard library until you get the hang of it. read and write have too many gotchas. You will want to write them once and include them in a mini-library any more assembly work you do.
We can skip read by taking argument off the command line as follows:
mov rbx, [rsp + 16] ; argv[1]
Unless I'm very much mistaken, the first thing on the stack is argc, and then argv[0], then argv[1], ...

Related

Displaying a number in x86-64 assembly with only Linux system calls [duplicate]

This question already has answers here:
Why should EDX be 0 before using the DIV instruction?
(2 answers)
How do I print an integer in Assembly Level Programming without printf from the c library? (itoa, integer to decimal ASCII string)
(5 answers)
Closed 2 years ago.
Previous version of this question (that question originally had a different problem, even though the code there is now the same as the code in this question)
I am trying to make code to display a number on console in Linux 64 bit NASM, without the use of c/c++ functions (pure assembly). The code compiles and links fine but it will not give output...
It displays just newline for some time and then displays '7' forever. I am new to Assembly so I don't know what is wrong. Please help... Here is the code:
section .data
num: dq 102 ;my default number to get the reverse of (for now)
nl: db 0x0a
nlsize: equ $-nl
ten: dq 10
section .bss
rem: resq 1
remsize: equ $-rem
section .text
global _start
_start:
cmp qword [num], 0
jng _exit ;jump to _exit if num is not greater than 0
mov rax, [num] ;move the number to rax
mov rbx, [num] ;move the number to rbx as well so that i have original number in register to subtract and get the remainder
mov rcx, [ten] ;move 10 to rcx to be the divisor
div rcx ;divide number in rax by 10
mov [num], rax ;get the quotient to get the remaining number for quotient
mul rcx ;multiply number in rax by 10
sub rbx, rax ;subtract rbx - rax and store the value in rax (right?)
mov [rem], rbx ;get the remainder from rax. this must be done right after div (WHY??????????)
call _disprem ;call _disprem to display the remainder... call returns the flow back to the caller right?
jmp _start ;get to the loop again
_exit:
mov rax, 60
mov rdi, 0
syscall
_newl:
mov rax, 1
mov rdi, 1
mov rsi, nl
mov rdx, nlsize
syscall
ret
_disprem:
mov rax, 1
mov rdi, 1
add qword [rem], 0x0000000000000030 ;since the rem variable is quadword (64 bit)
mov rsi, rem ;for getting ascii value (48 is ascii 0 in decimal) to convert the rem to character
mov rdx, remsize
syscall
sub qword[rem], 0x0000000000000030 ;get me my original number back plz thanks
call _newl
ret

Displaying a number - assembly code not working Linux, x64 (NASM) [duplicate]

This question already has answers here:
Displaying a number in x86-64 assembly with only Linux system calls [duplicate]
How do I print an integer in Assembly Level Programming without printf from the c library? (itoa, integer to decimal ASCII string)
(5 answers)
Closed 2 years ago.
I am learning assembly on Linux (NASM) x64 machine (I don't have access to 32 or 16 bit machine), and I am trying to display number on screen (reverse of number according to code but that's a start).
Number is predefined in section .data -> num.
I am quite a newbie at assembly programming and due to the lack of material on x64 assembly (really, cant find much, and all I was able to find was quite confusing) I am unable to resolve the issue.
The issue is that the code compiles an links with no errors/warnings, but it just displays some spaces (not even newline). If I remove the call _newl code from _disprem, those spaces are also gone. There is not even segment fault or something.
By the way, algorithm to get the remainder (to get the digits in a number) is num - (num / 10) * 10
section .data
num: dq 102 ;my default number to get the reverse of (for now)
nl: db 0x0a
nlsize: equ $-nl
ten: dq 10
section .bss
rem: resq 1
remsize: equ $-rem
section .text
global _start
_start:
cmp qword [num], 0
jng _exit ;jump to _exit if num is not greater than 0
mov rax, [num] ;move the number to rax
mov rbx, [num] ;move the number to rbx as well so that i have original number in register to subtract and get the remainder
mov rcx, [ten] ;move 10 to rcx to be the divisor
div rcx ;divide number in rax by 10
mov [num], rax ;get the quotient to get the remaining number for quotient
mul rcx ;multiply number in rax by 10
sub rbx, rax ;subtract rbx - rax and store the value in rax (right?)
mov [rem], rbx ;get the remainder from rax. this must be done right after div (WHY??????????)
call _disprem ;call _disprem to display the remainder... call returns the flow back to the caller right?
jmp _start ;get to the loop again
_exit:
mov rax, 60
mov rdi, 0
syscall
_newl:
mov rax, 1
mov rdi, 1
mov rsi, nl
mov rdx, nlsize
syscall
ret
_disprem:
mov rax, 1
mov rdi, 1
add qword [rem], 0x0000000000000030 ;since the rem variable is quadword (64 bit)
mov rsi, rem
mov rdx, remsize
syscall
sub qword [rem], 0x0000000000000030 ;get me my original number back plz thanks
call _newl
ret

x86-64 Bit Assembly Linux Input

I'm trying to input into my program... All it does is run through and print a '0' to the screen. I'm pretty sure that the PRINTDECI function works, I made it a while ago and it works. Do I just have to loop over the input code and only exit when I enter a certain value? I'm not sure how I would do that... Unless it's by ACSII values which might suck.... Anyways, here's my code (Yasm(nasm clone), Intel Syntax):
GLOBAL _start
SECTION .text
PRINTDECI:
LEA R9,[NUMBER + 18] ; last character of buffer
MOV R10,R9 ; copy the last character address
MOV RBX,10 ; base10 divisor
DIV_BY_10:
XOR RDX,RDX ; zero rdx for div
DIV RBX ; rax:rdx = rax / rbx
ADD RDX,0x30 ; convert binary digit to ascii
TEST RAX,RAX ; if rax == 0 exit DIV_BY_10
JZ CHECK_BUFFER
MOV byte [R9],DL ; save remainder
SUB R9,1 ; decrement the buffer address
JMP DIV_BY_10
CHECK_BUFFER:
MOV byte [R9],DL
SUB R9,1
CMP R9,R10 ; if the buffer has data print it
JNE PRINT_BUFFER
MOV byte [R9],'0' ; place the default zero into the empty buffer
SUB R9,1
PRINT_BUFFER:
ADD R9,1 ; address of last digit saved to buffer
SUB R10,R9 ; end address minus start address
ADD R10,1 ; R10 = length of number
MOV RAX,1 ; NR_write
MOV RDI,1 ; stdout
MOV RSI,R9 ; number buffer address
MOV RDX,R10 ; string length
SYSCALL
RET
_start:
MOV RCX, SCORE ;Input into Score
MOV RDX, SCORELEN
MOV RAX, 3
MOV RBX, 0
SYSCALL
MOV RAX, [SCORE]
PUSH RAX ;Print Score
CALL PRINTDECI
POP RAX
MOV RAX,60 ;Kill the Code
MOV RDI,0
SYSCALL
SECTION .bss
SCORE: RESQ 1
SCORELEN EQU $-SCORE
Thanks for any help!
- Kyle
As a side note, the pointer in RCX goes to a insanely large number according to DDD... So I'm thinking I have to get it to pause and wait for me to type, but I have no idea how to do that...
The 'setup' to call syscall 0 (READ) on x86_64 system is:
#xenon:~$ syscalls_lookup read
read:
rax = 0 (0x0)
rdi = unsigned int fd
rsi = char *buf
rdx = size_t count
So your _start code should be something like:
_start:
mov rax, 0 ; READ
mov rdi, 0 ; stdin
mov rsi, SCORE ; buffer
mov rdx, SCORELEN ; length
syscall
The register conventions and syscall numbers for x86_64 are COMPLETELY different than those for i386.
Some conceptual issues you seem to have:
READ does not do ANY interpretation on what you type, you seem to be expecting it to let you type a number (say, 57) and have it return the value 57. Nope. It'll return '5', '7', 'ENTER', 'GARBAGE'... Your SCORELEN is probably 8 (length of resq 1), so you'll read, AT MOST, 8 bytes. or Characters, if you wish to call them that. And unless you type the EOF char (^D), you'll need to type those 8 characters before the READ call will return to your code.
You have to convert the characters you receive into a value... You can do it the easy way and link with ATOI() in the C library, or write your own parser to convert the characters into a value by addition and multiplication (it's not hard, see code below).
Used below, here as a reference:
#xenon:~$ syscalls_lookup write
write:
rax = 1 (0x1)
rdi = unsigned int fd
rsi = const char *buf
rdx = size_t count
Ugh.... So many... I'll just rewrite bits:
global _start
section .text
PRINTDECI:
; input is in RAX
lea r9, [NUMBER + NUMBERLEN - 1 ] ; + space for \n
mov r10, r9 ; save end position for later
mov [r9], '\n' ; store \n at end
dec r9
mov rbx, 10 ; base10 divisor
DIV_BY_10:
xor rdx, rdx ; zero rdx for div
div rbx : rax = rdx:rax / rbx, rdx = remainder
or dl, 0x30 ; make REMAINDER a digit
mov [r9], dl
dec r9
or rax, rax
jnz DIV_BY_10
PRINT_BUFFER:
sub r10, r9 ; get length (r10 - r9)
inc r9 ; make r9 point to initial character
mov rax, 1 ; WRITE (1)
mov rdi, 1 ; stdout
mov rsi, r9 ; first character in buffer
mov rdx, r10 ; length
syscall
ret
MAKEVALUE:
; RAX points to buffer
mov r9, rax ; save pointer
xor rcx, rcx ; zero value storage
MAKELOOP:
mov al, [r9] ; get a character
or al, al ; set flags
jz MAKEDONE ; zero byte? we're done!
and rax, 0x0f ; strip off high nybble and zero rest of RAX (we're lazy!)
add rcx, rcx ; value = value * 2
mov rdx, rcx ; save it
add rcx, rcx ; value = value * 4
add rcx, rcx ; value = value * 8
add rcx, rdx ; value = value * 8 + value * 2 (== value * 10)
add rcx, rax ; add new digit
jmp MAKELOOP ; do it again
MAKEDONE:
mov rax, rcx ; put value in RAX to return
ret
_start:
mov rax, 0 ; READ (0)
mov rdi, 0 ; stdin
mov rsi, SCORE ; buffer
mov rdx, SCORELEN ; length
syscall
; RAX contains HOW MANY CHARS we read!
; -OR-, -1 to indicate error, really
; should check for that, but that's for
; you to do later... right? (if RAX==-1,
; you'll get a segfault, just so you know!)
add rax, SCORE ; get position of last byte
movb [rax], 0 ; force a terminator at end
mov rax, SCORE ; point to beginning of buffer
call MAKEVALUE ; convert from ASCII to a value
; RAX now should have the VALUE of the string of characters
; we input above. (well, hopefully, right?)
mov [VALUE], rax ; store it, because we can!
; it's stored... pretend it's later... we need value of VALUE!
mov rax, [VALUE] ; get the VALUE
call PRINTDECI ; convert and display value
; all done!
mov rax, 60 ; EXIT (60/0x3C)
mov rdi, 0 ; exit code = 0
syscall
section .bss
SCORE: resb 11 ; 10 chars + zero terminator
SCORELEN equ $-SCORE
NUMBER: resb 19 ; 18 chars + CR terminator
NUMBERLEN equ $-NUMBER
I'm going to say that this should work first time, it's off-the-cuff for me, haven't tested it, but it should be good. We read up to 10 chars, terminate it with a zero, convert to a value, then convert to ascii and write it out.
To be more proper, you should save registers to the stack in each subroutine, well, certain ones, and really, only if you're going to interface with libraries... doing things yourself lets you have all the freedom you want to play with the registers, you just have to remember what you put where!
Yes, someone is going to say "why didn't you just multiply by 10 instead of weird adding?" ... uh... because it's easier on the registers and I don't have to set it all up in rdx:rax. Besides, it's just as readable and understandable, especially with the comments. Roll with it! This isn't a competition, it's learning!
Machine code is fun! Gotta juggle all the eggs in your head though... no help from the compiler here!
Technically, you should check return result (RAX) of the syscalls for READ and WRITE, handle errors appropriately, yadda yadda yadda.... learn to use your debugger (gdb or whatever).
Hope this helps.

Why can't I sys_write from a pointer to stack memory, using int 0x80? [duplicate]

This question already has an answer here:
What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
(1 answer)
Closed 4 years ago.
; NASM
push 30 ; '0'
mov rax, 4 ; write
mov rbx, 1 ; stdout
mov rcx, rsp ; ptr to character on stack
mov rdx, 1 ; length of string = 1
int 80h
The code above does not print anything to stdout. It works when i give it a ptr to a character in section .data. What am i doing wrong?
amd64 uses a different method for system calls than int 0x80, although that might still work with 32-bit libraries installed, etc. Whereas on x86 one would do:
mov eax, SYSCALL_NUMBER
mov ebx, param1
mov ecx, param2
mov edx, param3
int 0x80
on amd64 one would instead do this:
mov rax, SYSCALL_NUMBER_64 ; different from the x86 equivalent, usually
mov rdi, param1
mov rsi, param2
mov rdx, param3
syscall
For what you want to do, consider the following example:
bits 64
global _start
section .text
_start:
push 0x0a424242
mov rdx, 04h
lea rsi, [rsp]
call write
call exit
exit:
mov rax, 60 ; exit()
xor rdi, rdi ; errno
syscall
write:
mov rax, 1 ; write()
mov rdi, 1 ; stdout
syscall
ret
30 decimal is the code of the ASCII "record separator". Whatever that is, it's probably not a printable character.
30 hexadecimal (30h or 0x30 in NASM parlance), on the other hand, is the code of the ASCII "0".
Also, you need to use the 64-bit ABI.

x64 bit assembly

I started assembly (nasm) programming not too long ago. Now I made a C function with assembly implementation which prints an integer. I got it working using the extended registers, but when I want to write it with the x64 registers (rax, rbx, ..) my implementation fails. Does any of you see what I missed?
main.c:
#include <stdio.h>
extern void printnum(int i);
int main(void)
{
printnum(8);
printnum(256);
return 0;
}
32 bit version:
; main.c: http://pastebin.com/f6wEvwTq
; nasm -f elf32 -o printnum.o printnum.asm
; gcc -o printnum printnum.o main.c -m32
section .data
_nl db 0x0A
nlLen equ $ - _nl
section .text
global printnum
printnum:
enter 0,0
mov eax, [ebp+8]
xor ebx, ebx
xor ecx, ecx
xor edx, edx
push ebx
mov ebx, 10
startLoop:
idiv ebx
add edx, 0x30
push dx ; With an odd number of digits this will screw up the stack, but that's ok
; because we'll reset the stack at the end of this function anyway.
; Needs fixing though.
inc ecx
xor edx, edx
cmp eax, 0
jne startLoop
push ecx
imul ecx, 2
mov edx, ecx
mov eax, 4 ; Prints the string (from stack) to screen
mov ebx, 1
mov ecx, esp
add ecx, 4
int 80h
mov eax, 4 ; Prints a new line
mov ebx, 1
mov ecx, _nl
mov edx, nlLen
int 80h
pop eax ; returns the ammount of used characters
leave
ret
x64 version:
; main.c : http://pastebin.com/f6wEvwTq
; nasm -f elf64 -o object/printnum.o printnum.asm
; gcc -o bin/printnum object/printnum.o main.c -m64
section .data
_nl db 0x0A
nlLen equ $ - _nl
section .text
global printnum
printnum:
enter 0, 0
mov rax, [rbp + 8] ; Get the function args from the stac
xor rbx, rbx
xor rcx, rcx
xor rdx, rdx
push rbx ; The 0 byte of the string
mov rbx, 10 ; Dividor
startLoop:
idiv rbx ; modulo is in rdx
add rdx, 0x30
push dx
inc rcx ; increase the loop variable
xor rdx, rdx ; resetting the modulo
cmp rax, 0
jne startLoop
push rcx ; push the counter on the stack
imul rcx, 2
mov rdx, rcx ; string length
mov rax, 4
mov rbx, 1
mov rcx, rsp ; the string
add rcx, 4
int 0x80
mov rax, 4
mov rbx, 1
mov rcx, _nl
mov rdx, nlLen
int 0x80
pop rax
leave
ret ; return to the C routine
Thanks in advance!
I think your problem is that you're trying to use the 32-bit calling conventions in 64-bit mode. That won't fly, not if you're calling these assembly routines from C. The 64-bit calling convention is documented here: http://www.x86-64.org/documentation/abi.pdf
Also, don't open-code system calls. Call the wrappers in the C library. That way errno gets set properly, you take advantage of sysenter/syscall, you don't have to deal with the differences between the normal calling convention and the system-call argument convention, and you're insulated from certain low-level ABI issues. (Another of your problems is that write is system call number 1, not 4, for Linux/x86-64.)
Editorial aside: There are two, and only two, reasons to write anything in assembly nowadays:
You are writing one of the very few remaining bits of deep magic that cannot be written in C alone (a good example is the guts of libffi)
You are hand-optimizing an inner-loop subroutine that has been measured to be performance-critical and the C compiler doesn't do a good enough job on.
Otherwise just write whatever it is in C. Your successors will thank you.
EDIT: checked system call numbers.
I'm not sure if this answer is related to the problem you're seeing (since you didn't specify anything about what the failure is), but 64-bit code has a different calling convention than 32-bit code does. Both of the major 64-bit Intel ABIs (Windows & Linux/BSD/Mac OS) pass function parameters in registers and not on the stack. Your program appears to still be expecting them on the stack, which isn't the normal way to go about it.
Edit: Now that I see there is a C main() routine that calls your functions, my answer is exactly about the problem you're having.

Resources