Learning NASM Assembly in 32-bit Ubuntu. I am somewhat confused:
In .bss, I reserve a byte for a variable:
num resb 1
Later I decided to give it a value of 5:
mov byte [num],5
And at some point print it out:
mov EAX,4
mov EBX,0
mov ECX,num
add ECX,'0' ; From decimal to ASCII
mov EDX,1
int 0x80
But it isn't printing anything.
I'm guessing that the problem is when I give num its value of 5. I originally wanted to do this:
mov byte num,5
As I thought that num refers to a position in memory, and so mov would copy 5 to such position. But I got an error saying
invalid combination of opcode and operands
So basically, why is the program not printing 5? And also, why was my suggestion above invalid?
To print using int 0x80 and code 4 you need ECX to be the address of the byte to print. You added '0' to the address of num that was in ECX before you called the print routine, so it was the address of something else out in memory somewhere.
You may want something like this. I created a separate area, numout to hold the ASCII version of num:
numout resb 1
....
mov EAX,4
mov EBX,0
mov CL,[num]
add CL,'0'
mov [numout],CL
mov ECX,numout
mov EDX,1
int 0x80
Related
So I want to print out the ascii value of what is stored in an address with a system call.
Lets say i want to print the letter Z in hex this is 0x5a, so lets say in my preamble there I have a pointer ptr point to a element of an array, I transfer the value at the address into cl by:
mov edx, ptr
mov cl, [edx]
Now lets say the value 0x5a is in cl, how do I print this to STDOUT? I have tried (amoung other things):
mov eax, 4 ; SYS_WRITE
mov ebx, 1 ; STDOUT
;value to be prininted is in cl so it is in ecx
mov edx, 4 ; size of the item (tried 4 and 1)
int 80h ;interrupt
This prints absolutely nothing, what am I doing wrong? Do I need to declare something in .data / .bss?
I want it to print Z
Thanks
EDIT:
So I can print Z by declaring in .data:
num db 90
Then using mov ecx, num in the system call. but how can I take the hex value in the register and do the same?
hi i need help displaying contents of a register.my code is below.i have been able to display values of the data register but i want to display flag states. eg 1 or 0. and it would be helpful if to display also the contents of other registers like esi,ebp.
my code is not printing the states of the flags ..what am i missing
section .text
global _start ;must be declared for using gcc
_start : ;tell linker entry point
mov eax,msg ; moves message "rubi" to eax register
mov [reg],eax ; moves message from eax to reg variable
mov edx, 8 ;message length
mov ecx, [reg];message to write
mov ebx, 1 ;file descriptor (stdout)
mov eax, 4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax, 100
mov ebx, 100
cmp ebx,eax
pushf
pop dword eax
mov [save_flags],eax
mov edx, 8 ;message length
mov ecx,[save_flags] ;message to write
mov ebx, 1 ;file descriptor (stdout)
mov eax, 4 ;system call number (sys_write)
int 0x80
mov eax, 1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db "rubi",10
section .bss
reg resb 100
save_flags resw 100
I'm not going for anything fancy here since this appears to be a homework assignment (two people have asked the same question today). This code should be made as a function, and it can have its performance enhanced. Since I don't get an honorary degree or an A in the class it doesn't make sense to me to offer the best solution, but one you can work from:
BITS_TO_DISPLAY equ 32 ; Number of least significant bits to display (1-32)
section .text
global _start ; must be declared for using gcc
_start : ; tell linker entry point
mov edx, msg_len ; message length
mov ecx, msg ; message to write
mov ebx, 1 ; file descriptor (stdout)
mov eax, 4 ; system call number (sys_write)
int 0x80 ; call kernel
mov eax, 100
mov ebx, 100
cmp ebx,eax
pushf
pop dword eax
; Convert binary to string by shifting the right most bit off EAX into
; the carry flag (CF) and convert the bit into a '0' or '1' and place
; in the save_flags buffer in reverse order. Nul terminate the string
; in the event you ever wish to use printf to print it
mov ecx, BITS_TO_DISPLAY ; Number of bits of EAX register to display
mov byte [save_flags+ecx], 0 ; Nul terminate binary string in case we use printf
bin2ascii:
xor bl, bl ; BL = 0
shr eax, 1 ; Shift right most bit into carry flag
adc bl, '0' ; bl = bl + '0' + Carry Flag
mov [save_flags-1+ecx], bl ; Place '0'/'1' into string buffer in reverse order
dec ecx
jnz bin2ascii ; Loop until all bits processed
mov edx, BITS_TO_DISPLAY ; message length
mov ecx, save_flags ; address of binary string to write
mov ebx, 1 ; file descriptor (stdout)
mov eax, 4 ; system call number (sys_write)
int 0x80
mov eax, 1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db "rubi",10
msg_len equ $ - msg
section .bss
save_flags resb BITS_TO_DISPLAY+1 ; Add one byte for nul terminator in case we use printf
The idea behind this code is that we continually shift the bits (using the SHR instruction) in the EAX register to the right one bit at a time. The bit that gets shifted out of the register gets placed in the carry flag (CF). We can use ADC to add the value of the carry flag (0/1) to ASCII '0' to get an ASCII value of '0` and '1'. We place these bytes into destination buffer in reverse order since we are moving from right to left through the bits.
BITS_TO_DISPLAY can be set between 1 and 32 (since this is 32-bit code). If you are interested in the bottom 8 bits of a register set it to 8. If you want to display all the bits of a 32-bit register, specify 32.
Note that you can pop directly into memory.
And if you want to binary dump register and flag data with write(2), your system call needs to pass a pointer to the buffer, not the data itself. Use a mov-immediate to get the address into the register, rather than doing a load. Or lea to use a RIP-relative addressing mode. Or pass a pointer to where it's sitting on the stack, instead of copying it to a global!
mov edx, 8 ;message length
mov ecx,[save_flags] ;message to write ;;;;;;; <<<--- problem
mov ebx, 1 ;file descriptor (stdout)
mov eax, 4 ;system call number (sys_write)
int 0x80
Passing a bad address to write(2) won't cause your program to receive a SIGSEGV, like it would if you used that address in user-space. Instead, write will return EFAULT. And you're not checking the return status from your system calls, so your code doesn't notice.
mov eax,msg ; moves message "rubi" to eax register
mov [reg],eax ; moves message from eax to reg variable
mov ecx, [reg];
This is silly. You should just mov ecx, msg to get the address of msg into ecx, rather than bouncing it through memory.
Are you building for 64bit? I see you're using 8 bytes for a message length. If so, you should be using the 64bit function call ABI (with syscall, not int 0x80). The system-call numbers are different. See the table in one of the links at x86. The 32bit ABI can only accept 32bit pointers. You will have a problem if you try to pass a pointer that has any of the high32 bits set.
You're probably also going to want to format the number into a string, unless you want to pipe your program's output into hexdump.
This is the code I have and it works fine:
section .bss
bufflen equ 1024
buff: resb bufflen
whatread: resb 4
section .data
section .text
global main
main:
nop
read:
mov eax,3 ; Specify sys_read
mov ebx,0 ; Specify standard input
mov ecx,buff ; Where to read to...
mov edx,bufflen ; How long to read
int 80h ; Tell linux to do its magic
; Eax currently has the return value from linux system call..
add eax, 30h ; Convert number to ASCII digit
mov [whatread],eax ; Store how many bytes has been read to memory at loc **whatread**
mov eax,4 ; Specify sys_write
mov ebx,1 ; Specify standart output
mov ecx,whatread ; Get the address of whatread to ecx
mov edx,4 ; number of bytes to be written
int 80h ; Tell linux to do its work
mov eax, 1;
mov ebx, 0;
int 80h
Here is a simple run and output:
koray#koray-VirtualBox:~/asm/buffasm$ nasm -f elf -g -F dwarf buff.asm
koray#koray-VirtualBox:~/asm/buffasm$ gcc -o buff buff.o
koray#koray-VirtualBox:~/asm/buffasm$ ./buff
p
2koray#koray-VirtualBox:~/asm/buffasm$ ./buff
ppp
4koray#koray-VirtualBox:~/asm/buffasm$
My question is: What is with these 2 instructions:
mov [whatread],eax ; Store how many byte reads info to memory at loc whatread
mov ecx,whatread ; Get the address of whatread in ecx
Why the first one works with [] but the other one without?
When I try replacing the second line above with:
mov ecx,[whatread] ; Get the address of whatread in ecx
the executable will not run properly, it will not shown anything in the console.
Using brackets and not using brackets are basically two different things:
A bracket means that the value in the memory at the given address is meant.
An expression without a bracket means that the address (or value) itself is meant.
Examples:
mov ecx, 1234
Means: Write the value 1234 to the register ecx
mov ecx, [1234]
Means: Write the value that is stored in memory at address 1234 to the register ecx
mov [1234], ecx
Means: Write the value stored in ecx to the memory at address 1234
mov 1234, ecx
... makes no sense (in this syntax) because 1234 is a constant number which cannot be changed.
Linux "write" syscall (INT 80h, EAX=4) requires the address of the value to be written, not the value itself!
This is why you do not use brackets at this position!
I'm trying to learn assembly with NASM on 64 bit Linux.
I managed to make a program that reads two numbers and adds them. The first thing I realized was that the program will only work with one-digit numbers (and results):
; Calculator
SECTION .data
msg1 db "Enter the first number: "
msg1len equ $-msg1
msg2 db "Enter the second number: "
msg2len equ $-msg2
msg3 db "The result is: "
msg3len equ $-msg3
SECTION .bss
num1 resb 1
num2 resb 1
result resb 1
SECTION .text
global main
main:
; Ask for the first number
mov EAX,4
mov EBX,1
mov ECX,msg1
mov EDX,msg1len
int 0x80
; Read the first number
mov EAX,3
mov EBX,1
mov ECX,num1
mov EDX,2
int 0x80
; Ask for the second number
mov EAX,4
mov EBX,1
mov ECX,msg2
mov EDX,msg2len
int 0x80
; Read the second number
mov EAX,3
mov EBX,1
mov ECX,num2
mov EDX,2
int 0x80
; Prepare to announce the result
mov EAX,4
mov EBX,1
mov ECX,msg3
mov EDX,msg3len
int 0x80
; Do the sum
; Store read values to EAX and EBX
mov EAX,[num1]
mov EBX,[num2]
; From ASCII to decimal
sub EAX,'0'
sub EBX,'0'
; Add
add EAX,EBX
; Convert back to EAX
add EAX,'0'
; Save the result back to the variable
mov [result],EAX
; Print result
mov EAX,4
mov EBX,1
mov ECX,result
mov EDX,1
int 0x80
As you can see, I reserve one byte for the first number, another for the second, and one more for the result. This isn't very flexible. I would like to make additions with numbers of any size.
How should I approach this?
First of all you are generating a 32-bit program, not a 64-bit program. This is no problem as Linux 64-bit can run 32-bit programs if they are either statically linked (this is the case for you) or the 32-bit shared libraries are installed.
Your program contains a real bug: You are reading and writing the "EAX" register from a 1-byte field in RAM:
mov EAX, [num1]
This will normally work on little-endian computers (x86). However if the byte you want to read is at the end of the last memory page of your program you'll get a bus error.
Even more critical is the write command:
mov [result], EAX
This command will overwrite 3 bytes of memory following the "result" variable. If you extend your program by additional bytes:
num1 resb 1
num2 resb 1
result resb 1
newVariable1 resb 1
You'll overwrite these variables! To correct your program you must use the AL (and BL) register instead of the complete EAX register:
mov AL, [num1]
mov BL, [num2]
...
mov [result], AL
Another finding in your program is: You are reading from file handle #1. This is the standard output. Your program should read from file handle #0 (standard input):
mov EAX, 3 ; read
mov EBX, 0 ; standard input
...
int 0x80
But now the answer to the actual question:
The C library functions (e.g. fgets()) use buffered input. Doing it like this would be a bit to complicated for the beginning so reading one byte at a time could be a possibility.
Thinking the way "how would I solve this problem using a high-level language like C". If you don't use libraries in your assembler program you can only use system calls (section 2 man pages) as functions (e.g. you cannot use "fgets()" but only "read()").
In your case a C program reading a number from standard input could look like this:
int num1;
char c;
...
num1 = 0;
while(1)
{
if(read(0,&c,1)!=1) break;
if(c=='\r' || c=='\n') break;
num1 = 10*num1 + c - '0';
}
Now you may think about the assembler code (I typically use GNU assembler, which has another syntax, so maybe this code contains some bugs):
c resb 1
num1 resb 4
...
; Set "num1" to 0
mov EAX, 0
mov [num1], EAX
; Here our while-loop starts
next_digit:
; Read one character
mov EAX, 3
mov EBX, 0
mov ECX, c
mov EDX, 1
int 0x80
; Check for the end-of-input
cmp EAX, 1
jnz end_of_loop
; This will cause EBX to be 0.
; When modifying the BL register the
; low 8 bits of EBX are modified.
; The high 24 bits remain 0.
; So clearing the EBX register before
; reading an 8-bit number into BL is
; a method for converting an 8-bit
; number to a 32-bit number!
xor EBX, EBX
; Load the character read into BL
; Check for "\r" or "\n" as input
mov BL, [c]
cmp BL, 10
jz end_of_loop
cmp BL, 13
jz end_of_loop
; read "num1" into EAX
mov EAX, [num1]
; Multiply "num1" with 10
mov ECX, 10
mul ECX
; Add one digit
sub EBX, '0'
add EAX, EBX
; write "num1" back
mov [num1], EAX
; Do the while loop again
jmp next_digit
; The end of the loop...
end_of_loop:
; Done
Writing decimal numbers with more digits is more difficult!
I am trying to write a program that will allow me to print multiple characters (strings of characters or integers). The problem that I am having is that my code only prints one of the characters, and then newlines and stays in an infinite loop. Here is my code:
SECTION .data
len EQU 32
SECTION .bss
num resb len
output resb len
SECTION .text
GLOBAL _start
_start:
Read:
mov eax, 3
mov ebx, 1
mov ecx, num
mov edx, len
int 80h
Point:
mov ecx, num
Print:
mov al, [ecx]
inc ecx
mov [output], al
mov eax, 4
mov ebx, 1
mov ecx, output
mov edx, len
int 80h
cmp al, 0
jz Exit
Clear:
mov eax, 0
mov [output], eax
jmp Print
Exit:
mov eax, 1
mov ebx, 0
int 80h
Could someone point out what I am doing wrong?
Thanks,
Rileyh
In the first time you enter the Print section, ecx is pointing to the start of the string and you use it to copy a single character to the start of the output string. But a few more instructions down, you overwrite ecx with the pointer to the output string, and never restore it, therefore you never manage to copy and print the rest of the string.
Also, why are you calling write() with a single character string with the aim to loop over it to print the entire string? Why not just pass num directly in instead of copying a single character to output and passing that?
In your last question, you showed message as a zero-terminated string, so cmp al, 0 would indicate the end of the string. sys_read does NOT create a zero-terminated string! (we can stuff a zero in there if we need it - e.g. as a filename for sys_open) sys_read will read a maximum of edx characters. sys_read from stdin returns when, and only when, the "enter" key is hit. If fewer than edx characters were entered, the string is terminated with a linefeed character (10 decimal or 0xA or 0Ah hex) - you could look for that... But, if the pesky user types more than edx characters, only edx characters go into your buffer, the "excess" remains in the OS's buffer (and can cause trouble later!). In this case your string is NOT terminated with a linefeed, so looking for it will fail. sys_read returns the number of characters actually read - up to edx - including the linefeed - in eax. If you don't want to include the linefeed in the length, you can decrement eax.
As an experiment, do a sys_read with some small number (say 4) in edx, then exit the program. Type "abcdls"(enter) and watch the "ls" be executed. If some joker typed "abcdrm -rf ."... well, don't!!!
Safest thing is to flush the OS's input buffer.
mov ecx, num
mov edx, len
mov ebx, 1
mov eax, 3
int 80h
cmp byte [ecx + eax - 1], 10 ; got linefeed?
push eax ; save read length - doesn't alter flags
je good
flush:
mov ecx, dummy_buf
mov edx, 1
mov ebx, 1
mov eax, 3
int 80h
cmp byte [ecx], 10
jne flush
good:
pop eax ; restore length from first sys_read
Instead of defining dummy_buf in .bss (or .data), we could put it on the stack - trying to keep it simple here. This is imperfect - we don't know if our string is linefeed-terminated or not, and we don't check for error (unlikely reading from stdin). You'll find you're writing much more code dealing with errors and "idiot user" input than "doing the work". Inevitable! (it's a low-level language - we've gotta tell the CPU Every Single Thing!)
sys_write doesn't know about zero-terminated strings, either! It'll print edx characters, regardless of how much garbage that might be. You want to figure out how many characters you actually want to print, and put that in edx (that's why I saved/restored the original length above).
You mention "integers" and use num as a variable name. Neither of these functions know about "numbers" except as ascii codes. You're reading and writing characters. Converting a single-digit number to and from a character is easy - add or subtract '0' (48 decimal or 30h). Multiple digits are more complicated - look around for an example, if that's what you need.
Best,
Frank