I am trying to teach myself how to write Assembly in X86 on NASM. I'm attempting to write a program that takes a single integer value and prints it back to standard output before exiting.
My code:
section .data
prompt: db "Enter a number: ", 0
str_length: equ $ - prompt
section .bss
the_number: resw 1
section .text
global _start
_start:
mov eax, 4 ; pass sys_write
mov ebx, 1 ; pass stdout
mov edx, str_length ; pass number of bytes for prompt
mov ecx, prompt ; pass prompt string
int 80h
mov eax, 3 ; sys_read
mov ebx, 0 ; stdin
mov edx, 1 ; number of bytes
mov ecx, [the_number] ; pass input of the_number
int 80h
mov eax, 4
mov ebx, 1
mov edx, 1
mov ecx, [the_number]
int 80h
mov eax, 1 ; exit
mov ebx, 0 ; status 0
int 80h
From there I do the assembling nasm -felf -o input.o input.asm and linking ld -m elf_i386 -o input input.o.
I run a test and input an integer and when I press enter, the program exits and Bash tries to execute the number input as a command. I even echo'd the exit status and been returned with 0.
So this is an odd behavior.
The call to read fails and doesn’t read any input. When your program exits, that input is still waiting to be read on the TTY (which was this program's stdin), at which point bash reads it.
You should check the return status of your system calls. If EAX is a negative number when the system call returns, it is the error code. For example, in this case, EAX contains -14, which is EFAULT (“Bad address”).
The reason read fails is that you are passing an invalid pointer as the buffer address. You need to load the address of the_number, not its value. Use mov ecx, the_number.
Related
I have the following 'hello world' code written in NASM x86_64 assembly:
section .data
msg db "Hello World", 0xa
msg_L equ $-msg
section .text
global _start
_start:
mov eax, 4 ; sys_write call
mov ebx, 1 ; stdout
mov ecx, msg
mov edx, msg_L
int 0x80 ; call kernel
mov eax, 1 ; sys_exit call
int 0x80 ; call kernel
In the first 'function' under the _start: section, mov ebx, 1 is used to specify the standard output for printing. Later, after the first kernel call, mov eax, 1 is used to specify the sys_exit system call. I don't understand how specifying the same system call number yields 2 different results when the kernel is called. This NASM tutorial specifies 1 as the system call number for sys_exit, yet the program does not exit after the first use of that number, and uses it for stdout instead. Can someone explain to me why this is?
You are not specifying the same system call number.
eax, not ebx, is used to specify system call numbers.
mov ebx, 1 sets the value of ebx and doesn't set the value of eax.
The system call number is set to 4 via mov eax, 4 when using the standard output set by mov ebx, 1.
I'm new at learning assembly x86. I have written a program that asks the user to enter a number and then checks if it's even or odd and then print a message to display this information.
The code works fine but it has one problem. It only works for 1 digit numbers:
; Ask the user to enter a number from the keyboard
; Check if this number is odd or even and display a message to say this
section .text
global _start ;must be declared for linker (gcc)
_start: ;tell linker entry point
;Display 'Please enter a number'
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg1 ; message to be print
mov edx, len1 ; message length
int 80h ; perform system call
;Enter the number from the keyboard
mov eax, 3 ; sys_read
mov ebx, 2 ; file descriptor: stdin
mov ecx, myvariable ; destination (memory address)
mov edx, 4 ; size of the the memory location in bytes
int 80h ; perform system call
;Convert the variable to a number and check if even or odd
mov eax, [myvariable]
sub eax, '0' ;eax now has the number value
and eax, 01H
jz isEven
;Display 'The entered number is odd'
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg2 ; message to be print
mov edx, len2 ; message length
int 80h
jmp outProg
isEven:
;Display 'The entered number is even'
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg3 ; message to be print
mov edx, len3 ; message length
int 80h
outProg:
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg1 db "Please enter a number: ", 0xA,0xD
len1 equ $- msg1
msg2 db "The entered number is odd", 0xA,0xD
len2 equ $- msg2
msg3 db "The entered number is even", 0xA,0xD
len3 equ $- msg3
segment .bss
myvariable resb 4
It does not work properly for numbers with more than 1 digit because it only takes in account the first byte(first digit) of the entered number so it only checks that. So I would need a way to find out how many digits(bytes) there are in the entered value that the user gives so I could do something like this:
;Convert the variable to a number and check if even or odd
mov eax, [myvariable+(number_of_digits-1)]
And only check eax which contains the last digit to see if it's even or odd.
Problem is I have no ideea how could I check how many bytes are in my number after the user has entered it.
I'm sure it's something very easy yet I have not been able to figure it out, nor have I found any solutions on how to do this on google. Please help me with this. Thank you!
You actually want movzx eax, byte [myvariable+(number_of_digits-1)] to only load 1 byte, not a dword. Or just directly test memory with test byte [...], 1. You can skip the sub because '0' is an even number; subtracting to convert from ASCII code to integer digit doesn't change the low bit.
But yes, you need least significant digit, the last (highest address) in printing / reading order.
A read system call returns the number of bytes read in EAX. (Or negative error code). This will include a newline if the user hit return, but not if the user redirected from a file that didn't end with a newline. (Or if they submitted input on a terminal using control-d after typing some digits). The most simple and robust way would be to simply loop looking for the first non-digit in the buffer.
But the "clever" / fun way would be to check if [mybuffer + eax - 1] is a digit, and if so use it. Otherwise check the previous byte. (Or just assume there's a newline and always check [mybuffer + eax - 2], the 2nd-last byte of what was read. (Or off the start of the buffer if the user just pressed return.)
(To efficiently check for an ASCII digit; sub al, '0' / cmp al, 9 / ja non_digit. See double condition checking in assembly / What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa?)
Just for fun, here's a more compact version that always just checks the 2nd-last byte of the read() input. (It doesn't check for being a digit, and it reads outside the buffer for input lengths of 0 or 1, e.g. pressing control-D or return.) Also for read errors, e.g. redirect with strace ./oddeven <&- to close its stdin.
Note the interesting part:
; check if the low digit is even or odd
mov ecx, msg_even
mov edx, msg_odd ; these don't set flags and actually could be done after TEST
test byte [mybuf + eax - 2], 1 ; check the low bit of 2nd-last byte of the read input
cmovnz ecx, edx
;Display selected message
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov edx, msg_odd.len
int 80h ; write(1, digit&1 ? msg_odd : msg_even, msg_odd.len)
I used cmov, but a simple branch over a mov ecx, msg_odd would work. You don't need to duplicate the whole setup for the system call, just run it with the right pointer and length. (ECX and EDX values, and I padded the odd message with a space so I could use the same length for both.)
And this is a homebrewed static_assert(msg_odd.len == msg_even.len), using NASM's conditional directives (https://nasm.us/doc/nasmdoc4.html). It's not just a separate preprocessor like C has, it can use NASM numeric equ expressions.
%if msg_odd.len != msg_even.len
; homebrew assert with NASM preprocessor, since I chose to skip doing a 2nd cmov for the length
%warn we assume both messages have the same length
%endif
The full thing. I outside of the part shown above, I just tweaked comments to sometimes simplify when I thought it was too redundant, and used meaningful label names.
Also, I put .rodata and .bss at the top because NASM complained about referencing msg_odd.len before it was defined. (You previously had your strings in .data, but read-only data should generally go in .rodata, so the OS can share those pages between runs of the same program because they stay clean.)
Other fixes:
Linux/Unix uses 0xa line endings, \n not \n\r.
stdin is fd 0. 2 is stderr. (2 happens to work because terminal emulators normally run the shell with all 3 file descriptors referring to the same read+write open file description for the tty).
; Ask the user to enter a number from the keyboard
; Check if this number is odd or even and display a message to say this
section .rodata
msg_prompt db "Please enter a number: ", 0xA
.len equ $- msg_prompt
msg_odd db "The entered number is odd ", 0xA ; padded with a space for same length as even
.len equ $- msg_odd
msg_even db "The entered number is even", 0xA
.len equ $- msg_even
section .bss
mybuf resb 128
.len equ $ - mybuf
section .text
global _start
_start: ; ld defaults to starting at the top of the .text section, but exporting a symbol silences the warning and can make GDB work more easily.
; Display prompt
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg_prompt
mov edx, msg_prompt.len
int 80h ; perform system call
mov eax, 3 ; sys_read
xor ebx, ebx ; file descriptor: stdin
mov ecx, mybuf
mov edx, mybuf.len
int 80h ; read(0, mybuf, len)
; return value in EAX: negative for error, 0 for EOF, or positive byte count
; for this toy program, lets assume valid input ending with digit\n
; the newline will be at [mybuf + eax - 1]. The digit before that, at [mybuf + eax - 2].
; If the user just presses return, we'll access before the end of mybuf, and may segfault if it's at the start of a page.
; check if the low digit is even or odd
mov ecx, msg_even
mov edx, msg_odd ; these don't set flags and actually could be done after TEST
test byte [mybuf + eax - 2], 1 ; check the low bit of 2nd-last byte of the read input
cmovnz ecx, edx
;Display selected message
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov edx, msg_odd.len
int 80h ; write(1, digit&1 ? msg_odd : msg_even, msg_odd.len)
%if msg_odd.len != msg_even.len
; homebrew assert with NASM preprocessor, since I chose to skip doing a 2nd cmov for the length
%warning we assume both messages have the same length
%endif
mov eax, 1 ;system call number (sys_exit)
xor ebx, ebx
int 0x80 ; _exit(0)
assemble + link with nasm -felf32 oddeven.asm && ld -melf_i386 -o oddeven oddeven.o
This question already has answers here:
How do I discard user input delivered during sleep function?
(3 answers)
Prevent reading of previous / prior user keyboard input from sys.stdin, that works with Click
(1 answer)
Closed 2 years ago.
I want to drain terminal input before prompting the user, or otherwise to close the terminal input until I want to prompt the user for input.
I am writing a program that uses sys_read to prompt the user for input through the terminal. If a user types characters before being prompted, those characters get included in the input. I can easily drain every unread character after the prompt using sys_read again, but it relies on a return character being at the end of the stream of input to prevent prompting the user more than once (I am assuming the user ends every prompt with the return key). I can't rely on the presence of a return character before the prompt, so I can't drain the input the same way.
I have also tried closing stdin with sys_close, but I can't figure out how to open terminal input again, so the program is left frozen when a prompt comes up. Even if I could open the terminal input again, I'm not sure whether or not characters typed while it's closed still get saved for the next time it's read, which would render this approach completely useless.
If there is some sort of obscure termios flag that could disable user input, that would also be an excellent solution.
Here's a program, for example:
global _start
_start:
mov edx,ldot
mov ecx,mdot
mov ebx,1
mov eax,4 ; sys_write
int 0x80
mov dword[sleep_sec],1
mov dword[sleep_usec],0
mov ecx,0
mov ebx,sleep
mov eax,162 ; sys_nanosleep
int 0x80
mov edx,ldesire
mov ecx,mdesire
mov ebx,1
mov eax,4 ; sys_write
int 0x80
call input ; here's the call...
mov edx,lreceive
mov ecx,mreceive
mov ebx,1
mov eax,4 ; sys_write
int 0x80
mov dword[sleep_sec],1
mov dword[sleep_usec],0
mov ecx,0
mov ebx,sleep
mov eax,162 ; sys_nanosleep
int 0x80
mov edx,99
mov ecx,inputed
mov ebx,1
mov eax,4 ; sys_write
int 0x80
mov eax,1 ; sys_exit
int 0x80 ; the end!
input: ; here is all that I am concerned about
mov edx,99
mov ecx,inputed
mov ebx,0 ; stdin
mov eax,3 ; sys_read
int 0x80
cmp eax,99 ; check for valid input
jl inputdone
cmp byte[inputed+98],0xa
je inputdone
inputclear: ; drain
mov eax,3
int 0x80
cmp eax,99
jl inputerror
cmp byte[inputed+98],0xa
je inputerror
jmp inputclear
inputerror:
mov eax,-1
ret
inputdone:
mov eax,1
ret
section .bss
inputed resb 99
section .data
sleep:
sleep_sec dd 0
sleep_usec dd 0
mdot db '...',0xA
ldot equ $ - mdot
mdesire db 'tell me your truest desire:'
ldesire equ $ - mdesire
mreceive db 'you shall receive... '
lreceive equ $ - mreceive
A user who does things when they are told will find no trouble:
...
tell me your truest desire:money
you shall receive... money
But a user who likes to press buttons as they please will struggle:
...
a gtell me your truest desire:love
you shall receive... a glove
And the program won't wait for anyone who makes a mess with the return key:
...
who
tell me your truest desire:you shall receive... who
Recently, I wrote a bit of assembly code that asks for the password and if the user enters the correct password as stored internally, it prints out "Correct!". Else, it prints out "Incorrect!".
Here is the code:
section .text
global _start
_start:
mov edx, len_whatis
mov ecx, whatis
mov ebx, 1
mov eax, 4
int 80h ; outputs: "What is the password?"
mov edx, 5 ; expect 5 bytes of input(so 4 numbers)
mov ecx, pass
mov ebx, 0
mov eax, 3
int 80h ; accepts intput and stores in pass
mov eax, [pass] ; move the pass variable into eax
sub eax, '0' ; change the ascii number in eax to a numerical number
mov ebx, [thepass] ; move the thepass variable into ebx
sub ebx, '0' ; change the ascii number in ebx to a numerical number
cmp eax, ebx ; compare the 2 numbers
je correct ; if they are equal, jump to correct
jmp incorrect ; if not, jump to incorrect
correct:
mov edx, len_corr
mov ecx, corr
mov ebx, 1
mov eax, 4
int 80h ; outputs: "Correct!"
mov ebx, 0
mov eax, 1
int 80h ; exits with status 0
incorrect:
mov edx, len_incor
mov ecx, incor
mov ebx, 1
mov eax, 4
int 80h ; outputs: "Incorrect!"
mov eax, 1
int 80h ; exits with status: 1
section .data
whatis db "What is the password?", 0xA
len_whatis equ $ - whatis
thepass db "12345"
corr db "Correct!", 0xA
len_corr equ $ - corr
incor db "Incorrect!", 0xA
len_incor equ $ - incor
section .bss
pass resb 5
Assemble:nasm -f elf password.s
Link:ld -m elf_i386 -s -o password password.o
(If you did try to assemble link and run this, you may notice that it checks the password incorrectly - ignore this. It is "off topic")
Then, I ran a test:
I ran the code with ./password
When I was prompted for the password, I typed in 123456, one more byte than the code expects
After I hit enter and the code exits, the terminal immediately tries to run a command 6
What is causing this behavior? Is it something to do with the assembler, or how my computer is reading the code?
EDIT:
And, when I run the code with 12345, the terminal prompts for a command twice when the program closes, as if someone just hit the enter button without entering a command.
You're only reading five bytes from standard input, so when you type 123456↵, your application ends up reading 12345 and leaving 6↵ in the buffer. That gets passed on to the shell.
If you want to read the whole line, use a larger buffer.
I'm trying to learn the basics asm on linux and I can't find a very good reference. The NASM docs seem to assume you already know masm... I found no examples in the documentation of the cmp (outside the Intel instruction reference).
I'd written a program that reads a single byte from stdin and writes it to stdout. Below is my modification to try to detect EOF on stdin and exit when EOF is reached. The issue is it never exits. I just keeps printing the last char read from stdin. The issue is either in my EOF detection (cmp ecx, EOF) and/or my jump to the _exit label (je _exit) I think.
What am I doing wrong?
%define EOF -1
section .bss
char: resb 1
section .text
global _start
_exit:
mov eax, 1 ; exit
mov ebx, 0 ; exit status
int 80h
_start:
mov eax, 3 ; sys_read
mov ebx, 0 ; stdin
mov ecx, char ; buffer
cmp ecx, EOF ; EOF?
je _exit
mov edx, 1 ; read byte count
int 80h
mov eax, 4 ; sys_write
mov ebx, 1 ; stdout
mov ecx, char ; buffer
mov edx, 1 ; write byte count
int 80h
jmp _start
For the sake of sanity, I verified EOF is -1 with this C:
#include <stdio.h>
int main() { printf("%d\n", EOF); }
You are comparing the address of the buffer to EOF (-1) instead of the character stored in the buffer.
Having said that, the read system call does not return the value of EOF when end of file is reached, but it returns zero and doesn't stick anything in the buffer (see man 2 read). To identify end of file, just check the value of eax after the call to read:
section .bss
buf: resb 1
section .text
global _start
_exit:
mov eax, 1 ; exit
mov ebx, 0 ; exit status
int 80h
_start:
mov eax, 3 ; sys_read
mov ebx, 0 ; stdin
mov ecx, buf ; buffer
mov edx, 1 ; read byte count
int 80h
cmp eax, 0
je _exit
mov eax, 4 ; sys_write
mov ebx, 1 ; stdout
mov ecx, buf ; buffer
mov edx, 1 ; write byte count
int 80h
jmp _start
If you did want to properly compare the character to some value, use:
cmp byte [buf], VALUE
Also, I renamed char to buf. char is a basic C data type and a bad choice for a variable name.