linux nasm print multiple characters - linux

I am trying to write a program that will allow me to print multiple characters (strings of characters or integers). The problem that I am having is that my code only prints one of the characters, and then newlines and stays in an infinite loop. Here is my code:
SECTION .data
len EQU 32
SECTION .bss
num resb len
output resb len
SECTION .text
GLOBAL _start
_start:
Read:
mov eax, 3
mov ebx, 1
mov ecx, num
mov edx, len
int 80h
Point:
mov ecx, num
Print:
mov al, [ecx]
inc ecx
mov [output], al
mov eax, 4
mov ebx, 1
mov ecx, output
mov edx, len
int 80h
cmp al, 0
jz Exit
Clear:
mov eax, 0
mov [output], eax
jmp Print
Exit:
mov eax, 1
mov ebx, 0
int 80h
Could someone point out what I am doing wrong?
Thanks,
Rileyh

In the first time you enter the Print section, ecx is pointing to the start of the string and you use it to copy a single character to the start of the output string. But a few more instructions down, you overwrite ecx with the pointer to the output string, and never restore it, therefore you never manage to copy and print the rest of the string.
Also, why are you calling write() with a single character string with the aim to loop over it to print the entire string? Why not just pass num directly in instead of copying a single character to output and passing that?

In your last question, you showed message as a zero-terminated string, so cmp al, 0 would indicate the end of the string. sys_read does NOT create a zero-terminated string! (we can stuff a zero in there if we need it - e.g. as a filename for sys_open) sys_read will read a maximum of edx characters. sys_read from stdin returns when, and only when, the "enter" key is hit. If fewer than edx characters were entered, the string is terminated with a linefeed character (10 decimal or 0xA or 0Ah hex) - you could look for that... But, if the pesky user types more than edx characters, only edx characters go into your buffer, the "excess" remains in the OS's buffer (and can cause trouble later!). In this case your string is NOT terminated with a linefeed, so looking for it will fail. sys_read returns the number of characters actually read - up to edx - including the linefeed - in eax. If you don't want to include the linefeed in the length, you can decrement eax.
As an experiment, do a sys_read with some small number (say 4) in edx, then exit the program. Type "abcdls"(enter) and watch the "ls" be executed. If some joker typed "abcdrm -rf ."... well, don't!!!
Safest thing is to flush the OS's input buffer.
mov ecx, num
mov edx, len
mov ebx, 1
mov eax, 3
int 80h
cmp byte [ecx + eax - 1], 10 ; got linefeed?
push eax ; save read length - doesn't alter flags
je good
flush:
mov ecx, dummy_buf
mov edx, 1
mov ebx, 1
mov eax, 3
int 80h
cmp byte [ecx], 10
jne flush
good:
pop eax ; restore length from first sys_read
Instead of defining dummy_buf in .bss (or .data), we could put it on the stack - trying to keep it simple here. This is imperfect - we don't know if our string is linefeed-terminated or not, and we don't check for error (unlikely reading from stdin). You'll find you're writing much more code dealing with errors and "idiot user" input than "doing the work". Inevitable! (it's a low-level language - we've gotta tell the CPU Every Single Thing!)
sys_write doesn't know about zero-terminated strings, either! It'll print edx characters, regardless of how much garbage that might be. You want to figure out how many characters you actually want to print, and put that in edx (that's why I saved/restored the original length above).
You mention "integers" and use num as a variable name. Neither of these functions know about "numbers" except as ascii codes. You're reading and writing characters. Converting a single-digit number to and from a character is easy - add or subtract '0' (48 decimal or 30h). Multiple digits are more complicated - look around for an example, if that's what you need.
Best,
Frank

Related

Finding the number of bytes of entered string at runtime

I'm new at learning assembly x86. I have written a program that asks the user to enter a number and then checks if it's even or odd and then print a message to display this information.
The code works fine but it has one problem. It only works for 1 digit numbers:
; Ask the user to enter a number from the keyboard
; Check if this number is odd or even and display a message to say this
section .text
global _start ;must be declared for linker (gcc)
_start: ;tell linker entry point
;Display 'Please enter a number'
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg1 ; message to be print
mov edx, len1 ; message length
int 80h ; perform system call
;Enter the number from the keyboard
mov eax, 3 ; sys_read
mov ebx, 2 ; file descriptor: stdin
mov ecx, myvariable ; destination (memory address)
mov edx, 4 ; size of the the memory location in bytes
int 80h ; perform system call
;Convert the variable to a number and check if even or odd
mov eax, [myvariable]
sub eax, '0' ;eax now has the number value
and eax, 01H
jz isEven
;Display 'The entered number is odd'
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg2 ; message to be print
mov edx, len2 ; message length
int 80h
jmp outProg
isEven:
;Display 'The entered number is even'
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg3 ; message to be print
mov edx, len3 ; message length
int 80h
outProg:
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg1 db "Please enter a number: ", 0xA,0xD
len1 equ $- msg1
msg2 db "The entered number is odd", 0xA,0xD
len2 equ $- msg2
msg3 db "The entered number is even", 0xA,0xD
len3 equ $- msg3
segment .bss
myvariable resb 4
It does not work properly for numbers with more than 1 digit because it only takes in account the first byte(first digit) of the entered number so it only checks that. So I would need a way to find out how many digits(bytes) there are in the entered value that the user gives so I could do something like this:
;Convert the variable to a number and check if even or odd
mov eax, [myvariable+(number_of_digits-1)]
And only check eax which contains the last digit to see if it's even or odd.
Problem is I have no ideea how could I check how many bytes are in my number after the user has entered it.
I'm sure it's something very easy yet I have not been able to figure it out, nor have I found any solutions on how to do this on google. Please help me with this. Thank you!
You actually want movzx eax, byte [myvariable+(number_of_digits-1)] to only load 1 byte, not a dword. Or just directly test memory with test byte [...], 1. You can skip the sub because '0' is an even number; subtracting to convert from ASCII code to integer digit doesn't change the low bit.
But yes, you need least significant digit, the last (highest address) in printing / reading order.
A read system call returns the number of bytes read in EAX. (Or negative error code). This will include a newline if the user hit return, but not if the user redirected from a file that didn't end with a newline. (Or if they submitted input on a terminal using control-d after typing some digits). The most simple and robust way would be to simply loop looking for the first non-digit in the buffer.
But the "clever" / fun way would be to check if [mybuffer + eax - 1] is a digit, and if so use it. Otherwise check the previous byte. (Or just assume there's a newline and always check [mybuffer + eax - 2], the 2nd-last byte of what was read. (Or off the start of the buffer if the user just pressed return.)
(To efficiently check for an ASCII digit; sub al, '0' / cmp al, 9 / ja non_digit. See double condition checking in assembly / What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa?)
Just for fun, here's a more compact version that always just checks the 2nd-last byte of the read() input. (It doesn't check for being a digit, and it reads outside the buffer for input lengths of 0 or 1, e.g. pressing control-D or return.) Also for read errors, e.g. redirect with strace ./oddeven <&- to close its stdin.
Note the interesting part:
; check if the low digit is even or odd
mov ecx, msg_even
mov edx, msg_odd ; these don't set flags and actually could be done after TEST
test byte [mybuf + eax - 2], 1 ; check the low bit of 2nd-last byte of the read input
cmovnz ecx, edx
;Display selected message
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov edx, msg_odd.len
int 80h ; write(1, digit&1 ? msg_odd : msg_even, msg_odd.len)
I used cmov, but a simple branch over a mov ecx, msg_odd would work. You don't need to duplicate the whole setup for the system call, just run it with the right pointer and length. (ECX and EDX values, and I padded the odd message with a space so I could use the same length for both.)
And this is a homebrewed static_assert(msg_odd.len == msg_even.len), using NASM's conditional directives (https://nasm.us/doc/nasmdoc4.html). It's not just a separate preprocessor like C has, it can use NASM numeric equ expressions.
%if msg_odd.len != msg_even.len
; homebrew assert with NASM preprocessor, since I chose to skip doing a 2nd cmov for the length
%warn we assume both messages have the same length
%endif
The full thing. I outside of the part shown above, I just tweaked comments to sometimes simplify when I thought it was too redundant, and used meaningful label names.
Also, I put .rodata and .bss at the top because NASM complained about referencing msg_odd.len before it was defined. (You previously had your strings in .data, but read-only data should generally go in .rodata, so the OS can share those pages between runs of the same program because they stay clean.)
Other fixes:
Linux/Unix uses 0xa line endings, \n not \n\r.
stdin is fd 0. 2 is stderr. (2 happens to work because terminal emulators normally run the shell with all 3 file descriptors referring to the same read+write open file description for the tty).
; Ask the user to enter a number from the keyboard
; Check if this number is odd or even and display a message to say this
section .rodata
msg_prompt db "Please enter a number: ", 0xA
.len equ $- msg_prompt
msg_odd db "The entered number is odd ", 0xA ; padded with a space for same length as even
.len equ $- msg_odd
msg_even db "The entered number is even", 0xA
.len equ $- msg_even
section .bss
mybuf resb 128
.len equ $ - mybuf
section .text
global _start
_start: ; ld defaults to starting at the top of the .text section, but exporting a symbol silences the warning and can make GDB work more easily.
; Display prompt
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg_prompt
mov edx, msg_prompt.len
int 80h ; perform system call
mov eax, 3 ; sys_read
xor ebx, ebx ; file descriptor: stdin
mov ecx, mybuf
mov edx, mybuf.len
int 80h ; read(0, mybuf, len)
; return value in EAX: negative for error, 0 for EOF, or positive byte count
; for this toy program, lets assume valid input ending with digit\n
; the newline will be at [mybuf + eax - 1]. The digit before that, at [mybuf + eax - 2].
; If the user just presses return, we'll access before the end of mybuf, and may segfault if it's at the start of a page.
; check if the low digit is even or odd
mov ecx, msg_even
mov edx, msg_odd ; these don't set flags and actually could be done after TEST
test byte [mybuf + eax - 2], 1 ; check the low bit of 2nd-last byte of the read input
cmovnz ecx, edx
;Display selected message
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov edx, msg_odd.len
int 80h ; write(1, digit&1 ? msg_odd : msg_even, msg_odd.len)
%if msg_odd.len != msg_even.len
; homebrew assert with NASM preprocessor, since I chose to skip doing a 2nd cmov for the length
%warning we assume both messages have the same length
%endif
mov eax, 1 ;system call number (sys_exit)
xor ebx, ebx
int 0x80 ; _exit(0)
assemble + link with nasm -felf32 oddeven.asm && ld -melf_i386 -o oddeven oddeven.o

Why is this register value in x86 assembly from user input different than expected?

Whenever the user inputs s, the expected value in the rax register that the buffer is moved to would be 73, but instead it is a73. Why is this? I need these two values to be equal in order to perform the jumps I need for the user input menu.
On any user input, the information in the register is always preceded by an a, while the register that I use to check for the value is not. This makes it impossible to compare them for a jump.
Any suggestions?
section .data
prompt: db 'Enter a command: '
section .bss
buffer: resb 100; "reserve" 32 bytes
section .text ; code
global _start
_start:
mov rax, 4 ; write
mov rbx, 1 ; stdout
mov rcx, prompt ; where characters start
mov rdx, 0x10 ; 16 characters
int 0x80
mov rax, 3 ; read
mov rbx, 0 ; from stdin
mov rcx, buffer ; start of storage
mov rdx, 0x10; no more than 64 (?) chars
int 0x80
mov rax, [buffer]
mov rbx, "s"
cmp rax, rbx
je _s
; return to Linux
mov rax, 1
mov rbx, 0
int 0x80
_s:
add r8, [buffer]
; dump buffer that was read
mov rdx, rax ; # chars read
mov rax, 4 ; write
mov rbx, 1 ; to stdout
mov rcx, buffer; starting point
int 0x80
jmp _start
If the user types s, followed by <enter>, the memory starting at the address of buffer will contain bytes ['s', '\n', '\0', '\0', ...] (where the newline byte '\n' is from pressing <enter> and the null bytes '\0' are from the .bss section being initialized to 0). As integers, represented in hex, the corresponding values in memory are [0x73, 0x0A, 0x00, 0x00, ...].
The mov rax, [buffer] instruction will copy 8 bytes of memory starting at the address of buffer to the rax register. The byte ordering is little endian on x86, so the 8 bytes will be loaded from memory in reversed order, resulting in rax having 0x0000000000000A73.
Workarounds
This workaround is based on Peter Cordes's comment below. The idea is to compare 1) the first byte starting at the address of buffer with 2) the byte 's'. This would replace the three lines in your question that 1) move [buffer] to rax, 2) move 's' to rbx, and 3) cmp rax, rbx.
cmp byte [buffer], 's'
je _s
This would check that the first character entered is 's', even if followed by other characters. If your intent is to check that only a single character 's' is entered (optionally followed by '\n' in the case that <enter> was pressed to end the input, as opposed to <ctrl-d>), a more thorough approach could utilize the value returned by the read system call, which indicates how many bytes were read.
Without checking how many characters are read, you might want to clear the buffer on each iteration. As is, a user could enter 's' on one iteration, followed by <ctrl-d> on the next iteration, and the buffer would still start with an 's'.
Band-aid Workarounds
(I had originally proposed the following two ideas as workarounds, but they have their own problems that Peter Cordes's identifies in the comments below)
To work around the issue, one option could be to add the newline to your target for comparison.
mov rax, [buffer]
mov rbx, `s\n` ; the second operand was formerly "s"
cmp rax, rbx
je _s
Alternatively, specifying that the read system call only consume 1 byte could be another approach to address the issue.
mov rax, 3 ; read
mov rbx, 0 ; from stdin
mov rcx, buffer ; start of storage
mov rdx, 0x01 ; the second operand was formerly 0x10
int 0x80

Converting User Input Hexadecimal to Decimal in Assembly

I am trying to create an assembly program that takes a user input hexadecimal number no greater than 4 digits and outputs the same number in base 10. This is being done using NASM on a Linux install. Using some tutorials I've found and my very limited understanding of this language, I have come up with this.
section .data
inp_buf: times 6 db 0
numPrompt: db "Enter a hexadecimal value no greater than 4 digits: "
len1 equ $ - numPrompt
answerText: db "The decimal value is: "
len2 equ $ - answerText
section .bss
dec: resb 5
section .text
global _start
_start:
mov edx, len1
mov ecx, numPrompt
mov ebx, 1
mov eax, 4
int 0x80
mov eax, 3
mov ebx, 0
mov ecx, inp_buf
mov edx, 6
int 0x80
mov esi, dec+11
mov byte[esi], 0xa
mov eax, ecx ;I feel like the problem must be here
mov ebx, 0xa
mov ecx, 1
next:
inc ecx
xor edx, edx
div ebx
add edx, 0x30
dec esi
mov [esi], dl
cmp eax, 0
jnz next
mov edx, len2
mov ecx, answerText
mov ebx, 1
mov eax, 4
int 0x80
mov edx, ecx
mov ecx, esi
mov ebx, 1
mov eax, 4
int 0x80
mov ebx, 0
mov eax, 1
int 0x80
It should be noted that if the user input is ignored and you just put in a variable with the hex in the data section, the program can convert that with no problem. For example, if you have the line hexNum: equ 0xFFFF and you replace the commented line above with mov eax, hexNum then the code is capable of converting that to base 10 correctly. The error must be with the format of the user input, but I don't know how to verify that.
I appreciate any insight on what my issue is here. This is my first assembly program other than "hello world" and a lot of these new concepts are still confusing and hard to understand for me. If there is a better or simpler way for me to go about this entire program then I would love to hear about that as well. Thanks!
The values in `inp_buffer are going to be in hex format, for eg.
1234
will be stored as
0x31 0x32 0x33 0x34
in 4 consecutive bytes.
The number must be converted from ascii format into hex format using the exact opposite procedure for reverse conversion.
After getting the number into hex form, the procedure for conversion to decimal is correct, what may be called as decimal dabble.
I recommend complete conversion first followed by byte by byte conversion into ascii. It is always better to have the final result and then go for conversions, especially in assembly language programming where debugging is difficult.

Joining two strings together in NASM

I looked all over google for ways to do this, I found some but I really found them to be overly complex for what I need. For starters, I need this to be done through a loop, the place where I'm putting my strings is also initially empty, so I'm sure that is bound to create some issues.
Anyway this is my code:
%include "io.mac"
.DATA
filename_msg db 'Enter the file name: ', 0
number_prompt_msg db 'Enter the number of bases: ',0 ;asks for the number of bases to be used
finish_msg db 'Operation completed, DNA file generated',0 ;tells the user when the file is complete
error_msg db 'Operation failed, please try again', 10
base_A db 'A',0
base_C db 'C',0
base_G db 'G',0
base_T db 'T',0
base_length equ $ - base_A
;----------------------------------------------------------------------------------------------------
.UDATA
number_of_bases rest 1 ;defined by the user
random_number resb 1
filename: resd 20 ;defined by the user
base rest 1
file_descriptor resd 1 ;used to generate the file
characters_to_write rest 1
;-------------------------------------------------------------------
;start of code, and message prompts for the user
.CODE
.STARTUP
;asks user for filename
ask_details:
PutStr filename_msg
GetStr filename, 300
;asks user for the number of bases
PutStr number_prompt_msg
GetLInt [number_of_bases]
;------------------------------------------------------------
;file creation
mov EAX, 8 ;creates the file
mov EBX, filename
mov ECX, 644O ;octal instruction
int 80h ;kernel interrupt
cmp EAX,0 ;throws error if something is amiss
jbe error
mov [file_descriptor],EAX
mov ECX,[number_of_bases]
;-------------------------------------------------------------
;randomization of base numbers
writing_loop:
rdtsc
mov EAX, EDX
mov EDX, 0
div ECX
mov EDX, 0
mov EBX, 4
div EBX
mov [random_number], EDX
mov EDX, 0
mov EAX,[random_number]
cmp EAX,0
je assignment_A
cmp EAX,1
je assignment_C
cmp EAX,2
je assignment_G
cmp EAX,3
je assignment_T
join_char:
mov [base + EBX],EBX
loop writing_loop
PutStr base
.EXIT
;------------------------------------------------------------
;file generation error message
error:
PutStr error_msg
jmp ask_details
;------------------------------------------------------------
;assignments
assignment_A:
mov EBX, [base_A]
jmp join_char
assignment_C:
mov EBX, [base_C]
jmp join_char
assignment_T:
mov EBX, [base_T]
jmp join_char
assignment_G:
mov EBX, [base_G]
jmp join_char
First it compares some random numbers I obtained with rdtsc, depending on what comes up, it will assign a letter to EBX this letter(base_A,base_C,base_T or base_G) is then supposed to go into base. I tried using
mov [base + EBX],EBX but that just printed an empty space, I used this because it seemed to work in the examples I looked at, but I'm not really sure how concatenating works in NASM. I don't know if anyone knows any simple methods to concatenate those characters together, if it is possible. This is really minor, so I'm hoping I don't have to add a lot of code, the only thing I need this string for is to write it in a file later. I would do that without the string but I need all my registers to write to the file so I can't do it letter by letter.
EDIT: What I need to know how to do is how to join each letter once it has been picked. Base is empty for example, then after a letter is picked, it gets thrown in there, however after the loop runs again, another letter will be picked, and I need to add it to base after all that's done.

linux nasm code displays nothing

I am making a program where the user enters a number, and it prints out all the numbers from zero up to the number. It compiles fine, links fine, and returns no errors when it runs, and yet it prints out absolutely nothing. Here is the code:
SECTION .data
len EQU 32
SECTION .bss
other resd len
data resd len
SECTION .text
GLOBAL _start
_start:
nop
input: ; This section gets the integer from the user
mov eax, 3 ; }
mov ebx, 1 ; }
mov ecx, data ; } System_read call
mov edx, len ; }
int 80h ; }
mov ebp, 1
setup: ; This section sets up the registers ready for looping
mov [other], ebp
loop: ; This section loops, printing out from zero to the number given
mov eax, 4
mov ebx, 1
mov ecx, [other]
mov edx, len
int 80h
exit: ; Exits the program
mov eax, 1 ; }
mov ebx, 0 ; } System_exit call
int 80h ; }
When I step through it on KDBG, it returns a few errors; it receives an interrupt and a segmentation fault, although I can't tell where. I'm not sure why though, because when I run it in Geany, it returns a 0 value at the end and runs without error. Why does it not work?
Thanks in advance
NOTE: This code does not loop. It is not finished yet. All it should do here is print out the number 1.
When you go to print, you are calling mov ecx, [other]. This looks at the address that's stored in other and follows that address to get whatever is stored there. The problem is that this system call is expecting an address in ecx, not a value.
If you call mov ecx, other instead, then ecx will have the address of other and it will be able to go to that address and print what's there.
You have another problem here: when you print the number stored in other, it will translate it into the ascii value. So, for example, when you try to print a 1, instead of printing the number 1, it will print ascii 1 (which happens to be a start of heading character; nothing you want to print). Add '0' (the character '0') if you want to print numbers.
EDIT: One more thing, when you read, you are passing 1 into ebx. 1 is STDOUT. What you want is STDIN which is 0.

Resources