Strange Characters when printing a string in x86 assembly language? - string

My code is required to have a string that will be printed to the console, alongside a string length counting program that will count it instead of manually putting length of string in edx register. But i am getting strange characters printed right after the string is printed.
global _start
section .text
_start:
mov edi, message
call _strlen
mov edx, eax
mov eax, 4
mov ebx, 1
mov ecx, message
int 80h
mov eax, 1
mov ebx, 5
int 80h
section .data
message: db "My name is Stanley Hudson", 0Ah
_strlen:
push ebx
push ecx
mov ebx, edi
xor al, al
mov ecx, 0xffffffff
repne scasb ; REPeat while Not Equal [edi] != al
sub edi, ebx ; length = offset of (edi – ebx)
mov eax, edi
pop ebx
pop ecx
ret
Here is the output

strlen searches for a 0 byte terminating the string, but your string doesn't have one, so it goes until it does find a zero byte and returns a value that's too large.
You want to write
message: db "My name is Stanley Hudson", 0Ah, 0
; ^^^
Another bug is that your _strlen function is apparently in the .data section, because you didn't go back to section .text after your string. x86-32 doesn't have the NX bit so the .data section is executable and everything still works, but it's surely not what you intend.

To get rid of the special characters write the strlen function before the start process and create a new register for the newline character

Related

To display characters in reverse order using nasm [infinite loop running]

THE PROGRAM IS USED TO ACCEPT CHARACTERS AND DISPLAY THEM IN REVERSE ORDER
The code is included here:
section .bss
num resb 1
section .text
global _start
_start:
call inputkey
call outputkey
;Output the number entered
mov eax, 1
mov ebx, 0
int 80h
inputkey:
;Read and store the user input
mov eax, 3
mov ebx, 2
mov ecx, num
mov edx, 1
int 80h
cmp ecx, 1Ch
je .sub2
push ecx
jmp inputkey
.sub2:
push ecx
ret
outputkey:
pop ecx
;Output the message
mov eax, 4
mov ebx, 1
;mov ecx, num
mov edx, 1
int 80h
cmp ecx, 1Ch
je .sub1
jmp outputkey
.sub1:
ret
The code to compile and run the program
logic.asm
is given here:
nasm -f elf logic.asm
ld -m elf_i386 -s -o logic logic.o
./logic
There are a few problems with the code. Firstly, for the sys_read syscall (eax = 3) you supplied 2 as the file descriptor, however 2 refers to stderr, but in this case you'd want stdin, which is 0 (I like to remember it as the non-zero numbers 1 and 2 being the output).
Next, an important thing to realize about the ret instruction is that it pops the value off the top of the stack and returns to it (treating it as an address). Meaning that even if you got to the .sub2 label, you'd likely get a segfault. With this in mind, the stack also tends to not be permanent storage, as in it is not preserved throughout procedures, so I'd recommend just making your buffer larger to e.g. 256 bytes and increment a value to point to an index in the buffer. (Using a fixed-size buffer will keep you from getting into the complications of memory allocation early, though if you want to go down that route you could do an external malloc call or just an mmap syscall.)
To demonstrate what I mean by an index into the reserved buffer:
section .bss
buf resb 256
; ...
inputkey:
xor esi, esi ; clear esi register, we'll use it as the index
mov eax, 3
mov ebx, 0 ; stdin file descriptor
mov edx, 1 ; read one byte
.l1: ; loop can start here instead of earlier, since the values eax, ebx and edx remain unchanged
lea ecx, [buf+esi] ; load the address of buf + esi
int 80h
cmp [buf+esi], 0x0a ; check for a \n character, meaning the user hit enter
je .e1
inc esi
jmp .l1
.e1:
ret
In this case, we also get to preserve esi up until the output, meaning that to reverse the input, we just print in descending order.
outputkey:
mov eax, 4
mov ebx, 1 ; stdout
mov edx, 1
.l2:
lea ecx, [buf+esi]
int 80h
test esi, esi ; if esi is zero it will set the ZF flag
jz .e2:
jmp .l2
.e2:
ret
Note: I haven't tested this code, so if there are any issues with it let me know.

nasm zero byte omitted at the end of the string

I am studying Assembly language using this nasm tutorial. Here is the code that prints a string:
SECTION .data
msg db 'Hello!', 0Ah
SECTION .text
global _start
_start:
mov ebx, msg
mov eax, ebx
; calculate number of bytes in string
nextchar:
cmp byte [eax], 0
jz finished
inc eax
jmp nextchar
finished:
sub eax, ebx ; number of bytes in eax now
mov edx, eax ; number of bytes to write - one for each letter plus 0Ah (line feed character)
mov ecx, ebx ; move the memory address of our message string into ecx
mov ebx, 1 ; write to the STDOUT file
mov eax, 4 ; invoke sys_write (kernel opcode 4)
int 80h
mov ebx, 0 ; no errors
mov eax, 1 ; invoke sys_exit (kernel opcode 1)
int 80h
It works and successfully prints "Hello!\n" to STDOUT. One thing I don't understand: it searches for \0 byte in msg, but we didn't define it. Ideally, the correct message definition should be
msg db 'Hello!', 0Ah, 0h
How does it successfully get the zero byte at the end of the string?
The similar case is in exercise 7:
; String printing with line feed function
sprintLF:
call sprint
push eax ; push eax onto the stack to preserve it while we use the eax register in this function
mov eax, 0Ah ; move 0Ah into eax - 0Ah is the ascii character for a linefeed
push eax ; push the linefeed onto the stack so we can get the address
mov eax, esp ; move the address of the current stack pointer into eax for sprint
call sprint ; call our sprint function
pop eax ; remove our linefeed character from the stack
pop eax ; restore the original value of eax before our function was called
ret ; return to our program
It puts just 1 byte: 0Ah into eax without terminating 0h, but the string length is calculated correctly inside sprint. What is the cause?

String Reverse in FASM x86 architecture

I am making a program that reverses a given string from the user.
The problem that has appeared is that the program works well if the string is 5 bytes long but if the string is lower then the result doesn't appear when I execute it. The other problem is that if the string is more than 5 bytes long it reverses only the first five bytes.
Please keep in mind that I am new to assembly and this question may be basic but I would be grateful is someone tells me where the problem is.
Thank you to everyone, have a great day :)
P.S The file "training. inc" is a file that has "print_str, read_line" methods implemented.
entry start
include "win32a.inc"
MAX_USER_STR = 5h
section '.data' data readable writeable
enter_string db "Enter a string : ", 0
newline db 13,10,0
user_str db MAX_USER_STR dup(?), 0
section ".text" code readable executable
start:
mov esi, enter_string
call print_str
mov edi, user_str
call read_line
call str_len
mov edx, MAX_USER_STR
mov ebx, 0
mov ecx, 0
mov esi, user_str
call print_str
mov esi, newline
call print_str
mov esi, user_str
for_loop :
push eax
mov al, byte[esi]
inc esi
inc ebx
call print_eax
cmp edx, ebx
jb clear_register
jmp for_loop
for_loop2 :
call print_eax
mov byte[esi], al
inc esi
inc ecx
pop eax
cmp ecx, edx
ja break_loop
jmp for_loop2
break_loop:
;mov edi, 0
mov esi, user_str
call print_str
push 0
call [ExitProcess]
clear_register :
mov esi, user_str
jmp for_loop2
str_len :
push ecx
sub ecx, ecx
mov ecx, -1
sub al, al
cld
repne scasb
neg ecx
sub ecx, 1
mov eax, ecx
pop ecx
ret
include 'training.inc'
MAX_USER_STR = 5h
The name MAX_ already says it, but a buffer is to be defined according to the worst case scenario. If you want to be able to deal with strings that could be longer than 5 characters, then raise this value.
MAX_USER_STR = 256 ; A decent buffer
... if the string is lower then the result doesn't appear when I execute it.
The other problem is that if the string is more than 5 bytes long it reverses only the first five bytes.
That's because your code does not actually use the length of the string but rather the size of the smaller buffer. I hope you see that this should never happen, overflowing the buffer. Your code didn't complain too much since this buffer was the last item in the data section.
Your loops could use the true length if you write:
call str_len ; -> EAX
mov edx, eax
for_loop :
push eax
mov al, byte[esi]
If it's characters that you want to push, then I would expect the push eax to follow the load from the string!
Note that in a string-reversal, you never want to move the string terminator(s) to the front of the string.
This is your basic string reversal via the stack:
mov ecx, edx ; EDX has StrLen
mov esi, user_str
loop1:
movzx eax, byte [esi]
inc esi
push eax
dec ecx
jnz loop1
mov esi, user_str
loop2:
pop eax
mov [esi], al
inc esi
dec edx
jnz loop2

Debug code regarding parsing a string character by character in NASM assembly for IA32

I am a novice in assembly programming.I stumbled across a program in which i am required to write a code to take a string and a number from the user and increment each character of the string by the given number.
I have done the following:-
section .bss
s2 resb 20 ;output string
s1 resb 20 ;input string
num resb 2 ;input number
count resb 1 ;length of the input string
section .data
section .text
global _start
_start:
mov eax,3 ;taking input string from the user
mov ebx,0
mov ecx,s1
mov edx,20
int 0x80
mov eax,3 ;taking input number from user
mov ebx,0
mov ecx,num
mov edx,2
int 0x80
mov al,'1' ;initializing count to 1
sub al,'0'
mov [count],al
mov ecx,20 ;no of times the loop can execute
mov esi,s1 ;to use movsb on s1 and s2
mov edi,s2
mov bl,[num] ;converting string num to integer
sub bl,'0'
loop1: ;parse the string character by character
lodsb
cmp al,00 ;exit out when encounter end_of_file
je _exit
add al,bl
stosb
inc byte [count] ;increament count for every possible character except end_of file
loop loop1
_exit:
cld
rep movsb
mov edx,count
mov ecx,s2
mov ebx,1
mov eax,4
int 0x80
mov eax,1
int 0x80
When i run the code,it produces the expected output and some gibberish characters.
I am not able to understand the problem with my code.
Near the end:
mov edx,count
This loads the edx register with the address of count, which is something like 0x804912a. You don't want to write 0x804912a bytes.
You want edx loaded with the contents of count. Note that count is a byte but edx is a 32-bit register, so you'll want to zero-extend it. You probably want to replace that instruction with
movzx edx, byte [count]
After the change, your program works as expected.

Linked assembly subroutine doesn't work as expected

I'm writing a simple subroutine in FASM to print 32-bit unsigned integers to STDOUT. This is what I came up with:
format elf
public uprint
section ".text" executable
uprint:
push ebx
push ecx
push edx
push esi
mov ebx, 10
mov ecx, buf + 11
xor esi, esi
do:
dec ecx
xor edx, edx
div ebx
add dl, 0x30
mov [ecx], dl
inc esi
test eax, 0
jnz do
mov eax, 4
mov ebx, 1
mov edx, esi
int 0x80
pop esi
pop edx
pop ecx
pop ebx
ret
section ".data" writeable
buf rb 11
Then I wrote another program to test whether the above subroutine works properly:
format elf
extrn uprint
public _start
section ".text" executable
_start:
mov eax, 1337
call uprint
mov eax, 4
mov ebx, 1
mov ecx, newline
mov edx, 1
int 0x80
mov eax, 1
xor ebx, ebx
int 0x80
section ".data"
newline db 0x0A
I compiled both these programs to their corresponding object files and linked them to create the executable.
On executing the program however it only displayed 7 instead of 1337. As it turns out only the last digit of the number is display regardless of the number itself.
This is strange because my uprint subroutine is correct. In fact if I combine both these programs into a single program then it displays 1337 correctly.
What am I doing wrong?
I gain the distinct impression that your LINK operation is building the uprint before the _start and you're in fact entering UPRINT, not at _start as you expect.
I found out my mistake. I'm using test eax, 0 which always sets the zero flag. Hence only the first digit is processed. Intead I need to use either test eax, eax or cmp eax, 0.

Resources