String Reverse in FASM x86 architecture - string

I am making a program that reverses a given string from the user.
The problem that has appeared is that the program works well if the string is 5 bytes long but if the string is lower then the result doesn't appear when I execute it. The other problem is that if the string is more than 5 bytes long it reverses only the first five bytes.
Please keep in mind that I am new to assembly and this question may be basic but I would be grateful is someone tells me where the problem is.
Thank you to everyone, have a great day :)
P.S The file "training. inc" is a file that has "print_str, read_line" methods implemented.
entry start
include "win32a.inc"
MAX_USER_STR = 5h
section '.data' data readable writeable
enter_string db "Enter a string : ", 0
newline db 13,10,0
user_str db MAX_USER_STR dup(?), 0
section ".text" code readable executable
start:
mov esi, enter_string
call print_str
mov edi, user_str
call read_line
call str_len
mov edx, MAX_USER_STR
mov ebx, 0
mov ecx, 0
mov esi, user_str
call print_str
mov esi, newline
call print_str
mov esi, user_str
for_loop :
push eax
mov al, byte[esi]
inc esi
inc ebx
call print_eax
cmp edx, ebx
jb clear_register
jmp for_loop
for_loop2 :
call print_eax
mov byte[esi], al
inc esi
inc ecx
pop eax
cmp ecx, edx
ja break_loop
jmp for_loop2
break_loop:
;mov edi, 0
mov esi, user_str
call print_str
push 0
call [ExitProcess]
clear_register :
mov esi, user_str
jmp for_loop2
str_len :
push ecx
sub ecx, ecx
mov ecx, -1
sub al, al
cld
repne scasb
neg ecx
sub ecx, 1
mov eax, ecx
pop ecx
ret
include 'training.inc'

MAX_USER_STR = 5h
The name MAX_ already says it, but a buffer is to be defined according to the worst case scenario. If you want to be able to deal with strings that could be longer than 5 characters, then raise this value.
MAX_USER_STR = 256 ; A decent buffer
... if the string is lower then the result doesn't appear when I execute it.
The other problem is that if the string is more than 5 bytes long it reverses only the first five bytes.
That's because your code does not actually use the length of the string but rather the size of the smaller buffer. I hope you see that this should never happen, overflowing the buffer. Your code didn't complain too much since this buffer was the last item in the data section.
Your loops could use the true length if you write:
call str_len ; -> EAX
mov edx, eax
for_loop :
push eax
mov al, byte[esi]
If it's characters that you want to push, then I would expect the push eax to follow the load from the string!
Note that in a string-reversal, you never want to move the string terminator(s) to the front of the string.
This is your basic string reversal via the stack:
mov ecx, edx ; EDX has StrLen
mov esi, user_str
loop1:
movzx eax, byte [esi]
inc esi
push eax
dec ecx
jnz loop1
mov esi, user_str
loop2:
pop eax
mov [esi], al
inc esi
dec edx
jnz loop2

Related

nasm zero byte omitted at the end of the string

I am studying Assembly language using this nasm tutorial. Here is the code that prints a string:
SECTION .data
msg db 'Hello!', 0Ah
SECTION .text
global _start
_start:
mov ebx, msg
mov eax, ebx
; calculate number of bytes in string
nextchar:
cmp byte [eax], 0
jz finished
inc eax
jmp nextchar
finished:
sub eax, ebx ; number of bytes in eax now
mov edx, eax ; number of bytes to write - one for each letter plus 0Ah (line feed character)
mov ecx, ebx ; move the memory address of our message string into ecx
mov ebx, 1 ; write to the STDOUT file
mov eax, 4 ; invoke sys_write (kernel opcode 4)
int 80h
mov ebx, 0 ; no errors
mov eax, 1 ; invoke sys_exit (kernel opcode 1)
int 80h
It works and successfully prints "Hello!\n" to STDOUT. One thing I don't understand: it searches for \0 byte in msg, but we didn't define it. Ideally, the correct message definition should be
msg db 'Hello!', 0Ah, 0h
How does it successfully get the zero byte at the end of the string?
The similar case is in exercise 7:
; String printing with line feed function
sprintLF:
call sprint
push eax ; push eax onto the stack to preserve it while we use the eax register in this function
mov eax, 0Ah ; move 0Ah into eax - 0Ah is the ascii character for a linefeed
push eax ; push the linefeed onto the stack so we can get the address
mov eax, esp ; move the address of the current stack pointer into eax for sprint
call sprint ; call our sprint function
pop eax ; remove our linefeed character from the stack
pop eax ; restore the original value of eax before our function was called
ret ; return to our program
It puts just 1 byte: 0Ah into eax without terminating 0h, but the string length is calculated correctly inside sprint. What is the cause?

Disregard a strings' space characters in ASM

Trying to find the number of characters in a string and disregard all the " " space characters
I have a C++ portion that passes the strings to asm and here is my asm
works fine, only thing is that the space characters are being counted as well.
stringLength PROC PUBLIC
PUSH ebp ; save caller base pointer
MOV ebp, esp ; set our base pointer
SUB esp, (1 * 4) ; allocate uint32_t local vars
PUSH edi
PUSH esi
; end prologue
MOV esi, [ebp+8] ;gets the string
xor ebx, ebx
COMPARE:
MOV al, [esi + ebx]
CMP al, 0 ;compare character of string with 0
JE FINALE ;if = to 0 go to end
INC ebx ;counter
CMP al, ' ' ;compare with sapce
JE SPACE ;go get rid of the space and keep going
INC al ;otherwise inc al to next character and repeat
JMP COMPARE
SPACE:
DEC ebx ;get rid of the extra space
INC al
JMP COMPARE ;goes back to compare
FINALE:
MOV eax,ebx ; bring back the counter
ADD esp, (2 * 4) ; clear the stack
POP esi
POP edi
MOV esp, ebp ; deallocate locals
POP ebp ; restore caller base pointer
RET
stringLength ENDP ; end the procedure
END stringLength
You are doing a lot of useless stuff and not doing anything to count ignoring the spaces.
You don't really need to setup a new stack frame, for such a simple routine you can do everything in clobbered registers, or at most save a few registers on the stack;
That inc al is pointless - you are incrementing the character value, just to discard it at the next loop iteration.
push fmt and then you clean the stack immediately? What sense does it make?
mov ebx, 0- nobody does that, the idiomatic way to zero a register is xor ebx,ebx (the instruction encoding is more compact);
cmp al, 0 given that you are only interested in equality, you can just do test al, al (more compact);
you read [ebp+12] but never actually use it - is that supposed to be an unused parameter?
As for the algorithm itself, you'll just have to keep a separate counter to count non-space characters; actually, you can just keep ebx for that, and increment directly esi to iterate over characters. For example:
xor ebx, ebx
COMPARE:
mov al, [esi]
cmp al, ' '
jne nonspace
inc ebx
nonspace:
test al, al
jz FINALE
inc esi
jmp COMPARE
FINALE:
Now, this can be streamlined further exploiting the fact that the eax is going to be the return value, and that you can clobber freely ecx and edx, so:
stringLength PROC PUBLIC
mov ecx,[esp+4] ; get the string
xor eax,eax ; zero the counter
compare:
mov dl,[ecx] ; read character
cmp dl,' '
jne nospace
inc eax ; increase counter if it's a space
nospace:
test dl,dl
jz end ; go to end if we reached the NUL
inc ecx ; next character
jmp compare
end:
ret ; straight return, nothing else to do
stringLength ENDP ; end the procedure
edit: about the updated version
COMPARE:
MOV al, [esi + ebx]
CMP al, 0
JE FINALE
INC ebx
CMP al, " " ; I don't know what assembler you are using,
; but typically character values are in single quotes
JE SPACE
INC al ; this makes no sense! you are incrementing
; the _character value_, not the position!
; it's going to be overwritten at the next iteration
JMP COMPARE
SPACE:
INC eax ; you cannot use eax as a counter, you are already
; using it (al) as temporary store for the current
; character!
JMP COMPARE
I think we need to use whole the eax register to compare values. In such manner:
; inlet data:
; eax - pointer to first byte of string
; edx - count of bytes in string
; ecx - result (number of non-space chars)
push esi
mov ecx, 0
mov esi, eax
##compare: cmp edx, 4
jl ##finalpass
mov eax, [esi]
xor eax, 20202020h ; 20h - space char
cmp al, 0
jz ##nextbyte0
inc ecx
##nextbyte0: cmp ah, 0
jz ##nextbyte1
inc ecx
##nextbyte1: shr eax, 16
cmp al, 0
jz ##nextbyte2
inc ecx
##nextbyte2: cmp ah, 0
jz ##nextbyte3
inc ecx
##nextbyte3: add esi, 4
sub edx, 4
jmp ##compare
##finalpass: and edx, edx
jz ##fine
mov al, [esi]
cmp al, 20h
jz ##nextbyte4
inc ecx
##nextbyte4: inc esi
dec edx
jmp ##finalpass
##fine: pop esi
; save the result data and restore stack

String Reverse in Assembly language x86

I'm new to assembly language and I have this code that is suppose to reverse the string length, now I know I'm close but the program keeps crashing on me for whatever reason that is. The problem is in the STRREV PROC. What am I doing wrong in this code?
INCLUDE Irvine32.inc
.data
prompt BYTE "Enter String: ", 0
response BYTE 50 DUP(0)
message BYTE " Message entered. ",0
.code
swap MACRO a,b
xor a,b
xor b,a
xor a,b
endM
STRQRY PROC
push ebp
mov ebp, esp
push edx
push ecx
mov edx, [ebp+8]
call writestring
mov ecx, SIZEOF response
mov edx, OFFSET response
call readstring
pop ecx
pop edx
pop ebp
ret 4
STRQRY ENDP
STRLEN PROC
push ebp
mov ebp, esp
push ebx
push ecx
mov edx,[ebp+16]
mov eax, 0
counter:
mov cl,[edx+eax]
cmp cl, 0
JE done
inc eax
jmp counter
done:
pop ecx
pop ebx
pop ebp
ret 4
STRLEN ENDP
STRREV proc
push ebp
mov ebp, esp
push OFFSET response
call STRLEN
mov edx, [ebp+8]
mov esi, 0
dec eax
reverseloop:
mov ah, [edx+esi]
mov al, [edx+eax]
swap ah, al
mov [edx+esi],ah
mov [edx+eax],al
inc esi
dec eax
cmp esi, eax
jb reverseloop
ja finish
finish:
pop ebp
ret 4
STRREV endp
main PROC
push OFFSET prompt
call STRQRY
call writedec
mov edx,OFFSET message
call WriteString
push eax
call STRREV
mov edx, OFFSET response
call WriteString
exit
main ENDP
END main
The main problem in your function is changing AL and AH register and then using EAX as pointer. I decided to write a new function based on your code, read it carefully and debug your code using the right emulator.
STRREV proc
;opening the function
push ebp
mov ebp, esp
push OFFSET response
call STRLEN
mov edx, [ebp+8] ;edx = offset string to reverse
mov esi, 0
dec eax
mov ebx,edx ;ebx stores the pointer to the first character
add ebx,eax` ;now ebx store the pointer to the last character before the '$'
reverseloop:
mov ah, [edx] ;ah stores the value at string[loop count]
mov al, [ebx] ;al stores the value at string[len-loop count-1]
;"swap ah,al" is logiclly unnecessary
;better solution:
mov [ebx],ah ; string[loop count] = string[len-loop count-1]
mov [edx],al ; string[len-loop count-1] = string[loop count]
inc edx ;increment of the right-most pointer
dec ebx ;decrement of the right-most pointer
cmp ebx, eax ;compares the left-most pointer to the right-most
jb reverseloop
jmp finish ;"ja", there is no need to check a condition twice
finish:
pop ebp
ret 4
STRREV endp

write number to file using NASM

How do I write a variable to a file using NASM?
For example, if I execute some mathematical operation - how do I write the result of the operation to write a file?
My file results have remained empty.
My code:
%include "io.inc"
section .bss
result db 2
section .data
filename db "Downloads/output.txt", 0
section .text
global CMAIN
CMAIN:
mov eax,5
add eax,17
mov [result],eax
PRINT_DEC 2,[result]
jmp write
write:
mov EAX, 8
mov EBX, filename
mov ECX, 0700
int 0x80
mov EBX, EAX
mov EAX, 4
mov ECX, [result]
int 0x80
mov EAX, 6
int 0x80
mov eax, 1
int 0x80
jmp exit
exit:
xor eax, eax
ret
You have to implement ito (integer to ascii) subsequently len for this manner. This code tested and works properly in Ubuntu.
section .bss
answer resb 64
section .data
filename db "./output.txt", 0
section .text
global main
main:
mov eax,5
add eax,44412
push eax ; Push the new calculated number onto the stack
call itoa
mov EAX, 8
mov EBX, filename
mov ECX, 0x0700
int 0x80
push answer
call len
mov EBX, EAX
mov EAX, 4
mov ECX, answer
movzx EDX, di ; move with extended zero edi. length of the string
int 0x80
mov EAX, 6
int 0x80
mov eax, 1
int 0x80
jmp exit
exit:
xor eax, eax
ret
itoa:
; Recursive function. This is going to convert the integer to the character.
push ebp ; Setup a new stack frame
mov ebp, esp
push eax ; Save the registers
push ebx
push ecx
push edx
mov eax, [ebp + 8] ; eax is going to contain the integer
mov ebx, dword 10 ; This is our "stop" value as well as our value to divide with
mov ecx, answer ; Put a pointer to answer into ecx
push ebx ; Push ebx on the field for our "stop" value
itoa_loop:
cmp eax, ebx ; Compare eax, and ebx
jl itoa_unroll ; Jump if eax is less than ebx (which is 10)
xor edx, edx ; Clear edx
div ebx ; Divide by ebx (10)
push edx ; Push the remainder onto the stack
jmp itoa_loop ; Jump back to the top of the loop
itoa_unroll:
add al, 0x30 ; Add 0x30 to the bottom part of eax to make it an ASCII char
mov [ecx], byte al ; Move the ASCII char into the memory references by ecx
inc ecx ; Increment ecx
pop eax ; Pop the next variable from the stack
cmp eax, ebx ; Compare if eax is ebx
jne itoa_unroll ; If they are not equal, we jump back to the unroll loop
; else we are done, and we execute the next few commands
mov [ecx], byte 0xa ; Add a newline character to the end of the character array
inc ecx ; Increment ecx
mov [ecx], byte 0 ; Add a null byte to ecx, so that when we pass it to our
; len function it will properly give us a length
pop edx ; Restore registers
pop ecx
pop ebx
pop eax
mov esp, ebp
pop ebp
ret
len:
; Returns the length of a string. The string has to be null terminated. Otherwise this function
; will fail miserably.
; Upon return. edi will contain the length of the string.
push ebp ; Save the previous stack pointer. We restore it on return
mov ebp, esp ; We setup a new stack frame
push eax ; Save registers we are going to use. edi returns the length of the string
push ecx
mov ecx, [ebp + 8] ; Move the pointer to eax; we want an offset of one, to jump over the return address
mov edi, 0 ; Set the counter to 0. We are going to increment this each loop
len_loop: ; Just a quick label to jump to
movzx eax, byte [ecx + edi] ; Move the character to eax.
movsx eax, al ; Move al to eax. al is part of eax.
inc di ; Increase di.
cmp eax, 0 ; Compare eax to 0.
jnz len_loop ; If it is not zero, we jump back to len_loop and repeat.
dec di ; Remove one from the count
pop ecx ; Restore registers
pop eax
mov esp, ebp ; Set esp back to what ebp used to be.
pop ebp ; Restore the stack frame
ret ; Return to caller

Loop/Input Logic Flow Issue (NASM x86 Assembly)

I have a program below that tries to take input from the user and repeat that same string until the user enters it again. (It's a personal learning project)
However, I am having some severe diffuculty in getting it to perform correctly. In a past thread here, you can see the input, pun intended, that other users have provided on this problem.
%include "system.inc"
section .data
greet: db 'Hello!', 0Ah, 'Please enter a word or character:', 0Ah
greetL: equ $-greet ;length of string
inform: db 'I will now repeat this until you type it back to me.', 0Ah
informL: equ $-inform
finish: db 'Good bye!', 0Ah
finishL: equ $-finish
newline: db 0Ah
newlineL: equ $-newline
section .bss
input: resb 40 ;first input buffer
check: resb 40 ;second input buffer
section .text
global _start
_start:
greeting:
mov eax, 4
mov ebx, 1
mov ecx, greet
mov edx, greetL
sys.write
getword:
mov eax, 3
mov ebx, 0
mov ecx, input
mov edx, 40
sys.read
sub eax, 1 ;remove the newline
push eax ;store length for later
instruct:
mov eax, 4
mov ebx, 1
mov ecx, inform
mov edx, informL
sys.write
pop edx ;pop length into edx
mov ecx, edx ;copy into ecx
push ecx ;store ecx again (needed multiple times)
mov eax, 4
mov ebx, 1
mov ecx, input
sys.write
mov eax, 4 ;print newline
mov ebx, 1
mov ecx, newline
mov edx, newlineL
sys.write
mov eax, 3 ;get the user's word
mov ebx, 0
mov ecx, check
mov edx, 40
sys.read
sub eax, 1
push eax
xor eax, eax
checker:
pop ecx ;length of check
pop ebx ;length of input
mov edx, ebx ;copy
cmp ebx, ecx ;see if input was the same as before
jne loop ;if not the same go to input again
mov ebx, check
mov ecx, input
secondcheck:
mov dl, [ebx]
cmp dl, [ecx]
jne loop
inc ebx
inc ecx
dec eax
jnz secondcheck
jmp done
loop:
pop edx
mov ecx, edx
push ecx
mov eax, 4
mov ebx, 1
mov ecx, check
sys.write ;repeat the word
mov eax, 4
mov ebx, 1
mov ecx, newline
mov edx, newlineL
sys.write
mov eax, 3 ;replace new input with old
mov ebx, 0
mov ecx, check
mov edx, 40
sys.read
jmp checker
done:
mov eax, 1
mov ebx, 0
sys.exit
Example output would yield:
Hello!
Please enter a word or character:
INPUT: Nick
I will now repeat this until you type it back to me.
Nick
INPUT: Nick
N
INPUT: Nick
INPUT: Nick
And that goes on forever until is ^C it to death. Any ideas on the problem?
Thanks.
instruct leaves two items on the stack, which are consumed by checker the first time round the loop. But they are not replaced for the case where you go round the loop again. This is the most fundamental problem in your code (there may be others).
You could find this by running with a debugger and watching the stack pointer esp; but it can be seen just by looking at the code -- if you take everything out except for the stack manipulation and branches, you can clearly see that the checker -> loop -> back to checker path pops three items but only pushes one:
greeting:
...
getword:
...
push eax ;store length for later
instruct:
...
pop edx ;pop length into edx
...
push ecx ;store ecx again (needed multiple times)
...
push eax
checker:
pop ecx ;length of check
pop ebx ;length of input
...
jne loop ;if not the same go to input again
...
secondcheck:
...
jne loop
...
jnz secondcheck
jmp done
loop:
pop edx
...
push ecx
...
jmp checker
done:
...
There are better ways to keep long-lived variables than trying to shuffle them around on the stack like this with push and pop.
Keep them in a data section (the .bss you already have would be suitable) instead of on the stack.
Allocate some space on the stack, and load/store them there directly. e.g. sub esp, 8 to reserve two 32-bit words, then access [esp] and [esp+4]. (The stack should be aligned to a 32-bit boundary, so always reserve a multiple of 4 bytes.) Remember to add esp, 8 when you've finished using it.
(These are essentially the equivalent of what a C compiler would do for global (or static) variables, and local variables, respectively.)

Resources