why does string declaring matter in assmebly 8086? - string

i wanted to write a program that iterate through a string and calculate how many times the letter "C" occured, ( i increase the value of ````BX``` ever time there is a ltter C)
now the code works correctly like this
LEA DI, STRING
MOV CX, 6h
FOR:
CMP CX, 0
JE END
cmp [DI], "C"
jnz siPasDeC
inc BX
siPasDeC:
INC DI
dec cx
jmp FOR
END:
ret
hlt
STRING DB "CCKCCD"
but when i write STRING DB "CCKCCD" in the first line the program does sort of an infinite loop. can you tell me why this happened?
PS: is is a best practice to write "ret" and "hlt" every time; because i didn't see people writing it online, but in college we are forced to write it.

Related

How to skip spaces at the beginning of the string in assembly x86

What loop should I write If I want to skip all spaces at the beginning of the string and start to do something with the string when my code reaches the first symbol of the string. If I have a string like
a=' s o m e word'
the code should start when 's' is reached. It should be some kind of loop but I still don't know how to write it correctly.
My try:
mov si, offset buff+2 ;buffer
mov ah, [si]
loop_skip_space:
cmp ah,20h ;chech if ah is space
jnz increase ;if yes increase si
jmp done ;if no end of loop
increase:
inc si
loop loop_skip_space
done:
In this 16-bit code that fetches its string offset at buff+2, I believe it's a safe bet to consider this string to belong to the input gotten from executing the DOS.BufferedInput function 0Ah.
The code snippets below are based on this assumption. Next observations about the OP's code remain valid anyway.
The mov ah, [si] must be part of the loop. You need to verify different characters from the string, so loading following bytes is necessary.
Your code should exit the loop upon finding a non-space character. Currently you exit upon the first space.
The loop instruction requires setting up the CX register with the length of the string. You omitted that.
mov si, offset buff+2
mov cl, [si-1]
mov ch, 0
jcxz done ; String could start off empty
loop_skip_space:
cmp byte ptr [si], ' '
jne done
inc si ; Skip the current space
loop loop_skip_space
done:
Here SI points at the first non-space character in the string with CX characters remaining. CX could be zero!
You can write better code if you stop using the loop instruction, because it's an instruction that is said to be slow. See Why is the loop instruction slow? Couldn't Intel have implemented it efficiently?.
Avoiding loop and no longer requiring the use of CX
mov si, offset buff+2
mov cl, [si-1]
test cl, cl
jz done ; String could start off empty
loop_skip_space:
cmp byte ptr [si], ' '
jne done
inc si ; Skip the current space
dec cl
jnz loop_skip_space
done:
Avoiding loop and using the delimiter 13 (carriage return)
mov si, offset buff+1
loop_skip_space:
inc si
cmp byte ptr [si], ' '
je loop_skip_space
look at the example
STRLEN EQU 9
STRING DB 'Assembler'
CLD ;clear direction flag
MOV AL,' ' ;symbol to find.
MOV ECX,STRLEN ;length of string
LEA EDI,STRING ;string pointer.
REPE SCASB ;search itself
JNZ K20 ;jump if not found
DEC EDI ;
; EDI points to your first not space char
K20: RET

A better form to find a Substring in a string in assembly code (MASM)?

So i made this code with the knowledge i gather in various sites, i think there is an optimazed way to do it without pushing and poping the registers on the stack memory, but i don't know how to do it.
Here is my Code
comparing proc
MOV CX, SIZEOF vec2 ;The size of the substring
DEC CX
MOV DX, SIZEOF vec1 ; The size of the String
LEA SI, vec1 ; The String
LEA DI, vec2 ; The substring
FIND_FIRST:
MOV AL, [SI]; storing the ascii value of current character of mainstring
MOV AH, [DI]; storing the ascii value of current character of substring
CMP AL,AH; comparing both character
JE FITTING; if we find it we try to find the whole substring
JNE NEXT
NEXT:
INC SI; We go to the next char
DEC DX; And the size of the string decreased
JE N_FINDED
JMP FIND_FIRST
FITTING:
CLD
PUSH CX ; I push this register because in the instruction REPE CMPSB
PUSH SI ; They change.
PUSH DI
REPE CMPSB
JNE N_FITTING
JE FINDED
N_FITTING:
POP DI
POP SI
POP CX
JMP NEXT ; if the complete substring doesn't fit we go to the next char
FINDED:
POP DI
POP SI
POP CX
MOV AL, 0001H; substring found
JMP RETURN
N_FINDED:
MOV AL, 0000H; substring not found
RETURN:
ret
comparing endp
If the substring happens to have more than one character, which is very likely, then your code will start comparing bytes that are beyond the string to search through.
With the string referred to by DI and its length DX, and the substring referred to by SI and its length CX, you first need to make sure that neither string is empty, and then you need to limit the number of possible finds. Next 4 lines of code do just that:
jcxz NotFound ; Substring is empty
sub dx, cx
jb NotFound ; Substring is longer than String
; Happens also when String is empty
inc dx
As an example consider the string "overflow" (DX=8) and the substring "basic" (CX=5):
sub dx, cx ; 8 - 5 = 3
inc dx ; 3 + 1 = 4 Number of possible finds is 4
overflow
basic possible find number 1
basic possible find number 2
basic possible find number 3
basic possible find number 4
You can write your proc without having to preserve those registers on the stack (or elsewhere) all the time. Just introduce another register so you don't have to clobber the CX, SI, and DI registers:
jcxz NotFound
sub dx, cx
jb NotFound
inc dx
mov al, [si] ; Permanent load of first char of the Substring
FirstCharLoop:
cmp [di], al
je FirstCharMatch
NextFirstChar:
inc di
dec dx ; More tries?
jnz FirstCharLoop ; Yes
NotFound:
xor ax, ax
ret
FirstCharMatch:
mov bx, cx
dec bx
jz Found ; Substring had only 1 character
OtherCharsLoop:
mov ah, [si+bx]
cmp [di+bx], ah
jne NextFirstChar
dec bx
jnz OtherCharsLoop
Found:
mov ax, 1
ret
Do note that this code now does not compare the first character again like the repe cmpsb in your program did.
AX being the result, the only registers that are clobbered (and that you maybe could want to preserve) are BX, DX, and DI.

How to find the hamming distance for strings that are not necessarily equal length?

I have an assignment asking me to find the hamming distance of two user-input strings that are not necessarily equal in length.
So, I made the following algorithm:
Read both strings
check the length of each string
compare the length of the strings
if(str1 is shorter)
set counter to be the length of str1
END IF
if(str1 is longer)
set counter to be the length of str2
END IF
if(str1 == str2)
set counter to be length of str1
END IF
loop through each digit of the strings
if(str1[digitNo] XOR str2[digitNo] == 1)
inc al
END IF
the final al value is the hamming distance of the strings, print it.
But I'm stuck at step 3 and I don't seem to get it working. any help?
I tried playing around with the registers to save the values in, but none of that worked, I still didn't get it working.
; THIS IS THE CODE I GOT
.model small
.data
str1 db 255
db ?
db 255 dup(?)
msg1 db 13,10,"Enter first string:$"
str2 db 255
db ?
db 255 dup(?)
msg2 db 13,10,"Enter second string:$"
one db "1"
count db ?
.code
.startup
mov ax,#data
mov ds,ax
; printing first message
mov ah, 9
mov dx, offset msg1
int 21h
; reading first string
mov ah, 10
mov dx, offset str1
int 21h
; printing second message
mov ah, 9
mov dx, offset msg2
int 21h
; reading second string
mov ah, 10
mov dx, offset str2
int 21h
; setting the values of the registers to zero
mov si, 0
mov di, 0
mov cx, 0
mov bx, 0
; checking the length of the first string
mov bl, str1+1
add bl, 30h
mov ah, 02h
mov dl, bl
int 21h
; checking the length of the second string
mov bl, str2+1
add bl, 30h
mov ah, 02h
mov dh, bl
int 21h
; comparing the length of the strings
cmp dl,dh
je equal
jg str1Greater
jl str1NotGreater
; if the strings are equal we jump here
equal:
mov cl, dl
call theLoop
; if the first string is greater than the second, we jump here and set counter of str1
str1Greater:
; if the second string is greater than the first, we jump here and set counter to length of str2
Str1NotGreater:
; this is the loop that finds and prints the hamming distance
;we find it by looping over the strings and taking the xor for each 2, then incrementing counter of ones for each xor == 1
theLoop:
end
So, in the code I provided, it's supposed to print the length of each string (it prints the lengths next to each other), but it seems to always keep printing the length of the first string, twice. The register used to store the length of the first string is dl, and the register used to store the length of the second is dh, if I change it back to dl, it would then print the correct length, but I want to compare the lengths, and I think it won't be possible to do so if I save it in dl both times.
but it seems to always keep printing the length of the first string, twice.
When outputting a character with the DOS function 02h you don't get to choose which register to use to supply the character! It's always DL.
Since after printing both lengths you still want to work with these lengths it will be better to not destroy them in the first place. Put the 1st length in BL and the second length in BH. For outputting you copy these in turn to DL where you do the conversion to a character. This of course can only work for strings of at most 9 characters.
; checking the length of the first string
mov BL, str1+1
mov dl, BL
add dl, 30h
mov ah, 02h
int 21h
; checking the length of the second string
mov BH, str2+1
mov dl, BH
add dl, 30h
mov ah, 02h
int 21h
; comparing the length of the strings
cmp BL, BH
ja str1LONGER
jb str1SHORTER
; if the strings are equal we ** FALL THROUGH ** here
equal:
mov cl, BL
mov ch, 0
call theLoop
!!!! You need some way out at this point. Don't fall through here !!!!
; if the first string is greater than the second, we set counter of str1
str1LONGER:
; if the second string is greater than the first, we set counter to length of str2
Str1SHORTER:
; this is the loop that finds and prints the hamming distance
;we find it by looping over the strings and taking the xor for each 2, then incrementing counter of ones for each xor == 1
theLoop:
Additional notes
Lengths are unsigned numbers. Use the unsigned conditions above and below.
Talking about longer and shorter makes more sense for strings.
Don't use 3 jumps if a mere fall through in the code can do the job.
Your code in theLoop will probably use CX as a counter. Don't forget to zero CH. Either using 2 instructions like I did above or else use movzx cx, BL if you're allowed to use instructions that surpass the original 8086.
Bonus
mov si, offset str1+2
mov di, offset str2+2
mov al, 0
MORE:
mov dl, [si]
cmp dl, [di]
je EQ
inc al
EQ:
inc si
inc di
loop MORE

display different attributes of character in a string in assembly

I would like to know if its possible to change the attributes of each character in a string?
For example in the string "hello" the character 'h' will have a different color, the same with 'e' and so on.
I use AH, 06 to call every character in the string. Then use AH, 09 INT 10h to change the attribute of each character but then its not working.
I want to know how can AL (in AH, 09) get the DL (AH, 06) and change the attribute of every character.
is this possible?
thanks for the help
here's my code
`
.DATA
hello DB "hello$"
.CODE
START:
MOV AX, #DATA
MOV DS, AX
LEA SI, hello
MOV CX, 0005H
E: MOV AH, 06H
MOV DL, [SI]
INC SI
;INT 21H
LOOP E
MOV CX, 0005H
MOV AH, 09H
MOV AL, [SI]
INC SI
MOV BL, 0001H
H: INT 10H
INC BL
LOOP H
MOV AX, 4C00H
INT 21H
END START `
First off this code is not Windows, it's 16-bit DOS code that calls the BIOS video routines.
The main body calls INT 10H, the documentation for that call is here: http://en.wikipedia.org/wiki/INT_10H
For int 10H,9 this is the relevant line:
Write character and attribute at cursor position AH=09h AL = Character, BH = Page Number, BL = Color, CX = Number of times to print character
This means there are a couple of errors you're making:
You cannot use CX as a loop counter, because it's a parameter for the call.
The color goes into bl so don't hardcode that.
bh is the page number, but you're not setting bh anywhere.
Increasing bl and then later resetting it back to 1 will obviously fix it at 1.
You've already increased si through the whole length of the string in the first loop, so in the second loop you're reading past the end of the string (a classic buffer overrun). At the start of the second loop you need to repeat the lea.
Ever since the 80486 using loop is a bad idea because it's much slower than the equivalent sub reg,1; jnz label; besides loop is tied to the cx register which is awkward.
If you're using bios int calls speed is hardly a requirement, but that's not the point.
If you want to learn x86 assembly you should also learn not to use the old cisc instructions on new processors.

Printing a string in assembly using no predefined function

I have to define a function in assembly that will allow me to loop through a string of declared bytes and print them using a BIOS interrupt. I'm in 16 bit real mode. This is an exercise on writing a little bootloader from a textbook, but it seems that it was only a draft and it's missing some stuff.
I have been given the following code:
org 0x7c00
mov bx, HELLO_MSG
call print_string
mov bx, GOODBYE_MSG
call print_string
jmp $ ;hang so we can see the message
%include "print_string.asm"
HELLO_MSG:
db 'Hello, World!', 0
GOODBYE_MSG:
db 'Goodbye!', 0
times 510 - ($ - $$) db 0
dw 0xaa55
My print_string.asm looks like this:
print_string:
pusha
mov ah, 0x0e
loop:
mov al, bl
cmp al, 0
je return
int 0x10
inc bx
jmp loop
return:
popa
ret
I have some idea of what I'm doing, but the book doesn't explain how to iterate through something. I know how to do it in C but this is my first time using assembly for something other than debugging C code. What happens when I boot through the emulator is that it prints out a couple of lines of gibberish and eventually hangs there for me to see my failure in all it's glory. Hahaha.
Well, it looks like it loads the address of the string into the BX register before calling the function.
The actual function looks like it is trying to loop through the string, using BX as a pointer and incrementing it (inc bx) until it hits the ASCII NUL at the end of the string (cmp al, 0; je return)...
...but something is wrong. The "mov al, bl" instruction does not look correct, because that would move the low 8 bits of the address into al to be compared for an ASCII NUL, which does not make a lot of sense. I think it should be something more like "mov al, [bx]"; i.e. move the byte referenced by the BX address into the AL register -- although it has been a long time since I've worked with assembly so I might not have the syntax correct.
Because of that bug, the 10h interrupt would also be printing random characters based on the address of the string rather than the contents of the string. That would explain the gibberish you're seeing.
I think the issue is you cannot count on the int preserving any of your registers, so you need to protect them. Plus, what Steven pointed out regarding loading of your string address:
; Print the string whose address is in `bx`, segment `ds`
; String is zero terminated
;
print_string:
pusha
loop:
mov al, [bx] ; load what `bx` points to
cmp al, 0
je return
push bx ; save bx
mov ah, 0x0e ; load this every time through the loop
; you don't know if `int` preserves it
int 0x10
pop bx ; restore bx
inc bx
jmp loop
return:
popa
ret

Resources