understanding nasm assembly for outputting characters in teletype mode - nasm

I am reading this wonderful skript on operating system programming
http://www.cs.bham.ac.uk/~exr/lectures/opsys/10_11/lectures/os-dev.pdf
On Page 12 there is a simple bootloader.
If I understand correclty, the code shown is what you must write in NASM to get BIOS to print out characters.
What I dont get is this:
It says that
we need interrupt 0x10 and to set ah to 0x0e (to indicate tele-type mode)
and al to the ASCII code of the character we wish to print.
But the first instruction is:
mov ah , 0x0e ;int 10/ ah = 0eh -> scrolling teletype BIOS routine
I don't understand the comment on that line. Why doesn't the first line of code say:
mov ah, 0xeh
int 0x10
if thats what you need to do?
Thanks for help!

Although Chrono gave you an answer I'm not quite sure it answers your question. You seem to be asking why the comment says one thing and the code seemingly does another.
Base Prefixes and Suffixes
Decades ago a lot of reference material and some disassemblers used a slightly different default notation to represent decimal, hexadecimal, octal, and binary bases than you might see today. They specified the base as the last character(suffix) of the value. Common suffixes are:
b = binary 10101010b (decimal 170) base 2
d = decimal 170d (decimal 170) \ both d and t mean base 10
t = decimal 170t (decimal 170) /
h = hex 0AAh (decimal 170) base 16
o = octal 252o (decimal 170) base 8
If a number contains no alphabetic characters then it is assumed to be base 10 decimal. So this also applies:
no alphabetic character 170 decimal 170
Most assemblers will accept most of these suffixes, but they also will support the base being defined as a prefix. If a value doesn't end with an alphabetic character but starts with a 0 followed by a letter then the letter denotes the base. Common prefix bases are:
b = binary 0b10101010 (decimal 170) base 2
d = decimal 0d170 (decimal 170) \ both d and t mean base 10
t = decimal 0t170 (decimal 170) /
h = hex 0xAA (decimal 170) base 16
o = octal 0o252 (decimal 170) base 8
Most modern assemblers will support the forms specified as a prefix or suffix. Some assemblers may not support some of the prefixes and suffixes like t.
If you specify numbers with a prefix base then stick with prefixes throughout the whole file. If you specify numbers with a suffix base then stick with suffixes throughout the whole file. You can mix them up, but it is best to be consistent in a file.
Interpreting int 10/ ah = 0eh
What does this mean:
int 10/ ah = 0eh -> scrolling teletype BIOS routine
int 10 contains no letters so it is decimal 10 (or hexadecimal a).
0eh ends with a letter and doesn't start with 0 and a letter so h is the suffix. h means hexadecimal. So 0ehis hexadecimal 0e (or decimal 14).
If you were to put that into assembler code for the BIOS it would look like (using hexadecimal suffix):
mov ah, 0eh ; Decimal 14
int 0ah ; Decimal 10. The 0 in front makes sure the assembler knows we don't mean register ah!
Using prefixes (hexadecimal in this example):
mov ah, 0xe ; Decimal 14
int 0xa ; Decimal 10
Or if you want to use decimal (no prefix or suffix):
mov ah, 14 ; Decimal 14
int 10 ; Decimal 10
But you may now be saying Hey Wait! that is wrong because the BIOS video interrupt is 0x10 (or 16 decimal) you are correct! We have just learned that the comment is wrong or at best VERY ambiguous. The comment should have said:
int 10h / ah = 0eh -> scrolling teletype BIOS routine
You may wish to contact the author of the comment / code and let them know that their comment is inaccurate. The code they wrote is correct.
If the assembler supports them I prefer prefixes like 0x, 0b, 0o instead of the suffixes h, b, o because it is possible for some suffixes to form register names, or other identifiers and symbols. When using suffixes, if you have a value that must start with a letter (ie: A to F in hexadecimal) add a 0 to the beginning to let the assembler know you are representing a value. As an example AAh would have to be written as 0AAh, and Bh would have to be written as 0Bh.

The comment is just for context, stating that AH=0x0e because it denotes the scrolling teletype BIOS routine when invoking INT 0x10.
You could think of the int XXX instruction as an "execute function XXX" instruction for simplicity purposes. In this particular case, if you don't first load the AL register with a byte of your choosing, whatever byte is in that register will be printed each time INT 0x10 appears. That's why AH is initially loaded with 0x0e for the scrolling teletype routine, and AL is then loaded each time with a byte to display before executing the INT 0x10 instruction.
In some simplistic commented pseudocode:
#AH=0x0e is the scrolling teletype BIOS routine when used with int 10h.
AH := 0x0e
#AL is the byte to display.
AL := 'H'
#Execute the scrolling teletype BIOS routine (AH=0x0e), displaying 'H' (AL='H').
INT 0x10

Related

Character parsing for new variable

I got the character with the value:
('2'; 0x32 excluding the brackets).
I'd like the give another variable the value between the ' ', in this case the 2.
Let's say char j='2'; 0x32 ;
int i ;
I've tried:
i=j[1];
I want to take the second variable (2) within the character j, but it doesn't seem to work.
It depends on how you got this value, because in the ASCII / Unicode table, 0x32 (or 50 in decimal) and character 2 are the same.
Perhaps you have errors in how you get this value. If not, then int i = j; will help.

I don't know how to chech if a string is symmetric or not in mips

i'am new at StackOverFlow, i get into trouble and i need your help.
I'am student and i need to write a MIPS program that checks if one string is symmteric.
*sample symmetric strings : ana, asddsa, fillif and so on.
This is my first line of code where i am reading string into an array, but i stucked at the symmetric part.
.data
array: .space 50 # char a[50];
.text
readText:
li $v0,8 # input
la $a0,array # loadiraj memorija vo adresata
li $a1,20 #obezbedi memorija za string
move $t0,$a0 #zachuvaj string vo $t0
syscall
symmetry:
Please give me an opinion, how i should start with symmetry part.
Thanks
Array references are done with pointer arithmetic. First we have to know the location of the variables string1 and i. Let's assume string1 is in $a0 and i is in $t0. We will need to add these two variables together. Whenever we do an arithmetic operation we have to send the result somewhere, and here the idea is to a send the result to a new as-yet-unused register, say $t1. ($a0 and $t0 in this scenario would be a bad place to send the result since those registers hold values we'll need later on in the current or next iteration of the loop.)
add $t1, $a0, $t0
Next dereference that temporary pointer using lbu:
lbu $t2, 0($t1)
again targeting an otherwise unused register.
The C version using three address code would look like this:
char *p;
char ch;
p = string1 + i;
char ch = *p;
Comparison is done with either the beq or bne instruction, both of which take two registers to compare (for equality or inequality, respectively) and a branch target in the form of a label.
We use conditional branches to skip ahead for if-then. The idea is to reason when to skip the then part — and when to skip is the opposite of the if condition as we would write it in C. In assembly: skip this if that, whereas in C: do this if that. Thus, the opposite condition is tested in assembly in order to skip around the then-part.

Assembler Intelx86: Comparing if I'm at the end of a string isn't working

I was doing a program in Assembler for Intelx86 (32 Bits, writing in Windows) where I have to cipher a string that I receive with _gets. I cipher by blocks of two chars, and if those two are the same, I need to add them to the Result String without having to change them. To iterate through the string I use EBX, like this:
cmp byte[userstring + ebx], 0
je EmptyBlock
For some reason, it works when it's only two characters, but when the string has 4 characters (All being exactly the same, ex "AAAA") it stops working. It enters an endless loop with the EBX increasing by 2 nonstop. Am I skipping something? Am I doing the wrong comparison? Sorry, I'm new to Assembler.

User input string and print in Assembly

Required to get the user to input a string though it does not need to prompt the user in any for just expects for them to to type it in.
Here's what I have so far:
mov ah, 3fh ;3fh Reads the string and moves it to ah
int 21H ;Calls MS-DOS to input string
mov ah,9 ;Store interrupt code in ah to display string stored in dx
int 21h ;interrupt code
This is the output
first line user entered hello, second line it repeats what user entered, then random symbols
Not sure why there is all of those symbols after it
Require for a school work sheet, though I don't really understand what I am doing.
Random symbols are printed because int 21, 9 prints strings up to $.
So you need to add $ to [DS:DX + AX], where the inputed string ends.
(Why it is [DS:DX + AX]? Because int 21, 3F returns number of bytes to AX)

what is the syntax to define a string constant in assembly?

I am learning assembly I see two examples of defining a string:
msg db 'Hello, world!',0xa
what does the 0xa mean here?
message DB 'I am loving it!', 0
why we have a 0 here?
is it a trailing null character?
why we have 0xa the above example but 0 here? (doesn't seem they are relating to string length)
If the above examples are two ways of defining an assembly string, how could the program differentiate them?
Thanks ahead for any help :)
The different assemblers have different syntax, but in the case of db directive they are pretty consistent.
db is an assembly directive, that defines bytes with the given value in the place where the directive is located in the source. Optionally, some label can be assigned to the directive.
The common syntax is:
[label] db n1, n2, n3, ..., nk
where n1..nk are some byte sized numbers (from 0..0xff) or some string constant.
As long as the ASCII string consists of bytes, the directive simply places these bytes in the memory, exactly as the other numbers in the directive.
Example:
db 1, 2, 3, 4
will allocate 4 bytes and will fill them with the numbers 1, 2, 3 and 4
string db 'Assembly', 0, 1, 2, 3
will be compiled to:
string: 41h, 73h, 73h, 65h, 6Dh, 62h, 6Ch, 79h, 00h, 01h, 02h, 03h
The character with ASCII code 0Ah (0xa) is the character LF (line feed) that is used in Linux as a new line command for the console.
The character with ASCII code 00h (0) is the NULL character that is used as a end-of-string mark in the C-like languages. (and probably in the OS API calls, because most OSes are written in C)
Appendix 1: There are several other assembly directives similar to DB in that they define some data in the memory, but with other size. Most common are DW (define word), DD (define double word) and DQ (define quadruple word) for 16, 32 and 64 bit data. However, their syntax accepts only numbers, not strings.
0 is a trailing null, yes. 0xa is a newline. They don’t define the same string, so that’s how you would differentiate them.
0xa stands for the hexadecimal value "A" which is 10 in decimal. The Linefeed control character has ASCII code 10 (Return has D hexadecimal or 13 decimal).
Strings are commonly terminated by a nul character to indicate their end.

Resources