i'am new at StackOverFlow, i get into trouble and i need your help.
I'am student and i need to write a MIPS program that checks if one string is symmteric.
*sample symmetric strings : ana, asddsa, fillif and so on.
This is my first line of code where i am reading string into an array, but i stucked at the symmetric part.
.data
array: .space 50 # char a[50];
.text
readText:
li $v0,8 # input
la $a0,array # loadiraj memorija vo adresata
li $a1,20 #obezbedi memorija za string
move $t0,$a0 #zachuvaj string vo $t0
syscall
symmetry:
Please give me an opinion, how i should start with symmetry part.
Thanks
Array references are done with pointer arithmetic. First we have to know the location of the variables string1 and i. Let's assume string1 is in $a0 and i is in $t0. We will need to add these two variables together. Whenever we do an arithmetic operation we have to send the result somewhere, and here the idea is to a send the result to a new as-yet-unused register, say $t1. ($a0 and $t0 in this scenario would be a bad place to send the result since those registers hold values we'll need later on in the current or next iteration of the loop.)
add $t1, $a0, $t0
Next dereference that temporary pointer using lbu:
lbu $t2, 0($t1)
again targeting an otherwise unused register.
The C version using three address code would look like this:
char *p;
char ch;
p = string1 + i;
char ch = *p;
Comparison is done with either the beq or bne instruction, both of which take two registers to compare (for equality or inequality, respectively) and a branch target in the form of a label.
We use conditional branches to skip ahead for if-then. The idea is to reason when to skip the then part — and when to skip is the opposite of the if condition as we would write it in C. In assembly: skip this if that, whereas in C: do this if that. Thus, the opposite condition is tested in assembly in order to skip around the then-part.
I am reading this wonderful skript on operating system programming
http://www.cs.bham.ac.uk/~exr/lectures/opsys/10_11/lectures/os-dev.pdf
On Page 12 there is a simple bootloader.
If I understand correclty, the code shown is what you must write in NASM to get BIOS to print out characters.
What I dont get is this:
It says that
we need interrupt 0x10 and to set ah to 0x0e (to indicate tele-type mode)
and al to the ASCII code of the character we wish to print.
But the first instruction is:
mov ah , 0x0e ;int 10/ ah = 0eh -> scrolling teletype BIOS routine
I don't understand the comment on that line. Why doesn't the first line of code say:
mov ah, 0xeh
int 0x10
if thats what you need to do?
Thanks for help!
Although Chrono gave you an answer I'm not quite sure it answers your question. You seem to be asking why the comment says one thing and the code seemingly does another.
Base Prefixes and Suffixes
Decades ago a lot of reference material and some disassemblers used a slightly different default notation to represent decimal, hexadecimal, octal, and binary bases than you might see today. They specified the base as the last character(suffix) of the value. Common suffixes are:
b = binary 10101010b (decimal 170) base 2
d = decimal 170d (decimal 170) \ both d and t mean base 10
t = decimal 170t (decimal 170) /
h = hex 0AAh (decimal 170) base 16
o = octal 252o (decimal 170) base 8
If a number contains no alphabetic characters then it is assumed to be base 10 decimal. So this also applies:
no alphabetic character 170 decimal 170
Most assemblers will accept most of these suffixes, but they also will support the base being defined as a prefix. If a value doesn't end with an alphabetic character but starts with a 0 followed by a letter then the letter denotes the base. Common prefix bases are:
b = binary 0b10101010 (decimal 170) base 2
d = decimal 0d170 (decimal 170) \ both d and t mean base 10
t = decimal 0t170 (decimal 170) /
h = hex 0xAA (decimal 170) base 16
o = octal 0o252 (decimal 170) base 8
Most modern assemblers will support the forms specified as a prefix or suffix. Some assemblers may not support some of the prefixes and suffixes like t.
If you specify numbers with a prefix base then stick with prefixes throughout the whole file. If you specify numbers with a suffix base then stick with suffixes throughout the whole file. You can mix them up, but it is best to be consistent in a file.
Interpreting int 10/ ah = 0eh
What does this mean:
int 10/ ah = 0eh -> scrolling teletype BIOS routine
int 10 contains no letters so it is decimal 10 (or hexadecimal a).
0eh ends with a letter and doesn't start with 0 and a letter so h is the suffix. h means hexadecimal. So 0ehis hexadecimal 0e (or decimal 14).
If you were to put that into assembler code for the BIOS it would look like (using hexadecimal suffix):
mov ah, 0eh ; Decimal 14
int 0ah ; Decimal 10. The 0 in front makes sure the assembler knows we don't mean register ah!
Using prefixes (hexadecimal in this example):
mov ah, 0xe ; Decimal 14
int 0xa ; Decimal 10
Or if you want to use decimal (no prefix or suffix):
mov ah, 14 ; Decimal 14
int 10 ; Decimal 10
But you may now be saying Hey Wait! that is wrong because the BIOS video interrupt is 0x10 (or 16 decimal) you are correct! We have just learned that the comment is wrong or at best VERY ambiguous. The comment should have said:
int 10h / ah = 0eh -> scrolling teletype BIOS routine
You may wish to contact the author of the comment / code and let them know that their comment is inaccurate. The code they wrote is correct.
If the assembler supports them I prefer prefixes like 0x, 0b, 0o instead of the suffixes h, b, o because it is possible for some suffixes to form register names, or other identifiers and symbols. When using suffixes, if you have a value that must start with a letter (ie: A to F in hexadecimal) add a 0 to the beginning to let the assembler know you are representing a value. As an example AAh would have to be written as 0AAh, and Bh would have to be written as 0Bh.
The comment is just for context, stating that AH=0x0e because it denotes the scrolling teletype BIOS routine when invoking INT 0x10.
You could think of the int XXX instruction as an "execute function XXX" instruction for simplicity purposes. In this particular case, if you don't first load the AL register with a byte of your choosing, whatever byte is in that register will be printed each time INT 0x10 appears. That's why AH is initially loaded with 0x0e for the scrolling teletype routine, and AL is then loaded each time with a byte to display before executing the INT 0x10 instruction.
In some simplistic commented pseudocode:
#AH=0x0e is the scrolling teletype BIOS routine when used with int 10h.
AH := 0x0e
#AL is the byte to display.
AL := 'H'
#Execute the scrolling teletype BIOS routine (AH=0x0e), displaying 'H' (AL='H').
INT 0x10
I am trying to create a program at LC3 assembly that counts the length of a string in the following way:
All data is already stored somewhere in memory.
There is a variable in which the address of the first element of the string is stored. (I apologize in advance for my lack of assembly knowledge in case this thing is not called "variable".)
The output (length of the string) must be stored at R0.
I made an attempt, but the results are disappointing. Here is my code:
.ORIG X3000
AND R0,R0,#0 ;R0 has the output(lenght)
LEA R1,ZERO ;R1 always has an adress of an element of the string
LOOP LDR R2,R1,#0 ;R2 has the contex of that adress
BRZ FINISH ;if R2=0,then we have found end of string
ADD R0,R0,#1 ;if not,increase the lenght by 1.
ADD R1,R1,#1 ;increase the adress by one
BRznp LOOP
FINISH
HALT
ZERO .FILL x5000 ;i chose a random rocation.I don't even know how to store a string in memory to run this program.
.END ;do i need any ASCII-decimal transformation or something similar?
Actually, I guess that my program is a piece of garbage.This is the new version of my program.I suppose that X0000 is end of string.I am a total beginner at LC3 assembly. How can I count that length?
To define a string you can use the .STRINGZ directive, which also places the terminating zero after it. You should use BRNZP because the assembler apparently doesn't like BRZNP. Other than that, your code works fine.
I am working on a project for school to find a substring in an array of charaters. The assembly program is called from c and passes three arguments: the array(a1), the starting index(a2), and the ending index(a3).
The current solution I have is as followed :
.global mysubstring
mysubstring:
stmfd sp!, {v1-v6, lr}
mov v1, a1 #copy array into a register
mov v2, a2 #copy starting index into a register
sub a1, a3, a2
add a1, a1, #1 #finds how many bytes to request from malloc
bl malloc #call c library routine to create a pointer to allocated space in a1
loop:
ldrbt v5, [v1], v2 #load byte located in v1, indexed by v2, into register v5
strbt a1, [v5] #store byte located in v5 into a1
add v2, v2, #1 #increment index
cmp v2, a3 #check if we've reached the end
ble loop
bl printf
ldmfd sp!, {v1-v6,pc}
.end
I'll admit that my understanding of assembly is very poor and may have messed up meaning of somethings.
Whenever I run the c program, the assembly program returns (null). I have a feeling it is because I'm not properly adding items into a1 (should I be?). I believe malloc is appropriately designating the required space for my substring, I feel the error is in the loop subroutine.
If anyone could please give me some insight on where to go from here it would be most appreciated!
Thank you!
You appear to be using the wrong addressing mode:
ldrbt v5, [v1], v2 #load byte located in v1, indexed by v2, into register v5
That's the "register post-indexed" mode, which effectively does:
address = v1
v1 += v2
v5 = *address
Note that v1 will be modified after each load, since post-increment addressing implicitly writes back the indexed address.
What you want is "register pre-indexed" addressing, which would be written as:
ldrbt v5, [v1, v2]
Which is more efficient for the compiler and the best practice for checking whether a string is blank?
Checking whether the length of the string == 0
Checking whether the string is empty (strVar == "")
Also, does the answer depend on language?
Yes, it depends on language, since string storage differs between languages.
Pascal-type strings: Length = 0.
C-style strings: [0] == 0.
.NET: .IsNullOrEmpty.
Etc.
In languages that use C-style (null-terminated) strings, comparing to "" will be faster. That's an O(1) operation, while taking the length of a C-style string is O(n).
In languages that store length as part of the string object (C#, Java, ...) checking the length is also O(1). In this case, directly checking the length is faster, because it avoids the overhead of constructing the new empty string.
In languages that use C-style (null-terminated) strings, comparing to "" will be faster
Actually, it may be better to check if the first char in the string is '\0':
char *mystring;
/* do something with the string */
if ((mystring != NULL) && (mystring[0] == '\0')) {
/* the string is empty */
}
In Perl there's a third option, that the string is undefined. This is a bit different from a NULL pointer in C, if only because you don't get a segmentation fault for accessing an undefined string.
In .Net:
string.IsNullOrEmpty( nystr );
strings can be null, so .Length sometimes throws a NullReferenceException
For C strings,
if (s[0] == 0)
will be faster than either
if (strlen(s) == 0)
or
if (strcmp(s, "") == 0)
because you will avoid the overhead of a function call.
String.IsNullOrEmpty() only works on .net 2.0 and above, for .net 1/1.1, I tend to use:
if (inputString == null || inputString == String.Empty)
{
// String is null or empty, do something clever here. Or just expload.
}
I use String.Empty as opposed to "" because "" will create an object, whereas String.Empty wont - I know its something small and trivial, but id still rather not create objects when I dont need them! (Source)
Assuming your question is .NET:
If you want to validate your string against nullity as well use IsNullOrEmpty, if you know already that your string is not null, for example when checking TextBox.Text etc., do not use IsNullOrEmpty, and then comes in your question.
So for my opinion String.Length is less perfomance than string comparison.
I event tested it (I also tested with C#, same result):
Module Module1
Sub Main()
Dim myString = ""
Dim a, b, c, d As Long
Console.WriteLine("Way 1...")
a = Now.Ticks
For index = 0 To 10000000
Dim isEmpty = myString = ""
Next
b = Now.Ticks
Console.WriteLine("Way 2...")
c = Now.Ticks
For index = 0 To 10000000
Dim isEmpty = myString.Length = 0
Next
d = Now.Ticks
Dim way1 = b - a, way2 = d - c
Console.WriteLine("way 1 took {0} ticks", way1)
Console.WriteLine("way 2 took {0} ticks", way2)
Console.WriteLine("way 1 took {0} ticks more than way 2", way1 - way2)
Console.Read()
End Sub
End Module
Result:
Way 1...
Way 2...
way 1 took 624001 ticks
way 2 took 468001 ticks
way 1 took 156000 ticks more than way 2
Which means comparison takes way more than string length check.
After I read this thread, I conducted a little experiment, which yielded two distinct, and interesting, findings.
Consider the following.
strInstallString "1" string
The above is copied from the locals window of the Visual Studio debugger. The same value is used in all three of the following examples.
if ( strInstallString == "" ) === if ( strInstallString == string.Empty )
Following is the code displayed in the disassembly window of the Visual Studio 2013 debugger for these two fundamentally identical cases.
if ( strInstallString == "" )
003126FB mov edx,dword ptr ds:[31B2184h]
00312701 mov ecx,dword ptr [ebp-50h]
00312704 call 59DEC0B0 ; On return, EAX = 0x00000000.
00312709 mov dword ptr [ebp-9Ch],eax
0031270F cmp dword ptr [ebp-9Ch],0
00312716 sete al
00312719 movzx eax,al
0031271C mov dword ptr [ebp-64h],eax
0031271F cmp dword ptr [ebp-64h],0
00312723 jne 00312750
if ( strInstallString == string.Empty )
00452443 mov edx,dword ptr ds:[3282184h]
00452449 mov ecx,dword ptr [ebp-50h]
0045244C call 59DEC0B0 ; On return, EAX = 0x00000000.
00452451 mov dword ptr [ebp-9Ch],eax
00452457 cmp dword ptr [ebp-9Ch],0
0045245E sete al
00452461 movzx eax,al
00452464 mov dword ptr [ebp-64h],eax
00452467 cmp dword ptr [ebp-64h],0
0045246B jne 00452498
if ( strInstallString == string.Empty ) Isn't Significantly Different
if ( strInstallString.Length == 0 )
003E284B mov ecx,dword ptr [ebp-50h]
003E284E cmp dword ptr [ecx],ecx
003E2850 call 5ACBC87E ; On return, EAX = 0x00000001.
003E2855 mov dword ptr [ebp-9Ch],eax
003E285B cmp dword ptr [ebp-9Ch],0
003E2862 setne al
003E2865 movzx eax,al
003E2868 mov dword ptr [ebp-64h],eax
003E286B cmp dword ptr [ebp-64h],0
003E286F jne 003E289C
From the above machine code listings, generated by the NGEN module of the .NET Framework, version 4.5, I draw the following conclusions.
Testing for equality against the empty string literal and the static string.Empty property on the System.string class are, for all practical purposes, identical. The only difference between the two code snippets is the source of the first move instruction, and both are offsets relative to ds, implying that both refer to baked-in constants.
Testing for equality against the empty string, as either a literal or the string.Empty property, sets up a two-argument function call, which indicates inequality by returning zero. I base this conclusion on other tests that I performed a couple of months ago, in which I followed some of my own code across the managed/unmanaged divide and back. In all cases, any call that required two or more arguments put the first argument in register ECX, and and the second in register EDX. I don't recall how subsequent arguments were passed. Nevertheless, the call setup looked more like __fastcall than __stdcall. Likewise, the expected return values always showed up in register EAX, which is almost universal.
Testing the length of the string sets up a one-argument function call, which returns 1 (in register EAX), which happens to be the length of the string being tested.
Given that the immediately visible machine code is almost identical, the only reason that I can imagine that would account for the better performance of the string equality over the sting length reported by Shinny is that the two-argument function that performs the comparison is significantly better optimized than the one-argument function that reads the length off the string instance.
Conclusion
As a matter of principle, I avoid comparing against the empty string as a literal, because the empty string literal can appear ambiguous in source code. To that end, my .NET helper classes have long defined the empty string as a constant. Though I use string.Empty for direct, inline comparisons, the constant earns its keep for defining other constants whose value is the empty string, because a constant cannot be assigned string.Empty as its value.
This exercise settles, once and for all, any concern I might have about the cost, if any, of comparing against either string.Empty or the constant defined by my helper classes.
However, it also raises a puzzling question to replace it; why is comparing against string.Empty more efficient than testing the length of the string? Or is the test used by Shinny invalidated because by the way the loop is implemented? (I find that hard to believe, but, then again, I've been fooled before, as I'm sure you have, too!)
I have long assumed that system.string objects were counted strings, fundamentally similar to the long established Basic String (BSTR) that we have long known from COM.
In Java 1.6, the String class has a new method [isEmpty] 1
There is also the Jakarta commons library, which has the [isBlank] 2 method. Blank is defined as a string that contains only whitespace.
Actually, IMO the best way to determine is the IsNullOrEmpty() method of the string class.
http://msdn.microsoft.com/en-us/library/system.string.isnullorempty.
Update: I assumed .Net, in other languages, this might be different.
In this case, directly checking the length is faster, because it avoids the overhead of constructing the new empty string.
#DerekPark: That's not always true. "" is a string literal so, in Java, it will almost certainly already be interned.
#Nathan
Actually, it may be better to check if the first char in the string is '\0':
I almost mentioned that, but ended up leaving it out, since calling strcmp() with the empty string and directly checking the first character in the string are both O(1). You basically just pay for an extra function call, which is pretty cheap. If you really need the absolute best speed, though, definitely go with a direct first-char-to-0 comparison.
Honestly, I always use strlen() == 0, because I have never written a program where this was actually a measurable performance issue, and I think that's the most readable way to express the check.
Again, without knowing the language, it's impossible to tell.
However, I recommend that you choose the technique that makes the most sense to the maintenance programmer that follows and will have to maintain your work.
I'd recommend writing a function that explicitly does what you want, such as
#define IS_EMPTY(s) ((s)[0]==0)
or comparable. Now there's no doubt at is you're checking.
If strings in a language have an internal length property, checking the length is faster because it is an integer comparison of the property value with zero. However, when strings have no such property, the length must be determined the moment you want to test for it, and this is very fast for a string that actually has length zero, but can take a very long time if the string is huge, which you cannot know in advance, because if you knew it had size zero, why do you need to check that at all?
If a strings don't store their lengths internally, comparing against an empty string is usually faster, as even if that means that the two strings will be compared character by character, this loop terminates after the very first iteration for sure, so the time is linear (O(1)) and does not depend on string length. However, even if strings do have an internal length property, comparing them to an empty string may be as fast as checking that property as in that case most implementations will do just that: They first compare lengths and if not the same, they skip comparing characters entirely as if not even the lengths match, the strings cannot be equal to begin with. Yet if the lengths do match, they usually check for the special case 0 and again skip the character comparison loop.
And if a language does offer an explicit way to check if a string is empty, always use that, since no matter which way is faster, that is the way this check is using internally. E.g. to check for an empty string in a shell script, you could use
[ "$var" = "" ]
which would be string comparison. Or you could use
[ ${#var} -eq 0 ]
which uses string length comparison. Yet the most efficient way is in fact
[ -z "$var" ]
as the -z operation only exists for exactly this purpose.
C is special in that regard, as the string internals are exposed (strings are not encapsulated objects in C) and while C strings don't have a length property and their length must be determined each time needed, it is very easy to check if a C string is empty by just checking if it's first character is NUL, as NUL is always the last character in a C string, so nothing can beat:
char * string = ...;
if (!*string) { /* empty */ }
(note that in C *string is the same as string[0], !x is the same as x == 0, and 0 is the same '\0', so you could have written string[0] == '\0' but for the compiler that's exactly the same as what I've written above)