MIPS, Number of occurrences in a string located in the stack - string

I have an exercise to solve in MIPS assembly (where I have some doubts but other things are clear) but I have some problem to write it's code. The exercise ask me:
Write a programm that, obtained a string from keyboard, count the occurrences of the character with the higher number of occurrences and show it.
How I can check all the 26 characters and find who has the higher occurences?
Example:
Give me a string: Hello world!
The character with the higher occurrences is: l
Thanks alot for the future answer.
P.s.
This is my first part of the programm:
#First message
li $v0, 4
la $a0, mess
syscall
#Stack space allocated
addi $sp, $sp, -257
#Read the string
move $a0, $sp
li $a1, 257
li $v0, 8
syscall

Since this is your assignment I'll leave the MIPS assembly implementation to you. I'll just show you the logic for the code in a higher-level language:
// You'd keep these variables in some MIPS registers of your choice
int c, i, count, max_count=0;
char max_char;
// Iterate over all ASCII character codes
for (c = 0; c < 128; c+=1) {
count = 0;
// Count the number of occurences of this character in the string
for (i = 0; string[i]!=0; i+=1) {
if (string[i] == c) count++;
}
// Was is greater than the current max?
if (count > max_count) {
max_count = count;
max_char = c;
}
}
// max_char now hold the ASCII code of the character with the highest number
// of occurences, and max_count hold the number of times that character was
// found in the string.

#Michael, I saw you answered before I posted, I just want to repeat that with a more detailed answer. If you edit your own to add some more explanations, then I will delete mine. I did not edit yours directly, because I was already half-way there when you posted. Anyway:
#Marco:
You can create a temporary array of 26 counters (initialized to 0).
Each counter corresponds to each letter (i.e. the number each letter occurs). For example counter[0] corresponds to the number of occurences of letter 'a', counter[1] for letter 'b', etc...
Then iterate over each character in the input character-sequence and for each character do:
a) Obtain the index of the character in the counter array.
b) Increase counter["obtained index"] by 1.
To obtain the index of the character you can do the following:
a) First make sure the character is not capital, i.e. only 'a' to 'z' allowed and not 'A' to 'Z'. If it is not, convert it.
b) Substract the letter 'a' from the character. This way 'a'-'a' gives 0, 'b'-'a' gives 1, 'c'-'a' gives 2, etc...
I will demonstrate in C language, because it's your exercise on MIPS (I mean the goal is to learn MIPS Assembly language):
#include <stdio.h>
int main()
{
//Maximum length of string:
int stringMaxLength = 100;
//Create string in stack. Size of string is length+1 to
//allow the '\0' character to mark the end of the string.
char str[stringMaxLength + 1];
//Read a string of maximum stringMaxLength characters:
puts("Enter string:");
scanf("%*s", stringMaxLength, str);
fflush(stdin);
//Create array of counters in stack:
int counter[26];
//Initialize the counters to 0:
int i;
for (i=0; i<26; ++i)
counter[i] = 0;
//Main counting loop:
for (i=0; str[i] != '\0'; ++i)
{
char tmp = str[i]; //Storing of str[i] in tmp, to write tmp if needed,
//instead of writing str[i] itself. Optional operation in this particular case.
if (tmp >= 'A' && tmp <= 'Z') //If the current character is upper:
tmp = tmp + 32; //Convert the character to lower.
if (tmp >= 'a' && tmp <='z') //If the character is a lower letter:
{
//Obtain the index of the letter in the array:
int index = tmp - 'a';
//Increment its counter by 1:
counter[index] = counter[index] + 1;
}
//Else if the chacacter is not a lower letter by now, we ignore it,
//or we could inform the user, for example, or we could ignore the
//whole string itself as invalid..
}
//Now find the maximum occurences of a letter:
int indexOfMaxCount = 0;
int maxCount = counter[0];
for (i=1; i<26; ++i)
if (counter[i] > maxCount)
{
maxCount = counter[i];
indexOfMaxCount = i;
}
//Convert the indexOfMaxCount back to the character it corresponds to:
char maxChar = 'a' + indexOfMaxCount;
//Inform the user of the letter with maximum occurences:
printf("Maximum %d occurences for letter '%c'.\n", maxCount, maxChar);
return 0;
}
If you don't understand why I convert the upper letter to lower by adding 32, then read on:
Each character corresponds to an integer value in memory, and when you make arithmetic operations on characters, it's like you are making them to their corresponding number in the encoding table.
An encoding is just a table which matches those letters with numbers.
For example 'a' corresponds to number 97 in ASCII encoding/decoding/table.
For example 'b' corresponds to number 98 in ASCII encoding/decoding/table.
So 'a'+1 gives 97+1=98 which is the character 'b'. They are all numbers in memory, and the difference is how you represent (decode) them. The same table of the encoding, is also used for decoding of course.
Examples:
printf("%c", 'a'); //Prints 'a'.
printf("%d", (int) 'a'); //Prints '97'.
printf("%c", (char) 97); //Prints 'a'.
printf("%d", 97); //Prints '97'.
printf("%d", (int) 'b'); //Prints '98'.
printf("%c", (char) (97 + 1)); //Prints 'b'.
printf("%c", (char) ( ((int) 'a') + 1 ) ); //Prints 'b'.
//Etc...
//All the casting in the above examples is just for demonstration,
//it would work without them also, in this case.

Related

Maximize number of substring such that no substring has characters from other substring

So I was asked an interesting question recently related to strings and substring. Still trying to get the most optimal answer to this. I'll prefer answer in Java though any psuedo-code/language will be good as well.
The question is:
I am given a string S. I have to divide it into maximum number of substrings(not subsequence) such that no substring has character which is present in another substring.
Examples:
1.
S = "aaaabbbcd"
Substrings = ["aaaa","bbb","c","d"]
2.
S = "ababcccdde"
Substrings = ["abab","ccc","dd","e"]
3.
S = "aaabbcccddda"
Substrings = ["aaabbcccddda"]
Will be really glad if I can get a solution which is better than O(n^2)
Thanks for the help.
It can be done in O(n) time.
The idea behind it is to predict where each substring will end. We know that if we read a char, then the last occurrence of this char must be in the same substring it is (otherwise there would be a repeated char in two distinct substrings).
Let's use abbacacd as example. Suppose we know the first and the last occurrences of every char in the string.
01234567
abbacacd (reading a at index 0)
- we know that our substring must be at least abbaca (last occurrence of a);
- the end of our substring will be the maximum between the last occurrence of
all the chars inside the own substring;
- we iterate through the substring:
012345 (we found b at index 1)
abbaca substring_end = maximum(5, last occurrence of b = 2)
substring_end = 5.
012345 (we found b at index 2)
abbaca substring_end = maximum(5, last occurrence of b = 2)
substring_end = 5.
012345 (we found a at index 3)
abbaca substring_end = maximum(5, last occurrence of a = 5)
substring_end = 5.
012345 (we found c at index 4)
abbaca substring_end = maximum(5, last occurrence of c = 6)
substring_end = 6.
0123456 (we found a at index 5)
abbacac substring_end = maximum(6, last occurrence of a = 5)
substring_end = 6.
0123456 (we found c at index 6)
abbacac substring_end = maximum(6, last occurrence of c = 6)
substring_end = 6.
---END OF FIRST SUBSTRING---
01234567
abbacacd [reading d]
- the first and last occurrence of d is the same index.
- d is an atomic substring.
The O(n) solution is:
#include <bits/stdc++.h>
using namespace std;
int main(){
int pos[26][2];
int index;
memset(pos, -1, sizeof(pos));
string s = "aaabbcccddda";
for(int i = 0; i < s.size(); i++){
index = s[i] - 'a';
if(pos[index][0] == -1) pos[index][0] = i;
pos[index][1] = i;
}
int substr_end;
for(int i = 0; i < s.size(); i++){
index = s[i] - 'a';
if(pos[index][0] == pos[index][1]) cout<<s[i]<<endl;
else{
substr_end = pos[index][1];
for(int j = i + 1; j < substr_end; j++){
substr_end = max(substr_end, pos[s[j] - 'a'][1]);
}
cout<<s.substr(i, substr_end - i + 1)<<endl;
i = substr_end;
}
}
}
You can do it with two passes. On the 1st you determine the max index of each character in the string. On the 2nd you keep track of the max index of each encountered character. If the max equals the current index you've reached the end of a unique substring.
Here's some Java code to illustrate:
char[] c = "aaaabbbcd".toCharArray();
int[] max = new int[26];
for(int i=0; i<c.length; i++) max[c[i]-'a'] = i;;
for(int i=0, m=0, lm=0; i<c.length;)
if((m = Math.max(m, max[c[i]-'a'])) == i++)
System.out.format("%s ", s.substring(lm, lm = i));
Output:
aaaa bbb c d
And for the other 2 strings:
abab ccc dd e
aaabbcccddda
The accepted answer includes some unnecessary complexity in the implementation of algorithm. It is very straight forward to divide strings (as the examples posted by OP in question) into maximum number of substrings such that no substring has character which is present in another substring.
Algorithm:
(assumption: the input string is a not null string having 1 or more characters within 'a' to 'z' inclusive)
Record the last position of each character of input string.
Assume, the first substring end position is 0.
Iterate through string and for every character in input string-
a). If the current character last position is greater than substring end position than update substring end position to current character last position.
b). Add (or print) current character processing as part of current substring.
c). If substring end position is equal to the position of current character processing then it is end of a unique substring and from next character the new substring starts.
Repeat 3 until input string end.
Implementation:
#include <stdio.h>
#include <string.h>
void unique_substr(const char * pst) {
size_t ch_last_pos[26] = {0};
size_t subst_end_pos = 0;
size_t len = strlen(pst);
printf ("%s -> ", pst);
for (size_t i = 0; i < len; i++) {
ch_last_pos[pst[i] - 'a'] = i;
}
for (size_t i = 0; i < len; i++) {
size_t pos = ch_last_pos[pst[i] - 'a'];
if (pos > subst_end_pos) {
subst_end_pos = pos;
}
printf ("%c", pst[i]);
if (subst_end_pos == i) {
printf (" ");
}
}
printf ("\n");
}
//Driver program
int main(void) {
//base cases
unique_substr ("b");
unique_substr ("ab");
//strings posted by OP in question
unique_substr ("aaaabbbcd");
unique_substr ("ababcccdde");
unique_substr ("aaabbcccddda");
return 0;
}
Output:
# ./a.out
b -> b
ab -> a b
aaaabbbcd -> aaaa bbb c d
ababcccdde -> abab ccc dd e
aaabbcccddda -> aaabbcccddda

why is my string being read this way?

I am trying to recreate atoi and I'm wondering why my function works. I ended up changing it to str[i] for the three top statements because it made sense to me, but it passed everything I threw at it.
i = 0;
result = 0;
negative = 1;
if (str[0] == '-')
{
negative = -1;
i++;
}
if (str[0] == '+')
i++;
while (str[0] <= ' ')
i++;
while (str[i] != '\0')
if (str[i] >= '0' && str[i] <= '9')
{
result = result * 10 + str[i] - '0';
++i;
}
return (result * negative);
The first if statement simply checks to see if the number should be a negative.
The second if statement checks to see if the string starts with the optional positive sign
The first while loop should run to infinity if a character who’s ascii is less than the ascii for space(32)
The second while loop simply loops through the string and converts the string to int
Changing the first 3 str[0] to str[i] shouldn’t make much of a difference since i has been initialised to 0. An exception would be if your string started with “-+” which shouldn’t be a valid integer or there’s a space between the signs and the numbers

Append Char To StringBuilder C++/CLI

I am trying to use StringBuilder to create the output that is being sent over the serial port for a log file. The output is stored in a byte array, and I am recursing through it.
ref class UART_G {
public:
static array<System::Byte>^ message = nullptr;
static uint8_t message_length = 0;
};
static void logSend ()
{
StringBuilder^ outputsb = gcnew StringBuilder();
outputsb->Append("Sent ");
for (uint8_t i = 0; i < UART_G::message_length; i ++)
{
unsigned char mychar = UART_G::message[i];
if (
(mychar >= ' ' && mychar <= 'Z') || //Includes 0-9, A-Z.
(mychar >= '^' && mychar <= '~') || //Includes a-z.
(mychar >= 128 && mychar <= 254)) //I think these are okay.
{
outputsb->Append(L""+mychar);
}
else
{
outputsb->Append("[");
outputsb->Append(mychar);
outputsb->Append("]");
}
}
log_line(outputsb->ToString());
}
I want all plain text characters (eg A, :) to be sent as text, while functional characters (eg BEL, NEWLINE) will be sent like [7][13].
What is happening is that the StringBuilder, in all cases, is outputting the character as a number. For example, A is being sent out as 65.
For example, if I have the string 'APPLE' and a newline in my byte array, I want to see:
Sent APPLE[13]
Instead, I see:
Sent 6580807669[13]
I have tried every way imaginable to get it to display the character properly, including type-casting, concatenating it to a string, changing the variable type, etc... I would really appreciate if anyone knows how to do this. My log files are largely unreadable without this function.
You're getting the ASCII values because the compiler is choosing one of the Append overloads that takes an integer of some sort. To fix this, you could do a explicit cast to System::Char, to force the correct overload.
However, that won't necessarily give the proper results for 128-255. You could cast a value in that range from Byte to Char, and it'll give something, but not necessarily what you expect. First off, 0x80 through 0x9F are control characters, and whereever you're getting the bytes from might not intend the same representation for 0xA0 through 0xFF as Unicode has.
In my opinion, the best solution would be to use the "[value]" syntax that you're using for the other control characters for 0x80 through 0xFF as well. However, if you do want to convert those to characters, I'd use Encoding::Default, not Encoding::ASCII. ASCII only defines 0x00 through 0x7F, 0x80 and higher will come out as "?". Encoding::Default is whatever code page is defined for the language you have selected in Windows.
Combine all that, and here's what you'd end up with:
for (uint8_t i = 0; i < UART_G::message_length; i ++)
{
unsigned char mychar = UART_G::message[i];
if (mychar >= ' ' && mychar <= '~' && mychar != '[' && mychar != ']')
{
// Use the character directly for all ASCII printable characters,
// except '[' and ']', because those have a special meaning, below.
outputsb->Append((System::Char)(mychar));
}
else if (mychar >= 128)
{
// Non-ASCII characters, use the default encoding to convert to Unicode.
outputsb->Append(Encoding::Default->GetChars(UART_G::message, i, 1));
}
else
{
// Unprintable characters, use the byte value in brackets.
// Also do this for bracket characters, so there's no ambiguity
// what a bracket means in the logs.
outputsb->Append("[");
outputsb->Append((unsigned int)mychar);
outputsb->Append("]");
}
}
You are recieveing ascii value of the string .
See the Ascii chart
65 = A
80 = P
80 = P
76 = L
69 = E
Just write a function that converts the ascii value to string
Here is the code I came up with which resolved the issue:
static void logSend ()
{
StringBuilder^ outputsb = gcnew StringBuilder();
ASCIIEncoding^ ascii = gcnew ASCIIEncoding;
outputsb->Append("Sent ");
for (uint8_t i = 0; i < UART_G::message_length; i ++)
{
unsigned char mychar = UART_G::message[i];
if (
(mychar >= ' ' && mychar <= 'Z') || //Includes 0-9, A-Z.
(mychar >= '^' && mychar <= '~') || //Includes a-z.
(mychar >= 128 && mychar <= 254)) //I think these are okay.
{
outputsb->Append(ascii->GetString(UART_G::message, i, 1));
}
else
{
outputsb->Append("[");
outputsb->Append(mychar);
outputsb->Append("]");
}
}
log_line(outputsb->ToString());
}
I still appreciate any alternatives which are more efficient or simpler to read.

Modifying strings in emu8086 assembly

I currently working on an intro assignment for a computer architecture course and i was asked to accomplish some string modifications. My question is not how to do it, but what should i be researching to be able to do it? Is there any functions that will make this easier, for example .reverse() is java.
What i need to accomplish is getting string input from the user, reverse the letters (while reversing numbers keep them where they are), add spaces whenever there is a vowel, and alternate the caps.
Example:
Input: AbC_DeF12
Output: f E d _ c B a 2 1
This is code i ripped from the lecture: http://pastebin.com/2E1UtGdD I put it in pastebin to avoid clutter. Anything used in this is fair game. (this code does have limitiations though, it only support ~9 characters and the looping doesn't work at the end of strings)
I would look at it like this.
Generate a function on paper of how you want to achieve this. This is notes and only a starting point.
Loop from 0 to string length.
if(byte >= 'A' || byte <= 'Z') then byte -= 'A' - 'a'; /* convert to lower case */
if(byte >= 'a' || byte <= 'z') then byte += 'A' - 'a'; /* convert to upper case */
/* Switch the letters only. */
a = 0; b = string length
Loop i from a to b. if((input >= 'A' && input <='Z') || (input >= 'a' && input <='z')) p = i
Loop j from b to a. if((input >= 'A' && input <='Z') || (input >= 'a' && input <='z')) q = j
c = input[i]; input[i] = input[j]; input[j] = c;
/* Regenerate the string and add spaces. */
loop i, 0 to string length
if(input[i] == 'A' 'a' 'E' 'e' ...) string2[j] = ' '; j++; string2[j] = input[i]; j++;
i++
After that if you don't know 8086 I would look at examples online of how to do each individual part. The most important bit is generating the code in your head and on paper on how it is going to work.

Finding maximum substring that is cyclic equivalent

This is a problem from a programming contest that was held recently.
Two strings a[0..n-1] and b[0..n-1] are called cyclic equivalent if and only if there exists an offset d, such that for all 0 <= i < n, a[i] = b[(i + d) mod n].
Given two strings s[0..L-1] and t[0..L-1] with same length L. You need to find the maximum p such that s[0..p-1] and t[0..p-1] are cyclic equivalent.Print 0 if no such valid p exists.
Input
The first line contains an integer T indicating the number of test cases.
For each test case, there are two lines in total. The first line contains s. The second line contains t.
All strings contain only lower case alphabets.
Output
Output T lines in total. Each line should start with "Case #: " and followed by the maximum p. Here "#" is the number of the test case starting from 1.
Constraints
1 ≤ T ≤ 10
1 ≤ L ≤ 1000000
Example
Input:
2
abab
baba
abab
baac
Output:
Case 1: 4
Case 2: 3
Explanation
Case 1, d can be 1.
Case 2, d can be 2.
My approach :
Generate all substrings of S and T in the from S[0...i], T[0...i] and concatenate S[0...i] with itself and check if T is a substring of S[0...i]+S[0...i]. if it a substring then maximum P = i
bool isCyclic( string s, string t ){
string str = s;
str.append(s);
if( str.find(t) != string::npos )
return true;
return false;
}
int main(){
string s, t;
int t1,l, o=1;
scanf("%d", &t1);
while( t1-- ){
cin>>s>>t;
l = min( s.length(), t.length());
int i, maxP = 0;
for( i=1; i<=l; i++ ){
if( isCyclic(s.substr(0,i), t.substr(0,i)) ){
maxP = i;
}
}
printf("Case %d: %d\n", o++, maxP);
}
return 0;
}
I knew that this not the most optimized approach for this problem since i got Time Limit Exceeded.I came to know that prefix function can be used to get an O(n) algorithm. I dont know about prefix function.Could someone explain the O(n) approach ?
Contest link http://www.codechef.com/ACMKGP14/problems/ACM14KP3

Resources