I found the following question in many interviews (not my interview).
given a string, you need to replace each space with 2 spaces.
you may assume that your string has enough place for adding the required spaces.
you need to do it in place, memory allocation is not allowed.
I don't understand how to implement this without override letters.
There's not a lot of context in your question. Let's assume it's a programming interview and you are dealing with a low level language like C or assembler. Let's also assume that the string has a count and/or ends in a null, like 'this is a string\0\0\0\0'
I would scan the string from beginning to end and count the spaces, let's call that C. Then I would work backward through the string on character at a time moving each character forward by C positions. Each time a space is encountered, copy the space forward by C positions, subtract one from C, and then move the space by C positions. Stop when C is 0.
Here, nulls/unused are represented by a period.
this is a string.... C=3
this is a string..g.
this is a string.ng.
this is a stringing.
this is a strinring.
this is a stritring.
this is a strstring.
this is a st string.
this is a s string. C=2
this is a a string.
this is a string.
this iss a string. C=1
this iis a string.
this is a string. C=0
Shifting the string following a found space is required by one letter. To reduce time needed to reduce the shifting part, I would use this approach:
Count the number of space. I will call this count c.
Shift the string to the number of spaces to the right (I'm assuming here a left to right reading direction.)
Start a loop starting at offset c until the end of the string:
Initialize a counter for the already duplicated space, called s, with 0
Copy the letter at current position to current position - c + s
If the letter was a space, increment s and add a space to position - c + 1
Not sure if all the offset are correctly calculated in my mind, correct it if needed. But because this is just an interview question to idea is just to sketch a correct algorithm.
Related
A string of length N (can be upto 10^5) is given which consists of only 0 and 1. We have to remove two substrings of length exactly K from the original string to maximize the number of consecutive 1's.
For example suppose the string is 1100110001and K=1.
So we can remove two substrings of length 1. The best possible option here is to remove the 0's at 3rd place and 4th place and get the output as 4 (as the new string will be 11110001)
If I try brute force it'll timeout for sure. I don't know if sliding window will work or not. Can anyone give me any hint on how to proceed? I am not demanding the full answer obviously, just some hints will work for me. Thanks in advance :)
This has a pretty straightforward dynamic programming solution.
For each index i, calculate:
The length of the sequence of 1s that immediately precedes it, if nothing has been removed;
The longest sequence of 1s that could immediately precede it, if exactly one substring is removed before it; and
The longest sequence of 1s that could immediately precede it, if exactly two substrings are removed before it.
For each index, these three values are easily calculated in constant time from the values for earlier indexes, so you can do this in a single pass in O(N) time.
For example, let BEST(i,r) be the best length immediately preceding position i after removing r substrings. If i >= K, then you can remove a substring ending at i and have BEST(i,r) = BEST(i-K,r-1) for r > 0. If string[i-1] = '1' then you could extend the sequence from the previous position and have BEST(i,r) = BEST(i-1,r)+1. Choose the best possibility for each i,r.
The largest value you find in step (3) is the answer.
Suppose I have a string , and I increase the string by adding k times the appearances of each letter of that string (suppose we have the original string aabbbcc, and k=1, then the new string after the change will be aaabbbbccc) - Is this may cause changes in the huffman tree of that string?
I try to find an expample of of a string which such change happens by changing the string as written above, but so far I have failed.
If by "increase" you mean multiply by k, then no, the relative frequencies of the symbols don't change and the resulting Huffman code will not change. If by "increase" you mean add k, then if the original string did not have a equal frequencies for the symbols, then the relative frequencies would change, and it is likely that the Huffman code would change. (Not certain, since you could have been close to having equal frequencies.)
Update:
From the comments, you mean adding k occurrences. So yes, the Huffman code can change, if you're not already close to a flat distribution. It's easy to see that as k gets larger, you approach a flat distribution, as the original frequencies become insignificant compared to k.
I learned that Swift strings cannot be indexed by integer values. I remembered it and I use the rule. But I've never fully understood the mechanic behind it.
The explanation of from the official document is as follows
"Different characters can require different amounts of memory to store, so in order to determine which Character is at a particular position, you must iterate over each Unicode scalar from the start or end of that String. For this reason, Swift strings cannot be indexed by integer values"
I've read it several times, I still don't quite get the point. Can someone explain me a bit more why Swift String cannot be indexed by integer values?
Many Thanks
A string is stored in memory as an array of bytes.
A given character can require 1 to 4 bytes for the basic codepoint, plus any number of combining diacritical mark.
For example, é requires 2 bytes.
Now, if you have the strings efgh and éfgh, to access the second character (f), for the first string, the character is in the byte array at index 1, for the second string, it is at index 2.
In order to know that, you need to inspect the first character. For accessing any character based on its index, you need to go through all the previous characters to know how many bytes each takes.
there I'm currently trying to write a MIPS program that will sort the user inputted String and Bubble sort it. A being the in the front and Z last.
Right now I'm kind of confused on how I can compare each individual character in the string. So for example:
String: Stackoverflow
Compare S and T the first two letters. Since S is belongs in the front it stays and no swap happens.
How would I go about moving onto the next set of characters to compare so T and A would be the next set to compare.
I think I would use the lb (load byte) instruction but I'm not entirely sure of to use the offset correctly.
Thanks for the help.
Just as a reminder, in the loop, you must check whether the current pointer is > than the index (base pointer + length of the string - 1), or you can also check the value at index (current pointer + 1) equals to 0 (NUL) string terminating character or not.
Make sure you keep a copy of the base pointer somewhere (in register or memory).
In each loop, you will read the character currently pointed to by the current pointer by load byte at current pointer with offset 0, and read the next character by load byte at current pointer with offset 1. Then you can do the comparison and swapping. After that, you increase the current pointer by 1 (since a character in ASCII is 1 byte, you will increase the address by 1 byte only).
I have a list of images stored in a directory. They are all named. My GUI reads all the images and saves their names in a cell array. Now I have added a editable box that the user can type in a name and the program will show that image. The problem is I want the program to take into account typos and misspellings by the user and find the most similar file name to the user typed word. Can you please help me?
Many Thanks,
Hamid
You should read this WP article: Approximate string matching and look at "Calculation of distance between strings" on FEx.
I think you should use the longest common subsequence algorithm to approximately compare strings.
Here is a matlab implementation:
http://www.mathworks.com/matlabcentral/fileexchange/24559-longest-common-subsequence
After, just do something like that:
[~,ind]=min(cellarray( #(x) LCS(lower(userInput),lower(x)), allFileNames));
chosenFile=allFileName{ind};
(the function LCS is the longest common subsequence algorithm, and the functionlower converts to lower case)
Not exactly what you are looking for, but you can compare the first few characters of the strings ignoring case to find a close match. See the command strncmpi:
strncmpi Compare first N characters of strings ignoring case.
TF = strncmpi(S,C,N) performs a case-insensitive comparison between the
first N characters of string S and the first N characters in each element
of cell array C. Input S is a character vector (or 1-by-1 cell array), and
input C is a cell array of strings. The function returns TF, a logical
array that is the same size as C and contains logical 1 (true) for those
elements of C that are a match, except for letter case, and logical 0
(false) for those elements that are not. The order of the two input
arguments is not important.