How do I find unique characters in users input?

How do I find unique characters in users input? - string

This algorithm creates a string by taking each unique character in the message in the order they first appear and putting that letter and the number of times it appears in the original message into the shortened string. Your algorithm should ignore any spaces in the message, and any characters which it has already put into the shortened string. For example, the string "I will arrive in Mississippi really soon" becomes "8i1w4l2a3r1v2e2n1m5s2p1y2o".
Here's my code for determining how many unique characters there are. I'm having trouble creating the nested loop to scan the whole string. Help pls!!
boolean used = false;
for (int j = 0; j<i; j++){
if (input.substring(j,j+1).equals(ltr)){
used = true;
}
}
if (!used){
num++;
int count = 0;
for(int k=i; k<input.length(); k++){
if(input.substring(k,k+1).equals(ltr))
count++;
}
}

I am not sure about that. Maybe your nested loop is not right.
Do you use nested loop?
your code is like this: for(){} for(){}
not for(){ for(){ }}

your program just scan the current character and the next character in position ! to find it is unique or not that's the problem
here your problem exactly
if (input.substring(j,j+1).equals(ltr)){

Related

Blueprism unable to match two data items that are the same

I have an object which is trying to determine if a value it reads from on screen is the same as that passed to the object. This is a validation step and it doesn't appear to recognize them when they are the same. I have also tried trimming and lowering both values. I have also tried Test Regex Match.
Is there any way that I can get the object to recognize that they are the same, or is there a way for me to find out why they are not matching?

A strange thing. If direct comparison failed, even after trimming and with regex failed, there is probably something wrong with some of the characters. I would probably guess the spaces. Have you experienced this behaviour even on values without spaces?
Anyway, I would probably build a C# code stage like this, that accepts txt (string) and outputs col (collection):
col = new DataTable();
col.Columns.Add("Pos", typeof(decimal));
col.Columns.Add("Char", typeof(string));
col.Columns.Add("CharNum", typeof(decimal));
char[] arr = txt.ToCharArray();
for (int i = 0; i < arr.Length; i++)
{
DataRow row = col.NewRow();
row["Pos"] = i;
row["Char"] = arr[i];
row["CharNum"] = (int)arr[i];
col.Rows.Add(row);
}
The result would be like this:
Try to run the code stage on both of your values and see if there is a visible discrepancy.

The solution was to use a Remove Non Word Characters Action in Utility Strings.

Check if a substring exists at the beginning, middle and end of a string while allowing intersections

It sound easy, you can simply iterate and check them, but the problem here is optimization: Don't make any needless checking, needless new objects or operation.
The algorithm will be tested against a huge set of test cases to verify its efficiency.
Examples:
"aaaa" contains "aa" at the beginning, middle and end.
"baabaabaaaabbaab" contains "baab" at the beginning, middle and end. See the intersection.
And one more thing I forgot to say:
You are not given the substring to check for, you need to find if such a substring exists, if it doesn't return false, if it does return true.
Find the longest substring satisfying those conditions and return it, or print it (your choice).
A simple Boolean function, right?
Update:
The substring needs to be at least 2 character shorter that the main string.
Sorry, it was my mistake in the "aaa" example, I fixed it.

You can solve it with KMP, a string matching algorithm. Using it to generate an array fail[]
fail[i] = max {k | S[1:k] == S[i-k+1:i]}
Then you can enumerate all possible value of fail[n](fail[n], fail[ fail[n] ], fail[ fail[fail[n]] ] ...) to check whether it exists in the middle.
The complexity is O(n).

Let's jump the shark:
function the_best_match_at_the_beginning_the_middle_and_the_end( s ){
print( s );
return true;
}

That's one of these "you might get significantly better in terms of theoretical complexity, but in reality, linear operation is always faster" answers:
Assuming in is your input string, pattern is what you're looking for, and you're able to read or look up C-standard-lib-style methods like strncmp. Let l_in be the number of characters in the input, l_pattern the number of characters in the pattern.
Simply explicitely check the start (strncmp(in,pattern,l_pattern)); then use a bog-normal linear search from the second letter on (strstr(in+1, pattern):
If strstr didn't find anything, there's no middle match nor a end match.
If it's at the end (result of strstr is l_in-l_pattern), you've got no middle match.
If it's not found at the end, you've got a middle match. Manually check (strncmp(in+l_in-l_patter, pattern, l_pattern)) for the end match.
Why this is faster? Because modern computers are pretty optimized for searching through data linearly, see Bjarne "C++" Stroustrup's why you should avoid linked lists. Simply put, letting your CPU run on a continous amount of memory prefetched to a CPU cache is much much faster than being "clever" about avoiding a few duplicate checks.

One clean way to approach this is to just check all substrings in the input from the beginning. Compare each substring to see that it exists at the end, and then check to see if it exists in the middle. For the middle check, you can compare against the input string with its first and last characters removed.
public boolean subStrings(String input) {
if (input == null || input.equals("")) {
return false;
}
if (input.length() == 1) {
System.out.println(input + " is a match!");
return true;
}
boolean foundIt = false;
String longestMatch = "";
for (int i=1; i < inputNew.length(); ++i) {
String substring = inputNew.substring(0, i);
boolean endMatch = inputNew.substring(inputNew.length()-i, inputNew.length()).equals(substring);
boolean midMatch = inputNew.substring(1, inputNew.length()-1).contains(substring);
if (endMatch && midMatch) {
longestMatch = substring;
foundIt = true;
}
}
if (foundIt) {
System.out.println(longestMatch + " is a match!");
return true;
}
else {
return false;
}
}
subStrings("baabaabaaaabbaab");
Output:
baab is a match!

Can we skip indexes when we search a string in another string?

Considering a search in a string for an exact match of another string. Is it safe to continue the search at the position where a partial match stopped to match, without getting wrong results?
In code:
int indexOf(string target, string search){
for(int i=0; i + search.length < target.length; i++){
int f=0;
for(; f < search.length && search[f] == target[i + f]; f++); //empty loop
if(f == search.length) return i;
i += f; //is it safe to do this without to worry about a missing match?
}
}
The thing to worry about is to miss an exact match starting in the partial match (somewhere between i and i + f in the code above). But in fact I couldn't think up any example case to proof the worry. Can you?

There are various string search algorithms here.
I think this is what you want which is know as KMP.

Yes, you need to worry about it, and an example of why you need to worry about it would be searching for the substring "ananas" in the string "anananas".

Word Break time complexity

I came across the word break problem which goes something like this:
Given an input string and a dictionary of words,segment the input
string into a space-separated sequence of dictionary words if
possible.
For example, if the input string is "applepie" and dictionary contains a standard set of English words,then we would return the string "apple pie" as output
Now I myself came up with a quadratic time solution. And I came across various other quadratic time solutions using DP.
However in Quora a user posted a linear time solution to this problem
I cant figure out how it comes out to be linear. Is their some mistake in the time complexity calculations? What is the best possible worst case time complexity for this problem. I am posting the most common DP solution here
String SegmentString(String input, Set<String> dict) {
int len = input.length();
for (int i = 1; i < len; i++) {
String prefix = input.substring(0, i);
if (dict.contains(prefix)) {
String suffix = input.substring(i, len);
if (dict.contains(suffix)) {
return prefix + " " + suffix;
}
}
}
return null;
}

The 'linear' time algorithm that you linked here works as follows:
If the string is sharperneedle and dictionary is sharp, sharper, needle,
It pushes sharp in the string.
Then it sees that er is not in dictionary, but if we combine it with the last word added, then sharper exists. Hence it pops out the last element and pushes this in.
IMO the above logic fails for string eaterror and dictionary eat, eater, error.
Here er shall pop out eat from the list, and push in eater. The remaining string ror shall not be recognized and discarded.
As regards the code you posted, as mentioned in the comments, this works for only two words with one partition place.

Looping in a String to find Unicode characters is taking too much time

I am creating a custom field where I want to replace some unicode caracters by pictures. Its like doing emoticons for blackberry device. Well I have a problem looping the caracters in the edit field and replacing the unicode caracters by images. When the text becomes too long, the loop takes too much time.
My code is as follows:
String aabb = "";
char[] chara = this.getText().toCharArray();
for (int i = loc; i < chara.length; i ++) {
Character cc = new Character(chara[i]);
aabb += cc.toString();
if (unicodeCaracter) {
//Get the location
//draw the image in the appropriate X and Y
}
}
Well this works fine, and the images are getting in the right place. But the problem is when the text becomes large, the looping is taking too much time, and the input of the text on the device becomes non friendly.
How to find the unicode caracters in a text without having to loop each time for them? Is their another way than this that I missed?
I need help with this issue. Thanks in advance

Well you're creating a new Character and a new String in each iteration of the loop, and converting the string to a character array to start with. You're also using string concatenation in a loop rather than using a StringBuffer. All of these will be hurting performance.
It's not obvious what you mean by "Unicode characters" here - all characters in Java are Unicode characters. I suspect you really want something like:
String text = this.getText();
StringBuffer buffer = new StringBuffer(text.length());
for (int i = 0; i < text.length(); i++) {
char c = text.charAt(i);
buffer.append(c);
if (c > 127) { // Or whatever
// Take some action
}
}
I'm assuming the "take some action" will be changing the buffer in some respect, otherwise the buffer is pointless of course... but fundamentally that's likely to be the sort of change you want.
The string concatenation in a loop is a particularly bad idea - see my article on it for more details.

What takes time is the string concatenation.
Strings are immutable in Java. Each time you do
aabb += cc.toString();
you create a new String object containing all the chars of the previous one, which must be garbage collected, plus the new ones. Use a StringBuilder to build your string:
StringBuilder builder = new StringBuilder(this.getText().length() + 100); // size estimation
char[] chara = this.getText().toCharArray();
for (int i = loc; i < chara.length; i++) {
builder.append(chara[i]);
if (unicodeCaracter) {
//Get the location
//draw the image in the appropriate X and Y
}
}
String aabb = builder.toString();

Well, besides speeding up your loop, you could also try and minimize the work load.
If the user is appending text you could store the last position you scanned previously time and start from there..
On inserts/deletes you'd need to get the caret position and scan the deleted/inserted part and maybe surrounding characters (if you have character groups instead of single characters that get replaced).
However, fixing loop performance is likely to give you a better improvement in your case, as I doubt you'll have that long strings to make that algorithmic change worthwhile.

The most important performance enhancements have already been stated but looping backwards will also help in BlackBerry apps.
Programming Tips: General Coding Tips

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How do I find unique characters in users input? - string

I am not sure about that. Maybe your nested loop is not right. Do you use nested loop? your code is like this: for(){} for(){} not for(){ for(){ }}

your program just scan the current character and the next character in position ! to find it is unique or not that's the problem here your problem exactly if (input.substring(j,j+1).equals(ltr)){

Related

Blueprism unable to match two data items that are the same

Check if a substring exists at the beginning, middle and end of a string while allowing intersections

Can we skip indexes when we search a string in another string?

Word Break time complexity

Looping in a String to find Unicode characters is taking too much time

Categories

Resources