Compare two strings and save the difference as an integer in Python - string

If I have a password, "rusty", and I input the sequence: "rusty123", "Rusty" and "rush" (which in turn is saved to a list newList), when I print out newList, how do I display a result that says:
rusty123, wrong by 3 characters
Rusty, wrong by 1 characters
rush, wrong by 2 characters
?
What I need to add is a function like (countDifference) that allows me to compare the right password 'rusty' with wrong passwords entered. So if I enter 'rusty123', it should compare 'rusty' to 'rusty123' and save the result as a integer (3 - because the password is off by 3 characters i.e. 123 is wrong). I then convert this integer to a string and record it to the file newFile.
I think something that takes (password ='rusty') as a constant, and then reads every line of a new password input, and compares it so 'rusty' will do the trick, but I just don't know how.
password = "rusty"
user_input = raw_input("Please enter the password")
so the user inputs: "Rusty" and the function reads that the password is wrong by 1 character, namely "R" - (should have been lower case)
SOLVED: If you have the same problem, follow the link that #Chris Beck provides at the end of his explanation. It solved this problem perfectly.

Is there a function that can help me determine by how many characters (in integers) the wrong password (entered as a string) was from the right password?
So, this could mean a few different things. There's a few different notions of "by how many characters does this string differ from that string" that people use. Some of them are easier to program than others, if you are new then you might not want to use the most sophisticated versions.
"Distance" from string A to string B
Simplest:
At how many indices i does A[i] != B[i]? (If one string is longer than the other then count all those indices as mismatching also.)
That's the easiest one to implement. However it doesn't always give the most intuitive results. If I have
A = "abracadabra"
B = "abrarcadabra"
the distance of these strings is going to be 8, even though they are only off "by one letter".
Harder: Edit Distance
Under the edit distance, A and B above would have distance 1. Edit distance means, how many insertions / deletions would have to be performed to change A into B. Under some variations, a swap of two adjacent characters is also thought to count as distance only 1.
The usual way to compute edit distance is using dynamic programming. You can see examples of this here: Edit Distance in Python

Related

Find lexicographically smallest string with given hash value [Competitive Coding]

I encountered the following problem for which I couldn't quite find the appropriate solution.
The problem says for a given string having a specific hash value, find the lowest string (which is not the same as the given one) of the
same length and same hash value (if one exists). E.g. For the
following value mapping of alphabets: {a:0, b:1, c:2,...,z:25}
If the given string is: ady with hash value - 27. The
lexicographically smallest one (from all possible ones excluding the
given one) would be: acz
Solution approach I could think of:
I reduced the problem to Coin-Change problem and resorted to finding all possible combinations for the given sum. Out of all the obtained solutions, I sort them up and find the lowest (or the next smallest if the given string is smallest).
The problem however lies with finding all possible solutions (even in a DP approach) which might be inefficient for larger inputs.
My doubt is:
What solution strategy (possibly even Greedy) could give a better time complexity than above?
I cannot guarantee that this will give you a lower complexity, but a couple of things:1 you don't need to check all the space, just the space of lexicographic value less than or equal to the given string. 2: you can formulate it as an integer programming problem:
Assuming your character space is the letters, and each letter is given its number index[0-25] so a corresponds to 0, b to 1 and so forth. let x_i be the number of letters in your string corresponding to index i. You can formulate your problem as:
min sum_i(wi*xi)
st xi*ai = M
xi>=0,
sum_i(xi)=n
sum_i(wi*xi)<= N
xi integer
Where wi= 26^i, ai is equal to hash(letter(i)), n is the number of letters of the original string, N is the hash value of the original string. This is an integer programming problem so you can try plugging it to a solver. The original problem is very similar to subset sum problem with fixed subset size (where the hash values are the elements you are summing over, and the subset size is the length of the string) so you might also want to take a look at that, although as you will see from the answer it is a complicated problem.

String parsing in optimal way

Suppose I have a string as onehhhtwominusthreehhkkseveneightjnine
Now I want to parse this string to get the numbers out of it. For Example this string should return an array, [one,two,minusthree,seven,eight,nine].
The order of the Integers should be maintained.
Can anyone Please suggest an optimal way to do this parsing? Thanks.
(You haven't mentioned a programming language?)
I would probably search for "minus" and check the number(s) that follow it. Then search for "one", then "two", noting their indexes. This would provide enough information to map and output the results, and order, that you need.
Another option is to look at each character in order, comparing each to the 10 choices. I couldn't tell you which is the most efficient - I think it depends on the possible total string length. I'd probably write both and profile them.
If the string to search is not of inordinate length then I suspect that the second approach might be more efficient. This is because, as soon as you have a match, you can eliminate searching the following (known) length of characters.
That is, if you have "abceightd", once you discover the "e" and its "eight" you can skip four characters. You can also skip the a, b, and c anyway, as they are not the beginning character for any of the 10 choices.
I am assuming your choices are:
one, two, three, four, five, six, seven, eight, nine, minus
Assuming that a) you have access to regular expressions in your choice of programming language and b) your possible choices are as Andy G has assumed... then this regular expression can pick out the numbers grouped with their associated minus, if present:
/((?:minus)*(?:one|two|three|four|five|six|seven|eight|nine))/g
Applied to your example string using JavaScript's RegEx.exec(), for example, this extracts:
one
two
minusthree
seven
eight
nine
You could easily place a space after any minus matched if required. Does this help at all?

How can i use a string to determine the location of an object within a list?

Let's say i have a list of the alphabet
myList=["a","b","c"..."z"]
Now lets say we have a variable within a loop that takes out a random letter from the list. Obviously random is imported.
while True:
ans=myList[random.randint(1,26)]
I want the user to be asked to take a guess at a letter so within the loop i add
guess=input('Take a guess at a letter from the alphabet')
The user will receive a clue on the whereabouts of the answer
print('The letter locates between x and x.')
Question. How can i determine the position of ans in myList so i can give two random values and perhaps assign them to variables, one below ans and one value over ans.
The range would always be random between these two values so ans is not always the median of the two values.
p.s. I would put the script together to give a better view of what it looks like, but unfortunately i find the formatting help very confusing, and highlighting pieces of code and pressing Ctrl+K does not work as simply as i expected.
The position is the output of the random call, right?
You can save that to a variable before calling the myList[]
index = random.randint(1,26)
ans = myList[index]
use
myList.index(ans)
for above code to work you need to have ans in myList or else it will throw an exception.
BTW this question is similar to Finding the index of an item given a list containing it in Python

compare a string to a cell array of srings in matlab and find the most similar

I have a list of images stored in a directory. They are all named. My GUI reads all the images and saves their names in a cell array. Now I have added a editable box that the user can type in a name and the program will show that image. The problem is I want the program to take into account typos and misspellings by the user and find the most similar file name to the user typed word. Can you please help me?
Many Thanks,
Hamid
You should read this WP article: Approximate string matching and look at "Calculation of distance between strings" on FEx.
I think you should use the longest common subsequence algorithm to approximately compare strings.
Here is a matlab implementation:
http://www.mathworks.com/matlabcentral/fileexchange/24559-longest-common-subsequence
After, just do something like that:
[~,ind]=min(cellarray( #(x) LCS(lower(userInput),lower(x)), allFileNames));
chosenFile=allFileName{ind};
(the function LCS is the longest common subsequence algorithm, and the functionlower converts to lower case)
Not exactly what you are looking for, but you can compare the first few characters of the strings ignoring case to find a close match. See the command strncmpi:
strncmpi Compare first N characters of strings ignoring case.
TF = strncmpi(S,C,N) performs a case-insensitive comparison between the
first N characters of string S and the first N characters in each element
of cell array C. Input S is a character vector (or 1-by-1 cell array), and
input C is a cell array of strings. The function returns TF, a logical
array that is the same size as C and contains logical 1 (true) for those
elements of C that are a match, except for letter case, and logical 0
(false) for those elements that are not. The order of the two input
arguments is not important.

How will you sort strings in the following example?

so i have a list of string
{test,testertest,testing,tester,testingtest}
I want to sort it in descending order .. how do u sort strings in general ? Is it based on the length or is it character by character ??
how would it be in the example above ?? I want to sort them in a descending way.
No matter what language you’re in, there’s a built-in sort function that performs a lexicographical order, which returns
['test','tester','testertest','testing','testingtest']
for your example. If I wanted this reversed, I would just say reversed(sorted(myList)) in Python and be done with it. If you look to your right you can see plenty of related questions that require a more specialized ordering method (for numbers, dates, etc.), but lexicographic order works on strings containing any kind of data.
Here’s how it works:
compare(string A, string B):
if A and B are both non-empty:
if A[0] == B[0]:
// First letters are the same; compare by the rest
return compare(A[1:], B[1:])
else:
// Compare the first letters by Unicode code point
return compare(A[0], B[0])
else:
// They were equal till now; the shorter one shall be sorted first
return compare(length of A, length of B)
I would sort it like this:
testingtest
testing
testertest
tester
test
Assuming C#
string[] myStrings = {"test","testertest","testing","tester","testingtest"};
Array.Sort(myStrings);
Array.Reverse(myStrings);
foreach(string s in myStrings)
{
Console.WriteLine(s);
}
Not always an ideal way to do it - you could implement a custom comparer instead - but for the trivial example you asked about this is probably the most logical approach.
In computer science strings are usually sorted character by character, with the preferred sort order being (for a standard english character set):
Null characters first
Followed by whitepsace
Followed by symbols
Followed by numeric characters in obvious numerical order
Followed by alphabetic characters in obvious alphabetical order
When sorting characters generally lowercase characters come before uppercase characters.
So for example if we were to sort / compare:
test i ng
test e r
Then "tester" would come before "testing" - the first different character in the string is the 5th one, and "e" comes before "i".
Similarily if we were to compare:
test
testing
Then in this case "test" would come first - once again the strings are identical until the 5th character, where the string "test" ends (i.e. no character) which becomes before any alphanumerical character.
Note that this can produce some counter-intutive results when dealing with numbers - for example try sorting the strings "50" and "100" - you will find that "100" comes before "50". Why? because the strings differ at character 1 and "5" comes after "1".
In nearly all languages there is a function which will do all of the above for you!
You should use that function instead of trying to sort strings yourself! For example:
// C#
string[] myStrings = {"test","testertest","testing","tester","testingtest"};
Array.Sort(myStrings);
in Java you can use natural ordering with
java.util.Collections.sort(list);
the make it descending
java.util.Collections.reverse(list);
or create your own Comparator to do the reverse sorting.
When comparing two strings to see which sorts first, the comparison is typically done on a character by character basis. If the characters in the first position (e.g., t in your example) are identical, you move to the next character. When two characters differ, that "may" define which string is considered "greater".
However, depending on the locale used and a number of other factors, it is possible for later characters in the two strings being compared to override a difference in an earlier character. For example, in some collations, the diacritics on letters are considered to be of secondary weight. So a primary difference in a later character can override the secondary difference.
When two strings are otherwise identical but one is longer, the longer one is typically considered to be "greater". When sorting in descending order, the "greater" of two strings is sorted first.
Do you want to know if test should appear after tester in a descending order? Or are you particularly interested in sorting strings with similar prefixes?
If it's the later, I'd suggest a Trie if the input tends to grow big time.

Resources