Overlapping values of strings in Python - python-3.x

I am building a puzzle word game in Python. I have the correct puzzle word, and the guessed puzzle word. I want to build a third string which shows the correct letters in the guessed puzzle in the correct puzzle word, and _ at the position of the incorrect letters.
For example, say the correct word is APPLE and the guessed word is APTLE
then i want to have a third string: AP_L_
The guessed word and correct word are guaranteed to be 3 to 5 characters long, but the guessed word is not guaranteed to be the same length as the correct word
For example, correct word is TEA and the guessed word is TEAKO, then the third string should be TEA__ because the players guessed the last two letters incorrectly.
Another example, correct word is APPLE and guessed word is POP, the third string should be:
_ _ P_ _ (without space separation)
I can successfully get the matched indexes of the correct and guessed word; however, I am having problems building the third string. I just learned that strings in Python are immutable and that i cannot assign something like str1[index] = str2[index]
I have tried many things, including using lists, but i am not getting the correct answer. The attached code is my most recent attempt, would you please help me solve this?
Thank you
find the match between puzzle_word and guess
def matcher(str_a, str_b):
#find indexes where letters overlap
matched_indexes = [i for i, (a, b) in enumerate(zip(str_a, str_b)) if a == b]
result = []
for i in str_a:
result.append('_')
for value in matched_indexes:
result[value].replace('_', str_a[value])
print(result)
matcher("apple", "allke")
the output result right now is list of five "_"
cases:
correct word is APPLE and the guessed word is APTLE third
string: AP_L_
correct word is TEA and the guessed word is TEAKO,
third string should be TEA__
correct word is APPLE and guessed
word is POP, third string should be _ _ P_ _

You can use itertools.zip_longest here to always make sure you pad out to the longest word provided and then create a new string by joining the matching characters or otherwise a _. eg:
from itertools import zip_longest
correct_and_guess = [
('APPLE', 'APTLE'),
('TEA', 'TEAKO'),
('APPLE', 'POP')
]
for correct, guess in correct_and_guess:
# If characters in same positions match - show character otherwise `_`
new_word = ''.join(c if c == g else '_' for c, g in zip_longest(correct, guess, fillvalue='_'))
print(correct, guess, new_word)
Will print the following:
APPLE APTLE AP_LE
TEA TEAKO TEA__
APPLE POP __P__

Couple of things here.
str.replace() does not replace inline; as you noted strings are immutable, so you have to assign the result of replace:
result[value] = result[value].replace('_', str_a[value])
However, there's no point doing this since you can just assign to the list element:
result[value] = str_a[value]
And finally you can assign a list of the length of str_a without the for loop, which might be more readable:
result = ['_'] * len(str_a)

Related

Doubts about string

So, I'm doing an exercise using python, and I tried to use the terminal to do step by step to understand what's happening but I didn't.
I want to understand mainly why the conditional return just the index 0.
Looking 'casino' in [Casinoville].lower() isn't the same thing?
Exercise:
Takes a list of documents (each document is a string) and a keyword.
Returns list of the index values into the original list for all documents containing the keyword.
Exercise solution
def word_search(documents, keyword):
indices = []
for i, doc in enumerate(documents):
tokens = doc.split()
normalized = [token.rstrip('.,').lower() for token in tokens]
if keyword.lower() in normalized:
indices.append(i)
return indices
My solution
def word_search(documents, keyword):
return [i for i, word in enumerate(doc_list) if keyword.lower() in word.rstrip('.,').lower()]
Run
>>> doc_list = ["The Learn Python Challenge Casino.", "They bought a car", "Casinoville"]
Expected output
>>> word_search(doc_list, 'casino')
>>> [0]
Actual output
>>> word_search(doc_list, 'casino')
>>> [0, 2]
Let's try to understand the difference.
The "result" function can be written with list-comprehension:
def word_search(documents, keyword):
return [i for i, word in enumerate(documents)
if keyword.lower() in
[token.rstrip('.,').lower() for token in word.split()]]
The problem happens with the string : "Casinoville" at index 2.
See the output:
print([token.rstrip('.,').lower() for token in doc_list[2].split()])
# ['casinoville']
And here is the matter: you try to ckeck if a word is in the list. The answer is True only if all the string matches (this is the expected output).
However, in your solution, you only check if a word contains a substring. In this case, the condition in is on the string itself and not the list.
See it:
# On the list :
print('casino' in [token.rstrip('.,').lower() for token in doc_list[2].split()])
# False
# On the string:
print('casino' in [token.rstrip('.,').lower() for token in doc_list[2].split()][0])
# True
As result, in the first case, "Casinoville" isn't included while it is in the second one.
Hope that helps !
The question is "Returns list of the index values into the original list for all documents containing the keyword".
you need to consider word only.
In "Casinoville" case, word "casino" is not in, since this case only have word "Casinoville".
When you use the in operator, the result depends on the type of object on the right hand side. When it's a list (or most other kinds of containers), you get an exact membership test. So 'casino' in ['casino'] is True, but 'casino' in ['casinoville'] is False because the strings are not equal.
When the right hand side of is is a string though, it does something different. Rather than looking for an exact match against a single character (which is what strings contain if you think of them as sequences), it does a substring match. So 'casino' in 'casinoville' is True, as would be casino in 'montecasino' or 'casino' in 'foocasinobar' (it's not just prefixes that are checked).
For your problem, you want exact matches to whole words only. The reference solution uses str.split to separate words (the with no argument it splits on any kind of whitespace). It then cleans up the words a bit (stripping off punctuation marks), then does an in match against the list of strings.
Your code never splits the strings you are passed. So when you do an in test, you're doing a substring match on the whole document, and you'll get false positives when you match part of a larger word.

String Operations Confusion? ELI5

I'm extremely new to python and I have no idea why this code gives me this output. I tried searching around for an answer but couldn't find anything because I'm not sure what to search for.
An explain-like-I'm-5 explanation would be greatly appreciated
astring = "hello world"
print(astring[3:7:2])
This gives me : "l"
Also
astring = "hello world"
print(astring[3:7:3])
gives me : "lw"
I can't wrap my head around why.
This is string slicing in python.
Slicing is similar to regular string indexing, but it can return a just a section of a string.
Using two parameters in a slice, such as [a:b] will return a string of characters, starting at index a up to, but not including, index b.
For example:
"abcdefg"[2:6] would return "cdef"
Using three parameters performs a similar function, but the slice will only return the character after a chosen gap. For example [2:6:2] will return every second character beginning at index 2, up to index 5.
ie "abcdefg"[2:6:2] will return ce, as it only counts every second character.
In your case, astring[3:7:3], the slice begins at index 3 (the second l) and moves forward the specified 3 characters (the third parameter) to w. It then stops at index 7, returning lw.
In fact when using only two parameters, the third defaults to 1, so astring[2:5] is the same as astring[2:5:1].
Python Central has some more detailed explanations of cutting and slicing strings in python.
I have a feeling you are over complicating this slightly.
Since the string astring is set statically you could more easily do the following:
# Sets the characters for the letters in the consistency of the word
letter-one = "h"
letter-two = "e"
letter-three = "l"
letter-four = "l"
letter-six = "o"
letter-7 = " "
letter-8 = "w"
letter-9 = "o"
letter-10 = "r"
letter11 = "l"
lettertwelve = "d"
# Tells the python which of the character letters that you want to have on the print screen
print(letter-three + letter-7 + letter-three)
This way its much more easily readable to human users and it should mitigate your error.

Finding position of first letter in subtring in list of strings (Python 3)

I have a list of strings, and I'm trying to find the position of the first letter of the substring I am searching for in the list of strings. I'm using the find() method to do this, however when I try to print the position of the first letter Python returns the correct position but then throws a -1 after it, like it couldn't find the substring, but only after it could find it. I want to know how to return the position of the first letter of he substring without returning a -1 after the correct value.
Here is my code:
mylist = ["blasdactiverehu", "sdfsfgiuyremdn"]
word = "active"
if any(word in x for x in mylist) == True:
for x in mylist:
position = x.find(word)
print(position)
The output is:
5
-1
I expected the output to just be:
5
I think it may be related to the fact the loop is searching for the substring for every string in the list and after it's found the position it still searches for more but of course returns an error as there is only one occurrence of the substring "active", however I'm not sure how to stop searching after successfully finding one substring. Any help is appreciated, thank you.
Indeed your code will not work as you want it to, since given that any of the words contain the substring, it will do the check for each and every one of them.
A good way to avoid that is using a generator. More specifically, next()
default_val = '-1'
position = next((x.find(word) for x in mylist if word in x), default_val)
print(position)
It will simply give you the position of the substring "word" for the first string "x" that will qualify for the condition if word in x, in the list 'mylist'.
By the way, no need to check for == True when using any(), it already returns True/False, so you can simply do if any(): ...

Print First Letter of Each Word in a String in Python (Keep Punctuation Marks)

First post to site so I apologize if I do something wrong. I have looked for an appropriate answer, but could not find one.
I am new to python and have been playing around trying to take a long string (passage in a book,) and printing all but the first letter of each word while keeping the punctuation marks (Though not apostrophe marks.) and have been unsuccessful so far.
Example:
input = "Hello, I'm writing a sentence. (Though not a good one.)"
Code....
output = H, I W A S. (T N A G O.)
--Note the ",", ".", "()", but not the " ' ".
Any tips? Thank you all so much for taking the time to look
To help you on your adventure, I'll give you like a step-by-step logic of it
In python first use the .split() to seperate it by spaces
Go through each string in the list
Go through every char in the string
Print any punctuation marks that you specify and the first alphabetical character you find

Are two strings anagrams or not?

I want to find if two strings are anagrams or not..
I thought to sort them,and then check one by one but is there any algorithms for sorting stings? or another idea to make it? (simple ideas or code because i am a beginner )thanks
Strings are lists of characters in Haskell, so the standard sort simply works.
> import Data.List
> sort "hello"
"ehllo"
Your idea of sorting and then comparing sounds fine for checking anagrams.
I can give you and idea-(as I am not that much acquainted with haskell).
Take an array having 26 spaces.
Now for each character in the first string you increase certaing position in array.
If array A[26]={0,0,...0}
Now if you find 'a' then put A[1]=A[1]+1;
if 'b' then A[2]=A[2]+1;
Now in case of 2nd string for each character you decrease the values for each character found in the same array.(if you find 'a' decrease A[1] like A[1]=A[1]-1)
At last check if all the array elements are 0 or not. If 0 then definitely they are anagram else not an anagram.
Note: You may extend this for Capital letters similarly.
It is not necessary to count the crowd each letter.
Simply, you can sort your string and then check each element of two lists.
For example, you have this
"cinema" and "maneci"
It would be helpful to make your string into a list of characters.
['c','i','n','e','m','a'] and ['m','a','n','e','c','i']
Then , you can sort these list and you will check each character.
Note that you will have these cases :
example [] [] = True
example [] a = False
example a [] = False
example (h1:t1)(h2:t2) = if h1==h2 then _retroactively_ else False
In the Joy of Haskell "Finding Success and Failure", pp.11-14, the authors offer the following code which works:
import Data.List
isAnagram :: String -> String -> Bool
isAnagram word1 word2 = (sort word1) == (sort word2)
After importing your module (I imported practice.hs into Clash), you can enter two strings which, if they are anagrams, will return true:
*Practice> isAnagram "julie" "eiluj"
True

Resources