How can I replace each letter in the sentence to sentence without breaking it? - python-3.x

Here's my problem.
sentence = "This car is awsome."
and what I want do do is
sentence.replace("a","<emoji:a>")
sentence.replace("b","<emoji:b>")
sentence.replace("c","<emoji:c>")
and so on...
But of course if I do it in that way the letters in "<emoji:>" will also be replaced as I go along. So how can I do it in other way?

As Carlos Gonzalez suggested:
create a mapping dict and apply it to each character in sequence:
sentence = "This car is awsome."
# mapping
up = {"a":"<emoji:a>",
"b":"<emoji:b>",
"c":"<emoji:c>",}
# apply mapping to create a new text (use up[k] if present else default to k)
text = ''.join( (up.get(k,k) for k in sentence) )
print(text)
Output:
This <emoji:c><emoji:a>r is <emoji:a>wsome.
The advantage of the generator expression inside the ''.join( ... generator ...) is that it takes each single character of sentence and either keeps it or replaces it. It only ever touches each char once, so there is no danger of multiple substitutions and it takes only one pass of sentence to convert the whole thing.
Doku: dict.get(key,default) and Why dict.get(key) instead of dict[key]?
If you used
sentence = sentence.replace("a","o")
sentence = sentence.replace("o","k")
you would first make o from a and then make k from any o (or a before) - and you would have to touch each character twice to make it happen.
Using
up = { "a":"o", "o":"k" }
text = ''.join( (up.get(k,k) for k in sentence) )
avoids this.

If you want to replace more then 1 character at a time, it would be easier to do this with regex. Inspired by Passing a function to re.sub in Python
import re
sentence = "This car is awsome."
up = {"is":"Yippi",
"ws":"WhatNot",}
# modified it to create the groups using the dicts key
text2 = re.sub( "("+'|'.join(up)+")", lambda x: up[x.group()], sentence)
print(text2)
Output:
ThYippi car Yippi aWhatNotome.
Doku: re.sub(pattern, repl, string, count=0, flags=0)
You would have to take extra care with your keys, if you wanted to use "regex" specific characters that have another meaning if used as regex-pattern - f.e. .+*?()[]^$

Related

how to remove word that is split into characters from list of strings

I have a list of sentences, where some of them contain only one word but it is split into characters. How can I either merge the characters to make it one word or drop the whole row?
list = ['What a rollercoaster', 'y i k e s', 'I love democracy']
I try to avoid writing regular expressions as much as I can, but from what you told me, this one could work :
import re
a = ['What a rollercoaster', 'y i k e s', 'I love democracy']
regex = re.compile(r'^(\w ){2,}.')
result = list(filter(regex.search, a))
This captures strings having at least two groups of character and space, followed by anything else. This is assuming you wouldn't have a sentence beginning with something like 'a a foo'.

Using re.sub to replace a

I have a text. I want to remove certain words and phrases.
One sentence is: We lived there in the l[b]ate[/b] 1990s.
I search it to find ate. (= words[0])
newline = re.sub('ate', newselectionString, line)
But I only want it to find ate, on its own, not as part of another word.
Is it possible to tell re just to find these 3 letters?
Later in the text is: The best thing was when we ate ice cream.
for line in lines:
for i in range(0, len(words)):
if words[i] in line:
print('Found ' + words[i])
newselectionString = selectionString.replace('GX', 'G' + str(startInt))
newline = re.sub(words[i], newselectionString, line)
newLines.append(newline)
startInt +=1
Here are two ways to do it:
Regular Expression
The regex you want is \bate\b, or that ate should appear between two word boundaries. It will match We ate., I ate it., but not We're late..
Splitting the String
Fairly similar to just a normal regex, but you might want control over the other words in the sentence.
word_fragments = re.split("\b", your_string)
print(' '.join([word for word in word_fragments if word != 'ate']))
Use word boundaries \b with str.format.
Ex:
re.sub(r"\b{}\b".format(words[i]), "Hello World", Text)

Caesar Cipher in Python - how to replace characters

I'm trying to re-arrange long sentence from a puzzle that is encoded using a Caesar Cipher.
Here is my code.
sentence="g fmnc wms bgblr rpylqjyrc gr zw fylb. rfyrq ufyr amknsrcpq ypc dmp. bmgle gr gl zw fylb gq glcddgagclr ylb rfyr'q ufw rfgq rcvr gq qm jmle. sqgle qrpgle.kyicrpylq() gq pcamkkclbcb. lmu ynnjw ml rfc spj."
import string
a=string.ascii_lowercase[]
b=a[2:]+a[:2]
for i in range(26):
sentence.replace(sentence[sentence.find(a[i])],b[i])
Am I, missing anything in replace function?
When I tried sentence.replace(sentence[sentence.find(a[0])],b[0])
it worked but why I can't loop through?
Thanks.
sentence.replace
returns a new string, which you are immediately throwing away. Note that replacing each character repeatedly will cause duplicate replacements in your cipher. See #RemcoGerlich's answer for a better-detailed explanation of what is wrong. As for the solution, what about
import string
letters = string.ascii_lowercase
shifted = {l: letters[(i + 2) % len(letters)] for i, l in enumerate(letters)}
sentence = ''.join(shifted.get(c, c) for c in sentence.lower())
or if you really want the tabled way:
from string import ascii_lowercase
rotated_lowercase = ascii_lowercase[2:] + ascii_lowercase[:2]
translation_table = str.maketrans(ascii_lowercase, rotated_lowercase)
sentence = sentence.translate(translation_table)
There are a few problems:
One, sentence[sentence.find(a[i])] is strange. It tries to look up where in the sentence the character a[1] occurs, and then looks up which character is there. Well, you already know -- a[1]. Unless that character doesn't occur in the string, then .find will return -1, and sentence[-1] is the last character in the sentence. Probably not what you meant. So instead you meant sentence.replace(a[i], b[i]).
But, you don't save the result anywhere. You meant sentence = sentence.replace(a[i], b[i]).
But that still doesn't work! What if a should be changed into b, and then b into c? Then the original as are also changed into c! That's a fundamental problem with your approach.
Better solutions are given by modesitt. Mine would have been something like
lookupdict = {a_char: b_char for (a_char, b_char) in zip(a, b)}
sentence_translated = [lookupdict.get(s, '') for s in sentence]
sentence = ''.join(sentence_translated)

Overlapping values of strings in Python

I am building a puzzle word game in Python. I have the correct puzzle word, and the guessed puzzle word. I want to build a third string which shows the correct letters in the guessed puzzle in the correct puzzle word, and _ at the position of the incorrect letters.
For example, say the correct word is APPLE and the guessed word is APTLE
then i want to have a third string: AP_L_
The guessed word and correct word are guaranteed to be 3 to 5 characters long, but the guessed word is not guaranteed to be the same length as the correct word
For example, correct word is TEA and the guessed word is TEAKO, then the third string should be TEA__ because the players guessed the last two letters incorrectly.
Another example, correct word is APPLE and guessed word is POP, the third string should be:
_ _ P_ _ (without space separation)
I can successfully get the matched indexes of the correct and guessed word; however, I am having problems building the third string. I just learned that strings in Python are immutable and that i cannot assign something like str1[index] = str2[index]
I have tried many things, including using lists, but i am not getting the correct answer. The attached code is my most recent attempt, would you please help me solve this?
Thank you
find the match between puzzle_word and guess
def matcher(str_a, str_b):
#find indexes where letters overlap
matched_indexes = [i for i, (a, b) in enumerate(zip(str_a, str_b)) if a == b]
result = []
for i in str_a:
result.append('_')
for value in matched_indexes:
result[value].replace('_', str_a[value])
print(result)
matcher("apple", "allke")
the output result right now is list of five "_"
cases:
correct word is APPLE and the guessed word is APTLE third
string: AP_L_
correct word is TEA and the guessed word is TEAKO,
third string should be TEA__
correct word is APPLE and guessed
word is POP, third string should be _ _ P_ _
You can use itertools.zip_longest here to always make sure you pad out to the longest word provided and then create a new string by joining the matching characters or otherwise a _. eg:
from itertools import zip_longest
correct_and_guess = [
('APPLE', 'APTLE'),
('TEA', 'TEAKO'),
('APPLE', 'POP')
]
for correct, guess in correct_and_guess:
# If characters in same positions match - show character otherwise `_`
new_word = ''.join(c if c == g else '_' for c, g in zip_longest(correct, guess, fillvalue='_'))
print(correct, guess, new_word)
Will print the following:
APPLE APTLE AP_LE
TEA TEAKO TEA__
APPLE POP __P__
Couple of things here.
str.replace() does not replace inline; as you noted strings are immutable, so you have to assign the result of replace:
result[value] = result[value].replace('_', str_a[value])
However, there's no point doing this since you can just assign to the list element:
result[value] = str_a[value]
And finally you can assign a list of the length of str_a without the for loop, which might be more readable:
result = ['_'] * len(str_a)

Dictionary thesaurus replacing substrings

So I have a dictionary that contains words and their synonyms. The purpose is to replace substrings in a string with a random synonym. Here's my code.
import random
thesaurus = {
"happy":["glad", "blissful", "ecstatic", "at ease"],
"sad" :["bleak", "blue", "depressed"]
}
phrase = input("Enter a phrase: ")
for x in phrase.split():
if x in thesaurus:
ran = len(thesaurus[x])
print( len(thesaurus[x]))
ranlis = random.randint(0,ran - 1)
phrase = phrase.replace(x,str.upper(thesaurus[x][ranlis]))
print(phrase)
If I input "happy happy happy"
The output is:
ECSTATIC ECSTATIC ECSTATIC
I want it to print a different synonym each time(or at least be able to. I understand that it is random).
So:
ECSTATIC BLISSFUL AT EASE
I understand the error in my logic but am unsure how to fix it.
The key is to replace only one occurrence. The replace() function takes 3 parameters, the third being the number of occurrences you want to replace.
So:
phrase = phrase.replace(x,str.upper(thesaurus[x][ranlis]),1)

Resources