Matching key value pair from a python dictionary gives absurd results - python-3.x

I have created a python dictionary for expanding acronyms. For example, the dictionary has the following entry:
Acronym_dict = {
"cont":"continued"
}
The code for the dictionary lookup is as follows:
def code_dictionary(text, dict1=Acronym_dict):
for word in text.split():
for key in Acronym_dict:
if key in text:
text = text.replace(key, Acronym_dict[key],1)
return text
The problem is that the code is replacing every string that contains substring 'cont' with continued. For example, continental is getting replaced by 'continuedinental' by the dictionary. This is something that I don't want. I know I can add space before and after each key in the dictionary but that will be time-consuming as the dictionary is quite long. Any other alternative?? Please suggest.

A few solutions:
Use regular expressions to find isolated words using \b (word break):
import re
Acronym_dict = {
r'\bcont\b':'continued'
}
def code_dictionary(text, dict1=Acronym_dict):
for key,value in dict1.items():
text = re.sub(key,value,text)
return text
s = 'to be cont in continental'
print(code_dictionary(s))
to be continued in continental
If you don't want to change your dictionary, build the regular expression. Note re.escape makes sure the key doesn't contain anything treated differently by a regular expression:
import re
Acronym_dict = {
'cont':'continued'
}
def code_dictionary(text, dict1=Acronym_dict):
for key,value in dict1.items():
regex = r'\b' + re.escape(key) + r'\b'
text = re.sub(regex,value,text)
return text
s = 'to be cont in continental'
print(code_dictionary(s))
to be continued in continental
Fanciest version, does all the acronym replacement in one call to re.sub:
import re
Acronym_dict = {'a':'aaa',
'b':'bbb',
'c':'ccc',
'd':'ddd'}
def code_dictionary(text, dict1=Acronym_dict):
# ORs all the keys together, longest match first.
# E.g. generates r'\b(abc|ab|b)\b'.
# Captures the value it matches.
regex = r'\b(' + '|'.join([re.escape(key)
for key in
sorted(dict1,key=len,reverse=True)]) + r')\b'
# Replace everything in the text in one regex.
# Uses a callback to look up the value of the acronym.
return re.sub(regex,lambda m: dict1[m.group(1)],text)
s = 'a abcd b abcd c abcd d'
print(code_dictionary(s))
aaa abcd bbb abcd ccc abcd ddd

Try this:
import re
Acronym_dict = {
"cont":"continued"
}
def code_dictionary(text, dict1=Acronym_dict):
# for word in text.split():
for key in Acronym_dict:
text = re.sub(r'\b' + key + r'\b', Acronym_dict[key], text)
return text
if __name__ == "__main__":
text = '''
abcd cont ajflkasdfla cont.
cont continental afakjsklfjakl jfalfj asl cont fjdlaskfjal fjal
cont
'''
print(text)
print('--------------------')
print(code_dictionary(text))

Related

remove string which contains special character by python regular expression

my code:
s = '$ascv abs is good'
re.sub(p.search(s).group(),'',s)
ouput:
'$ascv abs is good'
the output what i want:
'abs is good'
I want to remove string which contains special character by python regular expression. I thought my code was right but the output is wrong.
How can i fix my code to make the output right?
invalid_chars = ['#'] # Characters you don't want in your text
# Determine if a string has any character you don't want
def if_clean(word):
for letter in word:
if letter in invalid_chars:
return False
return True
def clean_text(text):
text = text.split(' ') # Convert text to a list of words
text_clean = ''
for word in text:
if if_clean(word):
text_clean = text_clean+' '+word
return text_clean[1:]
# This will print 'abs is good'
print(clean_text('$ascv abs is good'))

How to write High order function in Python?

I am trying to solve this question, on Codewars,
This kata is the first of a sequence of four about "Squared Strings".
You are given a string of n lines, each substring being n characters long: For example:
s = "abcd\nefgh\nijkl\nmnop"
We will study some transformations of this square of strings.
Vertical mirror: vert_mirror (or vertMirror or vert-mirror)
vert_mirror(s) => "dcba\nhgfe\nlkji\nponm"
Horizontal mirror: hor_mirror (or horMirror or hor-mirror)
hor_mirror(s) => "mnop\nijkl\nefgh\nabcd"
or printed:
vertical mirror |horizontal mirror
abcd --> dcba |abcd --> mnop
efgh hgfe |efgh ijkl
ijkl lkji |ijkl efgh
mnop ponm |mnop abcd
My Task:
--> Write these two functions
and
--> high-order function oper(fct, s) where
--> fct is the function of one variable f to apply to the string s (fct will be one of vertMirror, horMirror)
Examples:
s = "abcd\nefgh\nijkl\nmnop"
oper(vert_mirror, s) => "dcba\nhgfe\nlkji\nponm"
oper(hor_mirror, s) => "mnop\nijkl\nefgh\nabcd"
Note:
The form of the parameter fct in oper changes according to the language. You can see each form according to the language in "Sample Tests".
Bash Note:
The input strings are separated by , instead of \n. The output strings should be separated by \r instead of \n.
Here's the code below:
def vert_mirror(strng):
# your code
def hor_mirror(strng):
# your code
pass
def oper(fct, s):
# your code
pass
"I'Have tried using reverse [::-1] but it doesn't work..
The if statement at the bottom is for testing, remove it if you want to use the code somewhere else.
def vert_mirror(string):
rv = []
separator = '\n'
words = string.split(separator)
for word in words:
rv.append(word[::-1])
rv = separator.join(rv)
#return the representation of rv, bc \n will be displayed as a newline
return repr(rv)
def hor_mirror(string):
rv = []
separator = '\n'
words = string.split(separator)
rv = words[::-1]
rv = separator.join(rv)
#return the representation of rv, bc \n will be displayed as a newline
return repr(rv)
def oper(fct, s):
return fct(s)
if __name__ == '__main__':
s = "abcd\nefgh\nijkl\nmnop"
print(oper(vert_mirror, s))
print(oper(hor_mirror, s))
EDIT: I've just seen the note "The input strings are separated by , instead of \n. The output strings should be separated by \r instead of \n.", if you need to change separators, just change the value of "separator" accordingly.
Or remove the repr(), if you want the raw string.

hello friends i cant execute my else condition

The program must accept a string S as the input. The program must replace every vowel in the string S by the next consonant (alphabetical order) and replace every consonant in the string S by the next vowel (alphabetical order). Finally, the program must print the modified string as the output.
s=input()
z=[let for let in s]
alpa="abcdefghijklmnopqrstuvwxyz"
a=[let for let in alpa]
v="aeiou"
vow=[let for let in v]
for let in z:
if(let=="a"or let=="e" or let=="i" or let=="o" or let=="u"):
index=a.index(let)+1
if index!="a"or index!="e"or index!="i"or index!="o"or index!="u":
print(a[index],end="")
else:
for let in alpa:
ind=alpa.index(let)
i=ind+1
if(i=="a"or i=="e" or i=="i"or i=="o"or i=="u"):
print(i,end="")
the output is :
i/p orange
pbf
the required output is:
i/p orange
puboif
I would do it like this:
import string
def dumb_encrypt(text, vowels='aeiou'):
result = ''
for char in text:
i = string.ascii_letters.index(char)
if char.lower() in vowels:
result += string.ascii_letters[(i + 1) % len(string.ascii_letters)]
else:
c = 'a'
for c in vowels:
if string.ascii_letters.index(c) > i:
break
result += c
return result
print(dumb_encrypt('orange'))
# puboif
Basically, I would use string.ascii_letters, instead of defining that anew. Also, I would not convert all to list as it is not necessary for looping through. The consonants you got right. The vowels, I would just do an uncertain search for the next valid consonant. If the search, fails it sticks back to default a value.
Here I use groupby to split the alphabet into runs of vowels and consonants. I then create a mapping of letters to the next letter of the other type (ignoring the final consonants in the alphabet). I then use str.maketrans to build a translation table I can pass to str.translate to convert the string.
from itertools import groupby
from string import ascii_lowercase as letters
vowels = "aeiou"
is_vowel = vowels.__contains__
partitions = [list(g) for k, g in groupby(letters, is_vowel)]
mapping = {}
for curr_letters, next_letters in zip(partitions, partitions[1:]):
for letter in curr_letters:
mapping[letter] = next_letters[0]
table = str.maketrans(mapping)
"orange".translate(table)
# 'puboif'

Counting Frequencies

I am trying to figure out how to count the number of frequencies the word tags I-GENE and O appeared in a file.
The example of the file I'm trying to compute is this:
45 WORDTAG O cortex
2 WORDTAG I-GENE cdc33
4 WORDTAG O PPRE
4 WORDTAG O How
44 WORDTAG O if
I am trying to compute the sum of word[0] (column 1) in the same category (ex. I-GENE) same with category (ex. O)
In this example:
The sum of words with category of I-GENE is 2
and the sum of words with category of O is 97
MY CODE:
import os
def reading_files (path):
counter = 0
for root, dirs, files in os.walk(path):
for file in files:
if file != ".DS_Store":
if file == "gene.counts":
open_file = open(root+file, 'r', encoding = "ISO-8859-1")
for line in open_file:
tmp = line.split(' ')
for words in tmp:
for word in words:
if (words[2]=='I-GENE'):
sum = sum + int(words[0]
if (words[2] == 'O'):
sum = sum + int(words[0])
else:
print('Nothing')
print(sum)
I think you should delete the word loop - you don't use it
for word in words:
I would use a dictionary for this - if you want solve this generally.
While you read the file, fill a dictionary with:
- if you have the key in the dict already -> Increase the value for it
- If it is a new key, then add to the dict, and set value to it's value.
def reading_files (path):
freqDict = dict()
...
for words in tmp:
if words[2] not in freqDict():
freqDict[words[2]] = 0
freqDict[words[2]] += int(words[0])
After you created the dictionary, you can return it and use it with keyword, or you can pass a keyword for the function, and return the value or just print it.
I prefer the first one - Use as less file IO operation as possible. You can use the collected data from memory.
For this solution I wrote a wrapper:
def getValue(fDict, key):
if key not in fDict:
return "Nothing"
return str(fDict[key])
So it will behave like your example.
It is not neccessary, but a good practice: close the file when you are not using it anymore.

Selecting a random value from a dictionary in python

Here is my function:
def evilSetup():
words = setUp()
result = {}
char = input('Please enter your one letter guess: ')
for word in words:
key = ' '.join(char if c == char else '-' for c in word)
if key not in result:
result[key] = []
result[key].append(word)
return max(result.items(), key=lambda keyValue: len(keyValue[1]))
from collections import defaultdict
import random
words= evilSetup()#list of words from which to choose
won, lost = 0,0 #accumulators for games won, and lost
while True:
wrongs=0 # accumulator for wrong guesses
secretWord = words
print(secretWord) #for testing purposes
guess= len(secretWord)*'_'
print('Secret Word:' + ' '.join(guess))
while wrongs < 8 and guess != secretWord:
wrongs, guess = playRound(wrongs, guess)
won, lost = endRound(wrongs,won,lost)
if askIfMore()== 'N':
break
printStats(won, lost)
The function will take a list of words, and sort them into a dictionary based on the position of the guessed letter. As of now, it returns the key,value pair that is the largest. What I would like it to ultimately return is a random word from the biggest dictionary entry. The values as of now are in the form of a list.
For example {- - -, ['aah', 'aal', 'aas']}
Ideally I would grab a random word from this list to return. Any help is appreciated. Thanks in advance.
If you have a list lst, then you can simply do:
random_word = random.choice(lst)
to get a random entry of the list. So here, you will want something like:
return random.choice(max(result.items(), key=lambda kv: len(kv[1]))[1])
# ^^^^^^^^^^^^^^ ^^^^

Resources