I am attempting to create a program in Python that asks a user to input a string (preferably in lower-case), and then convert that string into sentence case. However, a boolean I am using in order to check whether the next letter needs to be capitalised will not be set to False, despite the conditions required for the 'if' statement being met.
class SentenceCaseProgram(object):
def __init__(self, isPunctuation, isSpace, sentence, new_sentence):
self.isPunctuation = isPunctuation
self.sentence = sentence
self.new_sentence = new_sentence
self.isSpace = isSpace
self.count = 0
def Input(self):
self.sentence = str(input("Type in a sentence (with punctuation) entirely in lowercase. "))
def SentenceCase(self):
for letter in self.sentence:
print(self.isPunctuation)
if self.count == 0:
letter = letter.capitalize()
if letter is ' ':
self.isSpace = True
if (self.isPunctuation == True) and (letter in 'abcdefghijklmnopqrstuvwxyz'):
letter = letter.capitalize()
self.isPunctuation = letter is '.' or '!' or '?' or ')'
if letter is 'i' and self.isSpace is True:
letter = letter.capitalize()
self.isSpace = False
self.count += 1
if letter == '.' or '!' or '?' or ')':
self.isPunctuation = True
else:
self.isPunctuation = False
self.new_sentence += letter
def Print(self):
print("Your sentence in sentence case is '%s'" % self.new_sentence)
def Main(self):
self.__init__(False, False, "", "")
self.Input()
self.SentenceCase()
self.Print()
app = SentenceCaseProgram(False, False, "", "")
app.Main()
When I run the program, the program asks for an input, and then capitalizes every single letter in the sentence, and the self.isPunctuation boolean will constantly be set to True, apart from with the first loop.
don't use is to compare strings, use ==
letter is '.' or '!' or '?' or ')' should be
letter == '.' or letter == '!' or letter == '?' or letter == ')'
OR
letter in '.!?)'
Related
I'm parsing a German text with many hyphens in it. To check if a word is a proper German word (and only got seperated by a hyphen because it was the end of the line) or needs those hyphens because that is actually how it should be written, I am currently extending a collection of lemmatized words that I found here:
https://github.com/michmech/lemmatization-lists
Can you point me to a way how that can be done with nltk?
What I do: when my parser encounters a word with a hyphen, I check spelling without hyphen (i.e. if it is contained in my list with lemmatized words). If it is not contained in my list (currently some 420,000 words) I will check myself if it should be added to my list or written with hyphen.
This is the function that does the work:
**function(sents, german_words, hyphened_words):**
clutter = '[*!?,.;:_\s()\u201C\u201D\u201E\u201F\u2033\u2036\u0022]'
sentences = list()
new_hyphened_words = list()
new_german_words = list()
skip = False
for i, sentence in enumerate(sents):
if skip:
skip = False
continue
new_sentence = ''
words = sentence.split(' ')
words = list(filter(None, words))
new_words = list() # words to make a correct sentence
last_word = words[-1]
last_word = last_word.strip()
if last_word[-1] == '-':
try:
next_sentence = sents[i+1]
except IndexError as e:
raise e
next_words = next_sentence.split(' ')
next_words = list(filter(None, next_words))
first_word = next_words[0]
new_word = last_word[:-1] + first_word
new_word = re.sub(clutter, '', word)
if _is_url_or_mail_address(new_word):
new_words = words[:-1] + [new_word] + next_words[1:]
skip = True
continue
elif new_word in german_stopwords:
new_words = words[:-1] + [new_word] + next_words[1:]
skip = True
continue
elif new_word in german_words:
new_words = words[:-1] + [new_word] + next_words[1:]
skip = True
continue
else:
new_word = last_word + first_word # now with hyphen!
new_word = re.sub(clutter, '', word)
if new_word in hyphened_words:
new_words = words[:-1] + [new_word] + next_words[1:]
skip = True
continue
else: # found neither with nor without hyphen
with_hyphen = re.sub(clutter, '', last_word + first_word)
without_hyphen = re.sub(clutter, '', last_word[:-1] + first_word)
print(f'1: {with_hyphen}, 2: {without_hyphen}')
choose = input('1 or 2, or . if correction')
if choose == '1':
new_hyphened_words.append(with_hyphen)
new_words = words[:-1] + [last_word+first_word] + next_words[1:]
skip = True
continue
elif choose == '2':
new_german_words.append(without_hyphen)
new_words = words[:-1] + [last_word[:-1]+first_word] +\
next_words[1:]
skip = True
continue
else:
corrected_word = input('Corrected word: ')
print()
new_german_words.append(corrected_word)
print(f'Added to dict: "{corrected_word}"')
ok = input('Also add to speech? ./n')
if ok == 'n':
speech_word = input('Speech word: ')
new_words = words[:-1] + [speech_word] + next_words[1:]
skip = True
continue
else:
new_words = words[:-1] + [corrected_word] + next_words[1:]
skip = True
continue
else:
new_words = words
new_sentence = ' '.join(w for w in new_words)
sentences.append(new_sentence)
return sentences
The lists "german_words" and "hyphened_words" get updated every now and again so they contain the new words from the sessions before.
What I do works, however it is slow work. I have been searching for ways to do this with nltk but I seem to have looked at the wrong places. Can you point me to a way that trains an nltk collection of words or that uses a more efficient way of processing this?
I'm trying to store substrings of letters in 's' that are in alphabetical order in a list
s = 'azcbobobegghakl'
string = ''
List = []
i = -1
for letter in s:
if letter == s[0]:
string += letter
elif letter >= s[i]:
string += letter
elif letter < s[i]:
List.append(string)
string = letter
i += 1
print(List)
My expected result:
['az', 'c', 'bo', 'bo', 'beggh', 'akl']
And my actual Output:
['az', 'c', 'bo', 'bo']
Firstly, your first if statement is incorrect. It should be if i == -1:. Because of this bug, you are ignoring the second a character in s.
Secondly, at the end of the string you don't add what's left of string into List.
As such, the following is what you want:
s = 'azcbobobegghakl'
string = ''
List = []
i = -1
for letter in s:
if i == -1:
string += letter
elif letter >= s[i]:
string += letter
elif letter < s[i]:
List.append(string)
string = letter
i += 1
List.append(string)
print(List)
An example is available here.
VOWELS = ['a', 'e', 'i', 'o', 'u']
BEGINNING = ["th", "st", "qu", "pl", "tr"]
def pig_latin2(word):
# word is a string to convert to pig-latin
string = word
string = string.lower()
# get first letter in string
test = string[0]
if test not in VOWELS:
# remove first letter from string skip index 0
string = string[1:] + string[0]
# add characters to string
string = string + "ay"
if test in VOWELS:
string = string + "hay"
print(string)
def pig_latin(word):
string = word
transfer_word = word
string.lower()
test = string[0] + string[1]
if test not in BEGINNING:
pig_latin2(transfer_word)
if test in BEGINNING:
string = string[2:] + string[0] + string[1] + "ay"
print(string)
When I un-comment the code below and replace print(string) with return string in above two functions, it only works for words in pig_latin(). As soon as word should be passed to pig_latin2() I get a value of None for all words and the programs crashes.
# def start_program():
# print("Would you like to convert words or sentence into pig latin?")
# answer = input("(y/n) >>>")
# print("Only have words with spaces, no punctuation marks!")
# word_list = ""
# if answer == "y":
# words = input("Provide words or sentence here: \n>>>")
# new_words = words.split()
# for word in new_words:
# word = pig_latin(word)
# word_list = word_list + " " + word
# print(word_list)
# elif answer == "n":
# print("Goodbye")
# quit()
# start_program()
You're not capturing the return value of the pig_latin2 function. So whatever that function does, you're discarding its output.
Fix this line in the pig_latin function:
if test not in BEGINNING:
string = pig_latin2(transfer_word) # <----------- forgot 'string =' here
When fixed thusly, it works for me. Having said that, there would still be a bunch of stuff to clean up.
I am making a program to take in a sentence, convert each word to pig latin, and then spit it back out as a sentence. I have no idea where I have messed up. I input a sentence and run it and it says
built-in method lower of str object at 0x03547D40
s = input("Input an English sentence: ")
s = s[:-1]
string = s.lower
vStr = ("a","e","i","o","u")
def findFirstVowel(word):
for index in range(len(word)):
if word[index] in vStr:
return index
return -1
def translateWord():
if(vowel == -1) or (vowel == 0):
end = (word + "ay")
else:
end = (word[vowel:] + word[:vowel]+ "ay")
def pigLatinTranslator(string):
for word in string:
vowel = findFirstVowel(word)
translateWord(vowel)
return
print (string)
You have used the lower method incorrectly.
You should use it like this string = s.lower().
The parentheses change everything. When you don't use it, Python returns an object.
Built-in function should always use ()
Here is the corrected version of the code which should work:
s = input("Input an English sentence: \n").strip()
string = s.lower() #lowercasing
vStr = ("a","e","i","o","u")
def findFirstVowel(word):
for idx,chr in enumerate(word):
if chr in vStr:
return idx
return -1
def translateWord(vowel, word):
if(vowel == -1) or (vowel == 0):
end = (word + "ay")
else:
end = (word[vowel:] + word[:vowel]+ "ay")
def pigLatinTranslator(string):
for word in string:
vowel = findFirstVowel(word)
translateWord(vowel,word)
return
print(string)
I have a problem, when in Hangman game there is a word like happy, it only append 1 'p' in the list...run my code and please tell me what to do?
check my loops.
import random
import time
File=open("Dict.txt",'r')
Data = File.read()
Word = Data.split("\n")
A = random.randint(0,len(Word)-1)
Dict = Word[A]
print(Dict)
Dash = []
print("\n\n\t\t\t","_ "*len(Dict),"\n\n")
i = 0
while i < len(Dict):
letter = str(input("\n\nEnter an alphabet: "))
if letter == "" or letter not in 'abcdefghijklmnopqrstuvwxyz' or len(letter) != 1:
print("\n\n\t\tPlease Enter Some valid thing\n\n")
time.sleep(2)
i = i - 1
if letter in Dict:
Dash.append(letter)
else:
print("This is not in the word")
i = i - 1
for item in Dict:
if item in Dash:
print(item, end = " ")
else:
print("_", end = " ")
i = i + 1
The error is with the "break" on Line 25: once you have filled in one space with the letter "p", the loop breaks and will not fill in the second space with "p".
You need to have a flag variable to remember whether any space has been successfully filled in, like this:
success = False
for c in range(len(Dict)):
if x == Dict[c]:
Dash[c] = x
success = True
if not success:
Lives -= 1
P.S. There's something wrong with the indentation of the code you have posted.