Removing a string that startswith a specific char Python - python-3.x

text='I miss Wonderland #feeling sad #omg'
prefix=('#','#')
for line in text:
if line.startswith(prefix):
text=text.replace(line,'')
print(text)
The output should be:
'I miss Wonderland'
But my output is the original string with the prefix removed

So it seems that you do not in fact want to remove the whole "string" or "line", but rather the word? Then you'll want to split your string into words:
words = test.split(' ')
And now iterate through each element in words, performing your check on the first letter. Lastly, combine these elements back into one string:
result = ""
for word in words:
if !word.startswith(prefix):
result += (word + " ")

for line in text in your case will iterate over each character in the text, not each word. So when it gets to e.g., '#' in '#feeling', it will remove the #, but 'feeling' will remain because none of the other characters in that string start with/are '#' or '#'. You can confirm that your code is going character by character by doing:
for line in text:
print(line)
Try the following instead, which does the filtering in a single line:
text = 'I miss Wonderland #feeling sad #omg'
prefix = ('#','#')
words = text.split() # Split the text into a list of its individual words.
# Join only those words that don't start with prefix
print(' '.join([word for word in words if not word.startswith(prefix)]))

Related

need to use a 'for loop' for this one. The user has to enter a sentence and any spaces must be replaced with "%"

the input
sentence = input("Please enter a sentence:")
the for loop (incorrect here)
for i in sentence:
print(sentence)
space_loc = sentence.index(" ")
for c in sentence:
print(space_loc)
for b in range(space_loc):
print("%")
confused about how to get the answer out.
You can try using concatenation of strings and slicing in this one.
sentence = input()
After taking the input simply store the length of your string
length = len(sentence)
Then iterate through every characters in the string and when you find a " ", break the string into two halves using slicing such that each half has one side of the string from " ". And then, join it by a "%" :-
for i in range(length):
if sentence[i]==" ":
sentence = sentence[:i] + "%" + sentence[i+1:]
Here, sentence[:i] is the part of string before the space and sentence[i+1:] is the part of string after the space.
One way of solving your query:
Code
sentence = input("Please enter a sentence:")
ls=sentence.split() #Creating a list of words present in sentence
new_sentence='%'.join(ls) #Joining the list with '%'
print(new_sentence)
Output
Please enter a sentence:Hello there coders!
Hello%there%coders!
EDIT
I do not understand how exactly you want to use the for loop here. If you just want to include a for loop (no restrictions), then you can do this:
Code
ls=[]
a=0
sentence = input("Please enter a sentence:")
for i in range(0,len(sentence)): # This loop will find the words in the sentence and store them in a list. Words are determined by checking the white space. Each space is replaced with '%'
if sentence[i]==' ':
ls.append(sentence[a:i])
a=i
ls.append('%')
ls.append(sentence[a:]) # This is to save the last word
ls1=[]
for i in ls: # Removing any white space inside the list
j=i.replace(' ','')
ls1.append(j)
print(''.join(ls1)) # Displaying final output
Again, your question is very open ended and this is just one way of using for loop to get the desired result!

How to loop through multiple string variables in a for-loop?

I have multiple string variables that store some text data. I want to perform the same set of tasks on all strings. How can I achieve this in Python?
string_1 = "this is string 1"
string_2 = "this is string 2"
string_3 = "this is string 3"
for words in string_1:
return the second word
Above is just an example. I want to extract the second word in every string. Can I do something like:
for words in [string_1, string_2, string_3]:
return the second word in each string
You can use a list comprehension to chain up the second word in those strings. split() breaks your sentence down into word components by consuming spaces.
lines = [string1, string2, string3]
>>>lines[0].split()
['this', 'is', 'string', '1']
>>>[line.split()[1] if len(line.split()) > 1 else None for line in lines]
['is', 'is', 'is']
Edit Added conditional checks to prevent indexing failures
Yes you could do
for sentence in [string_1, string_2, string_3]:
print(sentence.split(' ')[1]) # Get second word and do something with it
This will work assuming that you have minimum of two words in the string and each separated by a space.

Finding average number of words and sentences in paragraph

I have text file from which I need to find the the average number of words per sentence and the average number of sentences per paragraph where a sentence is a sequence of words followed by either a full-stop, comma or exclamation mark, which in turn must be followed either by a quotation mark (so the sentence is the end of a quote or spoken utterance), or white space (space, tab or new-line character) and where a paragraph is any number of sentences followed by a blank line or by the end of the text without using regex.
I created a list of words i.e [".", ",", "!", "\n", "\t", " "] as my problem says and then iterated over the entire text file.
with open("/Users/abhishekabhishek/downloads/l.txt") as f:
text_lis = f.read()
# print(text_lis)
sentence_count = 0
ens_sentence = [".", ",", "!", "\n", "\t", " "]
for word in ens_sentence:
if word in text_lis:
sentence_count += 1
#print(sentence_count)
# sentence_count gave me the wrong output so I tried splitting it
# using text_lis.split(".") so that I can count the sentences
s = text_lis.split(".")
# the for average number of words per sentence
char_len = 0
for line in s:
words = line.split(" ")
for word in words:
char_len += len(word.split)
average_number_of words = char_len/len(words)
The actual output must be the average number of sentences and average number of words per sentence in that paragraph.The approach that I tried gave me the wrong the output because, there are certain words in the file which also use such punctuations like .' for ex Dr. etc and when I used text_lis.split() it also counts those words as the end of the sentence.
here is the sample text
I would love to try or hear the sample audio your app can produce. I do not want to purchase, because I've purchased so many apps that say they do something and do not deliver.
Can you please add audio samples with text you've converted? I'd love to see the end results.
Thanks!
THE AUTHOR.

Get only one word from line

How can I take only one word from a line in file and save it in some string variable?
For example my file has line "this, line, is, super" and I want to save only first word ("this") in variable word. I tried to read it character by character until I got on "," but I when I check it I got an error "Argument of type 'int' is not iterable". How can I make this?
line = file.readline() # reading "this, line, is, super"
if "," in len(line): # checking, if it contains ','
for i in line:
if "," not in line[i]: # while character is not ',' -> this is where I get error
word += line[i] # add it to my string
You can do it like this, using split():
line = file.readline()
if "," in line:
split_line = line.split(",")
first_word = split_line[0]
print(first_word)
split() will create a list where each element is, in your case, a word. Commas will not be included.
At a glance, you are on the right track but there are a few things wrong that you can decipher if you always consider what data type is being stored where. For instance, your conditional 'if "," in len(line)' doesn't make sense, because it translates to 'if "," in 21'. Secondly, you iterate over each character in line, but your value for i is not what you think. You want the index of the character at that point in your for loop, to check if "," is there, but line[i] is not something like line[0], as you would imagine, it is actually line['t']. It is easy to assume that i is always an integer or index in your string, but what you want is a range of integer values, equal to the length of the line, to iterate through, and to find the associated character at each index. I have reformatted your code to work the way you intended, returning word = "this", with these clarifications in mind. I hope you find this instructional (there are shorter ways and built-in methods to do this, but understanding indices is crucial in programming). Assuming line is the string "this, line, is, super":
if "," in line: # checking that the string, not the number 21, has a comma
for i in range(0, len(line)): # for each character in the range 0 -> 21
if line[i] != ",": # e.g. if line[0] does not equal comma
word += line[i] # add character to your string
else:
break # break out of loop when encounter first comma, thus storing only first word

parsing words in a document using specific delimiters

I have a document that I'm parsing words from but I want to consider anything that is not a-z, A-Z, 0-9, or an apostrophe, to be white space. How could I do this if I am using the following bit of code before:
ifstream file;
file.open(filePath);
while(file >> word){
listOfWords.push_back(word); // I want to make sure only words with the stated
// range of characters exist in my list.
}
So, for example, the word hor.se would be two elements in my list, "hor" and "se".
Create a list of "whitespace characters" and then each time you encounter a character, check to see if that character is in the list and if so you've started a new word. This example is written in python, but the concept is the same.
def get_words(whitespace_chars, string):
words = []
current_word = ""
for x in range(0, len(string)):
#check to see if we hit the end of a word.
if(string[x] in whitespace_chars and current_word != ""):
words.append(current_word)
current_word = ""
#add current letter to current word.
else:
current_word += string[x]
#if the last letter isnt whitespace then the last word wont be added, so add here.
if(current_word != ""):
words.append(current_word)
return words
return words

Resources