Remove & add split-list using dictionary python [duplicate] - python-3.x

I have the code below. I'm trying to remove two strings from lists predict strings and test strings if one of them has been found in the other. The issue is that I have to split up each of them and check if there is a "portion" of one string inside the other. If there is then I just say there is a match and then delete both strings from the list so they are no longer iterated over.
ValueError: list.remove(x): x not in list
I get the above error though and I am assuming this is because I can't delete the string from test_strings since it is being iterated over? Is there a way around this?
Thanks
for test_string in test_strings[:]:
for predict_string in predict_strings[:]:
split_string = predict_string.split('/')
for string in split_string:
if (split_string in test_string):
no_matches = no_matches + 1
# Found match so remove both
test_strings.remove(test_string)
predict_strings.remove(predict_string)
Example input:
test_strings = ['hello/there', 'what/is/up', 'yo/do/di/doodle', 'ding/dong/darn']
predict_strings =['hello/there/mister', 'interesting/what/that/is']
so I want there to be a match between hello/there and hello/there/mister and for them to be removed from the list when doing the next comparison.
After one iteration I expect it to be:
test_strings == ['what/is/up', 'yo/do/di/doodle', 'ding/dong/darn']
predict_strings == ['interesting/what/that/is']
After the second iteration I expect it to be:
test_strings == ['yo/do/di/doodle', 'ding/dong/darn']
predict_strings == []

You should never try to modify an iterable while you're iterating over it, which is still effectively what you're trying to do. Make a set to keep track of your matches, then remove those elements at the end.
Also, your line for string in split_string: isn't really doing anything. You're not using the variable string. Either remove that loop, or change your code so that you're using string.
You can use augmented assignment to increase the value of no_matches.
no_matches = 0
found_in_test = set()
found_in_predict = set()
for test_string in test_strings:
test_set = set(test_string.split("/"))
for predict_string in predict_strings:
split_strings = set(predict_string.split("/"))
if not split_strings.isdisjoint(test_set):
no_matches += 1
found_in_test.add(test_string)
found_in_predict.add(predict_string)
for element in found_in_test:
test_strings.remove(element)
for element in found_in_predict:
predict_strings.remove(element)

From your code it seems likely that two split_strings match the same test_string. The first time through the loop removes test_string, the second time tries to do so but can't, since it's already removed!
You can try breaking out of the inner for loop if it finds a match, or use any instead.
for test_string, predict_string in itertools.product(test_strings[:], predict_strings[:]):
if any(s in test_string for s in predict_string.split('/')):
no_matches += 1 # isn't this counter-intuitive?
test_strings.remove(test_string)
predict_strings.remove(predict_string)

Related

How do I replace and update a string multiple times in Python?

I'm working on a quiz program and need some help. I'm trying to replace words one at a time, but Python isn't saving the previously replaced string. Here is a mini example of what I mean:
replacedQuiz=""
easyQuiz = """
You can change a string variable to an integer by typing (__1__)
in front of the variable. It also works vice versa, you can change an
integer
variable to a string by typing (__2__). This is important to remember before
you __3__ strings together, or else a TypeError will occur. While adding an
integer to a string, it is important to separate it using a __4__ (use the
symbol). \n"""
def replaceWord(replaced, quiz, numCount):
if numCount == 1:
replaced = quiz.replace("__1__", "int")
if numCount == 2:
replaced = replaced.replace("__2__", "str")
if numCount == 3:
replaced= replaced.replace("__3__", "concatenate")
if numCount == 4:
replaced= replaced.replace("__4__", ",")
print replaced
def easy():
QCount=1
print easyQuiz
while QCount < 5:
replaceWord(replacedQuiz, easyQuiz, QCount)
QCount += 1
print easy()
I thought that by making a String called replacedQuiz, it would save the first replacement and then I could continue replacing the words inside the quiz and updating it. Please help! I don't know where I'm going wrong
You seem to have made a slight mistake in the scope of your variable replacedQuiz (it'd certainly suggest that you check out some explanation of this topic). Basically, you are replacing replacedQuiz by its new value only within your current function. Your other functions only have access to the global value you defined earlier. There are several ways to fix this (e.g. the global keyword) but the standard way would be to return the new replacedQuiz from your function.
To do so, add the following line to the end of your replaceWord function:
return replacedQuiz
This tells Python to use this value at the line it was called at. You can then define a new value for replacedQuiz within easy by just defining it as the returned value:
replacedQuiz = replaceWord(replacedQuiz, easyQuiz, QCount)

Saving ord(characters) from different lines(one string) in different lists

i just can't figure it out.
I got a string with some lines.
qual=[abcdefg\nabcedfg\nabcdefg]
I want to convert my characters to the ascii value and saves those values in an other list for each line.
value=[[1,2,3,4,5,6],[1,2,3,4,5,6],[1,2,3,4,5,6]
But my codes saves them all in one list.
values=[1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6]
First of all my code:
for element in qual:
qs = ord(element)
quality_code.append(qs)
I also tried to split() the string but the result is still the same
qual=line#[:-100]
qually=qual.split()
for list in qually:
for element in list:
qs = ord(element)
quality.append(qs)
My next attempt was:
for element in qual:
qs = ord(element)
quality_code.append(qs)
for position in range(0, len(quality_code)):
qual_liste[position].append(quality_code[position])
With this code an IndexError(list index out of range) occurs.
There is probably a way with try and except but i dont get it.
for element in qual:
qs = ord(element)
quality_code.append(qs)
for position in range(0, len(quality_code)):
try:
qual_liste[position].append(quality_code[position])
except IndexError:
pass
With this code the qual_lists stays empty, probably because of the pass
but i dont know what to insert instead of pass.
Thanks a lot for help. I hope my bad english is excusable .D
Here you go, this should do the trick:
qual="abcdefg\nabcedfg\nabcdefg"
print([[ord(ii) for ii in i] for i in qual.split('\n')])
List comprehension is always the answer.

Python3 - Advice on a while loop with range?

Good afternoon! I am relatively new to Python - and am working on an assignment for a class.
The goal of this code is to download a file, add a line of data to the file, then create a while loop that iterates through each line of data, and prints out the city name and the highest average temp from the data for that city.
My code is below - I have the output working, no problem. The only issue I am running into is an IndexError: list index out of range - at the end.
I have searched on StackOverflow - as well as digging into the range() function documentation online with Python. I think I just need to figure to the range() properly, and I'd be done with it.
If I take out the range, I get the same error - so I tried to change the for/in to - for city in mean_temps:
The result of that was that the output only showed 4 of the 7 cities - skipping every other city.
Any advice would be greatly appreciated!
here is my code - the screenshot link below shows output and the error as well:
!curl https://raw.githubusercontent.com/MicrosoftLearning/intropython/master/world_temp_mean.csv -o mean_temp.txt
mean_temps = open('mean_temp.txt', 'a+')
mean_temps.write("Rio de Janeiro,Brazil,30.0,18.0")
mean_temps.seek(0)
headings = mean_temps.readline().split(',')
print(headings)
while mean_temps:
range(len(city_temp))
for city in mean_temps:
city_temp = mean_temps.readline().split(',')
print(headings[0].capitalize(),"of", city_temp[0],headings[2], "is", city_temp[2], "Celsius")
mean_temps.close()
You have used a while loop, when you actually want to use a for loop. You have no condition on your while loop, therefore, it will evaluate to True, and run forever. You should use a for loop in the pattern
for x in x:
do stuff
In your case, you will want to use
for x in range(len(city_temp)):
for city in means_temp:
EDIT:
If you have to use a while loop, you could have variable, x, that is incremented by the while loop. The while loop could run while x is less than range(len(city_temp)).
A basic example is
text = "hi"
counter = 0
while counter < 10:
print(text)
counter += 1
EDIT 2:
You also said that they expected you to get out of a while loop. If you want a while loop to run forever unless a condition is met later, you can use the break command to stop a while or for loop.
I've been stuck with this as well with the index error. My original code was:
city_temp = mean_temp.readline().strip(" \n").split(",")
while city_temp:
print("City of",city_temp[0],headings[2],city_temp[2],"Celcius")
city_temp = mean_temp.readline().split(",")
So I read the line then, in the loop, print the line, create the list from reading the line and if the list is empty, or false, break. Problem is I was getting the same error as yourself and this is because city_temp is still true after reading the last line. If you add..
print(city_temp)
to your code you will see that city_temp returns as "" and even though it's an empty string the list has content so will return true. My best guess (and it is a guess) it looks for the split condition and returns back nothing which then populates the list as an empty string.
The solution I found was to readline into a string first (or at the end of the whole loop) before creating the list:
city_temp = mean_temp.readline()
while city_temp:
city_temp = city_temp.split(',')
print(headings[0].capitalize(),"of",city_temp[0],headings[2],"is",city_temp[2],"Celcius")
city_temp = mean_temp.readline()
This time city_temp is checked by the while loop as a string and now returns false. Hope this helps from someone else who struggled with this

Updating dictionary - Python

total=0
line=input()
line = line.upper()
names = {}
(tag,text) = parseLine(line) #initialize
while tag !="</PLAY>": #test
if tag =='<SPEAKER>':
if text not in names:
names.update({text})
I seem to get this far and then draw a blank.. This is what I'm trying to figure out. When I run it, I get:
ValueError: dictionary update sequence element #0 has length 8; 2 is required
Make an empty dictionary
Which I did.
(its keys will be the names of speakers and its values will be how many times s/he spoke)
Within the if statement that checks whether a tag is <SPEAKER>
If the speaker is not in the dictionary, add him to the dictionary with a value of 1
I'm pretty sure I did this right.
If he already is in the dictionary, increment his value
I'm not sure.
You are close, the big issue is on this line:
names.update({text})
You are trying to make a dictionary entry from a string using {text}, python is trying to be helpful and convert the iterable inside the curly brackets into a dictionary entry. Except the string is too long, 8 characters instead of two.
To add a new entry do this instead:
names.update({text:1})
This will set the initial value.
Now, it seems like this is homework, but you've put in a bit of effort already, so while I won't answer the question I'll give you some broad pointers.
Next step is checking if a value already exists in the dictionary. Python dictionaries have a get method that will retrieve a value from the dictionary based on the key. For example:
> names = {'romeo',1}
> print names.get('romeo')
1
But will return None if the key doesn't exist:
> names = {'romeo',1}
> print names.get('juliet')
None
But this takes an optional argument, that returns a different default value
> names = {'romeo',2}
> print names.get('juliet',1)
1
Also note that your loop as it stands will never end, as you only set tag once:
(tag,text) = parseLine(line) #initialize
while tag !="</PLAY>": #test
# you need to set tag in here
# and have an escape clause if you run out of file
The rest is left as an exercise for the reader...

Python: iterate through list and check for matching sub-string in specific parts of string

for all the strings in a list of strings, if either of the first two characters of the string match (in any order) then check if either of last two strings match in specific order. If so, I will ad an edge between two vertex in graph G.
Example:
d = ['BEBC', 'ABRC']
since the 'B' in the first two characters and the 'C' in the second two characters match, I will add an edge. I'm fairly new to Python and what I have come up with through previous searches seems overly verbose:
for i in range(0,len(d)-1):
for j in range(0,len(d)-1):
if (d[i][0] in d[j+1][:2] or d[i][1] in d[j+1][:2]) and \
(d[i][2] in d[j+1][2] or d[i][3] in d[j+1][3]):
G.add_edge(d[i],d[j+1])
The next step on this is to come up with a faster way to iterate through since there will probably only be 1 to 3 edges connecting each node, so 90% of the iteration test will come back false. Suggestions would be welcome!
Since you know that the last character of each list item needs to absolutely match in the same place it's less expensive to check for that first. The code is otherwise doing unnecessary work even though it really doesn't need to. Using timeit you can determine the difference in calculation time by making a few changes, such as checking for the last characters first:
import timeit
d = ['BEBC', 'ABRC']
def test1():
if (d[0][len(d[0])-1] is d[1][len(d[1])-1]):
for i in range(0,2):
if(d[0][i] in d[1][:2]):
return(d[0],d[1])
print(test1())
print(timeit.timeit(stmt=test1, number=1000000))
Result:
('BEBC', 'ABRC')
2.3587113980001959
Original Code:
d = ['BEBC', 'ABRC']
def test2():
for i in range(0,len(d)-1):
for j in range(0,len(d)-1):
if (d[i][0] in d[j+1][:2] or d[i][1] in d[j+1][:2]) and \
(d[i][2] in d[j+1][2] or d[i][3] in d[j+1][3]):
return(d[i],d[j+1])
print(test2())
print(timeit.timeit(stmt=test2, number=1000000))
Result:
('BEBC', 'ABRC')
3.1525327970002763
Now let's take the last list value and change it so that the last character C does not match:
d = ['BEBC', 'ABRX']
New Code:
None
0.766526217000318
Original:
None
2.963771982000253
This is where it's obviously going to pay off in regard to the order of iterating items — especially considering if 90% of the iteration checks could come back false.

Resources