Removing list element while iterating in python3 - python-3.x

I am trying to Remove list elements(numeric values) while iterating through the list. I have two examples. example 1 works but example 2 doesn't, even though both examples use the same logic.
Example 1 : Working
list1=["5","a","6","c","f","9","r"]
print(list1)
for i in list1:
if str.isnumeric(i):
list1.remove(i)
print(list1)
Example 2 : Not Working
list2=["12abc1","45asd"]
for items in list2:
item_list=list(items)
print(item_list)
for i in item_list:
if str.isnumeric(i):
item_list.remove(i)
print(item_list)
I solved the example 2 by using (for i in item_list[:]:). But i can't understand the logic why second example didn't work at first place?

I can't claim to be an expert in Python, as I'm only poorly familiar with it, however I'll give you an explanation of what I think is likely happening.
The first example doesn't actually work any better than the second example, however the data you've used to test it is different so it doesn't show. The problem seems to be due to the fact that you're iterating through any modifying at the same time, so the following happens in the second example:
The program will iterate through its given list:
["1", "2", "a", "b","c", "1"]
The program starts with list item 1. It is numerical, so it is removed. The list is now different:
["2", "a", "b", "c", "1"]
As you are iterating through, it moves on to list item 2. This is problematic, as list item 2 is "a" rather than the "2", so it skips the "2".
As numbers in the first example are separated by at least 1 list item, this isn't an issue as all of the numbers are iterated over.
As for the fix you mentioned of changing list2 to list2[:], I have no idea what happened there as when I ran the program through PythonTutor's visualizor it didn't seem to work.
In order to fix this, the most obvious solution to me would be to try going through the array backwards - starting with the final list item and moving towards the start of the list, as that means any item you remove won't affect the numbering of the previous items.
Hope I helped!

Related

Python list append based on substring search - slow performance

In a list of lists, I need to add a list element to each inner list, whenever one or more elements of another list are contained in a fixed position element of the inner list itself.
Here's an example of the lists
list1 = ['AS23X2', '33YK87', 'YY744Q']
list2 = [[0, 1773332, 'some text that may contain 0, 1 or more occurrences of list1 items'], [1, 77666543, 'some other text 33YK87 is here']]
Note that len(list1) is about 95,000 and len(list1) over 120,000. The requirement is that if more than 1 item of list1 is found within list2[n][2], they are all appended as a list.
The below code does exactly what is required, but is very slow (takes several minutes). I can't figure out how to improve performance - can anyone suggest a possible solution?
for i in list2:
i.append([x for x in list1 if x in i[2]])
Please do consider that list2 is derived from a Pandas dataframe:
list2 = df2.values.to_list()
I'm quite confident there's something more efficient that could be achieved using Pandas, but I'm new to it and hope someone already solved a similar question in a better way.
Thanks
I'm just spit balling ideas:
Use a database
Use multithreading library
Try to do something with Set if the dataset includes many duplicates
Or try using Counter from the collections library to remove duplicates, but keep occurrences. I'm not sure if this will be faster given your dataset

String lookup failed on Iteration over a list using pandas dataframe

I have a list of strings I am trying to search through a pandas DF column with and delete any rows containing an element of that list.
Here is the code to search a specific column, then remove a row containing a substring of text in quotes. In this case, all rows containing 'dave' in the Owner_Name column would be removed. this works great by itself, exactly as expected.
df = df[~df.Owner_Name.str.contains('dave')
When I try to automate this over a list of 54 or so elements, it gets hung up and only removes some, but not all. Any idea why?
Here is my simple code for the loop(mock up to show what I am doing, not my actual code):
badWords= ['random stuff','code words','secret squirrel','blue','black','dave']
for word in badWords:
df = df[~df.Owner_Name.str.contains(word)]
print('Total Rows Left',df.shape[0], word)
I am not getting any errors, but it certainly isn't working like I would want. For example, after the loop, there are still 'dave' elements around in the Owner_Name column, even though it supposedly looped through the list. I even put breadcrumbs to call out the element being passed, so it is doing the loop, but it is as if the str.contains('') is not working properly to remove the rows. I made sure to make everything match the case of my list objects also in the df, so that shouldnt be an issue. I am really stumped and cant find anything on stack about this specific issue.
Adding the answer here which worked:
badWords= ['random stuff','code words','secret squirrel','blue','black','dave']
for word in badWords:
df = df[~df.Owner_Name.str.contains(word,case=False)]
print('Total Rows Left',df.shape[0], word)

Python: using list comprehension to count first element in list of numbers

I'm trying to teach myself list comprehension in Python, but I find it quite tricky compared to regular loops and it is hard to find good beginner examples of list comprehension.
Using this basic example below, it supplies a list of numbers and asks for sentences generated such as "2 numbers start with 1."
my_list = [232, 379, 985, 384, 129, 197]
2 numbers start with 1
1 number starts with 2
2 numbers start with 3
1 number starts with 9
If I was going to do this in a loop, I might bring back the first digit in each like this and then count them and put them in print statements (this just shows how I might start out in a loop):
for x in range(len(my_list)):
strList = (str(my_list[x]))
if strList[0]:
print(strList[0])
I'm so confused about how to bring back element [0] in list comprehension.
I know there is a sum available in list comprehension, so I'm trying to start like this below to create a count (this isn't right though) and I don't know how to retrieve the first elements back out of this so I can piece together sentences like "2 numbers start with 1":
count = [sum(x) for x in my_list if my_list[0]]
print(count,' numbers start with', start_digit)
Thanks for any help with understanding list comprehension. It looks much better than loops in terms of being more concise so I want to learn it.
Perhaps the reason why you're getting confused here is that this particular problem doesn't seem like something that list comprehension would solve.
If you only need to get the first digits of the items, then list comprehension can do the trick:
start_digits = [str(x)[0] for x in my_list]
Getting the occurrences of each item is a completely different story. You can it implement in a variety of ways, and if you're not against importing modules, you can use collections.Counter to get the occurrence counts.
from collections import Counter
Counter(start_digits)

select sublists with items that have multiple occurances throughout list

I have a nested list of integers ranging from 1 to 5 (not really). I want to ensure that each integer occurs at least once in the list, and if one is missing to replace a sublist with a list that contains the missing integer. (I have a full set of possible sublists to choose from.) I'm having trouble working out the syntax for ensuring that the removed list contains integers that have muliple occurances so that I don't recreate the missing integer problem I'm attempting to solve. Here's an example:
a = [[2], [4], [1], [1, 2], [1,2,5]]
Notice 3 is missing. If I randomly choose the the second or fifth sublist for replacement then either the 4 or 5 will be missing. I need to choose the first, third or fourth sublist, where each of the sublist elements i has a list.count(i) > 1.
Therefore I want to create a new list of viable selection candidates. I believe the solution should look something like this
b = [item for item in a if sum(a.count(i)) > 1 for i in item]
but Python3 is complaining that
UnboundLocalError: local variable 'i' referenced before assignment.
Any suggestions? Note: the algorithm will need to be able to scale to thousands of sublists, but this would rarely happen because the probability of a missing integer in those cases becomes nearly 0.
Thanks for looking!

How can i use a string to determine the location of an object within a list?

Let's say i have a list of the alphabet
myList=["a","b","c"..."z"]
Now lets say we have a variable within a loop that takes out a random letter from the list. Obviously random is imported.
while True:
ans=myList[random.randint(1,26)]
I want the user to be asked to take a guess at a letter so within the loop i add
guess=input('Take a guess at a letter from the alphabet')
The user will receive a clue on the whereabouts of the answer
print('The letter locates between x and x.')
Question. How can i determine the position of ans in myList so i can give two random values and perhaps assign them to variables, one below ans and one value over ans.
The range would always be random between these two values so ans is not always the median of the two values.
p.s. I would put the script together to give a better view of what it looks like, but unfortunately i find the formatting help very confusing, and highlighting pieces of code and pressing Ctrl+K does not work as simply as i expected.
The position is the output of the random call, right?
You can save that to a variable before calling the myList[]
index = random.randint(1,26)
ans = myList[index]
use
myList.index(ans)
for above code to work you need to have ans in myList or else it will throw an exception.
BTW this question is similar to Finding the index of an item given a list containing it in Python

Resources