list of numbers to a stack back to a list - python-3.x

Imagine four railroad cars positioned on the input side of the track in the figure above, numbered 1, 2, 3, and 4, respectively. Suppose we perform the following sequence of operations (which is compatible with the direction of the arrows in the diagram and does not require cars to "jump over" other cars):
As a result of these operations the original order of the cars, 1234, has been changed into 2431.
The operations above can be more concisely described by the code SSXSSXXX, where S stands for move a car from the input into the stack, and X stands for move a car from the stack into the output. Some sequences of S's and X's specify meaningless operations, since there may be no cars available on the specified track; for example, the sequence SXXSSXXS cannot be carried out. (Try it to see why.)
Write and test a function that emulates the train car switching:
# [import statements]
import q2_fun
# [constants]
# [rest of program code]
cars = [1, 2, 3, 4]
s_x = input("enter a code with s's and x's to move one stack to another")
list1 = q2_fun.train_swicth(cars, s_x)
print(list1)
from stack_array import Stack
def train_swicth(cars, s_x):
s = Stack()
list1 = []
for i in range(len(s_x)):
if s_x[i] == "s":
a = s_x.append()
s.push(a)
elif s_x[i] == "x":
b = s.pop()
list1.append(b)
return list1
I keep getting [] as the return and it should be 2431 with ssxssxxx. Can I get some help?

if I understood you right you could do:
def train_swicth(cars, s_x):
i=0
s=[]
out=[]
for c in s_x:
if c=="s":
s.append(cars[i])
i+=1
elif c=="x":
out.append(s.pop())
return out
as lists can be used as stacks with append as push-operation

Related

Why does my For Loop skip over elements in my list? [duplicate]

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
I'm iterating over a list of tuples in Python, and am attempting to remove them if they meet certain criteria.
for tup in somelist:
if determine(tup):
code_to_remove_tup
What should I use in place of code_to_remove_tup? I can't figure out how to remove the item in this fashion.
You can use a list comprehension to create a new list containing only the elements you don't want to remove:
somelist = [x for x in somelist if not determine(x)]
Or, by assigning to the slice somelist[:], you can mutate the existing list to contain only the items you want:
somelist[:] = [x for x in somelist if not determine(x)]
This approach could be useful if there are other references to somelist that need to reflect the changes.
Instead of a comprehension, you could also use itertools. In Python 2:
from itertools import ifilterfalse
somelist[:] = ifilterfalse(determine, somelist)
Or in Python 3:
from itertools import filterfalse
somelist[:] = filterfalse(determine, somelist)
The answers suggesting list comprehensions are almost correct—except that they build a completely new list and then give it the same name the old list as, they do not modify the old list in place. That's different from what you'd be doing by selective removal, as in Lennart's suggestion—it's faster, but if your list is accessed via multiple references the fact that you're just reseating one of the references and not altering the list object itself can lead to subtle, disastrous bugs.
Fortunately, it's extremely easy to get both the speed of list comprehensions AND the required semantics of in-place alteration—just code:
somelist[:] = [tup for tup in somelist if determine(tup)]
Note the subtle difference with other answers: this one is not assigning to a barename. It's assigning to a list slice that just happens to be the entire list, thereby replacing the list contents within the same Python list object, rather than just reseating one reference (from the previous list object to the new list object) like the other answers.
You need to take a copy of the list and iterate over it first, or the iteration will fail with what may be unexpected results.
For example (depends on what type of list):
for tup in somelist[:]:
etc....
An example:
>>> somelist = range(10)
>>> for x in somelist:
... somelist.remove(x)
>>> somelist
[1, 3, 5, 7, 9]
>>> somelist = range(10)
>>> for x in somelist[:]:
... somelist.remove(x)
>>> somelist
[]
for i in range(len(somelist) - 1, -1, -1):
if some_condition(somelist, i):
del somelist[i]
You need to go backwards otherwise it's a bit like sawing off the tree-branch that you are sitting on :-)
Python 2 users: replace range by xrange to avoid creating a hardcoded list
Overview of workarounds
Either:
use a linked list implementation/roll your own.
A linked list is the proper data structure to support efficient item removal, and does not force you to make space/time tradeoffs.
A CPython list is implemented with dynamic arrays as mentioned here, which is not a good data type to support removals.
There doesn't seem to be a linked list in the standard library however:
Is there a linked list predefined library in Python?
https://github.com/ajakubek/python-llist
start a new list() from scratch, and .append() back at the end as mentioned at: https://stackoverflow.com/a/1207460/895245
This time efficient, but less space efficient because it keeps an extra copy of the array around during iteration.
use del with an index as mentioned at: https://stackoverflow.com/a/1207485/895245
This is more space efficient since it dispenses the array copy, but it is less time efficient, because removal from dynamic arrays requires shifting all following items back by one, which is O(N).
Generally, if you are doing it quick and dirty and don't want to add a custom LinkedList class, you just want to go for the faster .append() option by default unless memory is a big concern.
Official Python 2 tutorial 4.2. "for Statements"
https://docs.python.org/2/tutorial/controlflow.html#for-statements
This part of the docs makes it clear that:
you need to make a copy of the iterated list to modify it
one way to do it is with the slice notation [:]
If you need to modify the sequence you are iterating over while inside the loop (for example to duplicate selected items), it is recommended that you first make a copy. Iterating over a sequence does not implicitly make a copy. The slice notation makes this especially convenient:
>>> words = ['cat', 'window', 'defenestrate']
>>> for w in words[:]: # Loop over a slice copy of the entire list.
... if len(w) > 6:
... words.insert(0, w)
...
>>> words
['defenestrate', 'cat', 'window', 'defenestrate']
Python 2 documentation 7.3. "The for statement"
https://docs.python.org/2/reference/compound_stmts.html#for
This part of the docs says once again that you have to make a copy, and gives an actual removal example:
Note: There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, i.e. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g.,
for x in a[:]:
if x < 0: a.remove(x)
However, I disagree with this implementation, since .remove() has to iterate the entire list to find the value.
Could Python do this better?
It seems like this particular Python API could be improved. Compare it, for instance, with:
Java ListIterator::remove which documents "This call can only be made once per call to next or previous"
C++ std::vector::erase which returns a valid interator to the element after the one removed
both of which make it crystal clear that you cannot modify a list being iterated except with the iterator itself, and gives you efficient ways to do so without copying the list.
Perhaps the underlying rationale is that Python lists are assumed to be dynamic array backed, and therefore any type of removal will be time inefficient anyways, while Java has a nicer interface hierarchy with both ArrayList and LinkedList implementations of ListIterator.
There doesn't seem to be an explicit linked list type in the Python stdlib either: Python Linked List
Your best approach for such an example would be a list comprehension
somelist = [tup for tup in somelist if determine(tup)]
In cases where you're doing something more complex than calling a determine function, I prefer constructing a new list and simply appending to it as I go. For example
newlist = []
for tup in somelist:
# lots of code here, possibly setting things up for calling determine
if determine(tup):
newlist.append(tup)
somelist = newlist
Copying the list using remove might make your code look a little cleaner, as described in one of the answers below. You should definitely not do this for extremely large lists, since this involves first copying the entire list, and also performing an O(n) remove operation for each element being removed, making this an O(n^2) algorithm.
for tup in somelist[:]:
# lots of code here, possibly setting things up for calling determine
if determine(tup):
newlist.append(tup)
For those who like functional programming:
somelist[:] = filter(lambda tup: not determine(tup), somelist)
or
from itertools import ifilterfalse
somelist[:] = list(ifilterfalse(determine, somelist))
I needed to do this with a huge list, and duplicating the list seemed expensive, especially since in my case the number of deletions would be few compared to the items that remain. I took this low-level approach.
array = [lots of stuff]
arraySize = len(array)
i = 0
while i < arraySize:
if someTest(array[i]):
del array[i]
arraySize -= 1
else:
i += 1
What I don't know is how efficient a couple of deletes are compared to copying a large list. Please comment if you have any insight.
Most of the answers here want you to create a copy of the list. I had a use case where the list was quite long (110K items) and it was smarter to keep reducing the list instead.
First of all you'll need to replace foreach loop with while loop,
i = 0
while i < len(somelist):
if determine(somelist[i]):
del somelist[i]
else:
i += 1
The value of i is not changed in the if block because you'll want to get value of the new item FROM THE SAME INDEX, once the old item is deleted.
It might be smart to also just create a new list if the current list item meets the desired criteria.
so:
for item in originalList:
if (item != badValue):
newList.append(item)
and to avoid having to re-code the entire project with the new lists name:
originalList[:] = newList
note, from Python documentation:
copy.copy(x)
Return a shallow copy of x.
copy.deepcopy(x)
Return a deep copy of x.
This answer was originally written in response to a question which has since been marked as duplicate:
Removing coordinates from list on python
There are two problems in your code:
1) When using remove(), you attempt to remove integers whereas you need to remove a tuple.
2) The for loop will skip items in your list.
Let's run through what happens when we execute your code:
>>> L1 = [(1,2), (5,6), (-1,-2), (1,-2)]
>>> for (a,b) in L1:
... if a < 0 or b < 0:
... L1.remove(a,b)
...
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
TypeError: remove() takes exactly one argument (2 given)
The first problem is that you are passing both 'a' and 'b' to remove(), but remove() only accepts a single argument. So how can we get remove() to work properly with your list? We need to figure out what each element of your list is. In this case, each one is a tuple. To see this, let's access one element of the list (indexing starts at 0):
>>> L1[1]
(5, 6)
>>> type(L1[1])
<type 'tuple'>
Aha! Each element of L1 is actually a tuple. So that's what we need to be passing to remove(). Tuples in python are very easy, they're simply made by enclosing values in parentheses. "a, b" is not a tuple, but "(a, b)" is a tuple. So we modify your code and run it again:
# The remove line now includes an extra "()" to make a tuple out of "a,b"
L1.remove((a,b))
This code runs without any error, but let's look at the list it outputs:
L1 is now: [(1, 2), (5, 6), (1, -2)]
Why is (1,-2) still in your list? It turns out modifying the list while using a loop to iterate over it is a very bad idea without special care. The reason that (1, -2) remains in the list is that the locations of each item within the list changed between iterations of the for loop. Let's look at what happens if we feed the above code a longer list:
L1 = [(1,2),(5,6),(-1,-2),(1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
### Outputs:
L1 is now: [(1, 2), (5, 6), (1, -2), (3, 4), (5, 7), (2, 1), (5, -1), (0, 6)]
As you can infer from that result, every time that the conditional statement evaluates to true and a list item is removed, the next iteration of the loop will skip evaluation of the next item in the list because its values are now located at different indices.
The most intuitive solution is to copy the list, then iterate over the original list and only modify the copy. You can try doing so like this:
L2 = L1
for (a,b) in L1:
if a < 0 or b < 0 :
L2.remove((a,b))
# Now, remove the original copy of L1 and replace with L2
print L2 is L1
del L1
L1 = L2; del L2
print ("L1 is now: ", L1)
However, the output will be identical to before:
'L1 is now: ', [(1, 2), (5, 6), (1, -2), (3, 4), (5, 7), (2, 1), (5, -1), (0, 6)]
This is because when we created L2, python did not actually create a new object. Instead, it merely referenced L2 to the same object as L1. We can verify this with 'is' which is different from merely "equals" (==).
>>> L2=L1
>>> L1 is L2
True
We can make a true copy using copy.copy(). Then everything works as expected:
import copy
L1 = [(1,2), (5,6),(-1,-2), (1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
L2 = copy.copy(L1)
for (a,b) in L1:
if a < 0 or b < 0 :
L2.remove((a,b))
# Now, remove the original copy of L1 and replace with L2
del L1
L1 = L2; del L2
>>> L1 is now: [(1, 2), (5, 6), (3, 4), (5, 7), (2, 1), (0, 6)]
Finally, there is one cleaner solution than having to make an entirely new copy of L1. The reversed() function:
L1 = [(1,2), (5,6),(-1,-2), (1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
for (a,b) in reversed(L1):
if a < 0 or b < 0 :
L1.remove((a,b))
print ("L1 is now: ", L1)
>>> L1 is now: [(1, 2), (5, 6), (3, 4), (5, 7), (2, 1), (0, 6)]
Unfortunately, I cannot adequately describe how reversed() works. It returns a 'listreverseiterator' object when a list is passed to it. For practical purposes, you can think of it as creating a reversed copy of its argument. This is the solution I recommend.
If you want to delete elements from a list while iterating, use a while-loop so you can alter the current index and end index after each deletion.
Example:
i = 0
length = len(list1)
while i < length:
if condition:
list1.remove(list1[i])
i -= 1
length -= 1
i += 1
The other answers are correct that it is usually a bad idea to delete from a list that you're iterating. Reverse iterating avoids some of the pitfalls, but it is much more difficult to follow code that does that, so usually you're better off using a list comprehension or filter.
There is, however, one case where it is safe to remove elements from a sequence that you are iterating: if you're only removing one item while you're iterating. This can be ensured using a return or a break. For example:
for i, item in enumerate(lst):
if item % 4 == 0:
foo(item)
del lst[i]
break
This is often easier to understand than a list comprehension when you're doing some operations with side effects on the first item in a list that meets some condition and then removing that item from the list immediately after.
If you want to do anything else during the iteration, it may be nice to get both the index (which guarantees you being able to reference it, for example if you have a list of dicts) and the actual list item contents.
inlist = [{'field1':10, 'field2':20}, {'field1':30, 'field2':15}]
for idx, i in enumerate(inlist):
do some stuff with i['field1']
if somecondition:
xlist.append(idx)
for i in reversed(xlist): del inlist[i]
enumerate gives you access to the item and the index at once. reversed is so that the indices that you're going to later delete don't change on you.
One possible solution, useful if you want not only remove some things, but also do something with all elements in a single loop:
alist = ['good', 'bad', 'good', 'bad', 'good']
i = 0
for x in alist[:]:
if x == 'bad':
alist.pop(i)
i -= 1
# do something cool with x or just print x
print(x)
i += 1
A for loop will be iterate through an index...
Consider you have a list,
[5, 7, 13, 29, 65, 91]
You have used a list variable called lis. And you use the same to remove...
Your variable
lis = [5, 7, 13, 29, 35, 65, 91]
0 1 2 3 4 5 6
during the 5th iteration,
Your number 35 was not a prime, so you removed it from a list.
lis.remove(y)
And then the next value (65) move on to the previous index.
lis = [5, 7, 13, 29, 65, 91]
0 1 2 3 4 5
so the 4th iteration done pointer moved onto the 5th...
That’s why your loop doesn’t cover 65 since it’s moved into the previous index.
So you shouldn't reference a list into another variable which still references the original instead of a copy.
ite = lis # Don’t do it will reference instead copy
So do a copy of the list using list[::].
Now you will give,
[5, 7, 13, 29]
The problem is you removed a value from a list during iteration and then your list index will collapse.
So you can try list comprehension instead.
Which supports all the iterable like, list, tuple, dict, string, etc.
You might want to use filter() available as the built-in.
For more details check here
You can try for-looping in reverse so for some_list you'll do something like:
list_len = len(some_list)
for i in range(list_len):
reverse_i = list_len - 1 - i
cur = some_list[reverse_i]
# some logic with cur element
if some_condition:
some_list.pop(reverse_i)
This way the index is aligned and doesn't suffer from the list updates (regardless whether you pop cur element or not).
I needed to do something similar and in my case the problem was memory - I needed to merge multiple dataset objects within a list, after doing some stuff with them, as a new object, and needed to get rid of each entry I was merging to avoid duplicating all of them and blowing up memory. In my case having the objects in a dictionary instead of a list worked fine:
```
k = range(5)
v = ['a','b','c','d','e']
d = {key:val for key,val in zip(k, v)}
print d
for i in range(5):
print d[i]
d.pop(i)
print d
```
The most effective method is list comprehension, many people show their case, of course, it is also a good way to get an iterator through filter.
Filter receives a function and a sequence. Filter applies the passed function to each element in turn, and then decides whether to retain or discard the element depending on whether the function return value is True or False.
There is an example (get the odds in the tuple):
list(filter(lambda x:x%2==1, (1, 2, 4, 5, 6, 9, 10, 15)))
# result: [1, 5, 9, 15]
Caution: You can also not handle iterators. Iterators are sometimes better than sequences.
TLDR:
I wrote a library that allows you to do this:
from fluidIter import FluidIterable
fSomeList = FluidIterable(someList)
for tup in fSomeList:
if determine(tup):
# remove 'tup' without "breaking" the iteration
fSomeList.remove(tup)
# tup has also been removed from 'someList'
# as well as 'fSomeList'
It's best to use another method if possible that doesn't require modifying your iterable while iterating over it, but for some algorithms it might not be that straight forward. And so if you are sure that you really do want the code pattern described in the original question, it is possible.
Should work on all mutable sequences not just lists.
Full answer:
Edit: The last code example in this answer gives a use case for why you might sometimes want to modify a list in place rather than use a list comprehension. The first part of the answers serves as tutorial of how an array can be modified in place.
The solution follows on from this answer (for a related question) from senderle. Which explains how the the array index is updated while iterating through a list that has been modified. The solution below is designed to correctly track the array index even if the list is modified.
Download fluidIter.py from here https://github.com/alanbacon/FluidIterator, it is just a single file so no need to install git. There is no installer so you will need to make sure that the file is in the python path your self. The code has been written for python 3 and is untested on python 2.
from fluidIter import FluidIterable
l = [0,1,2,3,4,5,6,7,8]
fluidL = FluidIterable(l)
for i in fluidL:
print('initial state of list on this iteration: ' + str(fluidL))
print('current iteration value: ' + str(i))
print('popped value: ' + str(fluidL.pop(2)))
print(' ')
print('Final List Value: ' + str(l))
This will produce the following output:
initial state of list on this iteration: [0, 1, 2, 3, 4, 5, 6, 7, 8]
current iteration value: 0
popped value: 2
initial state of list on this iteration: [0, 1, 3, 4, 5, 6, 7, 8]
current iteration value: 1
popped value: 3
initial state of list on this iteration: [0, 1, 4, 5, 6, 7, 8]
current iteration value: 4
popped value: 4
initial state of list on this iteration: [0, 1, 5, 6, 7, 8]
current iteration value: 5
popped value: 5
initial state of list on this iteration: [0, 1, 6, 7, 8]
current iteration value: 6
popped value: 6
initial state of list on this iteration: [0, 1, 7, 8]
current iteration value: 7
popped value: 7
initial state of list on this iteration: [0, 1, 8]
current iteration value: 8
popped value: 8
Final List Value: [0, 1]
Above we have used the pop method on the fluid list object. Other common iterable methods are also implemented such as del fluidL[i], .remove, .insert, .append, .extend. The list can also be modified using slices (sort and reverse methods are not implemented).
The only condition is that you must only modify the list in place, if at any point fluidL or l were reassigned to a different list object the code would not work. The original fluidL object would still be used by the for loop but would become out of scope for us to modify.
i.e.
fluidL[2] = 'a' # is OK
fluidL = [0, 1, 'a', 3, 4, 5, 6, 7, 8] # is not OK
If we want to access the current index value of the list we cannot use enumerate, as this only counts how many times the for loop has run. Instead we will use the iterator object directly.
fluidArr = FluidIterable([0,1,2,3])
# get iterator first so can query the current index
fluidArrIter = fluidArr.__iter__()
for i, v in enumerate(fluidArrIter):
print('enum: ', i)
print('current val: ', v)
print('current ind: ', fluidArrIter.currentIndex)
print(fluidArr)
fluidArr.insert(0,'a')
print(' ')
print('Final List Value: ' + str(fluidArr))
This will output the following:
enum: 0
current val: 0
current ind: 0
[0, 1, 2, 3]
enum: 1
current val: 1
current ind: 2
['a', 0, 1, 2, 3]
enum: 2
current val: 2
current ind: 4
['a', 'a', 0, 1, 2, 3]
enum: 3
current val: 3
current ind: 6
['a', 'a', 'a', 0, 1, 2, 3]
Final List Value: ['a', 'a', 'a', 'a', 0, 1, 2, 3]
The FluidIterable class just provides a wrapper for the original list object. The original object can be accessed as a property of the fluid object like so:
originalList = fluidArr.fixedIterable
More examples / tests can be found in the if __name__ is "__main__": section at the bottom of fluidIter.py. These are worth looking at because they explain what happens in various situations. Such as: Replacing a large sections of the list using a slice. Or using (and modifying) the same iterable in nested for loops.
As I stated to start with: this is a complicated solution that will hurt the readability of your code and make it more difficult to debug. Therefore other solutions such as the list comprehensions mentioned in David Raznick's answer should be considered first. That being said, I have found times where this class has been useful to me and has been easier to use than keeping track of the indices of elements that need deleting.
Edit: As mentioned in the comments, this answer does not really present a problem for which this approach provides a solution. I will try to address that here:
List comprehensions provide a way to generate a new list but these approaches tend to look at each element in isolation rather than the current state of the list as a whole.
i.e.
newList = [i for i in oldList if testFunc(i)]
But what if the result of the testFunc depends on the elements that have been added to newList already? Or the elements still in oldList that might be added next? There might still be a way to use a list comprehension but it will begin to lose it's elegance, and for me it feels easier to modify a list in place.
The code below is one example of an algorithm that suffers from the above problem. The algorithm will reduce a list so that no element is a multiple of any other element.
randInts = [70, 20, 61, 80, 54, 18, 7, 18, 55, 9]
fRandInts = FluidIterable(randInts)
fRandIntsIter = fRandInts.__iter__()
# for each value in the list (outer loop)
# test against every other value in the list (inner loop)
for i in fRandIntsIter:
print(' ')
print('outer val: ', i)
innerIntsIter = fRandInts.__iter__()
for j in innerIntsIter:
innerIndex = innerIntsIter.currentIndex
# skip the element that the outloop is currently on
# because we don't want to test a value against itself
if not innerIndex == fRandIntsIter.currentIndex:
# if the test element, j, is a multiple
# of the reference element, i, then remove 'j'
if j%i == 0:
print('remove val: ', j)
# remove element in place, without breaking the
# iteration of either loop
del fRandInts[innerIndex]
# end if multiple, then remove
# end if not the same value as outer loop
# end inner loop
# end outerloop
print('')
print('final list: ', randInts)
The output and the final reduced list are shown below
outer val: 70
outer val: 20
remove val: 80
outer val: 61
outer val: 54
outer val: 18
remove val: 54
remove val: 18
outer val: 7
remove val: 70
outer val: 55
outer val: 9
remove val: 18
final list: [20, 61, 7, 55, 9]
For anything that has the potential to be really big, I use the following.
import numpy as np
orig_list = np.array([1, 2, 3, 4, 5, 100, 8, 13])
remove_me = [100, 1]
cleaned = np.delete(orig_list, remove_me)
print(cleaned)
That should be significantly faster than anything else.
In some situations, where you're doing more than simply filtering a list one item at time, you want your iteration to change while iterating.
Here is an example where copying the list beforehand is incorrect, reverse iteration is impossible and a list comprehension is also not an option.
""" Sieve of Eratosthenes """
def generate_primes(n):
""" Generates all primes less than n. """
primes = list(range(2,n))
idx = 0
while idx < len(primes):
p = primes[idx]
for multiple in range(p+p, n, p):
try:
primes.remove(multiple)
except ValueError:
pass #EAFP
idx += 1
yield p
I can think of three approaches to solve your problem. As an example, I will create a random list of tuples somelist = [(1,2,3), (4,5,6), (3,6,6), (7,8,9), (15,0,0), (10,11,12)]. The condition that I choose is sum of elements of a tuple = 15. In the final list we will only have those tuples whose sum is not equal to 15.
What I have chosen is a randomly chosen example. Feel free to change the list of tuples and the condition that I have chosen.
Method 1.> Use the framework that you had suggested (where one fills in a code inside a for loop). I use a small code with del to delete a tuple that meets the said condition. However, this method will miss a tuple (which satisfies the said condition) if two consecutively placed tuples meet the given condition.
for tup in somelist:
if ( sum(tup)==15 ):
del somelist[somelist.index(tup)]
print somelist
>>> [(1, 2, 3), (3, 6, 6), (7, 8, 9), (10, 11, 12)]
Method 2.> Construct a new list which contains elements (tuples) where the given condition is not met (this is the same thing as removing elements of list where the given condition is met). Following is the code for that:
newlist1 = [somelist[tup] for tup in range(len(somelist)) if(sum(somelist[tup])!=15)]
print newlist1
>>>[(1, 2, 3), (7, 8, 9), (10, 11, 12)]
Method 3.> Find indices where the given condition is met, and then use remove elements (tuples) corresponding to those indices. Following is the code for that.
indices = [i for i in range(len(somelist)) if(sum(somelist[i])==15)]
newlist2 = [tup for j, tup in enumerate(somelist) if j not in indices]
print newlist2
>>>[(1, 2, 3), (7, 8, 9), (10, 11, 12)]
Method 1 and method 2 are faster than method 3. Method2 and method3 are more efficient than method1. I prefer method2. For the aforementioned example, time(method1) : time(method2) : time(method3) = 1 : 1 : 1.7
If you will use the new list later, you can simply set the elem to None, and then judge it in the later loop, like this
for i in li:
i = None
for elem in li:
if elem is None:
continue
In this way, you dont't need copy the list and it's easier to understand.

Find all cycles with at least 3 nodes in a directed graph using dictionary data structure

The above graph was drawn using LaTeX: https://www.overleaf.com/read/rxhpghzbkhby
The above graph is represented as a dictionary in Python.
graph = {
'A' : ['B','D', 'C'],
'B' : ['C'],
'C' : [],
'D' : ['E'],
'E' : ['G'],
'F' : ['A', 'I'],
'G' : ['A', 'K'],
'H' : ['F', 'G'],
'I' : ['H'],
'J' : ['A'],
'K' : []
}
I have a large graph of about 3,378,546 nodes.
Given the above-directed graph, I am trying to find circles with at least 3 and less than 5 different nodes, and output the first 3 circles.
I spent 1 day and a half on this problem. I looked in Stackoverflow and even tried to follow this Detect Cycle in a Directed Graph tutorial but couldn't come up with a solution.
In this example, the output is a tab-delimited text file where each line has a cycle.
0 A, D, E, G
1 F, I, H
0 and 1 are indexes.
Also, there is no order in the alphabet of the graph nodes.
I tried this form How to implement depth-first search in Python tutorial:
visited = set()
def dfs(visited, graph, node):
if node not in visited:
print (node)
visited.add(node)
for neighbour in graph[node]:
dfs(visited, graph, neighbour)
dfs(visited, graph, 'A')
But this doesn't help. I also tried this Post
Here is a commented code that would print the array containing the cycles found. Not much more would be necessary I believe to adjust the return value to the desired format (CSV in your case I think).
It could be that with 3M nodes, this turns out to be slow. I would then suggest going the dynamic programming way and caching/memoize the results of some recursions in order not to repeat them.
I hope this solves your problem or at least helps.
def cycles_rec(root, current_node, graph, depth, visited, min_depth, max_depth):
depth += 1
# First part our stop conditions
if current_node in visited or current_node not in graph.keys():
return ''
if depth >= max_depth:
return ''
visited.append(current_node)
if root in graph[current_node] and depth >= min_depth:
return current_node
# The recursive part
# for each connection we try to find recursively one that would cycle back to our root
for connections in graph[current_node]:
for connection in connections:
result = cycles_rec(root, connection, graph, depth, visited, min_depth, max_depth)
# If a match was found, it would "bubble up" here, we can return it along with the
# current connection that "found it"
if result != '':
return current_node + ' ' + result
# If we are here we found no cycle
return ''
def cycles(graph, min_depth = 3, max_depth = 5):
cycles = {}
for node, connections in graph.items():
for connection in connections:
visited = []
# Let the recursion begin here
result = cycles_rec(node, connection, graph, 1, visited, min_depth, max_depth)
if result == '':
continue
# Here we found a cycle.
# Fingerprint is only necessary in order to not repeat the cycles found in the results
# It could be ignored if repeating them is not important
# It's based on the fact that nodes are all represented as letters here
# It could be it's own function returning a hash for example if nodes have a more
# complex representation
fingerprint = ''.join(sorted(list(node + ' ' + result)))
if fingerprint not in cycles.keys():
cycles[fingerprint] = node + ' ' + result
return list(cycles.values())
So, assuming the graph variable you declared in your example:
print(cycles(graph, 3, 5))
Would print out
['A D E G', 'F I H']
NOTE: This solution is an extended solution to the describe one. I extended to the original graph with ~3million nodes and I look for all cycles that are at least 3 nodes and less than 40 nodes and store the first 3 cycles into a file.
I came up with the following solution.
# implementation of Johnson's cycle finding algorithm
# Original paper: Donald B Johnson. "Finding all the elementary circuits of a directed graph." SIAM Journal on Computing. 1975.
from collections import defaultdict
import networkx as nx
from networkx.utils import not_implemented_for, pairwise
#not_implemented_for("undirected")
def findCycles(G):
"""Find simple cycles of a directed graph.
A `simple cycle` is a closed path where no node appears twice.
Two elementary circuits are distinct if they are not cyclic permutations of each other.
This is iterator/generator version of Johnson's algorithm [1]_.
There may be better algorithms for some cases [2]_ [3]_.
Parameters
----------
G : NetworkX DiGraph
A directed graph
Returns
-------
cycle_generator: generator
A generator that produces elementary cycles of the graph.
Each cycle is represented by a list of nodes along the cycle.
Examples
--------
>>> graph = {'A' : ['B','D', 'C'],
'B' : ['C'],
'C' : [],
'D' : ['E'],
'E' : ['G'],
'F' : ['A', 'I'],
'G' : ['A', 'K'],
'H' : ['F', 'G'],
'I' : ['H'],
'J' : ['A'],
'K' : []
}
>>> G = nx.DiGraph()
>>> G.add_nodes_from(graph.keys())
>>> for keys, values in graph.items():
G.add_edges_from(([(keys, node) for node in values]))
>>> list(nx.findCycles(G))
[['F', 'I', 'H'], ['G', 'A', 'D', 'E']]
Notes
-----
The implementation follows pp. 79-80 in [1]_.
The time complexity is $O((n+e)(c+1))$ for $n$ nodes, $e$ edges and $c$
elementary circuits.
References
----------
.. [1] Finding all the elementary circuits of a directed graph.
D. B. Johnson, SIAM Journal on Computing 4, no. 1, 77-84, 1975.
https://doi.org/10.1137/0204007
.. [2] Enumerating the cycles of a digraph: a new preprocessing strategy.
G. Loizou and P. Thanish, Information Sciences, v. 27, 163-182, 1982.
.. [3] A search strategy for the elementary cycles of a directed graph.
J.L. Szwarcfiter and P.E. Lauer, BIT NUMERICAL MATHEMATICS,
v. 16, no. 2, 192-204, 1976.
--------
"""
def _unblock(thisnode, blocked, B):
stack = {thisnode}
while stack:
node = stack.pop()
if node in blocked:
blocked.remove(node)
stack.update(B[node])
B[node].clear()
# Johnson's algorithm requires some ordering of the nodes.
# We assign the arbitrary ordering given by the strongly connected comps
# There is no need to track the ordering as each node removed as processed.
# Also we save the actual graph so we can mutate it. We only take the
# edges because we do not want to copy edge and node attributes here.
subG = type(G)(G.edges())
sccs = [scc for scc in nx.strongly_connected_components(subG) if len(scc) in list(range(3, 41))]
# Johnson's algorithm exclude self cycle edges like (v, v)
# To be backward compatible, we record those cycles in advance
# and then remove from subG
for v in subG:
if subG.has_edge(v, v):
yield [v]
subG.remove_edge(v, v)
while sccs:
scc = sccs.pop()
sccG = subG.subgraph(scc)
# order of scc determines ordering of nodes
startnode = scc.pop()
# Processing node runs "circuit" routine from recursive version
path = [startnode]
blocked = set() # vertex: blocked from search?
closed = set() # nodes involved in a cycle
blocked.add(startnode)
B = defaultdict(set) # graph portions that yield no elementary circuit
stack = [(startnode, list(sccG[startnode]))] # sccG gives comp nbrs
while stack:
thisnode, nbrs = stack[-1]
if nbrs:
nextnode = nbrs.pop()
if nextnode == startnode:
yield path[:]
closed.update(path)
# print "Found a cycle", path, closed
elif nextnode not in blocked:
path.append(nextnode)
stack.append((nextnode, list(sccG[nextnode])))
closed.discard(nextnode)
blocked.add(nextnode)
continue
# done with nextnode... look for more neighbors
if not nbrs: # no more nbrs
if thisnode in closed:
_unblock(thisnode, blocked, B)
else:
for nbr in sccG[thisnode]:
if thisnode not in B[nbr]:
B[nbr].add(thisnode)
stack.pop()
path.pop()
# done processing this node
H = subG.subgraph(scc) # make smaller to avoid work in SCC routine
sccs.extend(scc for scc in nx.strongly_connected_components(H) if len(scc) in list(range(3, 41)))
import sys, csv, json
def findAllCycles(jsonInputFile, textOutFile):
"""Find simple cycles of a directed graph (jsonInputFile).
Parameters:
----------
jsonInputFile: a json file that has all concepts
textOutFile: give a desired name of output file
Returns:
----------
a .text file (named: {textOutFile}.txt) has the first 3 cycles found in jsonInputFile
Each cycle is represented by a list of nodes along the cycle
"""
with open(jsonInputFile) as infile:
graph = json.load(infile)
# Convert the json file to a NetworkX directed graph
G = nx.DiGraph()
G.add_nodes_from(graph.keys())
for keys, values in graph.items():
G.add_edges_from(([(keys, node) for node in values]))
# Search for all simple cycles existed in the graph
_cycles = list(findCycles(G))
# Start with an empty list and populate it by looping over all cycles
# in _cycles that have at least 3 and less than 40 different concepts (nodes)
cycles = []
for cycle in _cycles:
if len(cycle) in list(range(3, 41)):
cycles.append(cycle)
# Store the cycels under constraint in {textOutFile}.txt
with open(textOutFile, 'w') as outfile:
for cycle in cycles[:3]:
outfile.write(','.join(n for n in cycle)+'\n')
outfile.close()
# When process finishes, print Done!!
return 'Done!!'
infile = sys.argv[1]
outfile = sys.argv[2]
first_cycles = findAllCycles(infile, outfile)
To run this program, you simply use a command line as follows:
>> python3 {program file name}.py graph.json {desired output file name}[.txt][.csv]
let, for example {desired output file name}}.[txt][.csv], be first_3_cycles_found.txt
In my case, the graph has 3,378,546 nodes which took me ~40min to find all cycles using the above code. Thus, the output file will be:
Please contribute to this if you see it needs any improvement or something else to be added.

Python: for loop skipping array elements [duplicate]

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
I'm iterating over a list of tuples in Python, and am attempting to remove them if they meet certain criteria.
for tup in somelist:
if determine(tup):
code_to_remove_tup
What should I use in place of code_to_remove_tup? I can't figure out how to remove the item in this fashion.
You can use a list comprehension to create a new list containing only the elements you don't want to remove:
somelist = [x for x in somelist if not determine(x)]
Or, by assigning to the slice somelist[:], you can mutate the existing list to contain only the items you want:
somelist[:] = [x for x in somelist if not determine(x)]
This approach could be useful if there are other references to somelist that need to reflect the changes.
Instead of a comprehension, you could also use itertools. In Python 2:
from itertools import ifilterfalse
somelist[:] = ifilterfalse(determine, somelist)
Or in Python 3:
from itertools import filterfalse
somelist[:] = filterfalse(determine, somelist)
The answers suggesting list comprehensions are almost correct—except that they build a completely new list and then give it the same name the old list as, they do not modify the old list in place. That's different from what you'd be doing by selective removal, as in Lennart's suggestion—it's faster, but if your list is accessed via multiple references the fact that you're just reseating one of the references and not altering the list object itself can lead to subtle, disastrous bugs.
Fortunately, it's extremely easy to get both the speed of list comprehensions AND the required semantics of in-place alteration—just code:
somelist[:] = [tup for tup in somelist if determine(tup)]
Note the subtle difference with other answers: this one is not assigning to a barename. It's assigning to a list slice that just happens to be the entire list, thereby replacing the list contents within the same Python list object, rather than just reseating one reference (from the previous list object to the new list object) like the other answers.
You need to take a copy of the list and iterate over it first, or the iteration will fail with what may be unexpected results.
For example (depends on what type of list):
for tup in somelist[:]:
etc....
An example:
>>> somelist = range(10)
>>> for x in somelist:
... somelist.remove(x)
>>> somelist
[1, 3, 5, 7, 9]
>>> somelist = range(10)
>>> for x in somelist[:]:
... somelist.remove(x)
>>> somelist
[]
for i in range(len(somelist) - 1, -1, -1):
if some_condition(somelist, i):
del somelist[i]
You need to go backwards otherwise it's a bit like sawing off the tree-branch that you are sitting on :-)
Python 2 users: replace range by xrange to avoid creating a hardcoded list
Overview of workarounds
Either:
use a linked list implementation/roll your own.
A linked list is the proper data structure to support efficient item removal, and does not force you to make space/time tradeoffs.
A CPython list is implemented with dynamic arrays as mentioned here, which is not a good data type to support removals.
There doesn't seem to be a linked list in the standard library however:
Is there a linked list predefined library in Python?
https://github.com/ajakubek/python-llist
start a new list() from scratch, and .append() back at the end as mentioned at: https://stackoverflow.com/a/1207460/895245
This time efficient, but less space efficient because it keeps an extra copy of the array around during iteration.
use del with an index as mentioned at: https://stackoverflow.com/a/1207485/895245
This is more space efficient since it dispenses the array copy, but it is less time efficient, because removal from dynamic arrays requires shifting all following items back by one, which is O(N).
Generally, if you are doing it quick and dirty and don't want to add a custom LinkedList class, you just want to go for the faster .append() option by default unless memory is a big concern.
Official Python 2 tutorial 4.2. "for Statements"
https://docs.python.org/2/tutorial/controlflow.html#for-statements
This part of the docs makes it clear that:
you need to make a copy of the iterated list to modify it
one way to do it is with the slice notation [:]
If you need to modify the sequence you are iterating over while inside the loop (for example to duplicate selected items), it is recommended that you first make a copy. Iterating over a sequence does not implicitly make a copy. The slice notation makes this especially convenient:
>>> words = ['cat', 'window', 'defenestrate']
>>> for w in words[:]: # Loop over a slice copy of the entire list.
... if len(w) > 6:
... words.insert(0, w)
...
>>> words
['defenestrate', 'cat', 'window', 'defenestrate']
Python 2 documentation 7.3. "The for statement"
https://docs.python.org/2/reference/compound_stmts.html#for
This part of the docs says once again that you have to make a copy, and gives an actual removal example:
Note: There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, i.e. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g.,
for x in a[:]:
if x < 0: a.remove(x)
However, I disagree with this implementation, since .remove() has to iterate the entire list to find the value.
Could Python do this better?
It seems like this particular Python API could be improved. Compare it, for instance, with:
Java ListIterator::remove which documents "This call can only be made once per call to next or previous"
C++ std::vector::erase which returns a valid interator to the element after the one removed
both of which make it crystal clear that you cannot modify a list being iterated except with the iterator itself, and gives you efficient ways to do so without copying the list.
Perhaps the underlying rationale is that Python lists are assumed to be dynamic array backed, and therefore any type of removal will be time inefficient anyways, while Java has a nicer interface hierarchy with both ArrayList and LinkedList implementations of ListIterator.
There doesn't seem to be an explicit linked list type in the Python stdlib either: Python Linked List
Your best approach for such an example would be a list comprehension
somelist = [tup for tup in somelist if determine(tup)]
In cases where you're doing something more complex than calling a determine function, I prefer constructing a new list and simply appending to it as I go. For example
newlist = []
for tup in somelist:
# lots of code here, possibly setting things up for calling determine
if determine(tup):
newlist.append(tup)
somelist = newlist
Copying the list using remove might make your code look a little cleaner, as described in one of the answers below. You should definitely not do this for extremely large lists, since this involves first copying the entire list, and also performing an O(n) remove operation for each element being removed, making this an O(n^2) algorithm.
for tup in somelist[:]:
# lots of code here, possibly setting things up for calling determine
if determine(tup):
newlist.append(tup)
For those who like functional programming:
somelist[:] = filter(lambda tup: not determine(tup), somelist)
or
from itertools import ifilterfalse
somelist[:] = list(ifilterfalse(determine, somelist))
I needed to do this with a huge list, and duplicating the list seemed expensive, especially since in my case the number of deletions would be few compared to the items that remain. I took this low-level approach.
array = [lots of stuff]
arraySize = len(array)
i = 0
while i < arraySize:
if someTest(array[i]):
del array[i]
arraySize -= 1
else:
i += 1
What I don't know is how efficient a couple of deletes are compared to copying a large list. Please comment if you have any insight.
Most of the answers here want you to create a copy of the list. I had a use case where the list was quite long (110K items) and it was smarter to keep reducing the list instead.
First of all you'll need to replace foreach loop with while loop,
i = 0
while i < len(somelist):
if determine(somelist[i]):
del somelist[i]
else:
i += 1
The value of i is not changed in the if block because you'll want to get value of the new item FROM THE SAME INDEX, once the old item is deleted.
It might be smart to also just create a new list if the current list item meets the desired criteria.
so:
for item in originalList:
if (item != badValue):
newList.append(item)
and to avoid having to re-code the entire project with the new lists name:
originalList[:] = newList
note, from Python documentation:
copy.copy(x)
Return a shallow copy of x.
copy.deepcopy(x)
Return a deep copy of x.
This answer was originally written in response to a question which has since been marked as duplicate:
Removing coordinates from list on python
There are two problems in your code:
1) When using remove(), you attempt to remove integers whereas you need to remove a tuple.
2) The for loop will skip items in your list.
Let's run through what happens when we execute your code:
>>> L1 = [(1,2), (5,6), (-1,-2), (1,-2)]
>>> for (a,b) in L1:
... if a < 0 or b < 0:
... L1.remove(a,b)
...
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
TypeError: remove() takes exactly one argument (2 given)
The first problem is that you are passing both 'a' and 'b' to remove(), but remove() only accepts a single argument. So how can we get remove() to work properly with your list? We need to figure out what each element of your list is. In this case, each one is a tuple. To see this, let's access one element of the list (indexing starts at 0):
>>> L1[1]
(5, 6)
>>> type(L1[1])
<type 'tuple'>
Aha! Each element of L1 is actually a tuple. So that's what we need to be passing to remove(). Tuples in python are very easy, they're simply made by enclosing values in parentheses. "a, b" is not a tuple, but "(a, b)" is a tuple. So we modify your code and run it again:
# The remove line now includes an extra "()" to make a tuple out of "a,b"
L1.remove((a,b))
This code runs without any error, but let's look at the list it outputs:
L1 is now: [(1, 2), (5, 6), (1, -2)]
Why is (1,-2) still in your list? It turns out modifying the list while using a loop to iterate over it is a very bad idea without special care. The reason that (1, -2) remains in the list is that the locations of each item within the list changed between iterations of the for loop. Let's look at what happens if we feed the above code a longer list:
L1 = [(1,2),(5,6),(-1,-2),(1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
### Outputs:
L1 is now: [(1, 2), (5, 6), (1, -2), (3, 4), (5, 7), (2, 1), (5, -1), (0, 6)]
As you can infer from that result, every time that the conditional statement evaluates to true and a list item is removed, the next iteration of the loop will skip evaluation of the next item in the list because its values are now located at different indices.
The most intuitive solution is to copy the list, then iterate over the original list and only modify the copy. You can try doing so like this:
L2 = L1
for (a,b) in L1:
if a < 0 or b < 0 :
L2.remove((a,b))
# Now, remove the original copy of L1 and replace with L2
print L2 is L1
del L1
L1 = L2; del L2
print ("L1 is now: ", L1)
However, the output will be identical to before:
'L1 is now: ', [(1, 2), (5, 6), (1, -2), (3, 4), (5, 7), (2, 1), (5, -1), (0, 6)]
This is because when we created L2, python did not actually create a new object. Instead, it merely referenced L2 to the same object as L1. We can verify this with 'is' which is different from merely "equals" (==).
>>> L2=L1
>>> L1 is L2
True
We can make a true copy using copy.copy(). Then everything works as expected:
import copy
L1 = [(1,2), (5,6),(-1,-2), (1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
L2 = copy.copy(L1)
for (a,b) in L1:
if a < 0 or b < 0 :
L2.remove((a,b))
# Now, remove the original copy of L1 and replace with L2
del L1
L1 = L2; del L2
>>> L1 is now: [(1, 2), (5, 6), (3, 4), (5, 7), (2, 1), (0, 6)]
Finally, there is one cleaner solution than having to make an entirely new copy of L1. The reversed() function:
L1 = [(1,2), (5,6),(-1,-2), (1,-2),(3,4),(5,7),(-4,4),(2,1),(-3,-3),(5,-1),(0,6)]
for (a,b) in reversed(L1):
if a < 0 or b < 0 :
L1.remove((a,b))
print ("L1 is now: ", L1)
>>> L1 is now: [(1, 2), (5, 6), (3, 4), (5, 7), (2, 1), (0, 6)]
Unfortunately, I cannot adequately describe how reversed() works. It returns a 'listreverseiterator' object when a list is passed to it. For practical purposes, you can think of it as creating a reversed copy of its argument. This is the solution I recommend.
If you want to delete elements from a list while iterating, use a while-loop so you can alter the current index and end index after each deletion.
Example:
i = 0
length = len(list1)
while i < length:
if condition:
list1.remove(list1[i])
i -= 1
length -= 1
i += 1
The other answers are correct that it is usually a bad idea to delete from a list that you're iterating. Reverse iterating avoids some of the pitfalls, but it is much more difficult to follow code that does that, so usually you're better off using a list comprehension or filter.
There is, however, one case where it is safe to remove elements from a sequence that you are iterating: if you're only removing one item while you're iterating. This can be ensured using a return or a break. For example:
for i, item in enumerate(lst):
if item % 4 == 0:
foo(item)
del lst[i]
break
This is often easier to understand than a list comprehension when you're doing some operations with side effects on the first item in a list that meets some condition and then removing that item from the list immediately after.
If you want to do anything else during the iteration, it may be nice to get both the index (which guarantees you being able to reference it, for example if you have a list of dicts) and the actual list item contents.
inlist = [{'field1':10, 'field2':20}, {'field1':30, 'field2':15}]
for idx, i in enumerate(inlist):
do some stuff with i['field1']
if somecondition:
xlist.append(idx)
for i in reversed(xlist): del inlist[i]
enumerate gives you access to the item and the index at once. reversed is so that the indices that you're going to later delete don't change on you.
One possible solution, useful if you want not only remove some things, but also do something with all elements in a single loop:
alist = ['good', 'bad', 'good', 'bad', 'good']
i = 0
for x in alist[:]:
if x == 'bad':
alist.pop(i)
i -= 1
# do something cool with x or just print x
print(x)
i += 1
A for loop will be iterate through an index...
Consider you have a list,
[5, 7, 13, 29, 65, 91]
You have used a list variable called lis. And you use the same to remove...
Your variable
lis = [5, 7, 13, 29, 35, 65, 91]
0 1 2 3 4 5 6
during the 5th iteration,
Your number 35 was not a prime, so you removed it from a list.
lis.remove(y)
And then the next value (65) move on to the previous index.
lis = [5, 7, 13, 29, 65, 91]
0 1 2 3 4 5
so the 4th iteration done pointer moved onto the 5th...
That’s why your loop doesn’t cover 65 since it’s moved into the previous index.
So you shouldn't reference a list into another variable which still references the original instead of a copy.
ite = lis # Don’t do it will reference instead copy
So do a copy of the list using list[::].
Now you will give,
[5, 7, 13, 29]
The problem is you removed a value from a list during iteration and then your list index will collapse.
So you can try list comprehension instead.
Which supports all the iterable like, list, tuple, dict, string, etc.
You might want to use filter() available as the built-in.
For more details check here
You can try for-looping in reverse so for some_list you'll do something like:
list_len = len(some_list)
for i in range(list_len):
reverse_i = list_len - 1 - i
cur = some_list[reverse_i]
# some logic with cur element
if some_condition:
some_list.pop(reverse_i)
This way the index is aligned and doesn't suffer from the list updates (regardless whether you pop cur element or not).
I needed to do something similar and in my case the problem was memory - I needed to merge multiple dataset objects within a list, after doing some stuff with them, as a new object, and needed to get rid of each entry I was merging to avoid duplicating all of them and blowing up memory. In my case having the objects in a dictionary instead of a list worked fine:
```
k = range(5)
v = ['a','b','c','d','e']
d = {key:val for key,val in zip(k, v)}
print d
for i in range(5):
print d[i]
d.pop(i)
print d
```
The most effective method is list comprehension, many people show their case, of course, it is also a good way to get an iterator through filter.
Filter receives a function and a sequence. Filter applies the passed function to each element in turn, and then decides whether to retain or discard the element depending on whether the function return value is True or False.
There is an example (get the odds in the tuple):
list(filter(lambda x:x%2==1, (1, 2, 4, 5, 6, 9, 10, 15)))
# result: [1, 5, 9, 15]
Caution: You can also not handle iterators. Iterators are sometimes better than sequences.
TLDR:
I wrote a library that allows you to do this:
from fluidIter import FluidIterable
fSomeList = FluidIterable(someList)
for tup in fSomeList:
if determine(tup):
# remove 'tup' without "breaking" the iteration
fSomeList.remove(tup)
# tup has also been removed from 'someList'
# as well as 'fSomeList'
It's best to use another method if possible that doesn't require modifying your iterable while iterating over it, but for some algorithms it might not be that straight forward. And so if you are sure that you really do want the code pattern described in the original question, it is possible.
Should work on all mutable sequences not just lists.
Full answer:
Edit: The last code example in this answer gives a use case for why you might sometimes want to modify a list in place rather than use a list comprehension. The first part of the answers serves as tutorial of how an array can be modified in place.
The solution follows on from this answer (for a related question) from senderle. Which explains how the the array index is updated while iterating through a list that has been modified. The solution below is designed to correctly track the array index even if the list is modified.
Download fluidIter.py from here https://github.com/alanbacon/FluidIterator, it is just a single file so no need to install git. There is no installer so you will need to make sure that the file is in the python path your self. The code has been written for python 3 and is untested on python 2.
from fluidIter import FluidIterable
l = [0,1,2,3,4,5,6,7,8]
fluidL = FluidIterable(l)
for i in fluidL:
print('initial state of list on this iteration: ' + str(fluidL))
print('current iteration value: ' + str(i))
print('popped value: ' + str(fluidL.pop(2)))
print(' ')
print('Final List Value: ' + str(l))
This will produce the following output:
initial state of list on this iteration: [0, 1, 2, 3, 4, 5, 6, 7, 8]
current iteration value: 0
popped value: 2
initial state of list on this iteration: [0, 1, 3, 4, 5, 6, 7, 8]
current iteration value: 1
popped value: 3
initial state of list on this iteration: [0, 1, 4, 5, 6, 7, 8]
current iteration value: 4
popped value: 4
initial state of list on this iteration: [0, 1, 5, 6, 7, 8]
current iteration value: 5
popped value: 5
initial state of list on this iteration: [0, 1, 6, 7, 8]
current iteration value: 6
popped value: 6
initial state of list on this iteration: [0, 1, 7, 8]
current iteration value: 7
popped value: 7
initial state of list on this iteration: [0, 1, 8]
current iteration value: 8
popped value: 8
Final List Value: [0, 1]
Above we have used the pop method on the fluid list object. Other common iterable methods are also implemented such as del fluidL[i], .remove, .insert, .append, .extend. The list can also be modified using slices (sort and reverse methods are not implemented).
The only condition is that you must only modify the list in place, if at any point fluidL or l were reassigned to a different list object the code would not work. The original fluidL object would still be used by the for loop but would become out of scope for us to modify.
i.e.
fluidL[2] = 'a' # is OK
fluidL = [0, 1, 'a', 3, 4, 5, 6, 7, 8] # is not OK
If we want to access the current index value of the list we cannot use enumerate, as this only counts how many times the for loop has run. Instead we will use the iterator object directly.
fluidArr = FluidIterable([0,1,2,3])
# get iterator first so can query the current index
fluidArrIter = fluidArr.__iter__()
for i, v in enumerate(fluidArrIter):
print('enum: ', i)
print('current val: ', v)
print('current ind: ', fluidArrIter.currentIndex)
print(fluidArr)
fluidArr.insert(0,'a')
print(' ')
print('Final List Value: ' + str(fluidArr))
This will output the following:
enum: 0
current val: 0
current ind: 0
[0, 1, 2, 3]
enum: 1
current val: 1
current ind: 2
['a', 0, 1, 2, 3]
enum: 2
current val: 2
current ind: 4
['a', 'a', 0, 1, 2, 3]
enum: 3
current val: 3
current ind: 6
['a', 'a', 'a', 0, 1, 2, 3]
Final List Value: ['a', 'a', 'a', 'a', 0, 1, 2, 3]
The FluidIterable class just provides a wrapper for the original list object. The original object can be accessed as a property of the fluid object like so:
originalList = fluidArr.fixedIterable
More examples / tests can be found in the if __name__ is "__main__": section at the bottom of fluidIter.py. These are worth looking at because they explain what happens in various situations. Such as: Replacing a large sections of the list using a slice. Or using (and modifying) the same iterable in nested for loops.
As I stated to start with: this is a complicated solution that will hurt the readability of your code and make it more difficult to debug. Therefore other solutions such as the list comprehensions mentioned in David Raznick's answer should be considered first. That being said, I have found times where this class has been useful to me and has been easier to use than keeping track of the indices of elements that need deleting.
Edit: As mentioned in the comments, this answer does not really present a problem for which this approach provides a solution. I will try to address that here:
List comprehensions provide a way to generate a new list but these approaches tend to look at each element in isolation rather than the current state of the list as a whole.
i.e.
newList = [i for i in oldList if testFunc(i)]
But what if the result of the testFunc depends on the elements that have been added to newList already? Or the elements still in oldList that might be added next? There might still be a way to use a list comprehension but it will begin to lose it's elegance, and for me it feels easier to modify a list in place.
The code below is one example of an algorithm that suffers from the above problem. The algorithm will reduce a list so that no element is a multiple of any other element.
randInts = [70, 20, 61, 80, 54, 18, 7, 18, 55, 9]
fRandInts = FluidIterable(randInts)
fRandIntsIter = fRandInts.__iter__()
# for each value in the list (outer loop)
# test against every other value in the list (inner loop)
for i in fRandIntsIter:
print(' ')
print('outer val: ', i)
innerIntsIter = fRandInts.__iter__()
for j in innerIntsIter:
innerIndex = innerIntsIter.currentIndex
# skip the element that the outloop is currently on
# because we don't want to test a value against itself
if not innerIndex == fRandIntsIter.currentIndex:
# if the test element, j, is a multiple
# of the reference element, i, then remove 'j'
if j%i == 0:
print('remove val: ', j)
# remove element in place, without breaking the
# iteration of either loop
del fRandInts[innerIndex]
# end if multiple, then remove
# end if not the same value as outer loop
# end inner loop
# end outerloop
print('')
print('final list: ', randInts)
The output and the final reduced list are shown below
outer val: 70
outer val: 20
remove val: 80
outer val: 61
outer val: 54
outer val: 18
remove val: 54
remove val: 18
outer val: 7
remove val: 70
outer val: 55
outer val: 9
remove val: 18
final list: [20, 61, 7, 55, 9]
For anything that has the potential to be really big, I use the following.
import numpy as np
orig_list = np.array([1, 2, 3, 4, 5, 100, 8, 13])
remove_me = [100, 1]
cleaned = np.delete(orig_list, remove_me)
print(cleaned)
That should be significantly faster than anything else.
In some situations, where you're doing more than simply filtering a list one item at time, you want your iteration to change while iterating.
Here is an example where copying the list beforehand is incorrect, reverse iteration is impossible and a list comprehension is also not an option.
""" Sieve of Eratosthenes """
def generate_primes(n):
""" Generates all primes less than n. """
primes = list(range(2,n))
idx = 0
while idx < len(primes):
p = primes[idx]
for multiple in range(p+p, n, p):
try:
primes.remove(multiple)
except ValueError:
pass #EAFP
idx += 1
yield p
I can think of three approaches to solve your problem. As an example, I will create a random list of tuples somelist = [(1,2,3), (4,5,6), (3,6,6), (7,8,9), (15,0,0), (10,11,12)]. The condition that I choose is sum of elements of a tuple = 15. In the final list we will only have those tuples whose sum is not equal to 15.
What I have chosen is a randomly chosen example. Feel free to change the list of tuples and the condition that I have chosen.
Method 1.> Use the framework that you had suggested (where one fills in a code inside a for loop). I use a small code with del to delete a tuple that meets the said condition. However, this method will miss a tuple (which satisfies the said condition) if two consecutively placed tuples meet the given condition.
for tup in somelist:
if ( sum(tup)==15 ):
del somelist[somelist.index(tup)]
print somelist
>>> [(1, 2, 3), (3, 6, 6), (7, 8, 9), (10, 11, 12)]
Method 2.> Construct a new list which contains elements (tuples) where the given condition is not met (this is the same thing as removing elements of list where the given condition is met). Following is the code for that:
newlist1 = [somelist[tup] for tup in range(len(somelist)) if(sum(somelist[tup])!=15)]
print newlist1
>>>[(1, 2, 3), (7, 8, 9), (10, 11, 12)]
Method 3.> Find indices where the given condition is met, and then use remove elements (tuples) corresponding to those indices. Following is the code for that.
indices = [i for i in range(len(somelist)) if(sum(somelist[i])==15)]
newlist2 = [tup for j, tup in enumerate(somelist) if j not in indices]
print newlist2
>>>[(1, 2, 3), (7, 8, 9), (10, 11, 12)]
Method 1 and method 2 are faster than method 3. Method2 and method3 are more efficient than method1. I prefer method2. For the aforementioned example, time(method1) : time(method2) : time(method3) = 1 : 1 : 1.7
If you will use the new list later, you can simply set the elem to None, and then judge it in the later loop, like this
for i in li:
i = None
for elem in li:
if elem is None:
continue
In this way, you dont't need copy the list and it's easier to understand.

Recursion happens too many times and list is not iterable

I'm trying to make a secret santa programm. The input is in form of the list of names of people g. ["John", "Bob", "Alice"] and the list of emials ["John#gmail.com", "Bob#gmail.com", "Alice#outlook.com"]. I need to generate pairs of email adress and a random name which doesn't belong to the said email adress. For this I have written the function compare.
def compare(list_of_names, list_of_emails):
zipped_lists = zip(list_of_emails, list_of_names)
random.shuffle(list_of_emails)
zipped_shuffled_lists = zip(list_of_emails, list_of_names)
for pair in zipped_lists:
for shuffle_pair in zipped_shuffled_lists:
if shuffle_pair == pair:
return compare(list_of_names, list_of_emails)
return zipped_shuffled_lists
But instead of shuffling like it should it just creates a recursion. i still can't find out why. After a finite amount of time it should create two different lists that work. Also the shuffled_list_of_emails is not iterable, why?
EDIT:changed the code with shuffle because it works in place
zip is lazy!
I'm not sure why, but I'm too excited about this right now, so the answer might be a bit messy. Feel free to ask for clarification)
Let's step through your code:
def compare(list_of_names, list_of_emails):
# the `zip` object doesn't actually iterate over any of its arguments until you attempt to iterate over `zipped_lists`
zipped_lists = zip(list_of_emails, list_of_names)
# modify this IN-PLACE; but the `zip` object above has a pointer to this SAME list
random.shuffle(list_of_emails)
# since the very first `zip` object has `list_of_emails` as its argument, AND SO DOES THE ONE BELOW, they both point to the very same, SHUFFLED (!) list
zipped_shuffled_lists = zip(list_of_emails, list_of_names)
# now you're iterating over identical `zip` objects
for pair in zipped_lists:
for shuffle_pair in zipped_shuffled_lists:
# obviously, this is always true
if shuffle_pair == pair:
# say "hello" to infinite recursion, then!
return compare(list_of_names, list_of_emails)
return zipped_shuffled_lists
Let's recreate this in the Python interpreter!
>>> List = list(range(5))
>>> List
[0, 1, 2, 3, 4]
>>> zipped_1 = zip(List, range(5))
>>> import random
>>> random.shuffle(List)
>>> zipped_2 = zip(List, range(5))
>>> print(List)
[4, 2, 3, 0, 1]
>>> zipped_1, zipped_2 = list(zipped_1), list(zipped_2)
>>> zipped_1 == zipped_2
True
You see, two different zip objects applied to the same list at different times (before and after that list is modified in-place) produce the exact same result! Because zip doesn't do the zipping once you do zip(a, b), it will produce the zipped... uh, stuff... on-the-fly, while you're iterating over it!
So, to fix the issue, do not shuffle the original list, shuffle its copy:
list_of_emails_copy = list_of_emails.copy()
random.shuffle(list_of_emails_copy)
zipped_shuffled_lists = zip(list_of_emails_copy, list_of_names)
There's correct answer from #ForceBru already. But a will contribute a little.
You should avoid zip's lazy evaluation and unfold zips with, for example, list:
def compare(list_of_names, list_of_emails):
zipped_lists = list(zip(list_of_emails, list_of_names)) # eager evaluation instead of lazy
random.shuffle(list_of_emails) # shuffle lists
zipped_shuffled_lists = list(zip(list_of_emails, list_of_names)) # eager again
for pair in zipped_lists:
for shuffle_pair in zipped_shuffled_lists:
if shuffle_pair == pair:
return compare(list_of_names, list_of_emails)
return zipped_shuffled_lists
But I guess you need no recursion and can achieve your task easier:
def compare(list_of_names, list_of_emails):
zipped_lists = list(zip(list_of_emails, list_of_names))
random.shuffle(zipped_lists) # shuffle list of emails and names
result = []
shuffled_emails = [i[0] for i in zipped_lists]
for i, _ in enumerate(shuffled_emails):
result.append(zipped_lists[i-1][1]) # shift email relatively one position to the right
return list(zip(result, shuffled_emails))
This code links an name with an email of a previous name, which is randomly selected, and it guaranteed does not match.
There's no recursion, works fine for lists with two or more elements.

How to give points for each indices of list

def voting_borda(rank_ballots):
'''(list of list of str) -> tuple of (str, list of int)
The parameter is a list of 4-element lists that represent rank ballots for a single riding.
The Borda Count is determined by assigning points according to ranking. A party gets 3 points for each first-choice ranking, 2 points for each second-choice ranking and 1 point for each third-choice ranking. (No points are awarded for being ranked fourth.) For example, the rank ballot shown above would contribute 3 points to the Liberal count, 2 points to the Green count and 1 point to the CPC count. The party that receives the most points wins the seat.
Return a tuple where the first element is the name of the winning party according to Borda Count and the second element is a four-element list that contains the total number of points for each party. The order of the list elements corresponds to the order of the parties in PARTY_INDICES.
#>>> voting_borda([['GREEN','NDP', 'LIBERAL', 'CPC'], ['GREEN','CPC','LIBERAL','NDP'],
['LIBERAL','NDP', 'CPC', 'GREEN']])
#('GREEN',[4, 6, 5, 3])
list_of_party_order = []
for sublist in rank_ballots:
for party in sublist[0]:
if party == 'GREEN':
GREEN_COUNT += 3
elif party == 'NDP':
NDP_COUNT += 3
elif party == 'LIBERAL':
LIBERAL_COUNT += 3
elif party == 'CPC':
CPC_COUNT += 3
for party in sublist[1]:
if party == 'GREEN':
GREEN_COUNT += 2
elif party == 'NDP':
NDP_COUNT += 2
elif party == 'LIBERAL':
LIBERAL_COUNT += 2
elif party == 'CPC':
CPC_COUNT += 2
for party in sublist[2]:
if party == 'GREEN':
GREEN_COUNT += 1
elif party == 'NDP':
NDP_COUNT += 1
elif party == 'LIBERAL':
LIBERAL_COUNT += 1
elif party == 'CPC':
CPC_COUNT += 1
I don't know how I would give points for each indices of the list MORE SIMPLY.
Can someone please help me? Without being too complicated. Thank you!
This does not do EXACTLY what you were asking for, it returns a tuple with two values: winner's name, and a dictionary of all parties and their values, instead of a list with only values. In my opinion, this is better for almost any case, and if you don't like it, you can convert it to a list.
It also takes multiple parameters instead of one list too, but you can change that by simply removing the * from *args
Notice, however, if you care about speed rather than small code, this is not the best way to do it. It does work, tho.
It is also superior to your code in the manner of this allowing you to NOT use any of the parties names or amount of parties inside the function, which makes it possible to add, rename or remove parties.
def voting_borda(*args):
results = {}
for sublist in args:
for i in range(0, 3):
if sublist[i] in results:
results[sublist[i]] += 3-i
else:
results[sublist[i]] = 3-i
winner = max(results, key=results.get)
return winner, results
print(voting_borda(
['GREEN','NDP', 'LIBERAL', 'CPC'],
['GREEN','CPC','LIBERAL','NDP'],
['LIBERAL','NDP', 'CPC', 'GREEN']
))
Will result into: ('GREEN', {'LIBERAL': 5, 'NDP': 4, 'GREEN': 6, 'CPC': 3})
I found your voting simulation assignment online and added a few of the constants from it here to simplify the code for solving the problem a little (although it may not look like it with their definitions at the beginning).
The first element of the tuple returned is not exactly in the requested format -- it's a list rather than a single value -- to deal with the quite real possibility of tie votes, as illustrated with the sample data values used below for rank_ballots. Even if there wasn't a tie, the element is return is a singleton list -- which actually is usually easier to deal with than having it vary depending on there's more than one or not.
PARTY_NAMES = ['NDP', 'GREEN', 'LIBERAL', 'CPC']
NAME_TO_INDEX = {party:PARTY_NAMES.index(party) for party in PARTY_NAMES}
INDEX_TO_NAME = {PARTY_NAMES.index(party):party for party in PARTY_NAMES}
def voting_borda(rank_ballots):
results = [0 for _ in PARTY_NAMES]
MAX_POINTS = len(PARTY_NAMES)-1
for ballot in rank_ballots:
for i,party in enumerate(ballot):
results[NAME_TO_INDEX[party]] += MAX_POINTS-i
highest_rank = max(results)
winners = [INDEX_TO_NAME[i] for i,total in enumerate(results) if total == highest_rank]
return winners, results
rank_ballots = [['GREEN','NDP', 'LIBERAL', 'CPC'],
['GREEN','CPC','LIBERAL','NDP'],
['LIBERAL', 'GREEN', 'NDP', 'CPC'],
['LIBERAL','NDP', 'CPC', 'GREEN'],]
print(voting_borda(rank_ballots))
Output:
(['GREEN', 'LIBERAL'], [5, 8, 8, 3])

Resources