Infinite Loop when Iterating through a Linked List Python 3 - python-3.x

I am trying to write a function that removes all pdf files from a linked list, however after running this, I quickly realized that it became an infinite loop. My first while loop is supposed to catch all pdf files at the beginning of the linked list. My second while loop is supposed to iterate through the linked list as many times as it takes to get rid of the pdf files. I guess my logic for while not loops is incorrect.
def remove_all(lst):
ptr = lst
while ptr['data'][0] == 'pdf':
ptr = ptr['next']
lst = ptr
all_removed = True
while not all_removed:
all_removed = False
while ptr['next'] != None:
if ptr['next']['data'][0] == 'pdf':
ptr['next'] = ptr['next']['next']
all_removed = True
ptr = ptr['next']
return lst
I am getting the error that none type is not subscriptable for the the second while loop, which confuses me since it is supposed to stop when ptr['next'] is None.
My linked list looks like this:
{'data': ['pdf', 2, 4], 'next': {'data': ['csv', 1, 1], 'next': {'data': ['pdf', 234, 53], 'next':
{'data': ['xml', 1, 2], 'next': {'data': ['pdf', 0, 1], 'next': None}}}}}

First, try:
ptr['next'] = ptr['next']['next']
instead of:
ptr['next'] == ptr['next']['next']
Second, since we have a 'next':
{'data': ['xml', 1, 2] in your structure (with xml and csv - not pdf), the execution goes into the nested while loop:
while ptr['next'] != None:
and since the if condition if ptr['next']['data'][0] == 'pdf': evaluates to False it gets stuck in the loop infinitely.

Given that I do not fully understand while not, while true loops, I resorted to recursion to answer my question.
def remove(lst):
ptr=lst
while ptr['data'][0]=='pdf':
ptr=ptr['next']
lst=ptr
while ptr['next']!=None:
if ptr['next']['data'][0]=='pdf':
ptr['next']=ptr['next']['next']
return remove(lst)
ptr=ptr['next']
return lst
If there are any pdf's at the start of the list, they are removed, and then if there are any pdf's encountered later, they are removed and the function returns itself just in case there are adjacent pdf's.

Related

python3 holding on to data after recursion

I wrote some code that found the fastest way of getting the sum of a number by adding numbers that were part of a list together ex:
bestSum(3, [800, 1, 3]) would return [3] because that would be the best way of getting 3 (first number) with the numbers provided would be simply adding 3. Code:
def bestSum(target, lst, mochi = {}):
if target in mochi:
return mochi[target]
if target == 0:
return []
if target < 0:
return None
shortestCombination = None
for i in lst:
remainderCombination = bestSum(target - i, lst, mochi)
if remainderCombination is not None:
remainderCombination = [*remainderCombination, i]
if shortestCombination is None or len(remainderCombination) < len(shortestCombination):
shortestCombination = remainderCombination
mochi[target] = shortestCombination
return shortestCombination
I ran into this issue where data would be saved between times I ran the code, for example if I run just
print(bestSum(8, [4])
It Returns
[4, 4]
However if I run
print(bestSum(8, [2, 3, 5]))
print(bestSum(8, [4]))
it returns:
[5, 3]
[5, 3]
Am I doing something wrong here? Is this potentially a security vulnerability? Is there a way to make this return correctly? What would cause python to do something like this?
This is documented behavior when using mutables as default arguments (see "Default parameter values are evaluated from left to right when the function definition is executed.").
As discussed in the documentation, "A way around this is to use None as the default, and explicitly test for it in the body of the function".
[While documented, I only learned about it here on SO a couple of days ago]

How search an unordered list for a key using reduce?

I have a basic reduce function and I want to reduce a list in order to check if an item is in the list. I have defined the function below where f is a comparison function, id_ is the item I am searching for, and a is the list. For example, reduce(f, 2, [1, 6, 2, 7]) would return True since 2 is in the list.
def reduce(f, id_, a):
if len(a) == 0:
return id_
elif len(a) == 1:
return a[0]
else:
# can call these in parallel
res = f(reduce(f, id_, a[:len(a)//2]),
reduce(f, id_, a[len(a)//2:]))
return res
I tried passing it a comparison function:
def isequal(x, element):
if x == True: # if element has already been found in list -> True
return True
if x == element: # if key is equal to element -> True
return True
else: # o.w. -> False
return False
I realize this does not work because x is not the key I am searching for. I get how reduce works with summing and products, but I am failing to see how this function would even know what the key is to check if the next element matches.
I apologize, I am a bit new to this. Thanks in advance for any insight, I greatly appreciate it!
Based on your example, the problem you seem to be trying to solve is determining whether a value is or is not in a list. In that case reduce is probably not the best way to go about that. To check if a particular value is in a list or not, Python has a much simpler way of doing that:
my_list = [1, 6, 2, 7]
print(2 in my_list)
print(55 in my_list)
True
False
Edit: Given OP's comment that they were required to use reduce to solve the problem, the code below will work, but I'm not proud of it. ;^) To see how reduce is intended to be used, here is a good source of information.
Example:
from functools import reduce
def test_match(match_params, candidate):
pattern, found_match = match_params
if not found_match and pattern == candidate:
match_params = (pattern, True)
return match_params
num_list = [1,2,3,4,5]
_, found_match = reduce(test_match, num_list, (2, False))
print(found_match)
_, found_match = reduce(test_match, num_list, (55, False))
print(found_match)
Output:
True
False

Why is a list variable sometimes not impacted by changes in function as I thought python3 works on pass by reference with list variables?

For python3, I originally needed to extract odd and even positions from a list and assign it to new lists, then clear the original list. I thought lists were impacted by a function call through "pass by reference". Testing some scenarios, it works sometime. Could someone please explain how exactly python3 works here?
Case 1: empty list is populated with string as expected.
def func1(_in):
_in.append('abc')
mylist = list()
print(f"Before:\nmylist = {mylist}")
func1(mylist)
print(f"After:\nmylist = {mylist}")
Output case 1:
Before:
mylist = []
After:
mylist = ['abc']
Case 2: middle list element is replaced with string as expected.
def func2(_in):
_in[1] = 'abc'
mylist = list(range(3))
print(f"Before:\nmylist = {mylist}")
func2(mylist)
print(f"After:\nmylist = {mylist}")
Output case 2:
Before:
mylist = [0, 1, 2]
After:
mylist = [0, 'abc', 2]
Case 3: why is the list not empty after function call?
def func3(_in):
_in = list()
mylist = list(range(3))
print(f"Before:\nmylist = {mylist}")
func3(mylist)
print(f"After:\nmylist = {mylist}")
Output case 3:
Before:
mylist = [0, 1, 2]
After:
mylist = [0, 1, 2]
Case 4: working exactly as expected, but note I have returned all three lists from function.
def func4_with_ret(_src, _dest1, _dest2):
_dest1 = [val for val in _src[0:len(_src):2]]
_dest2 = [val for val in _src[1:len(_src):2]]
_src = list()
return _src, _dest1, _dest2
source = list(range(6))
evens, odds = list(), list()
print(f"Before function call:\nsource = {source}\nevens = {evens}\nodds = {odds}")
source, evens, odds = func4_with_ret(source, evens, odds)
print(f"\nAfter function call:\nsource = {source}\nevens = {evens}\nodds = {odds}")
Output case 4:
Before function call:
source = [0, 1, 2, 3, 4, 5]
evens = []
odds = []
After function call:
source = []
evens = [0, 2, 4]
odds = [1, 3, 5]
Case 5: why no impact on the variables outside the function if I do not explicitly return from function call?
def func5_no_ret(_src, _dest1, _dest2):
_dest1 = [val for val in _src[0:len(_src):2]]
_dest2 = [val for val in _src[1:len(_src):2]]
_src = list()
source = list(range(6))
evens, odds = list(), list()
print(f"Before function call:\nsource = {source}\nevens = {evens}\nodds = {odds}")
func5_no_ret(source, evens, odds)
print(f"\nAfter function call:\nsource = {source}\nevens = {evens}\nodds = {odds}")
Output case 5:
Before function call:
source = [0, 1, 2, 3, 4, 5]
evens = []
odds = []
After function call:
source = [0, 1, 2, 3, 4, 5]
evens = []
odds = []
Thank you.
Your ultimate problem is confusing (in-place) mutation with rebinding (also referred to somewhat less precisely as "reassignment").
In all the cases where the change isn't visible outside the function, you rebound the name inside the function. When you do:
name = val
it does not matter what used to be in name; it's rebound to val, and the reference to the old object is thrown away. When it's the last reference, this leads to the object being cleaned up; in your case, the argument used to alias an object also bound to a name in the caller, but after rebinding, that aliasing association is lost.
Aside for C/C++ folks: Rebinding is like assigning to a pointer variable, e.g. int *px = pfoo; (initial binding), followed later by px = pbar; (rebinding), where both pfoo and pbar are themselves pointers to int. When the px = pbar; assignment occurs, it doesn't matter that px used to point to the same thing as pfoo, it points to something new now, and following it up with *px = 1; (mutation, not rebinding) only affects whatever pbar points to, leaving the target of pfoo unchanged.
By contrast, mutation doesn't break aliasing associations, so:
name[1] = val
does rebind name[1] itself, but it doesn't rebind name; it continues to refer to the same object as before, it just mutates that object in place, leaving all aliasing intact (so all names aliasing the same object see the result of the change).
For your specific case, you could change the "broken" functions from rebinding to aliasing by changing to slice assignment/deletion or other forms of in-place mutation, e.g.:
def func3(_in):
# _in = list() BAD, rebinds
_in.clear() # Good, method mutates in place
del _in[:] # Good, equivalent to clear
_in[:] = list() # Acceptable; needlessly creates empty list, but closest to original
# code, and has same effect
def func5_no_ret(_src, _dest1, _dest2):
# BAD, all rebinding to new lists, not changing contents of original lists
#_dest1 = [val for val in _src[0:len(_src):2]]
#_dest2 = [val for val in _src[1:len(_src):2]]
#_src = list()
# Acceptable (you should just use multiple return values, not modify caller arguments)
# this isn't C where multiple returns are a PITA
_dest1[:] = _src[::2] # Removed slice components where defaults equivalent
_dest2[:] = _src[1::2] # and dropped pointless listcomp; if _src might not be a list
# list(_src[::2]) is still better than no-op listcomp
_src.clear()
# Best (though clearing _src is still weird)
retval = _src[::2], _src[1::2]
_src.clear()
return retval
# Perhaps overly clever to avoid named temporary:
try:
return _src[::2], _src[1::2]
finally:
_src.clear()

Is the rear item in a Queue the last item added or the item at the end of a Queue?

My professor wrote a Queue class that uses arrays. I was giving it multiple test cases and got confused with one specific part. I want to figure out if the last item added is the rear of the queue. Lets say I enqueued 8 elements:
[1, 2, 3, 4, 5, 6, 7, 8]
Then I dequeued. And now:
[None, 2, 3, 4, 5, 6, 7, 8]
I enqueued 9 onto the Queue and it goes to the front. However, when I called my method that returns the rear item of the queue, q.que_rear, it returned 8. I thought the rear item would be 9? Since it was the last item added.
Here is how I tested it in case anyone is confused:
>>> q = ArrayQueue()
>>> q.enqueue(1)
>>> q.enqueue(2)
>>> q.enqueue(3)
>>> q.enqueue(4)
>>> q.data
[1, 2, 3, 4, None, None, None, None]
>>> q.dequeue()
1
>>> q.enqueue(5)
>>> q.enqueue(6)
>>> q.enqueue(7)
>>> q.enqueue(8)
>>> q.data
[None, 2, 3, 4, 5, 6, 7, 8]
>>> q.enqueue(9)
>>> q.data
[9, 2, 3, 4, 5, 6, 7, 8]
>>> q.que_rear()
Rear item is 8
EDIT
I just want to know what’s supposed to be the “rear of the Queue”? The last element added, or the element at the end of the list? In this case I showed, is it supposed to be 8 or 9?
Here is my code:
class ArrayQueue:
INITIAL_CAPACITY = 8
def __init__(self):
self.data = [None] * ArrayQueue.INITIAL_CAPACITY
self.rear = ArrayQueue.INITIAL_CAPACITY -1
self.num_of_elems = 0
self.front_ind = None
# O(1) time
def __len__(self):
return self.num_of_elems
# O(1) time
def is_empty(self):
return len(self) == 0
# Amortized worst case running time is O(1)
def enqueue(self, elem):
if self.num_of_elems == len(self.data):
self.resize(2 * len(self.data))
if self.is_empty():
self.data[0] = elem
self.front_ind = 0
self.num_of_elems += 1
else:
back_ind = (self.front_ind + self.num_of_elems) % len(self.data)
self.data[back_ind] = elem
self.num_of_elems += 1
def dequeue(self):
if self.is_empty():
raise Exception("Queue is empty")
elem = self.data[self.front_ind]
self.data[self.front_ind] = None
self.front_ind = (self.front_ind + 1) % len(self.data)
self.num_of_elems -= 1
if self.is_empty():
self.front_ind = None
# As with dynamic arrays, we shrink the underlying array (by half) if we are using less than 1/4 of the capacity
elif len(self) < len(self.data) // 4:
self.resize(len(self.data) // 2)
return elem
# O(1) running time
def first(self):
if self.is_empty():
raise Exception("Queue is empty")
return self.data[self.front_ind]
def que_rear(self):
if self.is_empty():
print("Queue is empty")
print("Rear item is", self.data[self.rear])
# Resizing takes time O(n) where n is the number of elements in the queue
def resize(self, new_capacity):
old_data = self.data
self.data = [None] * new_capacity
old_ind = self.front_ind
for new_ind in range(self.num_of_elems):
self.data[new_ind] = old_data[old_ind]
old_ind = (old_ind + 1) % len(old_data)
self.front_ind = 0
The que_rear function seems to be added post-hoc in an attempt to understand how the internal circular queue operates. But notice that self.rear (the variable que_rear uses to determine what the "rear" is) is a meaningless garbage variable, in spite of its promising name. In the initializer, it's set to the internal array length and never gets touched again, so it's just pure luck if it prints out the rear or anything remotely related to the rear.
The true rear is actually the variable back_ind, which is computed on the spot whenever enqueue is called, which is the only time it matters what the back is. Typically, queue data structures don't permit access to the back or rear (if it did, that would make it a deque, or double-ended queue), so all of this is irrelevant and implementation-specific from the perspective of the client (the code which is using the class to do a task as a black box, without caring how it works).
Here's a function that gives you the actual rear. Unsurprisingly, it's pretty much a copy of part of enqueue:
def queue_rear(self):
if self.is_empty():
raise Exception("Queue is empty")
back_ind = (self.front_ind + self.num_of_elems - 1) % len(self.data)
return self.data[back_ind]
Also, I understand this class is likely for educational purposes, but I'm obliged to mention that in a real application, use collections.dequeue for all your queueing needs (unless you need a synchronized queue).
Interestingly, CPython doesn't use a circular array to implement the deque, but Java does in its ArrayDeque class, which is worth a read.

Python 3.x - function args type-testing

I started learning Python 3.x some time ago and I wrote a very simple code which adds numbers or concatenates lists, tuples and dicts:
X = 'sth'
def adder(*vargs):
if (len(vargs) == 0):
print('No args given. Stopping...')
else:
L = list(enumerate(vargs))
for i in range(len(L) - 1):
if (type(L[i][1]) != type(L[i + 1][1])):
global X
X = 'bad'
break
if (X == 'bad'):
print('Args have different types. Stopping...')
else:
if type(L[0][1]) == int: #num
temp = 0
for i in range(len(L)):
temp += L[i][1]
print('Sum is equal to:', temp)
elif type(L[0][1]) == list: #list
A = []
for i in range(len(L)):
A += L[i][1]
print('List made is:', A)
elif type(L[0][1]) == tuple: #tuple
A = []
for i in range(len(L)):
A += list(L[i][1])
print('Tuple made is:', tuple(A))
elif type(L[0][1]) == dict: #dict
A = L[0][1]
for i in range(len(L)):
A.update(L[i][1])
print('Dict made is:', A)
adder(0, 1, 2, 3, 4, 5, 6, 7)
adder([1,2,3,4], [2,3], [5,3,2,1])
adder((1,2,3), (2,3,4), (2,))
adder(dict(a = 2, b = 433), dict(c = 22, d = 2737))
My main issue with this is the way I am getting out of the function when args have different types with the 'X' global. I thought a while about it, but I can't see easier way of doing this (I can't simply put the else under for, because the results will be printed a few times; probably I'm messing something up with the continue and break usage).
I'm sure I'm missing an easy way to do this, but I can't get it.
Thank you for any replies. If you have any advice about any other code piece here, I would be very grateful for additional help. I probably have a lot of bad non-Pythonian habits coming from earlier C++ coding.
Here are some changes I made that I think clean it up a bit and get rid of the need for the global variable.
def adder(*vargs):
if len(vargs) == 0:
return None # could raise ValueError
mytype = type(vargs[0])
if not all(type(x) == mytype for x in vargs):
raise ValueError('Args have different types.')
if mytype is int:
print('Sum is equal to:', sum(vargs))
elif mytype is list or mytype is tuple:
out = []
for item in vargs:
out += item
if mytype is list:
print('List made is:', out)
else:
print('Tuple made is:', tuple(out))
elif mytype is dict:
out = {}
for i in vargs:
out.update(i)
print('Dict made is:', out)
adder(0, 1, 2, 3, 4, 5, 6, 7)
adder([1,2,3,4], [2,3], [5,3,2,1])
adder((1,2,3), (2,3,4), (2,))
adder(dict(a = 2, b = 433), dict(c = 22, d = 2737))
I also made some other improvements that I think are a bit more 'pythonic'. For instance
for item in list:
print(item)
instead of
for i in range(len(list)):
print(list[i])
In a function like this if there are illegal arguments you would commonly short-cuircuit and just throw a ValueError.
if bad_condition:
raise ValueError('Args have different types.')
Just for contrast, here is another version that feels more pythonic to me (reasonable people might disagree with me, which is OK by me).
The principal differences are that a) type clashes are left to the operator combining the arguments, b) no assumptions are made about the types of the arguments, and c) the result is returned instead of printed. This allows combining different types in the cases where that makes sense (e.g, combine({}, zip('abcde', range(5)))).
The only assumption is that the operator used to combine the arguments is either add or a member function of the first argument's type named update.
I prefer this solution because it does minimal type checking, and uses duck-typing to allow valid but unexpected use cases.
from functools import reduce
from operator import add
def combine(*args):
if not args:
return None
out = type(args[0])()
return reduce((getattr(out, 'update', None) and (lambda d, u: [d.update(u), d][1]))
or add, args, out)
print(combine(0, 1, 2, 3, 4, 5, 6, 7))
print(combine([1,2,3,4], [2,3], [5,3,2,1]))
print(combine((1,2,3), (2,3,4), (2,)))
print(combine(dict(a = 2, b = 433), dict(c = 22, d = 2737)))
print(combine({}, zip('abcde', range(5))))

Resources