How to track down the mistake in this merge function? - python-3.x

I can't figure out what the problem with this merge sort implementation is. I've confirmed the problem is in the merge function rather than merge_sort by replacing merge with the same function from some examples found online and it works fine, however I can't find the mistake in my implementation.
Expected result: list sorted in order from smallest to largest.
Actual result: left side of list modified (not in order) and right side unmodified.
I've tried adding print statements at various points in the program and it looks like the problem is related to rightList not being created properly but I can't figure out why.
What can I do to track down the cause of this?
Code:
def merge_sort(toSort, left, right):
# check if we have more than one remaining element
if left >= right:
return
# get middle of array, note the result needs to be an int
mid = (left + right) // 2
# call merge sort on the left and right sides of the list
merge_sort(toSort, left, mid)
merge_sort(toSort, mid+1, right)
# merge the results
merge(toSort, left, right, mid)
# merge function taking a list along with the positions
# of the start, middle and end
def merge(toSort, left, right, mid):
# split the list into two separate lists based on the mid position
leftList = toSort[left:mid+1]
rightList = toSort[mid+1:right+1]
# variables to track position in left and right lists and the sorted list
lIndex = 0
rIndex = 0
sIndex = lIndex
# while there are remaining elements in both lists
while lIndex < len(leftList) and rIndex < len(rightList):
#if the left value is less than or equal to the right value add it to position sIndex in toSort
# and move lIndex to next position
if leftList[lIndex] <= rightList[rIndex]:
toSort[sIndex] = leftList[lIndex]
lIndex = lIndex + 1
# otherwise set sIndex to right value and move rIndex to next position
else:
toSort[sIndex] = rightList[rIndex]
rIndex = rIndex + 1
sIndex = sIndex + 1
# add the remaining elements from either leftList or rightList
while lIndex < len(leftList):
toSort[sIndex] = leftList[lIndex]
lIndex = lIndex + 1
sIndex = sIndex + 1
while rIndex < len(rightList):
toSort[sIndex] = rightList[rIndex]
rIndex = rIndex + 1
sIndex = sIndex + 1
unsorted = [33, 42, 9, 37, 8, 47, 5, 29, 49, 31, 4, 48, 16, 22, 26]
print(unsorted)
merge_sort(unsorted, 0, len(unsorted) - 1)
print(unsorted)
Output:
[33, 42, 9, 37, 8, 47, 5, 29, 49, 31, 4, 48, 16, 22, 26]
[16, 22, 26, 49, 31, 4, 48, 29, 49, 31, 4, 48, 16, 22, 26]
Edit
Link to example of code in colab: https://colab.research.google.com/drive/1z5ouu_aD1QM0unthkW_ZGkDlrnPNElxm?usp=sharing

The index variable into toSort, sIndex should be initialized to left instead of 0.
Also note that it would be more readable and consistent to pass right as 1 + the index of the last element of the slice, which is consistent with the slice notation in python and would remove the +1/-1 adjustments here and there. The convention where right is included is taught in java classes, but it is error prone and does not allow for empty slices.
Using simpler index variable names would help readability, especially with more indent space.
Here is a modified version:
# sort the elements in toSort[left:right]
def merge_sort(toSort, left, right):
# check if we have more than one remaining element
if right - left < 2:
return
# get middle of array, note: the result needs to be an int
mid = (left + right) // 2
# call merge sort on the left and right sides of the list
merge_sort(toSort, left, mid)
merge_sort(toSort, mid, right)
# merge the results
merge(toSort, left, mid, right)
# merge function taking a list along with the positions
# of the start, middle and end, in this order.
def merge(toSort, left, mid, right):
# split the list into two separate lists based on the mid position
leftList = toSort[left : mid]
rightList = toSort[mid : right]
# variables to track position in left and right lists and the sorted list
i = 0
j = 0
k = left
# while there are remaining elements in both lists
while i < len(leftList) and j < len(rightList):
# if the left value is less than or equal to the right value add it to position k in toSort
# and move i to next position
if leftList[i] <= rightList[j]:
toSort[k] = leftList[i]
i += 1
# otherwise set it to right value and move j to next position
else:
toSort[k] = rightList[j]
j += 1
k += 1
# add the remaining elements from either leftList or rightList
while i < len(leftList):
toSort[k] = leftList[i]
i += 1
k += 1
while j < len(rightList):
toSort[k] = rightList[j]
j += 1
k += 1
unsorted = [33, 42, 9, 37, 8, 47, 5, 29, 49, 31, 4, 48, 16, 22, 26]
print(unsorted)
merge_sort(unsorted, 0, len(unsorted))
print(unsorted)

Related

what is wrong with my code? leetcode - 189. Rotate Array

The code works perfectly fine for the first test case but gives wrong answer for the second one. Why is that?
arr = [1,2,3,4,5,6,7]
arr2 = [-1,-100,3,99]
def reverse(array, start, end):
while start < end:
array[start], array[end] = array[end], array[start]
start += 1
end -= 1
return array
def rotate(array, k):
reverse(array, 0, k)
reverse(array, k+1, len(array)-1)
reverse(array, 0, len(array)-1)
return array
print(rotate(arr, 3)) # output: [5, 6, 7, 1, 2, 3, 4]
# print(reverse(arr, 2, 4))
rotate(arr2, 2)
print(arr2) # output: [99, -1, -100, 3] (should be [3, 99, -1, -100])
Your existing logic does the following -
Move k + 1 item from front of the list to the back of the list.
But the solution needs to move k elements from back of the list to the front. Or another way to think is move len(array) - k element from front to the back.
To do so, two changes required in the rotate() -
Update k to len(array) - k
Change your logic to move k instead of k + 1 element from front to back
So, your rotate() needs to be changed to following -
def rotate(array, k):
k = len(array) - k
reverse(array, 0, k-1)
reverse(array, k, len(array)-1)
reverse(array, 0, len(array)-1)
return array
I think there are better ways to solve this but with your logic this should solve the problem.

Coin Change problem using Memoization (Amazon interview question)

def rec_coin_dynam(target,coins,known_results):
'''
INPUT: This funciton takes in a target amount and a list of possible coins to use.
It also takes a third parameter, known_results, indicating previously calculated results.
The known_results parameter shoud be started with [0] * (target+1)
OUTPUT: Minimum number of coins needed to make the target.
'''
# Default output to target
min_coins = target
# Base Case
if target in coins:
known_results[target] = 1
return 1
# Return a known result if it happens to be greater than 1
elif known_results[target] > 0:
return known_results[target]
else:
# for every coin value that is <= than target
for i in [c for c in coins if c <= target]:
# Recursive call, note how we include the known results!
num_coins = 1 + rec_coin_dynam(target-i,coins,known_results)
# Reset Minimum if we have a new minimum
if num_coins < min_coins:
min_coins = num_coins
# Reset the known result
known_results[target] = min_coins
return min_coins
This runs perfectly fine but I have few questions about it.
We give it the following input to run:
target = 74
coins = [1,5,10,25]
known_results = [0]*(target+1)
rec_coin_dynam(target,coins,known_results)
why are we initalising the know result with zeros of length target+1? why can't we just write
know_results = []
Notice that the code contains lines such as:
known_results[target] = 1
return known_results[target]
known_results[target] = min_coins
Now, let me demonstrate the difference between [] and [0]*something in the python interactive shell:
>>> a = []
>>> b = [0]*10
>>> a
[]
>>> b
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
>>>
>>> a[3] = 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list assignment index out of range
>>>
>>> b[3] = 1
>>>
>>> a
[]
>>> b
[0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
The exception IndexError: list assignment index out of range was raised because we tried to access cell 3 of list a, but a has size 0; there is no cell 3. We could put a value in a using a.append(1), but then the 1 would be at position 0, not at position 3.
There was no exception when we accessed cell 3 of list b, because b has size 10, so any index between 0 and 9 is valid.
Conclusion: if you know in advance the size that your array will have, and this size never changes during the execution of the algorithm, then you might as well begin with an array of the appropriate size, rather than with an empty array.
What is the size of known_results? The algorithm needs results for values ranging from 0 to target. How many results is that? Exactly target+1. For instance, if target = 2, then the algorithm will deal with results for 0, 1 and 2; that's 3 different results. Thus known_results must have size target+1. Note that in python, just like in almost every other programming language, a list of size n has n elements, indexed 0 to n-1. In general, in an integer interval [a, b], there are b-a+1 integers. For instance, there are three integers in interval [8, 10] (those are 8, 9 and 10).

How to check a list of lists against a list of lists and count their overlaps? (Python)

I have one list a containing 100 lists and one list x containing 4 lists (all of equal length). I want to test the lists in a against those in x. My goal is to find out how often numbers in a "touch" those in x. Stated differently, all the lists are points on a line and the lines in a should not touch (or cross) those in x.
EDIT
In the code, I am testing each line in a (e.g. a1, a2 ... a100) first against x1, then against x2, x3 and x4. A condition and a counter check whether the a's touch the x's. Note: I am not interested in counting how many items in a1, for example, touch x1. Once a1 and x1 touch, I count that and can move on to a2, and so on.
However, the counter does not properly update. It seems that it does not tests a against all x. Any suggestions on how to solve this? Here is my code.
EDIT
I have updated the code so that the problem is easier to replicate.
x = [[10, 11, 12], [14, 15, 16]]
a = [[11, 10, 12], [15, 17, 20], [11, 14, 16]]
def touch(e, f):
e = np.array(e)
f = np.array(f)
lastitems = []
counter = 0
for lst in f:
if np.all(e < lst): # This is the condition
lastitems.append(lst[-1]) # This allows checking the end values
else:
counter += 1
c = counter
return c
touch = touch(x, a)
print(touch)
The result I get is:
2
But I expect this:
1
2
I'm unsure of what exactly is the result you expect, your example and description are still not clear. Anyway, this is what I guess you want. If you want more details, you can uncomment some lines i.e. those with #
i = 0
for j in x:
print("")
#print(j)
counter = 0
for k in a:
inters = set(j).intersection(k)
#print(k)
#print(inters)
if inters:
counter += 1
#print("yes", counter)
#else:
#print("nope", counter)
print(i, counter)
i += 1
which prints
0 2
1 2

I am having a Quick Sort error with infinite re cursion

I am having problems with my quicksort function constantly re cursing the best of three function. I dont know why it is doing that and i need help. I am trying to practice this for my coding class next semester and this is one of the assignments from last year that my friend had and im lost when it comes to this error
This is my quicksort function:
def quick_sort ( alist, function ):
if len(alist) <= 1:
return alist + []
pivot, index = function(alist)
#print("Pivot:",pivot)
left = []
right = []
for value in range(len(alist)):
if value == index:
continue
if alist[value] <= pivot:
left.append(alist[value])
else:
right.append(alist[value])
print("left:", left)
print("right:", right)
sortedleft = quick_sort( left, function )
print("sortedleft", sortedleft)
sortedright = quick_sort( right, function )
print("sortedright", sortedright)
completeList = sortedleft + [pivot] + sortedright
return completeList
#main
alist = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
x = quick_sort(alist, best_of_three)
print(x)
this is my best of three function:
def best_of_three( bNlist, nine = False ):
rightindex = 2
middleindex = 1
if nine == False:
left = blist[0]
rightindex = int(len(blist) - 1)
rightvalue = int(blist[rightindex])
middleindex = int((len(blist) - 1)/2)
middlevalue = int(blist[middleindex])
bNlist.append(left)
bNlist.append(middlevalue)
bNlist.append(rightvalue)
BN = bNlist
print("Values:",BN)
left = bNlist[0]
middle = bNlist[1]
right = bNlist[2]
if left <= middle <= right:
return middle , middleindex
elif left >= middle >= right:
return middle, middleindex
elif middle <= right <= left:
return right, rightindex
elif middle >= right >= left:
return right, rightindex
else:
return left, 0
#main
bNlist = []
print('Best of Three')
blist = [54,26,93,17,77,31,44,55]
print("")
print( "List: [54,26,93,17,77,31,44,55]" )
x, index = best_of_three(bNlist)
print("Pivot: ",x)
print("----------------------------")
i really dont know why it keeps infinitely re cursing,
There is also a third function called ninther
def ninther( bNlist ):
stepsize = int(len(blist) / 9)
left = 0
middle = left + 2
right = left + 2 * stepsize
blist[left]
blist[middle]
blist[right]
leftvalue = blist[left]
rightvalue = blist[right]
middlevalue = blist[middle]
left2 = right + stepsize
middle2 = left2 + 2
right2 = left2 + 2 * stepsize
blist[left2]
blist[middle2]
blist[right2]
left2value = blist[left2]
middle2value = blist[middle2]
right2value = blist[right2]
left3 = right2 + stepsize
middle3 = left3 + 2
right3 = left3 + 2 * stepsize
blist[left3]
blist[middle3]
blist[right3]
left3value = blist[left3]
middle3value = blist[middle3]
right3value = blist[right3]
bN3list = []
bN2list = []
bNlist = []
bNlist.append(leftvalue)
bNlist.append(middlevalue)
bNlist.append(rightvalue)
bN2list.append(left2value)
bN2list.append(middle2value)
bN2list.append(right2value)
bN3list.append(left3value)
bN3list.append(middle3value)
bN3list.append(right3value)
BN3 = bN3list
BN2 = bN2list
BN = bNlist
print("Ninter ")
print("Group 1:", BN)
print("Group 2:", BN2)
print("Group 3:", BN3)
x = best_of_three(bNlist, True)[0]
c = best_of_three(bN2list, True)[0]
d = best_of_three(bN3list, True)[0]
print("Median 1:", x)
print("Median 2:", c)
print("Median 3:", d)
bN4list = [x,c,d]
print("All Medians:", bN4list)
z = best_of_three(bN4list, True)
return z[0], z[1]
#main
blist = [2, 6, 9, 7, 13, 4, 3, 5, 11, 1, 20, 12, 8, 10, 32, 16, 14, 17, 21, 46]
Y = ninther(blist)
print("Pivot", Y)
print("----------------------------")
i have looked everywhere in it and i cant figure out where the problem is when calling best of three
Summary: The main error causing infinite recursion is that you don't deal with the case where best_of_three receives a length 2 list. A secondary error is that best_of_three modifies the list you send to it. If I correct these two errors, as below, your code works.
The details: best_of_three([1, 2]) returns (2, 3), implying a pivot value of 2 at the third index, which is wrong. This would give a left list of [1, 2], which then causes exactly the same behavior at the next recursive quick_sort(left, function) call.
More generally, the problem is that the very idea of choosing the best index out of three possible values is impossible for a length 2 list, and you haven't chosen how to deal with that special case.
If I add this special case code to best_of_three, it deals with the length 2 case:
if len(bNlist) == 2:
return bNlist[1], 1
The function best_of_three also modifies bNlist. I have no idea why you have the lines of the form bNlist.append(left) in that function.
L = [15, 17, 17, 17, 17, 17, 17]
best_of_three(L)
print(L) # prints [15, 17, 17, 17, 17, 17, 17, 54, 17, 55]
I removed the append lines, since having best_of_three modify bNlist is unlikely to be what you want, and I have no idea why those lines are there. However, you should ask yourself why they are there to begin with. There might be some reason I don't know about. When I do that, there are a couple of quantities you compute that are never used, so I remove the lines that compute those also.
Then I notice you have the code
rightindex = 2
middleindex = 1
if nine == False:
rightindex = int(len(blist) - 1)
middleindex = int((len(blist) - 1)/2)
left = bNlist[0]
middle = bNlist[1]
right = bNlist[2]
This doesn't seem to make any sense, since you set rightindex and middleindex to other values, but then you still access values using the old indices (2 and 1 respectively). So I removed the if nine == False block. Again, ask yourself why you had this code to begin with, maybe there's some other way you should modify this to account for something I don't know about.
The result is the following for best_of_three:
def best_of_three(bNlist):
print(bNlist)
if len(bNlist) == 2:
return bNlist[1], 1
rightindex = 2
middleindex = 1
left = bNlist[0]
middle = bNlist[1]
right = bNlist[2]
if left <= middle <= right:
return middle , middleindex
elif left >= middle >= right:
return middle, middleindex
elif middle <= right <= left:
return right, rightindex
elif middle >= right >= left:
return right, rightindex
else:
return left, 0
If I use this, your code does not recurse infinitely, and it sorts.
I don't know why you mentioned ninther at all, since it seems to have nothing to do with your question. You should probably edit it to remove that code.

Dynamic programming table - Finding the minimal cost to break a string

A certain string-processing language offers a primitive operation
which splits a string into two pieces. Since this operation involves
copying the original string, it takes n units of time for a string of
length n, regardless of the location of the cut. Suppose, now, that
you want to break a string into many pieces.
The order in which the breaks are made can affect the total running
time. For example, suppose we wish to break a 20-character string (for
example "abcdefghijklmnopqrst") after characters at indices 3, 8, and
10 to obtain for substrings: "abcd", "efghi", "jk" and "lmnopqrst". If
the breaks are made in left-right order, then the first break costs 20
units of time, the second break costs 16 units of time and the third
break costs 11 units of time, for a total of 47 steps. If the breaks
are made in right-left order, the first break costs 20 units of time,
the second break costs 11 units of time, and the third break costs 9
units of time, for a total of only 40 steps. However, the optimal
solution is 38 (and the order of the cuts is 10, 3, 8).
The input is the length of the string and an ascending-sorted array with the cut indexes. I need to design a dynamic programming table to find the minimal cost to break the string and the order in which the cuts should be performed.
I can't figure out how the table structure should look (certain cells should be the answer to certain sub-problems and should be computable from other entries etc.). Instead, I've written a recursive function to find the minimum cost to break the string: b0, b1, ..., bK are the indexes for the cuts that have to be made to the (sub)string between i and j.
totalCost(i, j, {b0, b1, ..., bK}) = j - i + 1 + min {
totalCost(b0 + 1, j, {b1, b2, ..., bK}),
totalCost(i, b1, {b0 }) + totalCost(b1 + 1, j, {b2, b3, ..., bK}),
totalCost(i, b2, {b0, b1 }) + totalCost(b2 + 1, j, {b3, b4, ..., bK}),
....................................................................................
totalCost(i, bK, {b0, b1, ..., b(k - 1)})
} if k + 1 (the number of cuts) > 1,
j - i + 1 otherwise.
Please help me figure out the structure of the table, thanks!
For example we have a string of length n = 20 and we need to break it in positions cuts = [3, 8, 10]. First of all let's add two fake cuts to our array: -1 and n - 1 (to avoid edge cases), now we have cuts = [-1, 3, 8, 10, 19]. Let's fill table M, where M[i, j] is a minimum units of time to make all breaks between i-th and j-th cuts. We can fill it by rule: M[i, j] = (cuts[j] - cuts[i]) + min(M[i, k] + M[k, j]) where i < k < j. The minimum time to make all cuts will be in the cell M[0, len(cuts) - 1]. Full code in python:
# input
n = 20
cuts = [3, 8, 10]
# add fake cuts
cuts = [-1] + cuts + [n - 1]
cuts_num = len(cuts)
# init table with zeros
table = []
for i in range(cuts_num):
table += [[0] * cuts_num]
# fill table
for diff in range(2, cuts_num):
for start in range(0, cuts_num - diff):
end = start + diff
table[start][end] = 1e9
for mid in range(start + 1, end):
table[start][end] = min(table[start][end], table[
start][mid] + table[mid][end])
table[start][end] += cuts[end] - cuts[start]
# print result: 38
print(table[0][cuts_num - 1])
Just in case you may feel easier to follow when everything is 1-based (same as DPV Dasgupta Algorithm book problem 6.9, and same as UdaCity Graduate Algorithm course initiated by GaTech), following is the python code that does the equivalent thing with the previous python code by Jemshit and Aleksei. It follows the chain multiply (binary tree) pattern as taught in the video lecture.
import numpy as np
# n is string len, P is of size m where P[i] is the split pos that split string into [1,i] and [i+1,n] (1-based)
def spliting_cost(P, n):
P = [0,] + P + [n,] # make sure pos list contains both ends of string
m = len(P)
P = [0,] + P # both C and P are 1-base indexed for easy reading
C = np.full((m+1,m+1), np.inf)
for i in range(1, m+1): C[i, i:i+2] = 0 # any segment <= 2 does not need split so is zero cost
for s in range(2, m): # s is split string len
for i in range(1, m-s+1):
j = i + s
for k in range(i, j+1):
C[i,j] = min(C[i,j], P[j] - P[i] + C[i,k] + C[k,j])
return C[1,m]
spliting_cost([3, 5, 10, 14, 16, 19], 20)
The output answer is 55, same as that with split points [2, 4, 9, 13, 15, 18] in the previous algorithm.

Resources