How to choose indices of subintervals in binary search? - python-3.x

iterative binary search algorithm. I am wrote algorithm in 2 different ways. the change i made was high = len(data) and high = len(data) -1 . In both cases algorithm runs fine. But in most of the sites they show high = len(data) -1 is correct way. so is using -1 is better and why?
1st code)
def iterative_binary_search(data, target):
low = 0
high = len(data) # this line is where I need help
while low <= high:
mid = (low + high) // 2
if target == data[mid]:
return True
elif target < data[mid]:
high = mid - 1
else:
low = mid + 1
return False
2nd code)
def iterative_binary_search(data, target):
low = 0
high = len(data) -1 # this line is where I need help
while low <= high:
mid = (low + high) // 2
if target == data[mid]:
return True
elif target < data[mid]:
high = mid - 1
else:
low = mid + 1
return False

One of the codes doesn't run fine.
Calling ibs1 the first one, with high=len(data), and ibs2 the second one, with high = len(data)-1, I get :
>>> haystack = [0,1,2,3,4,5,6,7,8,9]
>>> ibs2(haystack, 11)
False
>>> ibs1(haystack, 11)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 6, in ibs1
IndexError: list index out of range
How to decide between len(data) and len(data) - 1
You need to decide what low and high stand for, and make it very clear in your mind. When low=3 and high=6, what does it mean? Does it mean we are searching between list indices 3 and 6 included? Or excluded? That's up to you to decide. If it's included, then you should use high = len(data) - 1, because that is the index of the highest element of the array. If it's excluded, you should use high = len(data), because that is one past the index of the highest element in the array.
Both decisions are fine. But then this decision must reflect in the logic of the remaining of the code.
Hence, this code would be correct as well:
def ibs3(haystack, needle):
low = 0
high = len(haystack)
while low < high:
mid = (low + high) // 2
if needle == haystack[mid]:
return True
elif needle < haystack[mid]:
high = mid
else:
low = mid + 1
return False
Note that in python, the convention is most often to include low and exclude high. For instance, print(list(range(7, 10))) prints [7, 8, 9]: no number 10 in there!

Related

Search optimisation in Python

def CountingVallys(PathsTaken):
#Converts the Strings U and D into 1 and -1 respectively
Separate_Paths = [i for i in PathsTaken]
for index, i in enumerate(Separate_Paths):
if i == "D":
Separate_Paths[index] = -1
else:
Separate_Paths[index] = 1
Total_travels = [sum(Separate_Paths[0:i+1]) for i in range(len(Separate_Paths))]
#ValleyDistance shows the indexes where the traveller is below sea level and Valley Depth shows the depth at those
#Indexes
ValleyDistance = []
ValleyDepth = []
for Distance, Depth in enumerate(Total_travels):
if Depth < 0:
ValleyDistance.append(Distance)
ValleyDepth.append(Depth)
#Checks the distance between each index to shows if the valley ends (Difference > 1)
NumberOfValleys = []
DistanceOfValleys = []
TempDistance = 1
for index, Distance in enumerate(ValleyDistance):
# Check if final value, if so, check if the valley is distance 1 or 2 and append the final total of valleys
if ValleyDistance[index] == ValleyDistance[-1]:
if ValleyDistance[index] - ValleyDistance[index - 1] == 1:
TempDistance = TempDistance + 1
DistanceOfValleys.append(TempDistance)
NumberOfValleys.append(1)
elif ValleyDistance[index] - ValleyDistance[index - 1] > 1:
DistanceOfValleys.append(TempDistance)
NumberOfValleys.append(1)
#For all indexes apart from the final index
if ValleyDistance[index] - ValleyDistance[index-1] == 1:
TempDistance = TempDistance + 1
elif ValleyDistance[index] - ValleyDistance[index-1] > 1:
DistanceOfValleys.append(TempDistance)
NumberOfValleys.append(1)
TempDistance = 1
NumberOfValleys = sum(NumberOfValleys)
return NumberOfValleys
if __name__ == "__main__":
Result = CountingVallys("DDUDUUDUDUDUD")
print(Result)
An avid hiker keeps meticulous records of their hikes. Hikes always start and end at sea level, and each step up (U) or down (D) represents a unit change in altitude. We define the following terms:
A valley is a sequence of consecutive steps below sea level, starting with a step down from sea level and ending with a step up to sea level.
Find and print the number of valleys walked through.
In this question I was flagged due to my execution being too long and im wondering if there is any clear optimisations I could make to make it faster. I believe the use of "for-loops" is to blame but im not sure of any other ways to execute my steps.
Total_travels = [sum(Separate_Paths[0:i+1]) for i in range(len(Separate_Paths))]
In above code, why do you want to repeat the computation which you have already performed?
sum(Separate_Paths[0:i+1]) = sum(Separate_Paths[0:i] + Separate_Paths[i+1]
You can make the list Total_travels in efficient manner. That should take care of the too long execution time of your program.
>>> a
[2, 6, 4, 9, 10, 3]
>>> cumulative_sum = []
>>> sum_till_now = 0
>>> for x in a:
... sum_till_now += x
... cumulative_sum.append(sum_till_now)
...
>>> cumulative_sum
[2, 8, 12, 21, 31, 34]
>>>
numpy has a built-in cumsum which I believe to be overkill for your problem.
Here is an alternative solution:
By using a stack it would be easy to find the valleys.
for every new path of the same type ("D" or "U") push it into the stack
Before pushing, check for the last item and if they are opposite then remove it (if(stack[-1]!=path[i]): stack.pop()) from the stack and remember at ( last_path=path[i] )
if the stack is empty so it means there was and cutting with sea level (valley or mountain). If the last path was "U" so it means the was a valley so we count it (if(len(stack)==0): if(last_path=='U'): valleys+=1).
there is a possibility of cutting with the sea level at the end of the path so we need to handle it outside of the loop as well ( if(len(stack)==0 and last_path=='U'): valleys+=1 ).
Here is the code:
def countingValleys(steps, path):
stack=[]
valleys=0
last_path=None
for i in range(len(path)):
if(len(stack)==0):
if(last_path=='U'):
valleys+=1
stack.append(path[i])
else:
if(stack[-1]!=path[i]):
stack.pop()
last_path=path[i]
else:
stack.append(path[i])
if(len(stack)==0 and last_path=='U'):
valleys+=1
return valleys
print(countingValleys(12,"DUDUDUUUUDDDDU"))

How does a value from a previous level in recursion go back up?

I'm trying to make a recursive function to get minimum number of coins for change, but I think my understanding of what each layer's return value in the stack is wrong. What I want is for the coin amount to be passed back up when the recursion reaches it's base case, but looking at the debugger, the coin case decreases on the way back up.
I've already tried to look at solutions for this problem, but they all seem to use dynamic programming, and I know that it's more efficient in terms of complexity, but I want to figure out how to do the recursion before adding the dynamic programming portion
def min_coin(coin_list, value, counter = 0):
if value == 0:
return 0
else:
for coin in coin_list:
if coin <= value:
sub_result = value - coin
min_coin(coin_list, sub_result, counter)
counter +=1
return counter
#counter += 1 #should add returning out from,
#return counter
coin_list = [5, 2, 1]
value = 8
print(min_coin(coin_list,value))
I want an output of 3, but the actual output is 1 no matter the value
You need to increment the counter before calling min_coin().
def min_coin(coin_list, value, counter = 0):
if value == 0:
return counter
else:
for coin in coin_list:
if coin <= value:
sub_result = value - coin
return min_coin(coin_list, sub_result, counter+1)
You can solve your task without recursion, answer from geekforcoders
# Python 3 program to find minimum
# number of denominations
def findMin(V):
# All denominations of Indian Currency
deno = [1, 2, 5, 10, 20, 50,
100, 500, 1000]
n = len(deno)
# Initialize Result
ans = []
# Traverse through all denomination
i = n - 1
while(i >= 0):
# Find denominations
while (V >= deno[i]):
V -= deno[i]
ans.append(deno[i])
i -= 1
# Print result
for i in range(len(ans)):
print(ans[i], end = " ")
# Driver Code
if __name__ == '__main__':
n = 93
print("Following is minimal number",
"of change for", n, ": ", end = "")
findMin(n)

Maximum Number in Mountain Sequence

Given a mountain sequence of n integers which increase firstly and then decrease, find the mountain top.
Example
Given nums = [1, 2, 4, 8, 6, 3] return 8
Given nums = [10, 9, 8, 7], return 10
class Solution:
"""
#param nums: a mountain sequence which increase firstly and then decrease
#return: then mountain top
"""
def mountainSequence(self, nums):
# write your code here
if nums == []:
return None
if len(nums) <= 1:
return nums[0]
elif len(nums) <= 2:
return max(nums[0], nums[1])
for i in range(len(nums) -2):
if nums[i] >= nums[i + 1]:
return nums[i]
return nums[-1]
it stuck at [3,5,3]. Based on my analysis, it went wrong after running the for loop. But I cannot figure it out why the for loop failed.
this should be more efficient than your approach. it is a binary search customized for your use-case:
def top(lst):
low = 0
high = len(lst)
while low != high:
i = (high+low)//2
if lst[i] < lst[i+1]:
low = i+1
else:
high = i
return low
it starts in the middle of the list and checks if the series is still increasing there. if it is it sets low and will ignore all indices below low for the rest of the algorithm. if the series decreases already, high is set to the current index and all the elements above are ignored. and so on... when high == low the algorithm terminates.
if you have two or more of the same elements at the maximum of your list (a plateau) the algorithm will not even terminate.
and i skipped the tests for empty lists or lists of length 1.
This will get all triplets from your input, isolate all that are higher in the middle then left or right and return the one that is highest overall:
def get_mountain_top(seq):
triplets = zip(seq, seq[1:], seq[2:])
tops = list(filter(lambda x: x[0] < x[1] > x[2], triplets))
if tops:
# max not allowed, leverage sorted
return sorted(tops, key = lambda x:x[1])[-1]
# return max(tops,key = lambda x:x[1])
return None
print(get_mountain_top([1,2,3,4,3,2,3,4,5,6,7,6,5]))
print(get_mountain_top([1,1,1]))
Output:
(6,7,6)
None
It does not handle plateaus.
Doku:
zip(), filter() and max()

How can I optimize this python code for sorting large input?

I am trying to solve this problem on HackerRank which requires you to sort a list of integers and find how many times a number was moved in order to place in the correct ascending order (bribes within the context of the problem).
My code passes 8 of the 12 test cases but fails when the input is too large with a timeout error. This seems to be a common indicator on HackerRank that the code is too slow for the problem at hand. So is there a way to optimize this code so that it runs faster on larger data sets?
def minimum_bribes(queue):
"""Returns the minimum number of bribes people in a queue accepted."""
# Variable to keep track of bribes
bribes = 0
# Check if queue is too chaotic
for i in queue:
index = queue.index(i)
if i - index > 3:
return "Too chaotic"
# Use a bubble sort to find number of bribes
for i in range(len(queue) - 1):
for j in range(len(queue) - 1 - i):
if queue[j] > queue[j + 1]:
queue[j], queue[j + 1] = queue[j + 1], queue[j]
bribes += 1
return bribes
# Number of test cases
t = int(input())
results = []
for _ in range(t):
# Number of people
n = int(input())
# Final State of queue
q = list(map(int, input().rstrip().split()))
# Add bribe counts to results array
results.append(minimum_bribes(q))
# Print results
for result in results:
print(result)
I would recommend using while loop to test the condition, if there was no swap in the previous iteration, there is no need to run a new swap iteration.
def minimumBribes(queue):
for i in queue:
index = queue.index(i)
if (i - index) > 3:
print("Too chaotic")
return
n = len(queue)
swap =0
swapped = True
j =0
while swapped:
j+=1
swapped = False
for i in range(n-j):
if queue[i] > queue[i+1]:
queue[i], queue[i+1] = queue[i+1], queue[i]
swap +=1
swapped = True
print(swap)
return swap

Binary Search implementation in Python

I am trying to implement a solution using binary search. I have a list of numbers
list = [1, 2, 3, 4, 6]
value to be searched = 2
I have written something like this
def searchBinary(list, sval):
low = 0
high = len(list)
while low < high:
mid = low + math.floor((high - low) / 2)
if list[mid] == sval:
print("found : ", sval)
elif l2s[mid] > sval:
high = mid - 1
else:
low = mid + 1
but when I am trying to implement this, I am getting an error like: index out of range. Please help in identifying the issue.
A few things.
Your naming is inconsistent. Also, do not use list as a variable name, you're shadowing the global builtin.
The stopping condition is while low <= high. This is important.
You do not break when you find a value. This will result in infinite recursion.
def searchBinary(l2s, sval): # do not use 'list' as a variable
low = 0
high = len(l2s)
while low <= high: # this is the main issue. As long as low is not greater than high, the while loop must run
mid = (high + low) // 2
if l2s[mid] == sval:
print("found : ", sval)
return
elif l2s[mid] > sval:
high = mid - 1
else:
low = mid + 1
And now,
list_ = [1, 2, 3, 4, 6]
searchBinary(list_, 2)
Output:
found : 2
UPDATE high = len(lst) - 1 per comments below.
Three issues:
You used l2s instead of list (the actual name of the parameter).
Your while condition should be low <= high, not low < high.
You should presumably return the index when the value is found, or None (or perhaps -1?) if it's not found.
A couple other small changes I made:
It's a bad idea to hide the built-in list. I renamed the parameter to lst, which is commonly used in Python in this situation.
mid = (low + high) // 2 is a simpler form of finding the midpoint.
Python convention is to use snake_case, not camelCase, so I renamed the function.
Fixed code:
def binary_search(lst, sval):
low = 0
high = len(lst) - 1
while low <= high:
mid = (low + high) // 2
if lst[mid] == sval:
return mid
elif lst[mid] > sval:
high = mid - 1
else:
low = mid + 1
return None
print(binary_search([1, 2, 3, 4, 6], 2)) # 1

Resources