Wrong output binary search program - python-3.x

In writing a BinarySearch program, I've written the program:
def binary_search(array, x, low=0, high=None):
if high is None:
high = len(array)
while low < high:
mid = (low+high)//2
midval = array[mid]
if midval < x:
low = mid+1
elif midval > x:
high = mid
else:
return mid
return -1
When I put the following:
binary_search([1,2,2,3],2)
the output given by the program is
2
However, I would like the program to give as output the index of the first integer 'x' it finds. So in the previous example it would be '1' instead of '2'. Any idea on how I can alter this?

You need to remove the early out (the final else condition), replacing the prior elif with a straight else; only when the loop terminates do you test for equality and choose to return the index found or -1 if it wasn't found:
def binary_search(array, x, low=0, high=None):
if high is None:
high = len(array)
while low < high:
mid = (low+high)//2
if array[mid] < x:
low = mid+1
else:
high = mid
return -1 if low >= len(array) or array[low] != x else low
It's also a good idea to behave this way because in general, you don't want to perform multiple comparisons per loop (< and > would each invoke comparisons, which could be expensive depending on the type); simplifying to exactly one non-numeric comparison per loop saves time (and often runs faster; Python libraries often hand implement < and ==, and use wrappers to implement the other comparators in terms of < and ==, making them slower).
This is actually what bisect.bisect_left does in its pure Python implementation; it's otherwise nearly identical to your code. It makes it take longer, because it's more likely to take the full log(n) steps to identify the left most instance of a value, but the incremental cost is usually going to be small unless your input has many repeated values.

Given [1,2,2,3] the line mid = (low+high)//2 is evaluated for the first run as:
mid = (0 + 4) // 2 ==> 2
Than array[mid] satisfies the condition (its value is exactly 2), and the index returned is mid (2).
if midval < x:
...
elif midval > x:
...
else:
return mid

Related

A strategy-proof method of finding the time complexity of complex algorithms?

I have a question in regard to time complexity (big-O) in Python. I want to understand the general method I would need to implement when trying to find the big-O of a complex algorithm. I have understood the reasoning behind calculating the time complexity of simple algorithms, such as a for loop iterating over a list of n elements having a O(n), or having two nested for loops each iterating over 2 lists of n elements each having a big-O of n**2. But, for more complex algorithms that implement multiple if-elif-else statements coupled with for loops, I would want to see if there is a strategy to, simply based on the code, in an iterative fashion, to determine the big-O of my code using simple heuristics (such as, ignoring constant time complexity if statements or always squaring the n upon going over a for loop, or doing something specific when encountering an else statement).
I have created a battleship game, for which I would like to find the time complexity, using such an aforementioned strategy.
from random import randint
class Battle:
def __init__(self):
self.my_grid = [[False,False,False,False,False,False,False,False,False,False],[False,False,False,False,False,False,False,False,False,False],[False,False,False,False,False,False,False,False,False,False],[False,False,False,False,False,False,False,False,False,False],[False,False,False,False,False,False,False,False,False,False],[False,False,False,False,False,False,False,False,False,False],[False,False,False,False,False,False,False,False,False,False],[False,False,False,False,False,False,False,False,False,False],[False,False,False,False,False,False,False,False,False,False],[False,False,False,False,False,False,False,False,False,False]]
def putting_ship(self,x,y):
breaker = False
while breaker == False:
r1=x
r2=y
element = self.my_grid[r1][r2]
if element == True:
continue
else:
self.my_grid[r1][r2] = True
break
def printing_grid(self):
return self.my_grid
def striking(self,r1,r2):
element = self.my_grid[r1][r2]
if element == True:
print("STRIKE!")
self.my_grid[r1][r2] = False
return True
elif element == False:
print("Miss")
return False
def game():
battle_1 = Battle()
battle_2 = Battle()
score_player1 = 0
score_player2 = 0
turns = 5
counter_ships = 2
while True:
input_x_player_1 = input("give x coordinate for the ship, player 1\n")
input_y_player_1 = input("give y coordinate for the ship, player 1\n")
battle_1.putting_ship(int(input_x_player_1),int(input_y_player_1))
input_x_player_2 = randint(0,9)
input_y_player_2 = randint(0,9)
battle_2.putting_ship(int(input_x_player_2),int(input_y_player_2))
counter_ships -= 1
if counter_ships == 0:
break
while True:
input_x_player_1 = input("give x coordinate for the ship\n")
input_y_player_1 = input("give y coordinate for the ship\n")
my_var = battle_1.striking(int(input_x_player_1),int(input_y_player_1))
if my_var == True:
score_player1 += 1
print(score_player1)
input_x_player_2 = randint(0,9)
input_y_player_2 = randint(0,9)
my_var_2 = battle_2.striking(int(input_x_player_2),int(input_y_player_2))
if my_var_2 == True:
score_player2 += 1
print(score_player2)
counter_ships -= 1
if counter_ships == 0:
break
print("the score for player 1 is",score_player1)
print("the score for player 2 is",score_player2)
print(game())
If it's just nested for loops and if/else statements, you can take the approach ibonyun has suggested - assume all if/else cases are covered and look at the deepest loops (being aware that some operations like sorting, or copying an array, might hide loops of their own.)
However, your code also has while loops. In this particular example it's not too hard to replace them with fors, but for code containing nontrivial whiles there is no general strategy that will always give you the complexity - this is a consequence of the halting problem.
For example:
def collatz(n):
n = int(abs(n))
steps = 0
while n != 1:
if n%2 == 1:
n=3*n+1
else:
n=n//2
steps += 1
print(n)
print("Finished in",steps,"steps!")
So far nobody has been able to prove that this will even finish for all n, let alone shown an upper bound to the run-time.
Side note: instead of the screen-breaking
self.my_grid = [[False,False,...False],[False,False,...,False],...,[False,False,...False]]
consider something like:
grid_size = 10
self.my_grid = [[False for i in range(grid_size)] for j in range(grid_size)]
which is easier to read and check.
Empirical:
You could do some time trials while increasing n (so maybe increasing the board size?) and plot the resulting data. You could tell by the curve/slope of the line what the time complexity is.
Theoretical:
Parse the script and keep track of the biggest O() you find for any given line or function call. Any sorting operations will give you nlogn. A for loop inside a for loop will give you n^2 (assuming their both iterating over the input data), etc. Time complexity is about the broad strokes. O(n) and O(n*3) are both linear time, and that's what really matters. I don't think you need to worry about the minutia of all your if-elif-else logic. Maybe just focus on worst case scenario?

find the first occurrence of a number greater than k in a sorted array

For the given sorted list,the program should return the index of the number in the list which is greater than the number which is given as input.
Now when i run code and check if it is working i am getting 2 outputs. One is the value and other output is None.
If say i gave a input of 3 for the below code.The expected output is index of 20 i.e., 1 instead i am getting 1 followed by None.
If i give any value that is greater than the one present in the list i am getting correct output i.e., "The entered number is greater than the numbers in the list"
num_to_find = int(input("Enter the number to be found"))
a=[2,20,30]
def occur1(a,num_to_find):
j = i = 0
while j==0:
if a[len(a)-1] > num_to_find:
if num_to_find < a[i]:
j=1
print(i)
break
else:
i = i + 1
else:
ret_state = "The entered number is greater than the numbers in the list"
return ret_state
print(occur1(a,num_to_find))
This code is difficult to reason about due to extra variables, poor variable names (j is typically used as an index, not a bool flag), usage of break, nested conditionals and side effect. It's also inefficient because it needs to visit each element in the list in the worst case scenario and fails to take advantage of the sorted nature of the list to the fullest. However, it appears working.
Your first misunderstanding is likely that print(i) is printing the index of the next largest element rather than the element itself. In your example call of occur1([2, 20, 30], 3)), 1 is where 20 lives in the array.
Secondly, once the found element is printed, the function returns None after it breaks from the loop, and print dutifully prints None. Hopefully this explains your output--you can use return a[i] in place of break to fix your immediate problem and meet your expectations.
Having said that, Python has a builtin module for this: bisect. Here's an example:
from bisect import bisect_right
a = [1, 2, 5, 6, 8, 9, 15]
index_of_next_largest = bisect_right(a, 6)
print(a[index_of_next_largest]) # => 8
If the next number greater than k is out of bounds, you can try/except that or use a conditional to report the failure as you see fit. This function takes advantage of the fact that the list is sorted using a binary search algorithm, which cuts the search space in half on every step. The time complexity is O(log(n)), which is very fast.
If you do wish to stick with a linear algorithm similar to your solution, you can simplify your logic to:
def occur1(a, num_to_find):
for n in a:
if n > num_to_find:
return n
# test it...
a = [2, 5, 10]
for i in range(11):
print(i, " -> ", occur1(a, i))
Output:
0 -> 2
1 -> 2
2 -> 5
3 -> 5
4 -> 5
5 -> 10
6 -> 10
7 -> 10
8 -> 10
9 -> 10
10 -> None
Or, if you want the index of the next largest number:
def occur1(a, num_to_find):
for i, n in enumerate(a):
if n > num_to_find:
return i
But I want to stress that the binary search is, by every measure, far superior to the linear search. For a list of a billion elements, the binary search will make about 20 comparisons in the worst case where the linear version will make a billion comparisons. The only reason not to use it is if the list can't be guaranteed to be pre-sorted, which isn't the case here.
To make this more concrete, you can play with this program (but use the builtin module in practice):
import random
def bisect_right(a, target, lo=0, hi=None, cmps=0):
if hi is None:
hi = len(a)
mid = (hi - lo) // 2 + lo
cmps += 1
if lo <= hi and mid < len(a):
if a[mid] < target:
return bisect_right(a, target, mid + 1, hi, cmps)
elif a[mid] > target:
return bisect_right(a, target, lo, mid - 1, cmps)
else:
return cmps, mid + 1
return cmps, mid + 1
def linear_search(a, target, cmps=0):
for i, n in enumerate(a):
cmps += 1
if n > target:
return cmps, i
return cmps, i
if __name__ == "__main__":
random.seed(42)
trials = 10**3
list_size = 10**4
binary_search_cmps = 0
linear_search_cmps = 0
for n in range(trials):
test_list = sorted([random.randint(0, list_size) for _ in range(list_size)])
test_target = random.randint(0, list_size)
res = bisect_right(test_list, test_target)[0]
binary_search_cmps += res
linear_search_cmps += linear_search(test_list, test_target)[0]
binary_search_avg = binary_search_cmps / trials
linear_search_avg = linear_search_cmps / trials
s = "%s search made %d comparisons across \n%d searches on random lists of %d elements\n(found the element in an average of %d comparisons\nper search)\n"
print(s % ("binary", binary_search_cmps, trials, list_size, binary_search_avg))
print(s % ("linear", linear_search_cmps, trials, list_size, linear_search_avg))
Output:
binary search made 12820 comparisons across
1000 searches on random lists of 10000 elements
(found the element in an average of 12 comparisons
per search)
linear search made 5013525 comparisons across
1000 searches on random lists of 10000 elements
(found the element in an average of 5013 comparisons
per search)
The more elements you add, the worse the situation looks for the linear search.
I would do something along the lines of:
num_to_find = int(input("Enter the number to be found"))
a=[2,20,30]
def occur1(a, num_to_find):
for i in a:
if not i <= num_to_find:
return a.index(i)
return "The entered number is greater than the numbers in the list"
print(occur1(a, num_to_find))
Which gives the output of 1 (when inputting 3).
The reason yours gives you 2 outputs, is because you have 2 print statements inside your code.

Can someone help me with my Goldbach code ? When i call the function and then put 1,000,000 it seems not to spit out any output i dont know why?

I need to reach 1,000,000 and for some reason it doesn't give me the output for 1,000,000, I don't know what i am doing wrong. Every time i put a small number like 500 it would give me the correct output but as soon as i put 999,999 or 1,000,000 it just doesn't give out any output and when i do a keyboard interruption it says it stopped at break but I need that break in order for the values to only repeat once.
bachslst=[]
primeslst=[]
q=[]
newlst=[]
z=[]
def goldbach(limit):
primes = dict()
for i in range(2, limit+1):
primes[i] = True
for i in primes:
factors = range(i, limit+1, i)
for f in factors[1:]:
primes[f] = False
for i in primes:
if primes[i]==True:
z.append(i)
for num in range(4,limit+1,2):
for k in range(len(z)):
for j in z:
if (k + j ) == num :
x=(str(k),str(j))
q.append(x)
newlst.append([x,[num]])
break
bachslst.append(num)
print(bachslst,'\n')
return newlst
The break that is referred to is not the break in the code, it is the break caused by the keyboard interrupt.
If you want to get your result in less than 1.5 hours, try to reduce the amount of computing that you are doing. There are many implementations for testing the Goldbach Conjecture if you do a search. Some are in other languages, but you can still use them to influence your algorithm.
I have not looked at it, but here is another implementation in Python: https://codereview.stackexchange.com/questions/99161/function-to-find-two-prime-numbers-that-sum-up-to-a-given-even-number

Is there any trick or method to count recursion calls on paper(with larger numbers)?

Hi everyone it is my first question here! I would like to ask about some tricks how can we count recursive calls in a paper, without using computer? The language in example is Python 3.xx. In this example if I get larger number like 11 how can I count number of stars in this example "easily"?
def func(numb):
print('*', end='')
if numb <= 1:
return False
for i in range(numb):
if func(i) and numb%i == 0:
return False
return True
func(11)
I found too uneffective, to write everything as the program running, especially if it is on a test, too time consuming.
Thank you for helping!
There are several methods of counting recursive calls; this one basically is iteration, I guess, you do
T(n) + T(n - 1) + T(n - 2) ... // in which T(n) is the complexity of the recursive call
Substitution will lead to the same result and master theorem is useless here, so that's the best you can do, and since every one of your calls is linear this ends up being (worst case scenario, of course):
n + (n - 1) + (n - 2) ... + 2 // since you end at 1
But you can actually reduce your recursive calls if you do this:
if numb%i == 0 and func(i): // you won't reach func(i) if num % i != 0
Please check these function.
def recursion(numb):
if(numb<1):
return False
print('*'),
numb-=1
recursion(numb)
recursion(11)
print('')
def recursion1(numb):
if(numb<1):
return False
for i in range(numb):
print('*'),
print('')
numb-=1
recursion1(numb)
recursion1(11)

11+ digit ints not working

I'm using python 3 for a small extra credit assignment to write an RSA cracker. The teacher has given us a fairly large (large enough to require more than 32 bits) int and the public key. My code works for primes < 32 bits. One of the reasons I chose python 3 is because I heard it can handle arbitrarily large integers. In the python terminal I tested this by doing small things such as 2**35 and factorial(70). This stuff worked fine.
Now that I've written the code, I'm running in to problems with overflow errors etc. Why is it that operations on large numbers seem to work in the terminal but won't work in my actual code? The errors state that they cannot be converted to their C types, so my first guess would be that for some reason the stuff in the python interpreter is not being converter to C types while the coded stuff is. Is there anyway to get this working?
As a first attempt, I tried calculating a list of all primes between 1 and n (the large number). This sort of worked until I realized that the list indexers [ ] only accept ints and explode if the number is higher than int. Also, creating an array that is n in length won't work if n > 2**32. (not to mention the memory this would take up)
Because of this, I switched to using a function I found that could give a very accurate guess as to whether or not a number was prime. These methods are pasted below.
As you can see, I am only doing , *, /, and % operations. All of these seem to work in the interpreter but I get "cannot convert to c-type" errors when used with this code.
def power_mod(a,b,n):
if b < 0:
return 0
elif b == 0:
return 1
elif b % 2 == 0:
return power_mod(a*a, b/2, n) % n
else:
return (a * power_mod(a,b-1,n)) % n
Those last 3 lines are where the cannot convert to c-type appears.
The below function estimates with a very high degree of certainty that a number is prime. As mentioned above, I used this to avoid creating massive arrays.
def rabin_miller(n, tries = 7):
if n == 2:
return True
if n % 2 == 0 or n < 2:
return False
p = primes(tries**2)
if n in p:
return True
s = n - 1
r = 0
while s % 2 == 0:
r = r+1
s = s/2
for i in range(tries):
a = p[i]
if power_mod(a,s,n) == 1:
continue
else:
for j in range(0,r):
if power_mod(a, (2**j)*s, n) == n - 1:
break
else:
return False
continue
return True
Perhaps I should be more specific by pasting the error:
line 19, in power_mod
return (a * power_mod(a,b-1,n)) % n
OverflowError: Python int too large to convert to C double
This is the type of error I get when performing arithmetic. Int errors occur when trying to create incredibly large lists, sets etc
Your problem (I think) is that you are converting to floating point by using the / operator. Change it to // and you should stay in the int domain.
Many C routines still have C int limitations. Do your work using Python routines instead.

Resources