What is the time complexity of this division function (no divide or multiply operators used)? - python-3.x

I solved this leetcode question https://leetcode.com/problems/divide-two-integers/ . The goal is to get the quotient of the division of dividend by divisor without using a multiplication or division operator. Here is my solution:
def divide(dividend, divisor):
"""
:type dividend: int
:type divisor: int
:rtype: int
"""
sign = [1,-1][(dividend < 0) != (divisor < 0)]
dividend, divisor = abs(dividend), abs(divisor)
res = 0
i = 0
Q = divisor
while dividend >= divisor:
dividend = dividend - Q
Q <<= 1
res += (1 << i)
i+=1
if dividend < Q:
Q = divisor
i = 0
if sign == -1:
res = -res
if res < -2**31 or res > 2**31 -1:
return 2**31 - 1
return res
So I am having trouble analyzing the time complexity of this solution. I know it should be O(log(something)). Usually for algorithms we say they are O(log(n)) when the input gets divided by 2 at each iteration but here I multiply the divisor by 2 Q<<= 1 at each iteration so at each step I take a bigger step towards the solution. Obviously if the dividend is the same for a bigger divisor my algorithm will be faster. Similarly the bigger the dividend for the same divisor we get a slower run time.
My guess is the equation governing the runtime of this algorithm is basically of the form O(dividend/divisor) (duh that's division) with some logs in there to account for me multiplying Q by 2 at each step Q <<= 1... I can't figure out what exactly.
EDIT:
When I first posted the question the algorithm I posted is the one below, Alain Merigot's answer is based on that algorithm. The difference between the version on top and that one is for the one above I never have my dividend go below 0 resulting in a faster run time.
def divide(dividend, divisor):
"""
:type dividend: int
:type divisor: int
:rtype: int
"""
sign = [1,-1][(dividend < 0) != (divisor < 0)]
dividend, divisor = abs(dividend), abs(divisor)
res = 0
i = 0
tmp_divisor = divisor
while dividend >= divisor:
old_dividend, old_res = dividend, res
dividend = dividend - tmp_divisor
tmp_divisor <<= 1
res += (1 << i)
i+=1
if dividend < 0:
dividend = old_dividend
res = old_res
tmp_divisor >>= 2
i -= 2
if sign == -1:
res = -res
if res < -2**31 or res > 2**31 -1:
return 2**31 - 1
return res

Your algorithm is O(m^2) in the worst-case, where m is the number of bits in the result. In terms of the inputs, it would be O(log(dividend/divisor) ^ 2).
To see why, consider what your loop does. Let a=dividend, b=divisor. The loop subtracts b, 2b, 4b, 8b, ... from a as long as it's big enough, then repeats this sequence again and again until a<b.
It can be equivalently written as two nested loops:
while dividend >= divisor:
Q = divisor
i = 0
while Q <= dividend:
dividend = dividend - Q
Q <<= 1
res += (1 << i)
i+=1
For each iteration of the outer loop, the inner loop will perform less iterations because dividend is smaller. In the worst case, the inner loop will do only one iteration less for each iteration of the outer loop. This happens when the result is 1+3+7+15+...+(2^n-1) for some n. In this case, it can be shown that n = O(log(result)), but the total number of inner loop iterations is O(n^2), i.e. quadratic in the size of the result.
To improve this to be linear in the size of the result, first calculate the largest needed values of Q and i. Then work backwards from that, subtracting 1 from i and shifting Q right each iteration. This guarantees no more than 2n iterations total.

Worst case complexity is easy to find.
Every iteration generates a bit of the result, and the number of iterations is equal to the number of bits in the quotient.
When divider=1, quotient=dividend and in that case the number of iterations is equal to the number of bits in dividend after the leading (most significant) 1. It is maximized when dividend=2^(n-1)+k, where n is the number of bits and k any number such as 1≤k<2^(n-1). This will obviously be the worst case.
After first iteration, dividend=dividend-diviser(=dividend-1) and diviser=2^1
After iteration m, diviser=2^m and dividend=dividend-(1+2^1+..+2^(m-1))=dividend-(2^m-1)
Iterations stop when dividend is <0. As dividend=2^(n-1)+k, with k>0, this happens for m=n.
Hence, the number of steps in the worst case is n and complexity is linear with number of bits of the dividend.

Related

Maximum Sum of XOR operation on a selected element with array elements with an optimize approach

Problem: Choose an element from the array to maximize the sum after XOR all elements in the array.
Input for problem statement:
N=3
A=[15,11,8]
Output:
11
Approach:
(15^15)+(15^11)+(15^8)=11
My Code for brute force approach:
def compute(N,A):
ans=0
for i in A:
xor_sum=0
for j in A:
xor_sum+=(i^j)
if xor_sum>ans:
ans=xor_sum
return ans
Above approach giving the correct answer but wanted to optimize the approach to solve it in O(n) time complexity. Please help me to get this.
If you have integers with a fixed (constant) number of c bites then it should be possible because O(c) = O(1). For simplicity reasons I assume unsigned integers and n to be odd. If n is even then we sometimes have to check both paths in the tree (see solution below). You can adapt the algorithm to cover even n and negative numbers.
find max in array with length n O(n)
if max == 0 return 0 (just 0s in array)
find the position p of the most significant bit of max O(c) = O(1)
p = -1
while (max != 0)
p++
max /= 2
so 1 << p gives a mask for the highest set bit
build a tree where the leaves are the numbers and every level stands for a position of a bit, if there is an edge to the left from the root then there is a number that has bit p set and if there is an edge to the right there is a number that has bit p not set, for the next level we have an edge to the left if there is a number with bit p - 1 set and an edge to the right if bit p - 1 is not set and so on, this can be done in O(cn) = O(n)
go through the array and count how many times a bit at position i (i from 0 to p) is set => sum array O(cn) = O(n)
assign the root of the tree to node x
now for each i from p to 0 do the following:
if x has only one edge => x becomes its only child node
else if sum[i] > n / 2 => x becomes its right child node
else x becomes its left child node
in this step we choose the best path through the tree that gives us the most ones when xoring O(cn) = O(n)
xor all the elements in the array with the value of x and sum them up to get the result, actually you could have built the result already in the step before by adding sum[i] * (1 << i) to the result if going left and (n - sum[i]) * (1 << i) if going right O(n)
All the sequential steps are O(n) and therefore overall the algorithm is also O(n).

How can I reduce the time complexity of the given python code?

I have this python program which computes the "Square Free Numbers" of a given number. I'm facing problem regarding the time complexity that is I'm getting the error as "Time Limit Exceeded" in an online compiler.
number = int(input())
factors = []
perfectSquares = []
count = 0
total_len = 0
# Find All the Factors of the given number
for i in range(1, number):
if number%i == 0:
factors.append(i)
# Find total number of factors
total_len = len(factors)
for items in factors:
for i in range(1,total_len):
# Eleminate perfect square numbers
if items == i * i:
if items == 1:
factors.remove(items)
count += 1
else:
perfectSquares.append(items)
factors.remove(items)
count += 1
# Eleminate factors that are divisible by the perfect squares
for i in factors:
for j in perfectSquares:
if i%j == 0:
count +=1
# Print Total Square Free numbers
total_len -= count
print(total_len)
How can I reduce the time complexity of this program? That is how can I reduce the for loops so the program gets executed with a smaller time complexity?
Algorithmic Techniques for Reducing Time Complexity(TC) of a python code.
In order to reduce time complexity of a code, it's very much necessary to reduce the usage of loops whenever and wherever possible.
I'll divide your code's logic part into 5 sections and suggest optimization in each one of them.
Section 1 - Declaration of Variables and taking input
number = int(input())
factors = []
perfectSquares = []
count = 0
total_len = 0
You can easily omit declaration of perfect squares, count and total_length, as they aren't needed, as explained further. This will reduce both Time and Space complexities of your code.
Also, you can use Fast IO, in order to speed up INPUTS and OUTPUTS
This is done by using 'stdin.readline', and 'stdout.write'.
Section 2 - Finding All factors
for i in range(1, number):
if number%i == 0:
factors.append(i)
Here, you can use List comprehension technique to create the factor list, due to the fact that List comprehension is faster than looping statements.
Also, you can just iterate till square root of the Number, instead of looping till number itself, thereby reducing time complexity exponentially.
Above code section guns down to...
After applying '1' hack
factors = [for i in range(1, number) if number%i == 0]
After applying '2' hack - Use from_iterable to store more than 1 value in each iteration in list comprehension
factors = list( chain.from_iterable(
(i, int(number/i)) for i in range(2, int(number**0.5)+1)
if number%i == 0
))
Section 3 - Eliminating Perfect Squares
# Find total number of factors
total_len = len(factors)
for items in factors:
for i in range(1,total_len):
# Eleminate perfect square numbers
if items == i * i:
if items == 1:
factors.remove(items)
count += 1
else:
perfectSquares.append(items)
factors.remove(items)
count += 1
Actually you can completely omit this part, and just add additional condition to the Section 2, namely ... type(i**0.5) != int, to eliminate those numbers which have integer square roots, hence being perfect squares themselves.
Implement as follows....
factors = list( chain.from_iterable(
(i, int(number/i)) for i in range(2, int(number**0.5)+1)
if number%i == 0 and type(i**0.5) != int
))
Section 4 - I think this Section isn't needed because Square Free Numbers doesn't have such Restriction
Section 5 - Finalizing Count, Printing Count
There's absolutely no need of counter, you can just compute length of factors list, and use it as Count.
OPTIMISED CODES
Way 1 - Little Faster
number = int(input())
# Find Factors of the given number
factors = []
for i in range(2, int(number**0.5)+1):
if number%i == 0 and type(i**0.5) != int:
factors.extend([i, int(number/i)])
print([1] + factors)
Way 2 - Optimal Programming - Very Fast
from itertools import chain
from sys import stdin, stdout
number = int(stdin.readline())
factors = list( chain.from_iterable(
(i, int(number/i)) for i in range(2, int(number**0.5)+1)
if number%i == 0 and type(i**0.5) != int
))
stdout.write(', '.join(map(str, [1] + factors)))
First of all, you only need to check for i in range(1, number/2):, since number/2 + 1 and greater cannot be factors.
Second, you can compute the number of perfect squares that could be factors in sublinear time:
squares = []
for i in range(1, math.floor(math.sqrt(number/2))):
squares.append(i**2)
Third, you can search for factors and when you find one, check that it is not divisible by a square, and only then add it to the list of factors.
This approach will save you all the time of your for items in factors nested loop block, as well as the next block. I'm not sure if it will definitely be faster, but it is less wasteful.
I used the code provided in the answer above but it didn't give me the correct answer. This actually computes the square free list of factors of a number.
number = int(input())
factors = [
i for i in range(2, int(number/2)+1)
if number%i == 0 and int(int(math.sqrt(i))**2)!=i
]
print([1] + factors)

make change in python (maximum recursion depth exceeded in comparison)

So I have a recursive solution to the make change problem that works sometimes. It is:
def change(n, c):
if (n == 0):
return 1
if (n < 0):
return 0;
if (c + 1 <= 0 and n >= 1):
return 0
return change(n, c - 1) + change(n - coins[c - 1], c);
where coins is my array of coins. For example [1,5,10,25]. n is the amount of coins, for example 1000, and c is the length of the coins array - 1. This solution works in some situations. But when I need it to run in under two seconds and I use:
coins: [1,5,10,25]
n: 1000
I get a:
RecursionError: maximum recursion depth exceeded in comparison
So my question is, what would be the best way to optimize this. Using some sort of flow control? I don't want to do something like.
# Set recursion limit
sys.setrecursionlimit(10000000000)
UPDATE:
I now have something like
def coinss(n, c):
if n == 0:
return 1
if n < 0:
return 0
nCombos = 0
for c in range(c, -1, -1):
nCombos += coinss(n - coins[c - 1], c)
return nCombos
but it takes forever. it'd be ideal to have this run under a second.
As suggested in the answers above you could use DP for a more optimal solution.
Also your conditional check -
if (c + 1 <= 0 and n >= 1)
should be
if (c <= 1 ):
as n will always be >=1 and c <= 1 will prevent any calculations if the number of coins is lesser than or equal to 1.
While using recursion you will always run into this. If you set the recursion limit higher, you may be able to use your algorithm on a bigger number, but you will always be limited. The recursion limit is there to keep you from getting a stack overflow.
The best way to solved for bigger change amounts would be to swap to an iterative approach. There are algorithms out there, wikipedia:
https://en.wikipedia.org/wiki/Change-making_problem
Note that you have a bug here:
if (c + 1 <= 0 and n >= 1):
is like
if (c <= -1 and n >= 1):
So c can be 0 and pass to the next step where you pass c-1 to the index, which works because python doesn't mind negative indexes but still false (coins[-1] yields 25), so your solution sometimes prints 1 combination too much.
I've rewritten your algorithm with recursive and stack approaches:
Recursive (fixed, no need for c at init thanks to an internal recursive method, but still overflows the stack):
coins = [1,5,10,25]
def change(n):
def change_recurse(n, c):
if n == 0:
return 1
if n < 0:
return 0;
if c <= 0:
return 0
return change_recurse(n, c - 1) + change_recurse(n - coins[c - 1], c)
return change_recurse(n,len(coins))
iterative/stack approach (not dynamic programming), doesn't recurse, just uses a "stack" to store the computations to perform:
def change2(n):
def change_iter(stack):
result = 0
# continue while the stack isn't empty
while stack:
# process one computation
n,c = stack.pop()
if n == 0:
# one solution found, increase counter
result += 1
if n > 0 and c > 0:
# not found, request 2 more computations
stack.append((n, c - 1))
stack.append((n - coins[c - 1], c))
return result
return change_iter([(n,len(coins))])
Both methods return the same values for low values of n.
for i in range(1,200):
a,b = change(i),change2(i)
if a != b:
print("error",i,a,b)
the code above runs without any error prints.
Now print(change2(1000)) takes a few seconds but prints 142511 without blowing the stack.

Project Euler #23 Optimization [Python 3.6]

I'm having trouble getting my code to run quickly for Project Euler Problem 23. The problem is pasted below:
A perfect number is a number for which the sum of its proper divisors is exactly equal to the number. For example, the sum of the proper divisors of 28 would be 1 + 2 + 4 + 7 + 14 = 28, which means that 28 is a perfect number.
A number n is called deficient if the sum of its proper divisors is less than n and it is called abundant if this sum exceeds n.
As 12 is the smallest abundant number, 1 + 2 + 3 + 4 + 6 = 16, the smallest number that can be written as the sum of two abundant numbers is 24. By mathematical analysis, it can be shown that all integers greater than 28123 can be written as the sum of two abundant numbers. However, this upper limit cannot be reduced any further by analysis even though it is known that the greatest number that cannot be expressed as the sum of two abundant numbers is less than this limit.
Find the sum of all the positive integers which cannot be written as the sum of two abundant numbers.
And my code:
import math
import bisect
numbers = list(range(1, 20162))
tot = set()
numberabundance = []
abundant = []
for n in numbers:
m = 2
divisorsum = 1
while m <= math.sqrt(n):
if n % m == 0:
divisorsum += m + (n / m)
m += 1
if math.sqrt(n) % 1 == 0:
divisorsum -= math.sqrt(n)
if divisorsum > n:
numberabundance.append(1)
else:
numberabundance.append(0)
temp = 1
# print(numberabundance)
for each in numberabundance:
if each == 1:
abundant.append(temp)
temp += 1
abundant_set = set(abundant)
print(abundant_set)
for i in range(12, 20162):
for k in abundant:
if i - k in abundant_set:
tot.add(i)
break
elif i - k < i / 2:
break
print(sum(numbers.difference(tot)))
I know the issue lies in the for loop at the bottom but I'm not quire sure how to fix it. I've tried modeling it after some of the other answers I've seen here but none of them seem to work. Any suggestions? Thanks.
Your upper bound is incorrect - the question states all integers greater than 28123 can be written ..., not 20162
After changing the bound, generation of abundant is correct, although you could do this generation in a single pass by directly adding to a set abundant, instead of creating the bitmask array numberabundance.
The final loop is also incorrect - as per the question, you must
Find the sum of all the positive integers
whereas your code
for i in range(12, 20162):
will skip numbers below 12 and also doesn't include the correct upper bound.
I'm a bit puzzled about your choice of
elif i - k < i / 2:
Since the abundants are already sorted, I would just check if the inner loop had passed the midpoint of the outer loop:
if k > i / 2:
Also, since we just need the sum of these numbers, I would just keep a running total, instead of having to do a final sum on a collection.
So here's the result after making the above changes:
import math
import bisect
numbers = list(range(1, 28123))
abundant = set()
for n in numbers:
m = 2
divisorsum = 1
while m <= math.sqrt(n):
if n % m == 0:
divisorsum += m + (n / m)
m += 1
if math.sqrt(n) % 1 == 0:
divisorsum -= math.sqrt(n)
if divisorsum > n:
abundant.add(n)
#print(sorted(abundant))
nonabundantsum = 0
for i in numbers:
issumoftwoabundants = False
for k in abundant:
if k > i / 2:
break
if i - k in abundant:
issumoftwoabundants = True
break
if not issumoftwoabundants:
nonabundantsum += i
print(nonabundantsum)
Example here

efficiently generating all integers within a binary mask

Suppose I have some binary mask mask. (e.g. 0b101011011101)
Is there an efficient method of computing all integers k such that k & mask == k? (where & is the bitwise AND operator) (alternatively, k & ~mask == 0)
If mask has m ones, then there are exactly 2m numbers that satisfy this property, so it seems like there should be some kind of process that is O(2m). Enumerating the integers less than the mask is wasteful (though easy to eliminate values that do not apply).
I figured it out... you can identify all the single bit patterns like as follows, since the least significant 1 bit of any integer k is cleared when calculating k & (k-1):
def onebits(x):
while x > 0:
# find least significant 1 bit
xprev = x
x &= x-1
yield x ^ xprev
and then I can use the ruler function to XOR in various combinations of 1 bits to emulate which bits of a counter are toggled each time:
def maskcount(mask):
maskbits = []
m = 0
for ls1 in onebits(mask):
m ^= ls1
maskbits.append(m)
# ruler function modified from
# http://lua-users.org/wiki/LuaCoroutinesVersusPythonGenerators
def ruler(k):
for i in range(k):
yield i
for x in ruler(i): yield x
x = 0
yield x
for k in ruler(len(maskbits)):
x ^= maskbits[k]
yield x
which looks like this:
>>> for x in maskcount(0xc05):
... print format(x, '#016b')
0b00000000000000
0b00000000000001
0b00000000000100
0b00000000000101
0b00010000000000
0b00010000000001
0b00010000000100
0b00010000000101
0b00100000000000
0b00100000000001
0b00100000000100
0b00100000000101
0b00110000000000
0b00110000000001
0b00110000000100
0b00110000000101
An easy way to solve the problem is to find the bits that are set in mask, and then simply count with i, but then replacing the bits of i with corresponding bits from the mask.
def codes(mask):
bits = filter(None, (mask & (1 << i) for i in xrange(mask.bit_length())))
for i in xrange(1 << len(bits)):
yield sum(b for j, b in enumerate(bits) if (i >> j) & 1)
print list(codes(39))
That gives you O(log(N)) work per iteration (where N is the number of bits set in mask).
It's possible to be more efficient, and do O(1) work per iteration by counting using gray codes. With gray code counting, only a single bit changes each iteration so it's possible to efficiently update the current value, v. Obviously this is much harder to understand than the simple solution above.
def codes(mask):
bits = filter(None, (mask & (1 << i) for i in xrange(mask.bit_length())))
blt = dict((1 << i, b) for i, b in enumerate(bits))
p, v = 0, 0
for i in xrange(1 << len(bits)):
n = i ^ (i >> 1)
v ^= blt.get(p^n, 0)
p = n
yield v
print list(codes(39))
A disadvantage of using gray codes is that the results are not returned in numeric order. But luckily that wasn't a condition in the question!

Resources