Time complexity required to pop all elements using heapq.heappop (Python 3) - python-3.x

Now initially it seemed that it should be O(Nlog(N)) , where N is the number of elements in the heap but, assuming worst case, it will take log(N) time to sift each elements until N/2 nodes have been popped (Since that would mean that the height of heap has been reduced by one) , and then it will take log(N)-1 time to sift each element until N/4 nodes have been popped
Therefore it becomes a series like
N/2*(log(N)) + N/4*(log(N)-1) + N/8*(log(N)-1) + ... N/(2^(log(N))*(log(N) - Height of Heap)
Where the last term is basically N/N * 0 - 0
I cant figure out the sum of this series, I tried integrating it in its standard form integral of N*(log(N) - x)/2^(x+1)dx , limits 0 to log(N) but wolfram gave me a complicated answer

If you have n items in a heap, then popping the root item has a worst case complexity of log(n). You then have n-1 items on the heap, and complexity of popping the root item is log(n-1). So the series you want to sum is:
log(n) + log(n-1) + log(n-2) + log(n-3) + ... + log(n-n+1)
Or, easier to understand:
log(1) + log(2) + log(3) + ... + log(n)
https://stackoverflow.com/a/21152768/56778 explains how that is O(n log n), as well as Θ(n log n).
Alternately, log(a) + log(b) is equal to log(a*b). So the summation of logs from 1 to n is equal to log(n!). See https://math.stackexchange.com/questions/589027/whats-the-formula-to-solve-summation-of-logarithms
See also Is log(n!) = Θ(n·log(n))?

Related

Maximum Sum of XOR operation on a selected element with array elements with an optimize approach

Problem: Choose an element from the array to maximize the sum after XOR all elements in the array.
Input for problem statement:
N=3
A=[15,11,8]
Output:
11
Approach:
(15^15)+(15^11)+(15^8)=11
My Code for brute force approach:
def compute(N,A):
ans=0
for i in A:
xor_sum=0
for j in A:
xor_sum+=(i^j)
if xor_sum>ans:
ans=xor_sum
return ans
Above approach giving the correct answer but wanted to optimize the approach to solve it in O(n) time complexity. Please help me to get this.
If you have integers with a fixed (constant) number of c bites then it should be possible because O(c) = O(1). For simplicity reasons I assume unsigned integers and n to be odd. If n is even then we sometimes have to check both paths in the tree (see solution below). You can adapt the algorithm to cover even n and negative numbers.
find max in array with length n O(n)
if max == 0 return 0 (just 0s in array)
find the position p of the most significant bit of max O(c) = O(1)
p = -1
while (max != 0)
p++
max /= 2
so 1 << p gives a mask for the highest set bit
build a tree where the leaves are the numbers and every level stands for a position of a bit, if there is an edge to the left from the root then there is a number that has bit p set and if there is an edge to the right there is a number that has bit p not set, for the next level we have an edge to the left if there is a number with bit p - 1 set and an edge to the right if bit p - 1 is not set and so on, this can be done in O(cn) = O(n)
go through the array and count how many times a bit at position i (i from 0 to p) is set => sum array O(cn) = O(n)
assign the root of the tree to node x
now for each i from p to 0 do the following:
if x has only one edge => x becomes its only child node
else if sum[i] > n / 2 => x becomes its right child node
else x becomes its left child node
in this step we choose the best path through the tree that gives us the most ones when xoring O(cn) = O(n)
xor all the elements in the array with the value of x and sum them up to get the result, actually you could have built the result already in the step before by adding sum[i] * (1 << i) to the result if going left and (n - sum[i]) * (1 << i) if going right O(n)
All the sequential steps are O(n) and therefore overall the algorithm is also O(n).

What is the time complexity of this agorithm (that solves leetcode question 650) (question 2)?

Hello I have been working on https://leetcode.com/problems/2-keys-keyboard/ and came upon this dynamic programming question.
You start with an 'A' on a blank page and you get a number n when you are done you should have n times 'A' on the page. The catch is you are allowed only 2 operations copy (and you can only copy the total amount of A's currently on the page) and paste --> find the minimum number of operations to get n 'A' on the page.
I solved this problem but then found a better solution in the discussion section of leetcode --> and I can't figure out it's time complexity.
def minSteps(self, n):
factors = 0
i=2
while i <= n:
while n % i == 0:
factors += i
n /= i
i+=1
return factors
The way this works is i is never gonna be bigger than the biggest prime factor p of n so the outer loop is O(p) and the inner while loop is basically O(logn) since we are dividing n /= i at each iteration.
But the way I look at it we are doing O(logn) divisions in total for the inner loop while the outer loop is O(p) so using aggregate analysis this function is basically O(max(p, logn)) is this correct ?
Any help is welcome.
Your reasoning is correct: O(max(p, logn)) gives the time complexity, assuming that arithmetic operations take constant time. This assumption is not true for arbitrary large n, that would not fit in the machine's fixed-size number storage, and where you would need Big-Integer operations that have non-constant time complexity. But I will ignore that.
It is still odd to express the complexity in terms of p when that is not the input (but derived from it). Your input is only n, so it makes sense to express the complexity in terms of n alone.
Worst Case
Clearly, when n is prime, the algorithm is O(n) -- the inner loop never iterates.
For a prime n, the algorithm will take more time than for n+1, as even the smallest factor of n+1 (i.e. 2), will halve the number of iterations of the outer loop, and yet only add 1 block of constant work in the inner loop.
So O(n) is the worst case.
Average Case
For the average case, we note that the division of n happens just as many times as n has prime factors (counting duplicates). For example, for n = 12, we have 3 divisions, as n = 2·2·3
The average number of prime factors for 1 < n < x approaches loglogn + B, where B is some constant. So we could say the average time complexity for the total execution of the inner loop is O(loglogn).
We need to add to that the execution of the outer loop. This corresponds to the average greatest prime factor. For 1 < n < x this average approaches C.n/logn, and so we have:
O(n/logn + loglogn)
Now n/logn is the more important term here, so this simplifies to:
O(n/logn)

Analyzing the time complexity of Coin changing

We're doing the classic problem of determining the number of ways that we can make change that amounts to Z given a set of coins.
For example, Amount=5 and Coins={1, 2, 3}. One way we can make 5 is {2, 3}.
The naive recursive solution has a time complexity of factorial time.
f(n) = n * f(n-1) = n!
My professor argued that it actually has a time complexity of O(2^n), because we only choose to use a coin or not. That intuitively makes sense. However how come my recurence doesn't work out to be O(2^n)?
EDIT:
My recurrence is as follows:
f(5, {1, 2, 3})
/ \ .....
f(4, {2, 3}) f(3, {1, 3}) .....
Notice how the branching factor decreases by 1 at every step.
Formally.
T(n) = n*F(n-1) = n!
The recurrence doesn't work out to what you expect it to work out to because it doesn't reflect the number of operations made by the algorithm.
If the algorithm decides for each coin whether to output it or not, then you can model its time complexity with the recurrence T(n) = 2*T(n-1) + O(1) with T(1)=O(1); the intuition is that for each coin you have two options---output the coin or not; this obviously solves to T(n)=O(2^n).
I too was trying to analyze the time complexity for the brute force which performs depth first search:
def countCombinations(coins, n, amount, k=0):
if amount == 0:
return 1
res = 0
for i in range(k, n):
if coins[k] <= amount:
remaining_amount = amount - coins[i] # considering this coin, try for remaining sum
# in next round include this coin too
res += countCombinations(coins, n, remaining_amount, i)
return res
but we can see that the coins which are used in one round is used again in the next round, so at least for 1st coin we have n items at each stage which is equivalent to permutation with repetition n^r for n items available to arrange into r positions at each stage.
ex: [1, 1, 1, 1]; sum = 4
This will generate a recursive tree where for first path we literally have solutions at each diverged subpath until we have the sum=0. so the time complexity is O(sum^n) ie for each stage in the path towards sum we have n different subpaths.
Note however there is another algorithm which uses take/not-take approach and at most there is 2 branch at a node in recursion tree. Hence the time complexity for this algorithm is O(2^(n*m))
ex: say coins = [1, 1] sum = 2 there are 11 nodes/points to visit in the recursion tree for 6 paths(leaves) then complexity is at most 2^(2*2) => 2^4 => 16 (Hence 11 nodes visiting for a max of 16 possibility is correct but little loose on upper bound).
def get_count(coins, n, sum):
if(n == 0): # no coins left, to try a combination that matches the sum
return 0
if(sum == 0): # no more sum left to match, means that we have completely co-incided with our trial
return 1 # (return success)
# don't-include the last coin in the sum calc so, leave it and try rest
excluded = get_count(coins, n-1, sum)
included = 0
if(coins[n-1] <= sum):
# include the last coin in the sum calc, so reduce by its quantity in the sum
# we assume here that n is constant ie, it is supplied in unlimited(we can choose same coin again and again),
included = get_count(coins, n, sum-coins[n-1])
return included+excluded

What would be the big O notation for the function?

I know that big O notation is a measure of how efficint a function is but I don\t really get how to get calculate it.
def method(n)
sum = 0
for i in range(85)
sum += i * n
return sum
Would the answer be O(f(85)) ?
The complexity of this function is O(1)
in the RAM model basic mathematical functions occur in constant time. The dominate term in this function is
for i in range(85):
since 85 is a constant the complexity is represented by O(1)
you have function with 4 "actions", to calculate its big O we need to calculate big O for each action and select max:
sum = 0 - constant time, measured O(1)
for i in range(85) - constant time, 85 iterations, O(1 * complexity of #3)
sum += i*n - we can say constant time, but multiplication is actually depends on bit length of i and n, so we can either say O(1), or O(max(lenI, lenN))
return sum - constant time, measured O(1)
so, the possible max big O is #2, which is the 1 * O(#3), as soon as lenI and lenN are constant (32 or 64 bits usually), max(lenI, lenN) -> 32/64, so total complexity of your function is O(1 * 1) = O(1)
if we have big math, ie bit length of N can be very very long, then we can say O(bit length N)
NOTE: bit length N is actually log2(N)
In theory, the complexity is O(log n). As n grows, reading the number and performing the multiplication takes longer.
However, in practice, the value of n is constrained (there's a maximum value) and thus it can be read and operations can be performed on it in O(1) time. Since we repeat an O(1) operation a fixed amount of times, the complexity is still O(1).
Note that O(1) means constant time - O(85) doesn't really mean anything different. If you perform multiple constant time operations in a sequence, the result is still O(1) unless the length of the sequence depends on the size of the input. Doing a O(1) operation 1000 times is still O(1), but doing it n times is O(n).
If you want to really play it safe, just say O(∞), that's definitely a correct answer. CS teachers tend to not really appreciate it in practice though.
When talking about complexity, there always should be said what operations should be considered as constant time ones (the initial agreement). Here the integer multiplication can be considered or constant or not. Anyway, the time complexity of the example is better than O(n). But it is the teacher's trick against the students -- kind of. :)

What is the time complexity for repeatedly doubling a string?

Consider the following piece of C++ code:
string s = "a";
for (int i = 0; i < n; i++) {
s = s + s; // Concatenate s with itself.
}
Usually, when analyzing the time complexity of a piece of code, we would determine how much work the inner loop does, then multiply it by the number of times the outer loop runs. However, in this case, the amount of work done by the inner loop varies from iteration to iteration, since the string being built up gets longer and longer.
How would you analyze this code to get the big-O time complexity?
The time complexity of this function is Θ(2n). To see why this is, let's look at what the function does, then see how to analyze it.
For starters, let's trace through the loop for n = 3. Before iteration 0, the string s is the string "a". Iteration 0 doubles the length of s to make s = "aa". Iteration 1 doubles the length of s to make s = "aaaa". Iteration 2 then doubles the length of s to make s = "aaaaaaaa".
If you'll notice, after k iterations of the loop, the length of the string s is 2k. This means that each iteration of the loop will take longer and longer to complete, because it will take more and more work to concatenate the string s with itself. Specifically, the kth iteration of the loop will take time Θ(2k) to complete, because the loop iteration constructs a string of size 2k+1.
One way that we could analyze this function would be to multiply the worst-case time complexity of the inner loop by the number of loop iterations. Since each loop iteration takes time O(2n) to finish and there are n loop iterations, we would get that this code takes time O(n · 2n) to finish.
However, it turns out that this analysis is not very good, and in fact will overestimate the time complexity of this code. It is indeed true that this code runs in time O(n · 2n), but remember that big-O notation gives an upper bound on the runtime of a piece of code. This means that the growth rate of this code's runtime is no greater than the growth rate of n · 2n, but it doesn't mean that this is a precise bound. In fact, if we look at the code more precisely, we can get a better bound.
Let's begin by trying to do some better accounting for the work done. The work in this loop can be split apart into two smaller pieces:
The work done in the header of the loop, which increments i and tests whether the loop is done.
The work done in the body of the loop, which concatenates the string with itself.
Here, when accounting for the work in these two spots, we will account for the total amount of work done across all iterations, not just in one iteration.
Let's look at the first of these - the work done by the loop header. This will run exactly n times. Each time, this part of the code will do only O(1) work incrementing i, testing it against n, and deciding whether to continue with the loop. Therefore, the total work done here is Θ(n).
Now let's look at the loop body. As we saw before, iteration k creates a string of length 2k+1 on iteration k, which takes time roughly 2k+1. If we sum this up across all iterations, we get that the work done is (roughly speaking)
21 + 22 + 23 + ... + 2n+1.
So what is this sum? Previously, we got a bound of O(n · 2n) by noting that
21 + 22 + 23 + ... + 2n+1.
< 2n+1 + 2n+1 + 2n+1 + ... + 2n+1
= n · 2n+1 = 2(n · 2n) = Θ(n · 2n)
However, this is a very weak upper bound. If we're more observant, we can recognize the original sum as the sum of a geometric series, where a = 2 and r = 2. Given this, the sum of these terms can be worked out to be exactly
2n+2 - 2 = 4(2n) - 2 = Θ(2n)
In other words, the total work done by the body of the loop, across all iterations, is Θ(2n).
The total work done by the loop is given by the work done in the loop maintenance plus the work done in the body of the loop. This works out to Θ(2n) + Θ(n) = Θ(2n). Therefore, the total work done by the loop is Θ(2n). This grows very quickly, but nowhere near as rapidly as O(n · 2n), which is what our original analysis gave us.
In short, when analyzing a loop, you can always get a conservative upper bound by multiplying the number of iterations of the loop by the maximum work done on any one iteration of that loop. However, doing a more precisely analysis can often give you a much better bound.
Hope this helps!

Resources