Proof by induction that every non-empty tree of height h contains fewer than 2^n+1 nodes - induction

I am stuck on the induction case of a problem.
The problem:
Define the height of a tree as the maximum number of edges between the
root and any leaf. We consider the height of an empty tree to be -1, and
the height of a tree consisting of a single node to be 0. Prove by induction that every non-empty binary tree of height h contains
fewer than 2 (h+1) nodes.
So I started:
Base case: h = 0 (Since a non-empty tree consists of a single node
or
more, the first case would be an empty node)
= 2 (0+1) = 2(1)= 2
When height is 0 the tree consists of a single, so yes 1 node is
less than 2 nodes.
Inductive step = h less than or greater to 0
This is where I am stuck... I know that the statement is true, since
the height will always be 1 less than the number of nodes, I just
don't know how to prove it algebraically.
Thanks in advance.

Suppose you have a tree with the height of n+1.
Both it's left subtree and right subtree have their height bound by n.
By induction, each subtree has less than 2^(n+1) nodes, meaning at most 2^(n+1) - 1 nodes.
Since we have two subtrees, we have at most 2 * (2^(n+1) - 1 ) = 2^(n+2) - 2.
Add one for the root, and the tree of height n+1 has at most 2^(n+2) - 1, which is less than 2^(n+2), as required.

Related

Get position of disk if moving in reverse direction (Python)

I have 3 towers [A, B, C] which I will label in numbers as [1,2,3].
I want to find the position a disk will be in if it moves n times in the reverse direction. So if it starts at B or 2 position and moves back 7 times, it will be in A or 1 position.
Is there a general formula I can use to compute this for any given starting position and any number of moves n? To move forward, I just use
(start_pos + n) % 3.
However, I am not sure about the reverse direction.
The result of a modulus operation between a negative and a positive is a positive. I.e., you can use the same general formula, just subtract the number of steps instead of adding them:
(start_pos - n) % 3
Or, alternatively, define the number of steps moved backwards as a negative number of steps.

Bitmap pattern recogination

Describe and analyze an algorithm that finds the maximum-area rectangular
pattern that appears more than once in a given bitmap. Specifically, given
a two-dimensional array M[1 .. n, 1 .. n] of bits as input, your algorithm
should output the area of the largest repeated rectangular pattern in M.
For example, given the bitmap shown on the left in the figure below, your
algorithm should return the integer 195, which is the area of the 15 x 13
doggo. (Although it doesn’t happen in this example, the two copies of the
repeated pattern might overlap.)
Image: enter image description here
Using term rectangle instead of sub-matrix.
Brute Force Approach O(n^8)
Take two distinct positions A and B in matrix. Then consider all rectangles in the matrix with A as top left corner. Similarly consider all rectangles with B as top left corner. Suppose we have a rectangle with top-left corner A of length l and breadth b, then take corresponding rectangle with top left corner B, length l, breath b (Assuming both such rectangles exist). If every bit of both rectangles match, then we have a pattern of area of lb repeating and if lb is greater than previously seen maximum area, update the maximum area.
Repeat this procedure for all pairs of A and B.
Time Complexity : Without making analysis complex, let us assume that for every point P in matrix,for all rectangles with P as top left corner, possible maximum length of rectangle is n (But this is not true, for example consider point in the last row and last column of matrix). Let us make similar assumption for maximum possible breath of rectangle.
Then from product rule in counting, there are n^2 possible rectangles with point P as their top left corner.
This estimate is not so bad because according to this, there are total n^4 rectangles possible, as there are n^2 points in the rectangle and we have n^2 rectangles per point.
Actual answer is ( (n+1) Choose 2 ) * ( (n+1) Choose 2 ), which is in the order of n^4
So in total, for each pair, we'd compare n^2 rectangles. And similar to above, let us estimate that there are n^2 points in each rectangle for easy calculations. So for a pair of points A and B, we'll have O(n^4) comparisons because for each rectangle, we need to check all of their corresponding points.
And there are ( n^2 Choose 2 ) pairs of points in total, which is of order n^4. So we'd have overall time complexity of O(n^8).
Avoiding Repeated Calculations O(n^6)
We see that we are repeatedly checking same area again and again. For example consider two rectangles with top-left corners at A and B respectively, with length l and breadth b. Again when we check another corresponding rectangles with top-left corners at A and B with length (l+1) and breadth b, we are again checking rectangle of length l and breadth b.
So we use memoization to avoid repeated calculations.
Assume length of a rectangle is measured horizontally and breadth vertically measured.
Consider rectangles of length l+1 and breadth b with top-left corners at A and B. Now we need to compare if all values of corresponding points in rectangles match. A_r, B_r be points to right of A and B respectively. Then if rectangles have same values for all corresponding points, rectangles of length l, breadth b with top left corners A and B must repeat. Similarly rectangles of length l and breadth b with top-left corners at A_r and B_r must match.
Consider below figure :
Time Complexity By above procedure, it'd take O(1) time to compare two rectangles. So compared to above case, time complexity is reduced by a factor of n^2 (which was time required to compare rectangles in earlier case). So O(n^6) time in this case.
Let us represent (P, m, n) as rectangle with top-left corner P, length m and breadth m.
Removing unnecessary calculations O(n^5)
In the above approach, even if we know that if rectangles (A, l, b) and (B, l, b) do not match we again compare the rectangles (A, l+1, b) and (B, l+1, B).
So suppose now we have rectangles with top-left corners A and B each of length l, then what is maximum possible breadth of the rectangles?
If all the corresponding elements in top row of each rectangle do not match, then answer is 0. But if all corresponding rows match, then the answer is 1 + (max.breadth of rectangles of length l but with top-left corners below A and B).
And similar to above approach this calculation would take O(1) time, as we need breadth that of (A, B, l-1) and that of below rectangles with length l.
Refer to this answer Largest Square Block for more clarity.
Time Complexity For each pair of points in the matrix, we have to store max. breadth for length 1, 2, ...n. And there are order of n^4 such pairs. And we can check obtain each value in O(1) time. So overall time complexity is O(n^5). And finding max. area for each length of rectangle for each pair is obvious here.
Further Optimization O(n^4)
Imagine you have two copies of bitmaps. One is glued to the ground and other can move on top the glued one. Let us call fixed one as base and other one as moving bitmap.
Now top-left corner of moving bitmap can be on one of (n^2 - 1) points of base, except the case where moving bitmap sits on top of base. Now in each case, there are some points left-out i.e for some points on base bitmap, there will not be a point of moving one on top of it and vice-versa. Assume top-left corner of moving bitmap needs to have an element of base bitmap below it.
Now take one instance of these (n^2 - 1) configurations. And for all points on moving bitmap which have a point of base bitmap below them, let us construct a new matrix such that it contains "Y" if top and bottom element of both the bitmaps are same, else it'd contain "N". And remember, the size of matrix is same as no of elements on moving bitmap which have an element of base bitmap below them.
Then, maximum area of repeated pattern for that portions of base and moving bitmaps would be maximum solid block area containing all Y's, which can be done in O(n^2) time.
Our required answer is maximum of answers in all these (n^2 - 1) configurations.
Refer to Largest rectangle containing all Y's for further clarity
Time Complexity For each configuration, constructing new matrix of "Y" and "N"'s would take O(n^2) time and maximum area calculation also O(n^2) time.
And there are (n^2- 1) such configurations, so overall O(n^4) time.
Further Optimization O(n^3 polylog(n))
Consider in above bitmap :
For length : 1, a rectangle with length : 1 and breadth 3 is
repeating twice. (1st and 2nd columns are same in bitmap). So
maxBreadth(1) is 3.
For length : 2, a rectangle with length : 2 and breadth 2 is
repeating twice. So maxBreadth(2) is 2.
The rectangle is :
0
0
0
0
For length : 3, a rectangle with length : 3 and breadth 1 is
repeating twice. So maxBreadth(3) is 1.
For lengths : 4, no rectangle with length : 4 is repeating in the
bitMap, so maxBreadth(4) is 0.
Consider you have a method named maxBreadth which takes a bitmap matrix and length L as input and returns to you the maximum breadth B for which there is a repeating rectangle of with length L and breadth B in the bitmap.
Using this, can you find the area of largest repeating rectangle in a bitmap?? Iterate over each length. We now know that there is a rectangle repeating in the bitmap with area length * maxBreadth(bitMap, length). Update the corresponding maximum area encountered till now.
So now let's focus on maxBreadth method.
Observations :
If a rectangle of length 3 and breadth 5 is not repeating in given bitmap then a rectangle of length 3 and breadth 6 definitely will not be repeating in the bitmap. Generally if a rectangle of length l and breadth b does not repeat in bitMap, any rectangle of length l and breadth > b does not repeat in bitMap. Same goes for length also.
So based on above observation, you can do binary search to find maxBreadth of given length.
If rectangle of length L breadth B is
repeating, then check for breadth 2B
not repeating, then check for breadth B/2
Ok, now how to check if any rectangle of dimensions (l, b) is repeating in a bitMap? There are n^2 such rectangles, so will you compare each with all the others?
What will you do if you asked if an array of numbers is having a repeated element efficiently? Answer is to sort them and check
So take all the n^2 rectangles and sort them and check if there is any repeating one. And how to compare two 2D rectangles, just divide the rectangle into 4 quadrants and compare them individually. In this way we only need to store sorted indices for breadths 1, 2, 4, 8, ... n. and also for lengths 1, 2, 4, 8, ....n.
Each sorting takes O(n^2 log(n)) time for given (length, breadth)
And for each length we perform this operation log(n) times, so total O((n.logn)^2) time complexity for maxBreadth operation.
Finally we call the method maxBreadth n times, so overall time complexity is O(n^3 log(n)^2) and space complexity is O((n.logn)^2)
Note : This O(n^3 log(n)^2) method not only works for bitMaps but also for matrices containing any number of distinct arbitrary numbers to search for maximum repeating sub-matrix

We have an urn with n balls, numbered from 1 to n. Whats the chance of getting the ball with number k exactly after k-1 draws?

So, the problem is:
We have an urn with N balls, each numbered from 1 to n.
We keep drawing without putting back, and the question is whats the chance that one of the balls with number k (k can be anything) will be drawn exactly after k-1 draws?
should be 1/n.
Approach 1: You only care about the kth draw, which must be a precise ball. Odds of that are 1/n, the other draws are free choices otherwise.
Approach 2: (grind the raw probability)
The first (k-1) draws each have probability (n-i)/(n-i+1), which when multiplied reduces to (n-k+1)/n
The kth draw probability is 1/(n-k+1)
(n-k+1)/n * 1/(n-k+1) = 1/n

DP sum all node combinations

Given a tree and a value for each node, how can we get the total sum for each possible path?
A
|
B
In the above tree there will be 4 such paths: A, B, A-B, B-A.
Each node will have a value assign to it: A: 3, B: 2
The expected output should be: 3+2+(3+2)+(2+3)
A naive solution for this problem is to do a DFS from source to target(for each possible combination) and get the sum by adding the DFS result, but I believe this problem can be solved more efficiently with DP, but don't really have so much experience with DP.
There is a nice solution for this one which doesn't involve (much) dynamic programming:
You can compute how often each node appears in paths. Lets just take an example with n=5 nodes:
A
|
B
/ \
C D
|
E
The computation for the leaves A, C and E is very easy. They only appear in paths that start or end with this node. There are 2n - 1 = 9 paths (-1 because the path A starts and end ends with A and is therefore counted twice in 2*n).
For the inner nodes it gets a bit more tricky. Lets look at the node D first. Of course D appears in all paths, that start or end in D. So we again have 2n - 1 = 9 paths. But now it can also be the case that D appears somewhere in the middle of a path. E.g. in the path A-B-D-E. This can only happen, if the path starts somewhere in the subtree ABC and ends in the subtree E or opposite. Combinatorics tells us, that there are size(ABC)*size(E) + size(E)*size(ABC) = 2*size(ABC)*size(E) = 2*3*1 = 6 many. So D appears in exactly 9 + 6 = 15 paths.
For node B it still gets a bit more tricky. There are again 2n - 1 = 9 paths starting or ending from B (This is true for each node). But again B can appear somewhere in the middle of a path. For this to happen a path must start in one of the subtrees A, C or DE and end in a different one. So there are 2*size(A)*size(C) + 2*size(C)*size(DE) + 2*size(DE)*size(A) = 2*1*1 + 2*1*2 + 2*2*1 = 2 + 4 + 4 = 10 possible paths. With a little bit of math you can see that this is identical to (n-1)^2 - size(A)^2 - size(C)^2 - size(DE)^2. So in total the node appears in 9 + 10 = 19 paths.
And the value you want to compute is 9*value(A) + 19*value(B) + 9*value(C) + 15*value(D) + 9*value(E).
With one depth-first-search you can compute the sizes of all subtrees with dynamic programming and compute the number of appearances for each node using the two formulas.

How to understand segmented binomial heaps described in <Purely Functional Data Structures>

In chapter 6.3.1 of the thesis Purely Functional Data Structures, says:
Then, whenever we create a new tree from a new element and a segment
of trees of ranks 0... r-1, we simply compare the new element with the
first root in the segment (i.e.,the root of the rank 0 tree). The
smaller element becomes the new root and the larger element becomes
the rank 0 child of the root.
T0' is the new tree has rank 0
T0..T(r-1) are the original trees rank 0 to r-1
The smaller element becomes the new root and the larger element becomes rank 0 child of the root
The question is that step 3 result in two rank 1 trees, which is conflict with the binomial heaps.
Am I misunderstanding?
We are creating a tree of rank r. The structure of a tree of rank r is a root node with r children of ranks 0..r-1.
What the part you quoted means is this.
When we get a new element x we compare it to the element in T0
We create a new tree T0' of rank 0 containing the greater of the two compared elements
We create a new node T containing the lesser of the two compared elements and with T0',T1,T2...T(r-1) as children
Now T is a binomial tree of rank r and it is in heap order.

Resources