find the most divisible number in an array? - dynamic-programming

Got a question in interview, find most divisible by other numbers in the array, say [2,4,8], 8 is able to be divided by 3 numbers, and that is the ans. I have a O(N^2) solution, but is there a better solution than O(N^2)?
I think, something like quick sort will make sense, but not yet get the solution, like if a % b, b%c => a%c, but % operation is not transitional like > operation.

You can work with trees with O(NlogN).
The fastest algorithm i think is to make a heap (AVL tree is the best choice) tree with this rule:
For input x: find x%node[i] == 0 or node[i]%x == 0
If you found this node add x to node[i]'s children collection or replace Node[i] with x and add Node[i] to x's Children Collection.
else add this node to the root Node.

Related

Guess the number: Dynamic Programming -- Identifying Subproblems

I am working on the following problem and having a hell of a time at the moment.
We are playing the Guessing Game. The game will work as follows:
I pick a number between 1 and n.
You guess a number.
If you guess the right number, you win the game.
If you guess the wrong number, then I will tell you whether the number I picked is higher or lower, and you will continue guessing.
Every time you guess a wrong number x, you will pay x dollars. If you run out of money, you lose the game.
Given a particular n, return the minimum amount of money you need to guarantee a win regardless of what number I pick.
So, what do I know? Clearly this is a dynamic programming problem. I have two choices, break things up recursively or go ahead and do things bottom up. Bottom up seems like a better choice to me (though technically the max recursion depth would be 100 as we are guaranteed n<=100). The question then is: What do the sub-problems look like?
Well, I think we could start thinking about subarrays (but possible we need subsequences here) so what is the worst case in each possible sub-division kind of thing? That is:
[1,2,3,4,5,6]
[[1],[2],[3],[4],[5],[6]] -> 21
[[1,2],[3,4],[5,6]] -> 9
[[1,2,3],[4,5,6]] -> 7
...
but I don't think I quite have the idea yet. So, to get succinct since this post is kind of long: How are we breaking this up? What is the sub-problem we are trying to solve here?
Relevant Posts:
Binary Search Doesn't work in this case?

Reduce Time complexity

Question at hand : Complete the function minimumSwaps in the editor below. It must return an integer representing the minimum number of swaps to sort the array.
My Approach:
def minimumSwaps(arr):
count = 0
temp = [None]*len(arr)
res1=sorted(arr)
while(res1!=arr):
for i in range(int(len(arr))):
if(res1[i]!=arr[i]):
y=res1.index(arr[i])
arr[y] , arr[i]=arr[i] , arr[y]
count = count +1
return count
The code does give the required op for majority of the cases , but fails a few due to time limit exceeds error. Could someone suggest a few changes to reduce the time complexity issues and make the code more efficient. If Possible please try not to change the code in its entirety , I want to learn to make codes more efficient rather than trying a whole new approach altogether.
Link to one of the huge test case
To me, this is a graph problem. Maybe it's possible with a more simple solution, but I don't think so.
You can observe that to get the minimum swaps necessary, you'd just have to move every element into its sorted position. You can figure out where they're supposed to be by sorting and having an array indexed by element (or dictionary, for that matter) to the index.
Now, build a graph by making each item its own node, and connecting with a directed edge to the place it needs to be. We can observe that for a cycle of length k, we will need k-1 swaps to solve it. This is because we just need to swap each item forward, but the last swap actually solves two items rather than one. Thus, the answer is the sum of k-1 for each cycle, which can be reduced to n-c where c is the number of cycles.
To see why this works, consider the case of [2,3,1]. The sorted version of this array is [1,2,3]. Now, build the graph, where index 0 points to index 1 (since 2 needs to be in index 1), index 1 points to index 2, and index 2 points to index 0. We can run a search algorithm through the graph and find the number of cycles or components, and find that there is 1 cycle of length 3. So, the answer we produce is 3-1 = 2. As we can observe, this is indeed correct.
The problem gets a little more complicated if the array can contain duplicates, but it's not so bad, you'd just have to think a little harder. Maybe this isn't the intended solution, but it'll certainly work in O(n). Best of luck!

Bitwise operations Python

This is a first run-in with not only bitwise ops in python, but also strange (to me) syntax.
for i in range(2**len(set_)//2):
parts = [set(), set()]
for item in set_:
parts[i&1].add(item)
i >>= 1
For context, set_ is just a list of 4 letters.
There's a bit to unpack here. First, I've never seen [set(), set()]. I must be using the wrong keywords, as I couldn't find it in the docs. It looks like it creates a matrix in pythontutor, but I cannot say for certain. Second, while parts[i&1] is a slicing operation, I'm not entirely sure why a bitwise operation is required. For example, 0&1 should be 1 and 1&1 should be 0 (carry the one), so binary 10 (or 2 in decimal)? Finally, the last bitwise operation is completely bewildering. I believe a right shift is the same as dividing by two (I hope), but why i>>=1? I don't know how to interpret that. Any guidance would be sincerely appreciated.
[set(), set()] creates a list consisting of two empty sets.
0&1 is 0, 1&1 is 1. There is no carry in bitwise operations. parts[i&1] therefore refers to the first set when i is even, the second when i is odd.
i >>= 1 shifts right by one bit (which is indeed the same as dividing by two), then assigns the result back to i. It's the same basic concept as using i += 1 to increment a variable.
The effect of the inner loop is to partition the elements of _set into two subsets, based on the bits of i. If the limit in the outer loop had been simply 2 ** len(_set), the code would generate every possible such partitioning. But since that limit was divided by two, only half of the possible partitions get generated - I couldn't guess what the point of that might be, without more context.
I've never seen [set(), set()]
This isn't anything interesting, just a list with two new sets in it. So you have seen it, because it's not new syntax. Just a list and constructors.
parts[i&1]
This tests the least significant bit of i and selects either parts[0] (if the lsb was 0) or parts[1] (if the lsb was 1). Nothing fancy like slicing, just plain old indexing into a list. The thing you get out is a set, .add(item) does the obvious thing: adds something to whichever set was selected.
but why i>>=1? I don't know how to interpret that
Take the bits in i and move them one position to the right, dropping the old lsb, and keeping the sign. Sort of like this
Except of course that in Python you have arbitrary-precision integers, so it's however long it needs to be instead of 8 bits.
For positive numbers, the part about copying the sign is irrelevant.
You can think of right shift by 1 as a flooring division by 2 (this is different from truncation, negative numbers are rounded towards negative infinity, eg -1 >> 1 = -1), but that interpretation is usually more complicated to reason about.
Anyway, the way it is used here is just a way to loop through the bits of i, testing them one by one from low to high, but instead of changing which bit it tests it moves the bit it wants to test into the same position every time.

Find Median of AVL tree

I've searched a bit and found a related post: Get median from AVL tree?
but I'm not too satisfied with the response.
My thoughts on solving this problem:
If the balance factor is 0, return root
else keep removing the root until the tree is completely balanced, and calculate the median of the roots you just removed
Assuming the AVL tree will keep the balance(by definition?)
I've seen some answers suggesting in-order traversal and find median, but I that will require more space and time in my opinion.
Can someone confirm or correct my ideas? thanks!
There are two problems in your suggested approach:
You destroy your tree in the process (or take up twice as much memory for a "backup" copy)
In the worst case, you need quite a lot of root removals to get a completely balanced tree (I think in the worst-case, it would be close to 2^(n-1)-1 removals)... and you'd still need to calculate the median from that.
The answer in your linked question is right and optimal. The usual way to solve this is to construct a Order statistic tree (by holding the number of elements of the left and right sub-tree for each node). Do note, that you have to compensate the numbers accordingly if a rotation of the AVL tree happens.
See IVlad's answer here. Since an AVL tree guarantees an O(log n) Search operation and IVlad's algorithm is essentially a Search operation, you can find the k-th smallest element in O(log n) time and O(1) space (not counting the space for the tree itself).
Assuming your tree is indexed from 0 and has n elements, find the median in the following way:
if n is odd: Find the (n-1)/2-th element and return it
if n is even: Find the n/2-th and (n/2)-1 elements and return their average
Also, if changing the tree (left/right element counts) is not an option, see the second part of the answer you linked to.

Looking for ideas: lexicographically sorted suffix array of many different strings compute efficiently an LCP array

I don't want a direct solution to the problem that's the source of this question but it's this one link:
So I take in the strings and add them to a suffix array which is implemented as a sorted set internally, what I obtain then is a lexicographically sorted list of the two given strings.
S1 = "banana"
S2 = "panama"
SuffixArray.add S1, S2
To make searching for the k-th smallest substring efficient I preprocess this sorted set to add in information about the longest common prefix between a suffix and it's predecessor as well as keeping tabs on a cumulative substrings count. So I know that for a given k greater than the cumulative substrings count of the last item, it's an invalid query.
This works really well for small inputs as well as random large inputs of the constraints given in the problem definition, which is at most 50 strings of length 2000. I am able to pass the 4 out of 7 cases and was pretty surprised I didn't get them all.
So I went searching for the bottleneck and it hit me. Given large number of inputs like these
anananananananana.....ananana
bkbkbkbkbkbkbkbkb.....bkbkbkb
The queries for k-th smallest substrings are still fast as expected but not the way I preprocess the sorted set... The way I calculate the longest common prefix between the elements of the set is not efficient and linear O(m), like this, I did the most naïve thing expecting it to be good enough:
m = anananan
n = anananana
Start at 0 and find the point where `m[i] != n[i]`
It is like this because a suffix and his predecessor might no be related (i.e. coming from different input strings) and so I thought I couldn't help but using brute force.
Here is the question then and where I ended up reducing the problem as. Given a list of lexicographically sorted suffix like in the manner I described above (made up of multiple strings):
What is an efficient way of computing the longest common prefix array?.
The subquestion would then be, am I completely off the mark in my approach? Please propose further avenues of investigation if that's the case.
Foot note, I do not want to be shown implemented algorithm and I don't mind to be told to go read so and so book or resource on the subject as that is what I do anyway while attempting these challenges.
Accepted answer will be something that guides me on the right path or in the case that that fails; something that teaches me how to solve these types of problem in a broader sense, a book or something
READING
I would recommend this tutorial pdf from Stanford.
This tutorial explains a simple O(nlog^2n) algorithm with O(nlogn) space to compute suffix array and a matrix of intermediate results. The matrix of intermediate results can be used to compute the longest common prefix between two suffixes in O(logn).
HINTS
If you wish to try to develop the algorithm yourself, the key is to sort the strings based on their 2^k long prefixes.
From the tutorial:
Let's denote by A(i,k) be the subsequence of A of length 2^k starting at position i.
The position of A(i,k) in the sorted array of A(j,k) subsequences (j=1,n) is kept in P(k,i).
and
Using matrix P, one can iterate descending from the biggest k down to 0 and check whether A(i,k) = A(j,k). If the two prefixes are equal, a common prefix of length 2^k had been found. We only have left to update i and j, increasing them both by 2^k and check again if there are any more common prefixes.

Resources