Introduction
I have written code to give me a set of numbers in '36 by q' format ( 1<= q <= 36), subject to following conditions:
Each row must use numbers from 1 to 36.
No number must repeat itself in a column.
Method
The first row is generated randomly. Each number in the coming row is checked for the above conditions. If a number fails to satisfy one of the given conditions, it doesn't get picked again fot that specific place in that specific row. If it runs out of acceptable values, it starts over again.
Problem
Unlike for low q values (say 15 which takes less than a second to compute), the main objective is q=36. It has been more than 24hrs since it started to run for q=36 on my PC.
Questions
Can I predict the time required by it using the data I have from lower q values? How?
Is there any better algorithm to perform this in less time?
How can I calculate the average number of cycles it requires? (using combinatorics or otherwise).
Can I predict the time required by it using the data I have from lower q values? How?
Usually, you should be able to determine the running time of your algorithm in terms of input. Refer to big O notation.
If I understood your question correctly, you shouldn't spend hours computing a 36x36 matrix satisfying your conditions. Most probably you are stuck in the infinite loop or something. It would be more clear of you could share code snippet.
Is there any better algorithm to perform this in less time?
Well, I tried to do what you described and it works in O(q) (assuming that number of rows is constant).
import random
def rotate(arr):
return arr[-1:] + arr[:-1]
y = set([i for i in range(1, 37)])
n = 36
q = 36
res = []
i = 0
while i < n:
x = []
for j in range(q):
if y:
el = random.choice(list(y))
y.remove(el)
x.append(el)
res.append(x)
for j in range(q-1):
x = rotate(x)
res.append(x)
i += 1
i += 1
Basically, I choose random numbers from the set of {1..36} for the i+q th row, then rotate the row q times and assigned these rotated rows to the next q rows.
This guarantees both conditions you have mentioned.
How can I calculate the average number of cycles it requires?( Using combinatorics or otherwise).
I you cannot calculate the computation time in terms of input (code is too complex), then fitting to curve seems to be right.
Or you could create an ML model with iterations as data and time for each iteration as label and perform linear regression. But that seems to be overkill in your example.
Graph q vs time
Fit a curve,
Extrapolate to q = 36.
You might want to also graph q vs log(time) as that may give an easier fitted curve.
I came across a solution to the Coin Change problem here : Coin Change. Here I was able to understand the first recursive method, the second method which uses DP with a 2D array. But am not able to understand the logic behind the third solution.
As far as I have thought, the last method works for problems in which the sequence of coins used in coin change is considered. Am I correct? Can anyone please explain me if I am wrong.
Well I figured it out myself!
This can be easily proved using induction. Let table[k] denote the ways change can be given for a total of k. Now the algorithm consists of two loops, one which is controlled by i and iterates through the array containing all the different coins and the other is the j controlled loop which for a given i, updates all the values of elements in array table. Now consider for a fixed i we have calculated the number of ways change can be given for all values from 1 to n and these values are stored in table from table[1] to table[n]. When the i controlled loop iterates for i+1, the value in table[j] for an arbitrary j is incremented by table[j-S[i + 1]] which is nothing but the ways we can create j using at least one coin with value S[i + 1] (the array which stores coin values). Thus the total value in table[j] equals the number of ways we can create a change with coins of value S[1]....S[i] (this was already stored before) and the value table[j-S[i + 1]]. This is same as the optimal substructure of the problem used in the recursive algorithm.
int arr[size];
memset(arr,0,sizeof(size));
int n;
cin>>n;
int sum;
cin>>sum;
int a[size];
fi(i,n)
cin>>a[i];
arr[0]=1;
fi(i,n)
for(int j=arr[i]; j<=n; j++)
a[j]+=a[j-arr[i]];
cout<<arr[n];
The array arr is initialised as 0 so as to show that the number of ways a sum of ican be represented is zero(that is not initialised). However, the number of ways in which a sum of 0 can be represented is 1 (zero way).
Further, we take each coin and start initialising each position in the array starting from the coin denomination.
a[j]+=a[j-arr[i]] means that we are basically incrementing the possible ways to represent the sum jby the previous number of ways, required (j-arr[i]).
In the end, we output the a[n]
Was trying to solve this popular interview question - http://www.careercup.com/question?id=3406682
There are 2 approaches to this that i was able to grasp -
Brian Kernighan's algo -
Bits counting algorithm (Brian Kernighan) in an integer time complexity
Lookup table.
I assume when people say use a lookup table, they mean a Hashmap with the Integer as key, and the count of number of set bits as value.
How does one construct this lookup table? Do we use Brian's algo to to count the number of bits the first time we encounter an integer, put it in hashtable, and next time we encounter that integer, retrieve value from hashtable?
PS: I am aware of the hardware and software api's available to perform popcount (Integer.bitCount()), but in context of this interview question, we are not allowed to use those methods.
I was looking for Answer everywhere but could not get the satisfactory explanation.
Let's start by understanding the concept of left shifting. When we shift a number left we multiply the number by 2 and shifting right will divide it by 2.
For example, if we want to generate number 20(binary 10100) from number 10(01010) then we have to shift number 10 to the left by one. we can see number of set bit in 10 and 20 is same except for the fact that bits in 20 is shifted one position to the left in comparison to number 10. so from here we can conclude that number of set bits in the number n is same as that of number of set bit in n/2(if n is even).
In case of odd numbers, like 21(10101) all bits will be same as number 20 except for the last bit, which will be set to 1 in case of 21 resulting in extra one set bit for odd number.
let's generalize this formual
number of set bits in n is number of set bits in n/2 if n is even
number of set bits in n is number of set bit in n/2 + 1 if n is odd (as in case of odd number last bit is set.
More generic Formula would be:
BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
where BitsetTable256 is table we are building for bit count. For base case we can set BitsetTable256[0] = 0; rest of the table can be computed using above formula in bottom up approach.
Integers can directly be used to index arrays;
e.g. so you have just a simple array of unsigned 8bit integers containing the set-bit-count for 0x0001, 0x0002, 0x0003... and do a look up by array[number_to_test].
You don't need to implement a hash function to map an 16 bit integer to something that you can order so you can have a look up function!
To answer your question about how to compute this table:
int table[256]; /* For 8 bit lookup */
for (int i=0; i<256; i++) {
table[i] = table[i/2] + (i&1);
}
Lookup this table on every byte of the given integer and sum the values obtained.
I've been trying out the dp tutorials on Topcoder. One of the problems given for practice was MiniPaint . I think I've got the solution partly- find the minimum no. of mispaints for a given no. of strokes, for each row and then compute for the entire picture (again using dp, similar to the knapsack problem). However, I'm not sure how to compute the min. no for each row.
P.S I later found the match editorial, but the code for finding the min. no. of mispaintings for each row seems wrong. Could someone explain exactly what they've done in the code?
The stripScore() function returns the minimum number of mispaintings for each row given the amount of strokes available to paint it. Although I'm not sure if the rowid argument is correct, the idea is that starting at start at a particular row with needed amount of strokes available to use and the colour of the region directly before it.
The key to this algorithm, is that the best score for the area to the right of the kth region, is uniquely determined by the number of strokes needed, and the color used to paint the (k-1)th region.
Intuition
I have been bashing my head with this problem for 3 days straight, not realising that It requires two consecutive uses of dynamic programming logic. My approaches, in contrast to the ones available from topcoder, are bottom up.
To start with, instead of calculating the minimum number of mispaints I can achieve, I will instead calculate the maximum number of cells I can paint with maxStrokes strokes. The result can easily be calculated by subtracting my findings from the total cells of my matrix. But how can I really do that? The initial observation has to be the fact that each row can yield me some painted cells in exchange for a number of strokes. This does not depend on the rest of the rows. That means that, for each row, I can calculate the maximum number of cells I can paint on that specific row, with a certain number of strokes.
Example
Input=['BBWWB','WBWWW'], maxStrokes=3
Let's now look at the first row BBWWB, and denote C to be the Max number of cells i can paint with Q strokes
Q C
0 0 (I cant paint with 0 strokes)
1 3 (BBWWB)
2 4 (BBWWB)
3 5 (BBWWB)
We could easily represent the above results with an array of length 4 that stores for each index (stroke) the maximum number of cells that can be painted, namely [0,3,4,5]
It's easy to see that the second row in the same manner would have an array [0,4,4,5].
The result can now easily be calculated just by these two arrays alone, as what we're looking for is a combination of two choices, one for each calculated array, that will yield me the highest amount of cells I can paint with 3 strokes. What are my choices though? Each item of my array represents the maximum number of cells i can paint with index strokes. So, for the first array a choice would be to paint 4 cells with 2 strokes.
I could then combine that choice with the second array's 1-st item 4, which means I can paint 4 cells with 1 stroke. My final result would be 4+4=8 cells with 2+1=3 strokes, which happens to be the best I can get. The output would then trivially be 2*5-8=2 minimum mispaints. However, we need to find an optimal way to calculate the different combinations of items from each row and what sums they can yield me.
The Process
The first part of my algorithm populates two very important tables. Let us denote with N, M the dimensions of the matrix I'm given. The first table, dp is a N*M*maxStrokes matrix. dp[i][j][k] represents the maximum number of cells I can paint from the 0-th cell up until the j-th cell of the i-th row with k strokes. As for the maxPainted table, that is a N*maxStrokes matrix. maxPainted[i][k] stores the maximum number of cells I can paint in the i-th row with k strokes and is identical to the arrays calculated in the above example. In order to calculate the latter, I need to calculate dp first. The formula is the following:
dp[i][j][k]= MAX (1,dp[i][r][k]+1 (if A[i][j]==A[i][r]) ,dp[i][r][k-1]+1 (if A[i][j]!=A[i][r])), for every 0<=r<j
Which can be translated as: The maximum number of cells I can paint up to the j-th cell of the i-th row with k strokes is the maximum of:
1, because I can just ignore all the previous cells, and paint this cell alone
dp[i][r][k]+1, because when A[i][j]==A[i][r], I can extend that color with no extra strokes
dp[i][r][k-1]+1, because when A[i][j]==A[i][r], I have to use a new stroke to paint A[i][j]
It is now evident, that the dp table needs to be calculated in order to acquire the best possible scenarios for each row, that is the maximum number of cells I can paint with every possible number of strokes available. But how can I utilize the maxPainted table once I have calculated it in order to get to my result?
The second part of my approach uses a variation of the 0-1 Knapsack problem in order to calculate the biggest number of cells I can paint with maxStrokes strokes available. What really made this challenging, is that, in contrast to the classical Knapsack, I am only allowed to pick 1 item out of every row, and then calculate all the possible combinations that do not surpass the required stroke constraint. In order to achieve that, I will firstly create a new array of length N*M +1 , called possSums. Let us denote with possSums[S] the MINIMUM number of strokes needed to reach sum S. My goal is to calculate each row's contribution to this array. Let us demonstrate with our previous example.
So I had a 2*5 input, therefore the possSums array would consist of 10+1 elements, which we set to Infinity, as we re trying to minimize the keystrokes needed to reach said sums.
So, possSums=[0,∞,∞,∞,∞,∞,∞,∞,∞,∞,∞], with the first item being 0 because I can paint 0 cells with 0 strokes. What we re now looking to do is calculate each row's contribution to possSums. That means that for every row of my maxPainted array, each element needs to make a specific sum available, which will simulate it being chosen. As we have previously demostrated, maxPainted[0]=[0,3,4,5]. This row's contribution would have to allow 0,3,4 and 5 as achievable sums in my possSums array with used strokes 0,1,2,3 respectively. possSums would then be transformed to possSums=[0,∞,∞,1,2,3,∞,∞,∞,∞,∞]. The next row was maxPainted[1]=[0,4,4,5], which now has to once again alter the possSums to allow the combinations made possible with the selection of each item. Notice that each alterations needs to be irrelevant to the others in the same row. For example, if we first allow the sum=4 which can happen by picking the 1st item of maxPainted[1], sum=9 cannot be allowed by furtherly picking the 3d item of that same array, essentially meaning that combinations of items in the same row cannot be considered. In order to ensure that no such cases are considered, for each row I create a clone of my possSums array to which I will be making the necessary modifications instead of my original array. After considering all of the items within maxPainted[1], possSums would look like this possSums=[0,∞,∞,1,1,3,∞,2,3,4,6], giving me a maximum number of cells that can be painted with up to 3 strokes on the 8th index (sum=8). Therefore my output would be 2*5-8=2
var minipaint=(A,maxStrokes)=>{
let n=A.length,m=A[0].length
, maxPainted=[...Array(n)].map(d=>[...Array(maxStrokes+1)].map(d=>0))
, dp=[...Array(n)].map(d=>[...Array(m)].map(d=>[...Array(maxStrokes+1)].map(d=>0)))
for (let k = 1; k <=maxStrokes; k++)
for (let i = 0; i <n; i++)
for (let j = 0; j <m; j++) {
dp[i][j][k]=1 //i can always just paint the damn thing alone
//for every previous cell of this row
//consider painting it and then painting my current cell j
for (let p = 0; p <j; p++)
if(A[i][p]===A[i][j]) //if the cells are the same, i dont need to use an extra stroke
dp[i][j][k]=Math.max(dp[i][p][k]+1,dp[i][j][k])
else//however if they are,im using an extra stroke( going from k-1 to k)
dp[i][j][k]=Math.max(dp[i][p][k-1]+1,dp[i][j][k])
maxPainted[i][k]=Math.max(maxPainted[i][k],dp[i][j][k])//store the maximum cells I can paint with k strokes
}
//this is where the knapsack VARIANT happens:
// Essentially I want to maximize the sum of my selection of strokes
// For each row, I can pick maximum of 1 item. Thing is,I have a constraint of my total
// strokes used, so I will create an array of possSums whose index represent the sum I wanna reach, and values represent the MINIMUM strokes needed to reach that very sum.
// so possSums[k]=min Number of strokes needed to reach sum K
let result=0,possSums=[...Array(n*m+1)].map(d=>Infinity)
//basecase, I can paint 0 cells with 0 strokes
possSums[0]=0
for (let i = 0; i < n; i++) {
let curr=maxPainted[i],
temp=[...possSums]// I create a clone of my possSums,
// where for each row, I intend to alter It instead of the original array
// in order to avoid cases where two items from the same row contribute to
// the same sum, which of course is incorrect.
for (let stroke = 0; stroke <=maxStrokes; stroke++) {
let maxCells=curr[stroke]
//so the way this happens is :
for (let sum = 0; sum <=n*m-maxCells; sum++) {
let oldWeight=possSums[sum]//consider if UP until now, the sum was possible
if(oldWeight==Infinity)// if it wasnt possible, i cant extend it with my maxCells
continue;
// <GAME CHANGER THAT ALLOWS 1 PICK PER ROW
let minWeight=temp[sum+maxCells]//now, consider extending it by sum+maxCells
// ALTERING THE TEMP ARRAY INSTEAD SO MY POTENTIAL RESULTS ARE NOT AFFECTED BY THE
// SUMS THAT WERE ALLOWED DURING THE SAME ROW
temp[sum+maxCells]=Math.min(minWeight,oldWeight+stroke)
if(temp[sum+maxCells]<=maxStrokes)
result=Math.max(result,sum+maxCells)
}
}
possSums=temp
}
return n*m-result // returning the total number of cells minus the maximum I can paint with maxStrokes
}
Might be a quite stupid question and I'm not sure if it belongs here or to math.
My problem:
I have several elements of type X which have a boolean attribute Y.
To calculate the percentage of elements where Y is true, I count all X where Y is true and divide it by the number of elements.
But I don't want to iterate all the time above all elements to update that percentage-value.
My idea was:
If I had 33% for 3 elements, and am adding a fourth one where Y is true:
(0.33 * 3 + 1) / 4 = 0.4975
Obviously that does not work well because of the 0.33.
Is there any way for getting an accurate solution without iteration or saving the number of items where Y is true?
Keep a count of the total number of elements and of the "true" ones. Global vars, object member variables, whatever. I assume that sometime back when the program is starting, you have zero elements. Every time an element is added, removed, or its boolean attribute changes, increment or decrement those counts as appropriate. You'll never have to iterate over the list (except maybe for testing) but at the cost of every change to the list having to include fiddling with those variables.
Your idea doesn't work because 0.33 does not equal 1/3. It's an approximation. If you take the exact value, you get the right answer:
(1/3 * 3 + 1) / 4 = (1 + 1) / 4 = 1/2
My question is, if you can store the value of 33% without iterating, why not just store the values of 1 and 3 and calculate them? That is, just keep a running total of the number of true values and number of objects. Increment when you get new ones. Calculate on demand. It's not necessary to iterate every time is way.