Recurrence equation for dynamic programming - dynamic-programming

I have a situation that is really similar to the knapsack problem but I just want to confirm that my recurrence equation is the same as the knapsack problem.
We have a maximum of M dollars to invest. We have N different investments which each one have a cost m(i) and a profit g(i). We want to find the recurrence equation for maximize the profit.
here is my answer :
g(i,j) = max{g(i-1,j), g_i + (i-1,j-m_i)} if j-m_i >= 0
g(i-1,j) if j-m_i < 0
I hope my explanation are clear.
Thank you and have a nice day!
Bobby

Your recurrence equation is correct. The problem is same as the traditional knapsack problem. Actually you can make some optimization on space complexity. Here is the C++ code.
int dp[M + 10];
int DP{
memset(dp, 0, sizeof(dp));
for(int i = 0; i < N; ++i)
for(int j = M; j >= m[i]; --j) // pay attention
dp[j] = max(dp[j], dp[j - m[i]] + g[i]);
int ret = 0;
for(int i = 0; i <= M; ++i) ret = max(ret, dp[i]);
return ret;
}

Related

Book Shop Question (same logic but 2 different implementation)

You are in a book shop which sells n different books. You know the price and number of pages of each book.
You have decided that the total price of your purchases will be at most x. What is the maximum number of pages you can buy? You can buy each book at most once.
So i figured that it was an example of 0-1 knapsack problem.
In my first approach I created a dp array as dp[i][j] which tells us the maximum pages using i money and first j books.
int n,budget;
cin>>n>>budget;
vector<int> price(n),pages(n);
for (int &v : price)
cin >> v;
for (int &v : pages)
cin >> v;
vector<vector<int> > dp(budget+1 , vector<int>(n+1,0));
for(int i=1 ; i<budget+1 ; i++){
for(int j=1; j<n+1 ; j++){
if(i-price[j-1] >= 0){
dp[i][j] = max(dp[i][j-1] , dp[i-price[j-1]][j-1] + pages[j-1]);
}
else{
dp[i][j] = dp[i][j-1];
}
}
}
cout<<dp[budget][n];
But the problem is that this solution exceeds the time limit.
The solution posted on the site had the same logic but the rows and columns of dp vector were flipped.That is dp[i][j] tells the maximum number of pages using j money and first i books. The solution was way faster.
int n,x;
cin>>n>>x;
vector<int> price(n), pages(n);
for (int &v : price)
cin >> v;
for (int &v : pages)
cin >> v;
vector<vector<int> > dp(n + 1, vector<int>(x + 1, 0));
for (int i = 1; i <= n; i++)
{
for (int j = 1; j <= x; j++)
{
if (j - price[i - 1] >= 0)
{
dp[i][j] = max(dp[i-1][j], dp[i - 1][j - price[i - 1]] + pages[i - 1]);
}
else{
dp[i][j] = dp[i-1][j];
}
}
}
cout << dp[n][x] << endl;
I don't understand why there is a difference in time for the above implementations. I am new to competitive programming so please clarify my doubt.
Thanks in advance.

I'm not able to understand logic of coin changing problem in o(sum) space complexity

I'm facing difficulty in understanding O(sum) complexity solution of coin changing problem.
The problem statement is:
You are given a set of coins A. In how many ways can you make sum B assuming you have infinite amount of each coin in the set.
NOTE:
Coins in set A will be unique. Expected space complexity of this problem is O(B).
The solution is:
int count( int S[], int m, int n )
{
int table[n+1];
memset(table, 0, sizeof(table));
table[0] = 1;
for(int i=0; i<m; i++)
for(int j=S[i]; j<=n; j++)
table[j] += table[j-S[i]];
return table[n];
}
can someone explain me this code.?
First, let's identify the parameters and variables used in the function:
Parameters:
S contain the denomination of all m coins. i.e. Each element contain the value of each coin.
m represents the number of coin denominations. Essentially, it's the length of array S.
n represents the sum B to be achieved.
Variables:
table: Element i in array table contains the number of ways sum i can be achieved with the given coins. table[0] = 1 because there is a single way to achieve a sum of 0 (not using any coin).
i loops through each coin.
Logic:
The number of ways to achieve a sum j = sum of the following:
number of ways to achieve a sum of j - S[0]
number of ways to achieve a sum of j - S[1]
...
number of ways to achieve a sum of j - S[m-1] (S[m-1] is the value of the mth coin)
I did not completely decipher nor validate the rest of the code, but I hope this is a step in the right direction.
Added comments to code:
#include <stdio.h>
#include <string.h>
int count( int S[], int m, int n )
{
int table[n+1];
memset(table, 0, sizeof(table));
table[0] = 1;
for(int i=0; i<m; i++) // Loop through all of the coins
for(int j=S[i]; j<=n; j++) // Achieve sum j between the value of S[i] and n.
table[j] += table[j-S[i]]; // Add to the number of ways to achieve sum j the number of ways to achieve sum j - S[i]
return table[n];
}
int main() {
int S[] = {1, 2};
int m = 2;
int n = 3;
int c = count(S, m, n);
printf("%d\n", c);
}
Notes:
The code avoids repeats: 3 = 1+1+1, 1+2 (2 ways instead of 3 if 2+1 was considered.
No dependence on the order of the coins in term of value.

Is it possible to parallelize or unroll this loop?

I am trying to see if I can improve the performance of the following loop in C++, which uses two dimensional vectors (_external and _Table) and has a carried loop dependency on the previous iteration. Additionally, it has a calculated index accessor in the innermost loop that will make the access of _Table non sequential on the right hand side.
int N = 8000;
int M = 400
int P = 100;
for(int i = 1; i <= N; i++){
for(int j = 0; j < M; j++){
for(int k =0; k < P; k++){
int index = _external.at(j).at(k);
_Table.at(j).at(i) += _Table.at(index).at(i-1);
}
}
}
What can I do to improve the performance of a loop like this?
Well it looks to me like the order in which these statements:
int index = _external.at(j).at(k);
_Table.at(j).at(i) += _Table.at(index).at(i-1);
are executed is critical to correctness. (That is, if the iteration order for i, j, k changes, then the results will be different ... and incorrect.)
So I think you are only left with micro-optimizations, like hoisting the expressions _Table.at(j).at(i) and _external.at(j) out of the innermost loop.
Consider this:
for(int k =0; k < P; k++){
int index = _external.at(j).at(k);
_Table.at(j).at(i) += _Table.at(index).at(i-1);
}
This loop is repeatedly adding numbers to _Table.at(j).at(i). Since (by inspection) _Table.at(index).at(i-1) must be reading from a different cell of the table (because of i-1 versus i), you could do this:
int temp = 0;
for(int k =0; k < P; k++){
int index = _external.at(j).at(k);
temp += _Table.at(index).at(i-1);
}
_Table.at(j).at(i) += temp;
This will reduce the number of calls to at, and may also improve cache performance a bit.

Finding similar/related texts algorithms

I searched a lot in stackoverflow and Google but I didn't find the best answer for this.
Actually, I'm going to develop a news reader system that crawl and collect news from web (with a crawler) and then, I want to find similar or related news in websites (In order to prevent showing duplicated news in website)
I think the best live example for that is Google News, it collect news from web and then categorize and find related news and articles. This is what I want to do.
What's the best algorithm for doing this?
A relatively simple solution is to compute a tf-idf vector (en.wikipedia.org/wiki/Tf*idf) for each document, then use the cosine distance (en.wikipedia.org/wiki/Cosine_similarity) between these vectors as an estimate for semantic distance between articles.
This will probably capture semantic relationships better than Levenstein distance and is much faster to compute.
This is one: http://en.wikipedia.org/wiki/Levenshtein_distance
public static SqlInt32 ComputeLevenstheinDistance(SqlString firstString, SqlString secondString)
{
int n = firstString.Value.Length;
int m = secondString.Value.Length;
int[,] d = new int[n + 1,m + 1];
// Step 1
if (n == 0)
{
return m;
}
if (m == 0)
{
return n;
}
// Step 2
for (int i = 0; i <= n; d[i, 0] = i++)
{
}
for (int j = 0; j <= m; d[0, j] = j++)
{
}
// Step 3
for (int i = 1; i <= n; i++)
{
//Step 4
for (int j = 1; j <= m; j++)
{
// Step 5
int cost = (secondString.Value[j - 1] == firstString.Value[i - 1]) ? 0 : 1;
// Step 6
d[i, j] = Math.Min(Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1), d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
This is handy for the task at hand: http://code.google.com/p/boilerpipe/
Also, if you need to reduce the number of words to analyze, try this: http://ots.codeplex.com/
I have found the OTS VERY useful in sentiment analysis, whereby I can reduce the number of sentences into a small list of common phrases and/or words and calculate the overall sentiment based on this. The same should work for similarity.

A Dynamic Programming problem in USACO

In section2.2,a problem called"subset sum"require you to calculate in how many ways can a integer set from 1 to n be partitioned into two sets whose sums are identical.
I know the recurrence is:
f[i][j] : numbers of ways that sum up to j with 1...i
f[i][j]=f[i-1][j]+f[i-1][j-i]
if the initial condition is:
f[1][1]=1;//others are all zero,main loop start from 2
OR:
f[0][0]=1;//others are all zero,main loop start from 1
the answers are all f[n][n*(n+1)/4].Does this means the initial condition doesn't affect the answer?
but if I use a one dimension array,say f[N]:
let f[0]=1,loop from 1(so f[0] is f[0][0] in fact),the answer is f[n]/2
or f[1]=1,loop from 2(f[1] is f[1][1]),the answer is f[n]
I am so confused...
I don't know if you are still stuck on this problem, but here's a solution for anyone else who stumbles onto this problem.
Let ways[i] be the number of ways you can get a sum of i using a subset of the numbers 1...N.
Then it becomes a variant of the 0-1 knapsack algorithm:
base case: ways[0] = 1
for (int i = 1; i <= N; i++) {
for (int j = sum - i; j >= 0; --j) { //sum is n*(n+1)/2
ways[j + i] += ways[j];
}
}
Your answer is located at ways[sum/2]/2.

Resources