How to dynamic programming with a LARGE input

How to dynamic programming with a LARGE input - dynamic-programming

Let's assume that we're now solving a typical problem which can be solved by dynamic programming - get the number of possible coin combinations for a change.
Memorizer mem = new Memorizer();
int[] coins = { 100000, 8534, 5935, 291, 76, 51, 30, 29, 7, 5 }
......
int getNum(int money, int idx) {
if(idx == coins.length - 1) {
if(money % coins[idx] == 0)
return 1;
else
return 0;
}
int found = mem.find(money, idx);
if(found != null)
return found;
int num = 0;
for(int i = 0; i <= (money / coins[idx]); i++)
num += getNum(money - i * coins[idx], idx + 1);
mem.remember(money, idx, num);
return num;
}
But if the money is very large, something like 2,000,000,000, it's very hard to memorize all the intermediate results. How can I solve the problem with a very large input? Please help me. Thank you.

There is something wrong with this dp, it should be:
if(money % coins[idx] == 0)
return money/coins[idx];
That being said, It is bad idea to loop on one coin value till you reach the money, rather, loop through all the values of coin. This way, num[x] = min(num[x-coin[i]]+1, num[x]).
In this code, we could use memorization on part of array, i.e. last {maximum value of coin} elements, because, you only need to remember for x-coin[i]. Try figuring out how to implement this if it is what you are looking for.

Related

(DP) Memoization - How to know if it starts from the top or bottom?

It hasn't been long since I started studying algorithm coding tests, and I found it difficult to find regularity in Memoization.
Here are two problems.
Min Cost Climbing Stairs
You are given an integer array cost where cost[i] is the cost of ith step on a staircase. Once you pay the cost, you can either climb one or two steps.
You can either start from the step with index 0, or the step with index 1.
Return the minimum cost to reach the top of the floor.
Min Cost Climbing Stairs
Recurrence Relation Formula:
minimumCost(i) = min(cost[i - 1] + minimumCost(i - 1), cost[i - 2] + minimumCost(i - 2))
House Robber
You are a professional robber planning to rob houses along a street. Each house has a certain amount of money stashed, the only constraint stopping you from robbing each of them is that adjacent houses have security systems connected and it will automatically contact the police if two adjacent houses were broken into on the same night.
Given an integer array nums representing the amount of money of each house, return the maximum amount of money you can rob tonight without alerting the police.
House Robber
Recurrence Relation Formula:
robFrom(i) = max(robFrom(i + 1), robFrom(i + 2) + nums(i))
So as you can see, first problem consist of the previous, and second problem consist of the next.
Because of this, when I try to make recursion function, start numbers are different.
Start from n
int rec(int n, vector<int>& cost)
{
if(memo[n] == -1)
{
if(n <= 1)
{
memo[n] = 0;
} else
{
memo[n] = min(rec(n-1, cost) + cost[n-1], rec(n-2, cost) + cost[n-2]);
}
}
return memo[n];
}
int minCostClimbingStairs(vector<int>& cost) {
const int n = cost.size();
memo.assign(n+1,-1);
return rec(n, cost); // Start from n
}
Start from 0
int getrob(int n, vector<int>& nums)
{
if(how_much[n] == -1)
{
if(n >= nums.size())
{
return 0;
} else {
how_much[n] = max(getrob(n + 1, nums), getrob(n + 2, nums) + nums[n]);
}
}
return how_much[n];
}
int rob(vector<int>& nums) {
how_much.assign(nums.size() + 2, -1);
return getrob(0, nums); // Start from 0
}
How can I easily know which one need to be started from 0 or n? Is there some regularity?
Or should I just solve a lot of problems and increase my sense?

Your question is right, but somehow examples are not correct. Both the problems you shared can be done in both ways : 1. starting from top & 2. starting from bottom.
For example: Min Cost Climbing Stairs : solution that starts from 0.
int[] dp;
public int minCostClimbingStairs(int[] cost) {
int n = cost.length;
dp = new int[n];
for(int i=0; i<n; i++) {
dp[i] = -1;
}
rec(0, cost);
return Math.min(dp[0], dp[1]);
}
int rec(int in, int[] cost) {
if(in >= cost.length) {
return 0;
} else {
if(dp[in] == -1) {
dp[in] = cost[in] + Math.min(rec(in+1, cost), rec(in+2, cost));
}
return dp[in];
}
}
However, there are certain set of problems where this is not easy. Their structure is such that if you start in reverse, the computation could get complicated or mess up the future results:
Example: Reaching a target sum from numbers in an array using an index at max only 1 time. Reaching 10 in {3, 4, 6, 5, 2} : {4,6} is one answer but not {6, 2, 2} as you are using index (4) 2 times.
This can be done easily in top down way:
int m[M+10];
for(i=0; i<M+10; i++) m[i]=0;
m[0]=1;
for(i=0; i<n; i++)
for(j=M; j>=a[i]; j--)
m[j] |= m[j-a[i]];
If you try to implement in bottom up way, you will end up using a[i] multiple times. You can definitely do it bottom up way if you figure a out a way to tackle this messing up of states. Like using a queue to only store reached state in previous iterations and not use numbers reached in current iterations. Or even check if you keep a count in m[j] instead of just 1 and only use numbers where count is less than that of current iteration count. I think same thing should be valid for all DP.

find the number of ways you can form a string on size N, given an unlimited number of 0s and 1s

The below question was asked in the atlassian company online test ,I don't have test cases , this is the below question I took from this link
find the number of ways you can form a string on size N, given an unlimited number of 0s and 1s. But
you cannot have D number of consecutive 0s and T number of consecutive 1s. N, D, T were given as inputs,
Please help me on this problem,any approach how to proceed with it
My approach for the above question is simply I applied recursion and tried for all possiblity and then I memoized it using hash map
But it seems to me there must be some combinatoric approach that can do this question in less time and space? for debugging purposes I am also printing the strings generated during recursion, if there is flaw in my approach please do tell me
#include <bits/stdc++.h>
using namespace std;
unordered_map<string,int>dp;
int recurse(int d,int t,int n,int oldd,int oldt,string s)
{
if(d<=0)
return 0;
if(t<=0)
return 0;
cout<<s<<"\n";
if(n==0&&d>0&&t>0)
return 1;
string h=to_string(d)+" "+to_string(t)+" "+to_string(n);
if(dp.find(h)!=dp.end())
return dp[h];
int ans=0;
ans+=recurse(d-1,oldt,n-1,oldd,oldt,s+'0')+recurse(oldd,t-1,n-1,oldd,oldt,s+'1');
return dp[h]=ans;
}
int main()
{
int n,d,t;
cin>>n>>d>>t;
dp.clear();
cout<<recurse(d,t,n,d,t,"")<<"\n";
return 0;
}

You are right, instead of generating strings, it is worth to consider combinatoric approach using dynamic programming (a kind of).
"Good" sequence of length K might end with 1..D-1 zeros or 1..T-1 of ones.
To make a good sequence of length K+1, you can add zero to all sequences except for D-1, and get 2..D-1 zeros for the first kind of precursors and 1 zero for the second kind
Similarly you can add one to all sequences of the first kind, and to all sequences of the second kind except for T-1, and get 1 one for the first kind of precursors and 2..T-1 ones for the second kind
Make two tables
Zeros[N][D] and Ones[N][T]
Fill the first row with zero counts, except for Zeros[1][1] = 1, Ones[1][1] = 1
Fill row by row using the rules above.
Zeros[K][1] = Sum(Ones[K-1][C=1..T-1])
for C in 2..D-1:
Zeros[K][C] = Zeros[K-1][C-1]
Ones[K][1] = Sum(Zeros[K-1][C=1..T-1])
for C in 2..T-1:
Ones[K][C] = Ones[K-1][C-1]
Result is sum of the last row in both tables.
Also note that you really need only two active rows of the table, so you can optimize size to Zeros[2][D] after debugging.

This can be solved using dynamic programming. I'll give a recursive solution to the same. It'll be similar to generating a binary string.
States will be:
i: The ith character that we need to insert to the string.
cnt: The number of consecutive characters before i
bit: The character which was repeated cnt times before i. Value of bit will be either 0 or 1.
Base case will: Return 1, when we reach n since we are starting from 0 and ending at n-1.
Define the size of dp array accordingly. The time complexity will be 2 x N x max(D,T)
#include<bits/stdc++.h>
using namespace std;
int dp[1000][1000][2];
int n, d, t;
int count(int i, int cnt, int bit) {
if (i == n) {
return 1;
}
int &ans = dp[i][cnt][bit];
if (ans != -1) return ans;
ans = 0;
if (bit == 0) {
ans += count(i+1, 1, 1);
if (cnt != d - 1) {
ans += count(i+1, cnt + 1, 0);
}
} else {
// bit == 1
ans += count(i+1, 1, 0);
if (cnt != t-1) {
ans += count(i+1, cnt + 1, 1);
}
}
return ans;
}
signed main() {
ios_base::sync_with_stdio(false), cin.tie(nullptr);
cin >> n >> d >> t;
memset(dp, -1, sizeof dp);
cout << count(0, 0, 0);
return 0;
}

Profit Maximization based on dynamix programming

I have been trying to solve this problem :
" You have to travel to different villages to make some profit.
In each village, you gain some profit. But the catch is, from a particular village i, you can only move to a village j if and only if and the profit gain from village j is a multiple of the profit gain from village i.
You have to tell the maximum profit you can gain while traveling."
Here is the link to the full problem:
https://www.hackerearth.com/practice/algorithms/dynamic-programming/introduction-to-dynamic-programming-1/practice-problems/algorithm/avatar-and-his-quest-d939b13f/description/
I have been trying to solve this problem for quite a few hours. I know this is a variant of the longest increasing subsequence but the first thought that came to my mind was to solve it through recursion and then memoize it. Here is a part of the code to my approach. Please help me identify the mistake.
static int[] dp;
static int index;
static int solve(int[] p) {
int n = p.length;
int max = 0;
for(int i = 0;i<n; i++)
{
dp = new int[i+1];
Arrays.fill(dp,-1);
index = i;
max = Math.max(max,profit(p,i));
}
return max;
}
static int profit(int[] p, int n)
{
if(dp[n] == -1)
{
if(n == 0)
{
if(p[index] % p[n] == 0)
dp[n] = p[n];
else
dp[n] = 0;
}
else
{
int v1 = profit(p,n-1);
int v2 = 0;
if(p[index] % p[n] == 0)
v2 = p[n] + profit(p,n-1);
dp[n] = Math.max(v1,v2);
}
}
return dp[n];
}

I have used extra array to get the solution, my code is written in Java.
public static int getmaxprofit(int[] p, int n){
// p is the array that contains all the village profits
// n is the number of villages
// used one extra array msis, that would be just a copy of p initially
int i,j,max=0;
int msis[] = new int[n];
for(i=0;i<n;i++){
msis[i]=p[i];
}
// while iteraring through p, I will check in backward and find all the villages that can be added based on criteria such previous element must be smaller and current element is multiple of previous.
for(i=1;i<n;i++){
for(j=0;j<i;j++){
if(p[i]>p[j] && p[i]%p[j]==0 && msis[i] < msis[j]+p[i]){
msis[i] = msis[j]+p[i];
}
}
}
for(i=0;i<n;i++){
if(max < msis[i]){
max = msis[i];
}
}
return max;
}

Optimal algorithm for this string decompression

I have been working on an exercise from google's dev tech guide. It is called Compression and Decompression you can check the following link to get the description of the problem Challenge Description.
Here is my code for the solution:
public static String decompressV2 (String string, int start, int times) {
String result = "";
for (int i = 0; i < times; i++) {
inner:
{
for (int j = start; j < string.length(); j++) {
if (isNumeric(string.substring(j, j + 1))) {
String num = string.substring(j, j + 1);
int times2 = Integer.parseInt(num);
String temp = decompressV2(string, j + 2, times2);
result = result + temp;
int next_j = find_next(string, j + 2);
j = next_j;
continue;
}
if (string.substring(j, j + 1).equals("]")) { // Si es un bracket cerrado
break inner;
}
result = result + string.substring(j,j+1);
}
}
}
return result;
}
public static int find_next(String string, int start) {
int count = 0;
for (int i = start; i < string.length(); i++) {
if (string.substring(i, i+1).equals("[")) {
count= count + 1;
}
if (string.substring(i, i +1).equals("]") && count> 0) {
count = count- 1;
continue;
}
if (string.substring(i, i +1).equals("]") && count== 0) {
return i;
}
}
return -111111;
}
I will explain a little bit about the inner workings of my approach. It is a basic solution involves use of simple recursion and loops.
So, let's start from the beggining with a simple decompression:
DevTech.decompressV2("2[3[a]b]", 0, 1);
As you can see, the 0 indicates that it has to iterate over the string at index 0, and the 1 indicates that the string has to be evaluated only once: 1[ 2[3[a]b] ]
The core here is that everytime you encounter a number you call the algorithm again(recursively) and continue where the string insides its brackets ends, that's the find_next function for.
When it finds a close brackets, the inner loop breaks, that's the way I choose to make the stop sign.
I think that would be the main idea behind the algorithm, if you read the code closely you'll get the full picture.
So here are some of my concerns about the way I've written the solution:
I could not find a more clean solution to tell the algorithm were to go next if it finds a number. So I kind of hardcoded it with the find_next function. Is there a way to do this more clean inside the decompress func ?
About performance, It wastes a lot of time by doing the same thing again, when you have a number bigger than 1 at the begging of a bracket.
I am relatively to programming so maybe this code also needs an improvement not in the idea, but in the ways It's written. So would be very grateful to get some suggestions.
This is the approach I figure out but I am sure there are a couple more, I could not think of anyone but It would be great if you could tell your ideas.
In the description it tells you some things that you should be awared of when developing the solutions. They are: handling non-repeated strings, handling repetitions inside, not doing the same job twice, not copying too much. Are these covered by my approach ?
And the last point It's about tets cases, I know that confidence is very important when developing solutions, and the best way to give confidence to an algorithm is test cases. I tried a few and they all worked as expected. But what techniques do you recommend for developing test cases. Are there any softwares?
So that would be all guys, I am new to the community so I am open to suggestions about the how to improve the quality of the question. Cheers!

Your solution involves a lot of string copying that really slows it down. Instead of returning strings that you concatenate, you should pass a StringBuilder into every call and append substrings onto that.
That means you can use your return value to indicate the position to continue scanning from.
You're also parsing repeated parts of the source string more than once.
My solution looks like this:
public static String decompress(String src)
{
StringBuilder dest = new StringBuilder();
_decomp2(dest, src, 0);
return dest.toString();
}
private static int _decomp2(StringBuilder dest, String src, int pos)
{
int num=0;
while(pos < src.length()) {
char c = src.charAt(pos++);
if (c == ']') {
break;
}
if (c>='0' && c<='9') {
num = num*10 + (c-'0');
} else if (c=='[') {
int startlen = dest.length();
pos = _decomp2(dest, src, pos);
if (num<1) {
// 0 repetitions -- delete it
dest.setLength(startlen);
} else {
// copy output num-1 times
int copyEnd = startlen + (num-1) * (dest.length()-startlen);
for (int i=startlen; i<copyEnd; ++i) {
dest.append(dest.charAt(i));
}
}
num=0;
} else {
// regular char
dest.append(c);
num=0;
}
}
return pos;
}

I would try to return a tuple that also contains the next index where decompression should continue from. Then we can have a recursion that concatenates the current part with the rest of the block in the current recursion depth.
Here's JavaScript code. It takes some thought to encapsulate the order of operations that reflects the rules.
function f(s, i=0){
if (i == s.length)
return ['', i];
// We might start with a multiplier
let m = '';
while (!isNaN(s[i]))
m = m + s[i++];
// If we have a multiplier, we'll
// also have a nested expression
if (s[i] == '['){
let result = '';
const [word, nextIdx] = f(s, i + 1);
for (let j=0; j<Number(m); j++)
result = result + word;
const [rest, end] = f(s, nextIdx);
return [result + rest, end]
}
// Otherwise, we may have a word,
let word = '';
while (isNaN(s[i]) && s[i] != ']' && i < s.length)
word = word + s[i++];
// followed by either the end of an expression
// or another multiplier
const [rest, end] = s[i] == ']' ? ['', i + 1] : f(s, i);
return [word + rest, end];
}
var strs = [
'2[3[a]b]',
'10[a]',
'3[abc]4[ab]c',
'2[2[a]g2[r]]'
];
for (const s of strs){
console.log(s);
console.log(JSON.stringify(f(s)));
console.log('');
}

Is it possible to do a Levenshtein distance in Excel without having to resort to Macros?

Let me explain.
I have to do some fuzzy matching for a company, so ATM I use a levenshtein distance calculator, and then calculate the percentage of similarity between the two terms. If the terms are more than 80% similar, Fuzzymatch returns "TRUE".
My problem is that I'm on an internship, and leaving soon. The people who will continue doing this do not know how to use excel with macros, and want me to implement what I did as best I can.
So my question is : however inefficient the function may be, is there ANY way to make a standard function in Excel that will calculate what I did before, without resorting to macros ?
Thanks.

If you came about this googling something like
levenshtein distance google sheets
I threw this together, with the code comment from milot-midia on this gist (https://gist.github.com/andrei-m/982927 - code under MIT license)
From Sheets in the header menu, Tools -> Script Editor
Name the project
The name of the function (not the project) will let you use the func
Paste the following code
function Levenshtein(a, b) {
if(a.length == 0) return b.length;
if(b.length == 0) return a.length;
// swap to save some memory O(min(a,b)) instead of O(a)
if(a.length > b.length) {
var tmp = a;
a = b;
b = tmp;
}
var row = [];
// init the row
for(var i = 0; i <= a.length; i++){
row[i] = i;
}
// fill in the rest
for(var i = 1; i <= b.length; i++){
var prev = i;
for(var j = 1; j <= a.length; j++){
var val;
if(b.charAt(i-1) == a.charAt(j-1)){
val = row[j-1]; // match
} else {
val = Math.min(row[j-1] + 1, // substitution
prev + 1, // insertion
row[j] + 1); // deletion
}
row[j - 1] = prev;
prev = val;
}
row[a.length] = prev;
}
return row[a.length];
}
You should be able to run it from a spreadsheet with
=Levenshtein(cell_1,cell_2)

While it can't be done in a single formula for any reasonably-sized strings, you can use formulas alone to compute the Levenshtein Distance between strings using a worksheet.
Here is an example that can handle strings up to 15 characters, it could be easily expanded for more:
https://docs.google.com/spreadsheet/ccc?key=0AkZy12yffb5YdFNybkNJaE5hTG9VYkNpdW5ZOWowSFE&usp=sharing
This isn't practical for anything other than ad-hoc comparisons, but it does do a decent job of showing how the algorithm works.

looking at the previous answers to calculating Levenshtein distance, I think it would be impossible to create it as a formula.
Take a look at the code here

Actually, I think I just found a workaround. I was adding it in the wrong part of the code...
Adding this line
} else if(b.charAt(i-1)==a.charAt(j) && b.charAt(i)==a.charAt(j-1)){
val = row[j-1]-0.33; //transposition
so it now reads
if(b.charAt(i-1) == a.charAt(j-1)){
val = row[j-1]; // match
} else if(b.charAt(i-1)==a.charAt(j) && b.charAt(i)==a.charAt(j-1)){
val = row[j-1]-0.33; //transposition
} else {
val = Math.min(row[j-1] + 1, // substitution
prev + 1, // insertion
row[j] + 1); // deletion
}
Seems to fix the problem. Now 'biulding' is 92% accurate and 'bilding' is 88%. (whereas with the original formula 'biulding' was only 75%... despite being closer to the correct spelling of building)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to dynamic programming with a LARGE input - dynamic-programming

Related

(DP) Memoization - How to know if it starts from the top or bottom?

find the number of ways you can form a string on size N, given an unlimited number of 0s and 1s

Profit Maximization based on dynamix programming

Optimal algorithm for this string decompression

Is it possible to do a Levenshtein distance in Excel without having to resort to Macros?

Categories

Resources