Dynamic programming. 4xn matrix, how to find max sum in O(n) - dynamic-programming

I have few restrictions. I can`t pick values from cells that share side (diagonal is ok). I can pick multiple elements from each column.
For example. I have a matrix
1 3 6 10 3
4 2 6 7 9
5 1 8 3 3
9 2 9 1 1
If I pick 1 and 5, in first I have to pick 2 and 2 in second, 6 and 8 in third and so on.
Best I can come up with is brute forcing the algorithm, but that is O(n^4) any ideas?
I have looked at Hungarian algorithm but it does not work in O(n) as well.
Thanks!

If the matrix is restricted to 4 × n, we can just run a dynamic program. Assuming our choices for any column can be only one of:
a) x b) - c) - d) -
- x - -
- - x -
- - - x
e) x f) - g) x h) -
- x - -
x - - -
- x x -
We can define f(i, p) to represent the maximum sum up to and including the ith column when we choose pattern p for our element selections. Then:
f(i, a) = sum ith column elements pattern a + max(
f(i - 1, b),
f(i - 1, c),
f(i - 1, d),
f(i - 1, f),
f(i - 1, h)
)
f(i, b) = sum ith column elements pattern b + max(
f(i - 1, a),
f(i - 1, c),
f(i - 1, d),
f(i - 1, e),
f(i - 1, g),
f(i - 1, h)
)
etc.
Since p is restricted to 8 choices, the search space is O(n * 8) = O(n).

Related

String manipulation with dynamic programming

I have a problem where I have a string of length N, where (1 ≤ N ≤ 10^5). This string will only have lower case letters.
We have to rewrite the string so that it has a series of "streaks", where the same letter is included at least K (1 ≤ K ≤ N) times in a row.
It costs a_ij to change a single specific letter in the string from i to j. There are M different possible letters you can change each letter to.
Example: "abcde" is the input string. N = 5 (length of "abcde"), M = 5 (letters are A, B, C, D, E), and K = 2 (each letter must be repeated at least 2 times) Then we are given a M×M matrix of values a_ij, where a_ij is an integer in the range 0…1000 and a_ii = 0 for all i.
0 1 4 4 4
2 0 4 4 4
6 5 0 3 2
5 5 5 0 4
3 7 0 5 0
Here, it costs 0 to change from A to A, 1 to change from A to B, 4 to change from A to C, and so on. It costs 2 to change from B to A.
The optimal solution in this example is to change the a into b, change the d into e, and then change both e’s into c’s. This will take 1 + 4 + 0 + 0 = 5 moves, and the final combo string will be "bbccc".
It becomes complicated as it might take less time to switch from using button i to an intermediate button k and then from button k to button j rather than from i to j directly (or more generally, there may be a path of changes starting with i and ending with j that gives the best overall cost for switching from button i ultimately to button j).
To solve for this issue, I am treating the matrix as a graph, and then performing Floyd Warshall to find the fastest time to switch letters. This will take O(M^3) which is only 26^3.
My next step is to perform dynamic programming on each additional letter to find the answer. If someone could give me advice on how to do this, I would be thankful!
Here are some untested ideas. I'm not sure if this is efficient enough (or completely worked out) but it looks like 26 * 3 * 10^5. The recurrence could be converted to a table, although with higher Ks, memoisation might be more efficient because of reduced state possibilities.
Assume we've recorded 26 prefix arrays for conversion of the entire list to each of the characters using the best conversion schedule, using a path-finding method. This lets us calculate the cost of a conversion of a range in the string in O(1) time, using a function, cost.
A letter in the result can be one of three things: either it's the kth instance of character c, or it's before the kth, or it's after the kth. This leads to a general recurrence:
f(i, is_kth, c) ->
cost(i - k + 1, i, c) + A
where
A = min(
f(i - k, is_kth, c'),
f(i - k, is_after_kth, c')
) forall c'
A takes constant time since the alphabet is constant, assuming earlier calls to f have been tabled.
f(i, is_before_kth, c) ->
cost(i, i, c) + A
where
A = min(
f(i - 1, is_before_kth, c),
f(i - 1, is_kth, c'),
f(i - 1, is_after_kth, c')
) forall c'
Again A is constant time since the alphabet is constant.
f(i, is_after_kth, c) ->
cost(i, i, c) + A
where
A = min(
f(i - 1, is_after_kth, c),
f(i - 1, is_kth, c)
)
A is constant time in the latter. We would seek the best result of the recurrence applied to each character at the end of the string with either state is_kth or state is_after_kth.

How to unequally distribute random numbers in MS Excel?

Whenever I try to use RANDBETWEEN(Value1,Value2), it almost equally distribute the numbers randomly.
How to generate random number in an unequal manner?
Example -
The above randbetween formula distributed both "Yes" & "No" equally.
And I want more of "Yes" than "No"
You can skew your randbetween values in your favour with the following:
=IF(RANDBETWEEN(1,10)>2,"YES","NO")
You can change the >2 bit to any number between 1 and 10 to determine how much you want to go either side.
Use inverse functions to get different distributions. The function below shows how I implemented multiple inverse functions into one
Dist = the distribution type
a,b,c = parameters of the distribution like minimum, mode, maximum
Prob = rand()
If you pass multiple random values (between 0 and 1) the result from the function will end up with the shape of the distribution you've selected.
Function DistInv(Dist, a, b, c, Prob) As Single
If Dist = "Single" Then
' this is a single value to be used
DistInv = a
ElseIf Dist = "Binomial" Then
' binomial is like a coin flip. Only has a value of 1 or 0. 'a' determines the cut off point
If Abs(Prob) > a Then
DistInv = 0
Else
DistInv = 1
End If
ElseIf Dist = "Random" Then
' uniform distribution between 0% and 100%
DistInv = Prob
ElseIf Dist = "Rand Between" Then
' uniform distribution between the given parameters
DistInv = Prob * (b - a) + a
ElseIf Dist = "Triangular" Then
' Triangular distribution with a = lowest value, b = most likely value and c = highest value
a1 = 1 / ((b - a) * (c - a))
b1 = -2 * a / ((b - a) * (c - a))
C1 = a ^ 2 / ((b - a) * (c - a))
a2 = -1 / ((c - b) * (c - a))
b2 = 2 * c / ((c - b) * (c - a))
C2 = ((c - b) * (c - a) - c ^ 2) / ((c - b) * (c - a))
DistInv = ((-4 * a1 * C1 + 4 * a1 * Prob + b1 ^ 2) ^ (1 / 2) - b1) / (2 * a1)
If DistInv > b Then
DistInv = ((-4 * a2 * C2 + 4 * a2 * Prob + b2 ^ 2) ^ (1 / 2) - b2) / (2 * a2)
End If
ElseIf Dist = "Norm Between" Then
' normal distribution between the given parameters
DistInv = WorksheetFunction.NormInv(Prob, (a + b) / 2, (b - a) / 3.29)
ElseIf Dist = "Norm Mean Dev" Then
' Normal distribution with the average.norm and standard deviation
DistInv = WorksheetFunction.NormInv(Prob, a, b)
ElseIf Dist = "Weibull" Then
' Weibull distribution of probability
'
' inverse of Cumulative Weibull Function
' for a cumulative Weibull distribution F = 1- exp(-((x-c)/b)^a)
' where a is the shape parameter
' b is the scale parameter and
' c is the offset
'
' then solving for x
'
' x = c + b * (-log(1-Prob))^a
DistInv = c + b * (-Log(1 - Prob)) ^ (1 / a)
End If
End Function
To get twice as many "Yes" as "No":
=CHOOSE(RANDBETWEEN(1,3),"Yes","Yes","No")
If you want more Yes than No, make the formula in the Yes cell RANDBETWEEN(Value1,Value2)+RANDBETWEEN(Value3,Value4)
Try ROUND(RANDBETWEEN(RAND(),2),0)... there will be more values in the 1-2 interval than 0-1

How to automatically delimit a column, then copy and pasting into new row? Excel

my data set looks like this.
Column 1 - Column 2 - Column 3 - Column 4
X - Y - A,B,C - F
H - J - E,O,P - L
I want it to look like
Column 1 - Column 2 - Column 3 - Column 4
X - Y - A - F
X - Y - B - F
X - Y - C - F
H - J - E - L
H - J - O - L
H - J - P - L
MY current process is super manual with manually delimiting column 3, then manually posting the line 3 times, and deleting the columns I don't need.
Please let me know if there is a way to do this more automatically! I would prefer a formula that I can use, since I've never used VBA, but I can attempt to understand a VBA Macro as well!
Best,
How many lines do you have? Is this a constant number of columns and row?

given rank of a string, find all substrings in a given string with the given rank

Suppose the rank pattern of a string is based on numbering the characters from 1 to k if there are k distinct characters in it
(there is an order provided on the characters used, here we assume A < B < C.... and so on for example, but generally it will be something which can be tell us char(X) < or = or > char(Y) in O|1| )
Then we follow with assigning the numbers to corresponding indices where the character occurs.
eg-
string - C D B C B, here B - 1, C - 2, D - 3
rank - 2 3 1 2 1 creating the corresponding rank array
eg 2 -
string - D E G B C D here B - 1, C - 2, D - 3, E - 4, G - 5
rank - 3 5 4 1 2 3
(in short, we assign the smallest character to value 1, then next greater to 2, and so on )
Now, the question is this:
Given a string S of size m and rank pattern in form of array P of size n; find the no. of sub-strings of it which has the given rank pattern?
(m will be greater than n to make multiple possible solutions in a string S)
eg.-
S is A C B A D C A E
P is 1 3 2 1 4
there are two substrings in S that conform to the pattern P
here, in the substring [0, 4] i.e A C B A D
rank 1 3 2 1 4
again in the substring [3, 7] i.e A D C A E
rank 1 3 2 1 4
so, the answer is 2 here..
I know of an O|SR| soln. and an O|S(logR + RlogR)| soln.
Can we do better ?? If so, can someone tell me how??

Is there a way to optimise this program in Haskell?

I am doing project euler question 224. And whipped up this list comprehension in Haskell:
prob39 = length [ d | d <- [1..75000000], c <- [1..37500000], b <-[1..c], a <- [1..b], a+b+c == d, a^2 + b^2 == (c^2 -1)]
I compiled it with GHC and it has been running with above average kernel priority for over an hour without returning a result. What can I do to optimise this solution? It seems I am getting better at finding brute force solutions in a naive manner. Is there anything I can do about this?
EDIT: I am also unclear about the definition of 'integral length', does this just mean the side length has a magnitude which falls in the positive set of integers, i.e: 1,2,3,4,5... ?
My Haskell isn't amazing, but I think this is going to be n^5 as written.
It looks like you're saying for each n from 1 to 75 million, check every "barely obtuse" triangle with a perimiter less than or equal to 75 million to see if it has perimiter n.
Also I'm not certain if list comprehensions are smart enough to stop looking once the current value of c^2 -1 is greater than a^2 + b^2.
A simple refactor should be
prob39 = length [ (a, b, c) | c <- [1..37500000], b <-[1..c], a <- [1..b], a^2 + b^2 == (c^2 -1), (a + b + c) <= 75000000]
You can make it better, but that should literally be 75 million times faster.
Less certain about this refactoring, but it should also speed things up considerably:
prob39 = length [ (a, b, c) | a <- [1..25000000], b <-[a..(75000000 - 2*a)], c <- [b..(75000000 - a - b)], a^2 + b^2 == (c^2 -1)]
Syntax may not be 100% there. The idea is that a can only be 1 to 25 million (since a <= b <= c and a + b + c <= 75 million). b can only be between a and halfway from a to 75 million (since b <= c) and c can only be from b to 75 million - (a + b), otherwise the perimeter would be over 75 million.
Edit: updated code snippets, there were a couple of bugs in there.
Another quick suggestion, you can replace c <- [b..(75000000 - a - b)] with something along the lines of c <- [b..min((75000000 - a - b), sqrt(aa + bb) + 1)]. There's no need to bother checking any values of c greater than the ceiling of the square root of (a^2 + b^2). Can't remember if those are the correct min/sqrt function names in haskell though.
Getting OCD on this one, I have a couple more suggestions.
1) you can set the upper bound on b to be the min of the current upper bound and a^2 * 2 + 1. This is based on the principle that (x+1)^2 - x^2 = 2x + 1. b cannot be so much larger than a that we can guarantee that (a^2) + (b^2) < (b+1)^2.
2) set the lower bound of c to be max of b + 1 and floor(sqrt(a^2 + b^2) - 1). Just like the upper limit on C, no need to test values which couldn't possibly be correct.
Along with the suggestions given #patros.
I would like to share my observations on this problem.
If we print the values of a , b and c for some perimeter say 100000, then we can observe that a and b always take even values and c always take odd values. So if we optimize our code with these restrictions then almost half the checking can be skipped.

Resources