Dynamic Programming: top down versus bottom up comparison

Dynamic Programming: top down versus bottom up comparison - dynamic-programming

Can you point me to some dynamic programming problem statements where bottom up is more beneficial than top down? (i.e. simple DP works more naturally but memoization would be harder to implement?)
I find recursion with memoization much easier, and want to solve problems where bottom up is a better/perhaps only feasible approach.
I understand that theoretically both are equivalent, so even something like ease of implementation would count as a benefit.

You will apply bottom up with memoization OR top down recursion with memoization depending on the problem at hand .
For example, if you have to find the minimum weight independent path of a path graph, you will use the bottom up approach as you have to solve all the subproblems that are possible.
But if you have to solve the knapsack problem , you may want to use recursive top down with memoization as you have to solve a limited number of subproblems. Approaching the knapsack problem bottom up will cause the algo to solve a lot of redundant problems that are not used in the original subproblem.

Two things to consider when deciding which algorithm to use
Time Complexity. Both approaches have the same time complexity in general, but because for loop is cheaper than recursive function calls, bottom-up can be faster if measured in machine time.
Space Complexity. (without considering extra call stack allocations during top-down) Usually both approaches need to build a table for all sub-solutions, but bottom-up is following a topological order, its cost of auxiliary space can be sometimes reduced to the size of problem's immediate dependencies. For example: fibonacci(n) = fibonacci(n-1) + fibonacci(n-2), we only need to store the past two calculations
That being said, bottom-up is not always the best choice, I will try to illustrate with examples:
(mentioned by #Nikunj Banka) top-down only solves sub-problems used by your solution whereas bottom-up might waste time on redundant sub-problems. A silly example would be 0-1 knapsack with 1 item...run time difference is O(1) vs O(weight)
you might need to perform extra work to get topological order for bottm-up. In Longest Increasing Path in Matrix, if we want to do sub-problems after their dependencies, we would have to sort all entries of the matrix in descending order, that's extra nmlog(nm) pre-processing time before DP

Related

Running time - Dynamic programming algorithm

Any dynamic programming algorithm that solves N sub problems in the process of computing it's final answer must run in Ω(N) time.
Is this statement true? I am thinking that it is indeed true as i need to compute every sub problem. Please let me know if i am wrong

The short answer is no. Dynamic programming is more of a strategy to boost up performance/shorten runtime complexity than an actual algorithm. Without knowing the actual algorithm for a specific problem, it's not possible to say anything about time complexity.
The idea of DP is to use memoization(by consuming some space) to speed up exisiting algorithm. Moreover, every algorithm that you can apply DP may speed up in different ways. Without re-computing the same subtask multiple time, you will have to store intermediate results in another data structure. If the result is needed again in your data strucutre, you will directly return intermediate results you've stored
With that being said, the time complexity of DP problems is the number of unique states/subproblems * time taken per state.
Here's one example when DP solves N sub problems and computation is not Ω(N).
let's assume your DP requires O(n) subproblems and evaluating each subproblem costs an O(logn) binary search plus constant time operations.
Then the overall algorithm would take O(n*logn).

Complexities of the following -insertion sort, selection sort ,merge sort , radix sort and explain which one is best sorting algorithm and why?

Complexities of the following -insertion sort, selection sort ,merge sort , radix sort and explain which one is best sorting algorithm and why?

I don't believe in the 'best' sorting algorithm. It depends on what you want to do. For instance, bubble sort is really easy to implement and would be the best if you just want a quick and dirty way of sorting a short array. On the other hand, for larger arrays, the time complexity will really come into play and you will notice considerable runtime difference. If you really value memory, then you probably want to evaluate space complexities of these.
So the sort answer is: IMHO, there's no best sorting algorithm. I'll leave the following table for you to evaluate for yourself what you want to use.
Sorting AlgorithmAvg Time ComplextitySpace Complexity
Quicksort O(nlog(n)) O(log(n))
Mergesort O(nlog(n)) O(n)
Insertionsort O(n^2) O(1)
Selectionsort O(n^2) O(1)
Radixsort O(nk) O(n+k)

Do you need to sort inputs for dynamic programming knapsack

In every single example I've found for a 1/0 Knapsack problem using dynamic programming where the items have weights(costs) and profits, it never explicitly says to sort the items list, but in all the examples they are sorted by increasing both weight and profit (higher weights have higher profits in the examples). So my question is when adding items in the matrix from the item array/list, can I add them in any order, or do I add the one with the smallest weight or profit? Because from multiple examples I found I'm not sure if its just a coincidence or you do in fact need to put the smallest weight/profit into the matrix each time

The dynamic programming solution is nothing but choosing all the possibilities (using brute force) in an efficient way (just by saving the values for future reference).
Note: We consider all the subsets. Even if the list is sorted or not the total number of subsets will be the same. So in the end all the subsets will get considered.

No, you don't need to sort the weights because every row gives the maximum possible value under the weight limit of that row. The maximum will come in the last column of that row.

Maybe you are looking at bottom-up Dynamic solutions. And there is one characteristic in Dynamic solution when you solve using the bottom-up method.
The second approach is the bottom-up method. This approach typically depends
on some natural notion of the “size” of a subproblem, such that solving any particular
subproblem depends only on solving “smaller” subproblems. We sort the
subproblems by size and solve them in size order, smallest first. When solving a
particular subproblem, we have already solved all of the smaller subproblems its
solution depends upon, and we have saved their solutions. We solve each subproblem
only once, and when we first see it, we have already solved all of its
prerequisite subproblems.
From: Introduction to Algorithm, CORMEN (3rd edition)

The "smaller problem" in this case is just smaller in terms of the number of available items to choose, not about the profits or weights of these items. If given a 3 item list, the sub-problems will be 2 items, and 1 item to choose from.
The smallest problem is first hardcoded (base case), then at each stage of moving from a smaller to bigger problem, the best profit is enumerated, and the max is chosen. At the end, all 2^n combinations would have been considered and the various stages of repeated max will bubble up the largest solution.
Changing the input order, or putting dominated items into the input (like higher weight and lower profit) may just change which argument of max() won at each stage, but the final max result will come from the same selection of items, albeit being selected at different stages in the algorithm for different sort orders or input characteristics.

The answer could be found by some random shuffle experiments.
which I found is: better in ascending order. correct me if i'm wrong.
gist: https://gist.github.com/whille/39cf7bf8cf5dcf6ac933063735ae54de
Problem described in "Algorithm Design", ISBN: 9780321295354, chapter 6.4.
Two methods could be used:
as the chapter used, a M cache to pre-calculated, in which not sub answer is needed.
recursive function, which is simple to understand and test. I found functools.cache of python3.5+ could be use to check how many sub caculation is need, as my gist show: ascending order of test_random() is the smallest in currsize, So it's most efficient, and could extend to float value.
results for 10 random weights(1~100) and 200 knapack:
[(13.527716157276256, 18.371888775465692), (16.18632175987168, 206.88043031085252), (20.14117982372607, 81.52793937986635), (33.28606671929836, 298.8676699147799), (49.12968642850187, 22.037638580809592), (55.279973594800225, 377.3715225559507), (56.56103181962746, 460.9161412820592), (60.38456825749498, 10.721915577913244), (67.98836121062645, 63.47478755362385), (86.49436333909377, 208.06767811169286)]: reverse: False
CacheInfo(hits=0, misses=832, maxsize=None, currsize=832)
[(86.49436333909377, 208.06767811169286), (67.98836121062645, 63.47478755362385), (60.38456825749498, 10.721915577913244), (56.56103181962746, 460.9161412820592), (55.279973594800225, 377.3715225559507), (49.12968642850187, 22.037638580809592), (33.28606671929836, 298.8676699147799), (20.14117982372607, 81.52793937986635), (16.18632175987168, 206.88043031085252), (13.527716157276256, 18.371888775465692)]: reverse: True
CacheInfo(hits=0, misses=1120, maxsize=None, currsize=1120)
Note for method2:
if capacity is much larger, all random orders have equal currsize.
callstack overflow should be avoided for large N, so recurse method should be transformed. Generally a two step method could be used, first map subproblem dependency, and calc. I'ill try later in gist.

I guess, sorting might be required in certain types of knapsack problem. For example consider the problem "Maximum Earnings From Taxi". Here, the input has to be sorted by the starting point of the riders, else we won't get the optimal result.
For example, consider the below input for the above problem:-
9
[[2,3,1],[2,9,2], [3,6,7],[2,3,6]]
If you apply a typical knapsack implementation in recursive approach, without sorting the input we won't get the optimal solution.

Does the most efficient solution to some problems require mutable data?

I've been dabbling in Haskell - so still very much a beginner.
I'm been thinking about the counting the frequency of items in a list. In languages with mutable data structures, this is typically solved using a hash table - a dict in Python or a HashMap in Java for example. The complexity of such a solution is O(n) - assuming the hash table can fit entirely in memory.
In Haskell, there seem to be two (mainstream) choices - to sort the data then group and count it or use a Data.Map. If a sort is used, it dominates the run-time of the solution, so the complexity is O(n log n). Likewise, Data.Map uses a balanced tree, so inserting n elements into it will also have complexity O(n log n).
If my analysis is correct, then I assume that this particular problem is most efficiently solved by resorting to a mutable data structure. Are there other types of problems where this is also true? How in general do people using Haskell approach something like this?

The question whether we can implement any algorithm with optimal complexity in a pure language is currently unknown. Nicholas Pippenger has proven that there is a problem that must necessarily have a log(n) penalty in a pure strict language compared to the optimal algorithm. However, there is a followup paper which shows that this problem have an optimal solution in a lazy language. So at the end of the day we really don't know. Though it seems that most people think that there is an inherent log(n) penalty for some problems, even for lazy languages.

Why mergesort is not Dynamic programming

I have read these words:
There are two key attributes that a problem must have in order for dynamic programming to be applicable: optimal substructure and overlapping subproblems. If a problem can be solved by combining optimal solutions to non-overlapping subproblems, the strategy is called "divide and conquer". This is why mergesort and quicksort are not classified as dynamic programming problems.
I have the 3 questions:
Why mergesort and quicksort is not Dynamic programming？
I think mergesort also can be divided small problems and small problems then do the same thing and so on.
Is Dijkstra Algorithm using dynamic algorithm?
Are there applied examples of using Dynamic programming?

The key words here are "overlapping subproblems" and "optimal substructure". When you execute quicksort or mergesort, you are recursively breaking down your array into smaller pieces that do not overlap. You never operate over the same elements of the original array twice during any given level of the recursion. This means there is no opportunity to re-use previous calculations. On the other hand, many problems DO involve performing the same calculations over overlapping subsets, and have the useful characteristic that an optimal solution to a subproblem can be re-used when computing the optimal solution to a larger problem.
Dijkstra's algorithm is a classic example of dynamic programming, as it re-uses prior computations to discover the shortest path between two nodes A and Z. Say that A's immediate neighbors are B and C. We can find the shortest path from A to Z by summing the distance between A and B with our computed shortest path from B to Z; and do similarly for finding the shortest path from C to Z. Then the shortest path from A to Z will be the shorter of these two paths. The key insight here is that we can re-use the shortest path computations for paths of length 2 when computing the shortest paths of length 3, and so on. Doing so results in a much more efficient algorithm.
Dynamic programming can be used to solve many types of problems -- see http://en.wikipedia.org/wiki/Dynamic_programming#Examples:_Computer_algorithms for some examples.

For dynamic programming to be applicable to a problem, there should be
i. An optimal structure in the subproblems:
This means that when you break down your problem into smaller units, those smaller units also need to be broken down into yet smaller units for an optimal solution. For example, in merge sort, a array of numbers can get sorted if we divide it into two subarrays, get them sorted and combine them. While sorting these two subarrays, repeat the same process you followed in the previous sentence. So an optimal solution (a sorted array) is got when we find an optimal solution to its subproblems (we sort the subarrays and combine them). This requirement is fulfilled for merge sort. Also the subproblems must be independent for them to follow an optimal structure. This is also fulfilled by merge sort as the subproblems' solutions do not get affected by each others' solutions. For example, the solutions to the two parts of an array are not affected by each other's sortedness.
ii. Overlapping subproblems:
This means that while solving for the solution, the subproblems you formulate get repeated, and hence need only be solved once. In the case of merge sort, this requirement will be met only rarely in the normal case. An array of numbers like 2 1 3 4 9 4 2 1 3 1 9 4 may be a good candidate for overlapping subproblems for merge sort. In this case, the solution to the subproblem sort(2 1 3) can be stored in a table to be reused, because it will be called twice during the computation. But as you can see, there is a very slim chance that a random array of numbers will have this kind of a repeated contrivance. So it would only be inefficient if we used a dynamic programming technique like memoization for an algorithm like merge sort.
Yes. Dijkstra's algorithm uses dynamic programming as mentioned by #Alan in the comment. link
Yes. If I may quote Wikipedia here,
"Dynamic programming is widely used in bioinformatics for the tasks such as sequence alignment, protein folding, RNA structure prediction and protein-DNA binding." 1
1 https://en.wikipedia.org/wiki/Dynamic_programming

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string