Is optimal substructure always means overlapping problems?

Is optimal substructure always means overlapping problems? - dynamic-programming

So I'm currently studying about "Dynamic programming", one thing I'm wondering is that "Is optimal substructure always means overlapping problems?" Can you guys show me an optimization problem that don't have overlapping problems but still have optimal substructure?

Related

Sequence of Smaller Sub-Problems is Known to be Which Design Approach?

To solve a problem, it is broken in to a sequence of smaller sub-problems, till a stage that the sub-problem can be easily solved. What is this design approach called?
Is It a,
Top-Down approach,
Bottom-up approach,
Procedural Programming,
Dynamic programming,
Divide & Conquer
i'm Confused as the Question is asking for "Breaking into Sequence of Smaller Sub-Problems"
Procedural programming break down a programming task into Modules.
Dynamic Programming breaks problem into subproblems.
Divide & Conquer also means the same.
Top down approach is also a stepwise approach.
What would be the Exact Answer for this ?

this design paradigm is called divide and conquer

If the term sequence is to be taken seriously, then I would include pipeline processing in the description of the design approach.

What is the difference between dynamic programming and branch and bound?

I only know that by branch and bound, one can REDUCE the procedure to obtain a solution, but that only helps for problems which have a solution space tree.

Dynamic Programming
Dynamic programming (usually referred to as DP ) is a very powerful technique to solve a particular class of problems. It demands very elegant formulation of the approach and simple thinking and the coding part is very easy. The idea is very simple, If you have solved a problem with the given input, then save the result for future reference, so as to avoid solving the same problem again. Shortly 'Remember your Past' .
If the given problem can be broken up in to smaller sub-problems and these smaller subproblems are in turn divided in to still-smaller ones, and in this process, if you observe some over-lappping subproblems, then its a big hint for DP. Also, the optimal solutions to the subproblems contribute to the optimal solution of the given problem ( referred to as the Optimal Substructure Property ).
There are two ways of doing this.
1.) Top-Down : Start solving the given problem by breaking it down. If you see that the problem has been solved already, then just return the saved answer. If it has not been solved, solve it and save the answer. This is usually easy to think of and very intuitive. This is referred to as Memoization.
2.) Bottom-Up : Analyze the problem and see the order in which the sub-problems are solved and start solving from the trivial subproblem, up towards the given problem. In this process, it is guaranteed that the subproblems are solved before solving the problem. This is referred to as Dynamic Programming.
Branch And Bound
A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root.
The algorithm explores branches of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated bounds on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm.
For more Information About Branch and Bound Please refer this link :
http://www.cs.umsl.edu/~sanjiv/classes/cs5130/lectures/bb.pdf

Dynamic programming requires a recursive structure (a.k.a., optimal substructure in CRLS). That is, at a given state, one can characterize the optimal decision based on partial solutions.
Branch and bound is a more general and is used to solve more difficul problems via implicit enumerations of the solution space.

Suffix tries vs dynamic programming for string algorithms

It seems that many difficult string algorithms can be solved both using suffix tries(trees) and Dynamic Programming.
But I am not sure which approach is best to use and when.
Additionally which approach is better to master on the specific area of algorithms and have it in your arsenal in the area of job interviews? I assume it would be the one that would be used more frequently by a programmer in any task or something like that?
This is more of which algorithmic technique is more useful to master as most frequent to use in your job than simply comparing asymptotic notations

Think of a problem requiring the Lexicographically nth substring of a given string : A suffix array is just what you need...and it is easy to learn the bare essentials for solving most problems involving suffix arrays..
On the other hand DP is an algorithmic technique..MASTER IT and you will be able to solve a HUGE number of problems..not only strings.
For an interview though i will take DP anyday...for interviewers, a DP problem lets them make it knotty that is almost impossible to solve without DP (within given constraints) but the solution would mean that you give them a basic recursion and how DP helps you solve it.If it were a suffix-array-only-problem that would mean that they are assessing you over a single data structure( easy once learned) rather than an more general technique which requires mastery.
PS: I had put off learning DP until recently when i got fed up trying to solve problems (that require DP ) using any advanced data structures and would invariably fail ( Case in point : UVA 1394 -- simple problem now that i know how to solve it using DP but instead went on to study segment trees and achieved a O(nlgn) whereas DP gave me O(n). So final advice : if one hasn't studied DP drop everything else and go for it.

honestly, for job interview, no suffix tree is needed. that's too difficult and beyond the scope. however, DP is widely used in interviews for some famous companies like google and facebook.
suffix tree has limitation for solving problems compared with DP. usually it is used to solve string related problems. but DP can solve many different areas.

What's the opposite of "embarrassingly parallel"?

According to Wikipedia, an "embarrassingly parallel" problem is one for which little or no effort is required to separate the problem into a number of parallel tasks. Raytracing is often cited as an example because each ray can, in principle, be processed in parallel.
Obviously, some problems are much harder to parallelize. Some may even be impossible. I'm wondering what terms are used and what the standard examples are for these harder cases.
Can I propose "Annoyingly Sequential" as a possible name?

Inherently sequential.
Example: The number of women will not reduce the length of pregnancy.

There's more than one opposite of an "embarrassingly parallel" problem.
Perfectly sequential
One opposite is a non-parallelizable problem, that is, a problem for which no speedup may be achieved by utilizing more than one processor. Several suggestions were already posted, but I'd propose yet another name: a perfectly sequential problem.
Examples: I/O-bound problems, "calculate f1000000(x0)" type of problems, calculating certain cryptographic hash functions.
Communication-intensive
Another opposite is a parallelizable problem which requires a lot of parallel communication (a communication-intensive problem). An implementation of such a problem will scale properly only on a supercomputer with high-bandwidth, low-latency interconnect. Contrast this with embarrassingly parallel problems, implementations of which run efficiently even on systems with very poor interconnect (e.g. farms).
Notable example of a communication-intensive problem: solving A x = b where A is a large, dense matrix. As a matter of fact, an implementation of the problem is used to compile the TOP500 ranking. It's a good benchmark, as it emphasizes both the computational power of individual CPUs and the quality of interconnect (due to intensity of communication).
In more practical terms, any mathematical model which solves a system of partial differential equations on a regular grid using discrete time stepping (think: weather forecasting, in silico crash tests), is parallelizable by domain decomposition. That means, each CPU takes care of a part of the grid, and at the end of each time step the CPUs exchange their results on region boundaries with "neighbour" CPUs. These exchanges render this class of problems communication-intensive.

Im having a hard time to not post this... cause I know it don't add anything to the discussion.. but for all southpark fans out there
"Super serial!"

"Stubbornly serial"?

The opposite of embarassingly parallel is Amdahl's Law, which says that some tasks cannot be parallel, and that the minimum time a perfectly parallel task will require is dictated by the purely sequential portion of that task.

"standard examples" of sequential processes:
making a baby: “Crash programs fail because they are based on theory that, with nine women pregnant, you can get a baby a month.” -- attributed to Werner von Braun
calculating pi, e, sqrt(2), and other irrational numbers to millions of digits: most algorithms sequential
navigation: to get from point A to point Z, you must first go through some intermediate points B, C, D, etc.
Newton's method: you need each approximation in order to calculate the next, better approximation
challenge-response authentication
key strengthening
hash chain
Hashcash

P-complete (but that's not known for sure yet).

I use "Humiliatingly Sequential"
Paul

If ever one should speculate what it would be like to have natural, incorrigibly sequential problems, try
blissfully sequential
to counter 'embarrassingly parallel'.

"Gladdengly Sequential"

It all has to do with data dependencies. Embarrassingly parallel problems are ones for which the solution is made up of many independent parts. Problems with the opposite of this nature would be ones that have massive data dependencies, where there is little to nothing that can be done in parallel. Degeneratively dependent?

The term I've heard most often is "tightly-coupled", in that each process must interact and communicate often in order to share intermediate data. Basically, each process depends on others to complete their computation.
For example, matrix processing often involves sharing boundary values at the edges of each array partition.
This is in contrast to embarassingly parallel (or loosely-coupled) problems where each part of the problem is completely self-contained, and no (or very little) IPC is needed. Think master/worker parallelism.

Boastfully sequential.

I've always preferred 'sadly sequential' ala the partition step in quicksort.

"Completely serial?"
It shouldn't really surprise you that scientists think more about what can be done than what cannot be done. Especially in this case, where the alternative to parallelizing is doing everything as one normally would.

Completely non-parallelizable?
Pessimally parallelizable?

The opposite is "disconcertingly serial".

taking into acount that parallelism is the act of doing many jobs in the same time step t. the opposite could be time-sequential problems

An example inherently sequential problem.
This is common in CAD packages and some kinds of engineering analysis.
Tree traversal with data dependencies between nodes.
Imagine traversing a graph and adding up weights of nodes.
You just can't parallelise it.
CAD software represents parts as a tree, and to render to object you have to traverse the tree.
For this reason, cad workstations use less cores and faster, rather than many cores.
Thanks for reading.

You could of course, however I think that both 'names' are a non-issue.
From a functional programming perspective you could say that the 'annoyingly sequential' part is the smallest more or less independent part of an algorithm.
While the 'embarrassingly parallel' if not indeed taking into a parallel approach is bad coding practice.
Thus I don't see a point in given these things a name if good coding practice is always to brake up your solution into independent pieces, even if you at that moment don't take advantage of parallelism.

canonical problems list

Does anyone known of a a good reference for canonical CS problems?
I'm thinking of things like "the sorting problem", "the bin packing problem", "the travailing salesman problem" and what not.
edit: websites preferred

You can probably find the best in an algorithms textbook like Introduction to Algorithms. Though I've never read that particular book, it's quite renowned for being thorough and would probably contain most of the problems you're likely to encounter.

"Computers and Intractability: A guide to the theory of NP-Completeness" by Garey and Johnson is a great reference for this sort of thing, although the "solved" problems (in P) are obviously not given much attention in the book.
I'm not aware of any good on-line resources, but Karp's seminal paper Reducibility among Combinatorial Problems (1972) on reductions and complexity is probably the "canonical" reference for Hard Problems.

Have you looked at Wikipedia's Category:Computational problems and Category:NP Complete Problems pages? It's probably not complete, but they look like good starting points. Wikipedia seems to do pretty well in CS topics.

I don't think you'll find the answers to all those problems in only one book. I've never seen any decent, comprehensive website on algorithms, so I'd recommend you to stick to the books. That said, you can always get some introductory material on canonical algorithm texts (there are always three I usually recommend: CLRS, Manber, Aho, Hopcroft and Ullman (this one is a bit out of date in some key topics, but it's so formal and well-written that it's a must-read). All of them contain important combinatorial problems that are, in some sense, canonical problems in computer science. After learning some fundamentals in graph theory you'll be able to move to Network Flows and Linear Programming. These comprise a set of techniques that will ultimately solve most problems you'll encounter (linear programming with the variables restricted to integer values is NP-hard). Network flows deals with problems defined on graphs (with weighted/capacitated edges) with very interesting applications in fields that seemingly have no relationship to graph theory whatsoever. THE textbook on this is Ahuja, Magnanti and Orlin's. Linear programming is some kind of superset of network flows, and deals with optimizing a linear function on variables subject to restrictions in the form of a linear system of equations. A book that emphasizes the relationship to network flows is Bazaraa's. Then you can move on to integer programming, a very valuable tool that presents many natural techniques for modelling problems like bin packing, task scheduling, the knapsack problem, and so on. A good reference would be L. Wolsey's book.

You definitely want to look at NIST's Dictionary of Algorithms and Data Structures. It's got the traveling salesman problem, the Byzantine generals problem, the dining philosophers' problem, the knapsack problem (= your "bin packing problem", I think), the cutting stock problem, the eight queens problem, the knight's tour problem, the busy beaver problem, the halting problem, etc. etc.
It doesn't have the firing squad synchronization problem (I'm surprised about that omission) or the Jeep problem (more logistics than computer science).
Interestingly enough there's a blog on codinghorror.com which talks about some of these in puzzle form. (I can't remember whether I've read Smullyan's book cited in the blog, but he is a good compiler of puzzles & philosophical musings. Martin Gardner and Douglas Hofstadter and H.E. Dudeney are others.)
Also maybe check out the Stony Brook Algorithm Repository.
(Or look up "combinatorial problems" on google, or search for "problem" in Wolfram Mathworld or look at Hilbert's problems, but in all these links many of them are more pure-mathematics than computer science.)

#rcreswick those sound like good references but fall a bit shy of what I'm thinking of. (However, for all I know, it's the best there is)
I'm going to not mark anything as accepted in hopes people might find a better reference.
Meanwhile, I'm going to list a few problems here, fell free to add more
The sorting problem Find an order for a set that is monotonic in a given way
The bin packing problem partition a set into a minimum number of sets where each subset is "smaller" than some limit
The travailing salesman problem Find a Hamiltonian cycle in a weighted graph with the minimum total weight

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string