Optimal substructure in Dynamic Programing - dynamic-programming

I have been trying to understand Dynamic Programming, and what I understood is that there are two parts of DP.
Optimal substructures
Overlapping subproblems
I understand the second one, but I am not able to understand the first one.

Optimal substructure means, that any optimal solution to a problem of size n, is based on an optimal solution to the same problem when considering n' < n elements.
That means, when building your solution for a problem of size n, you split the problem to smaller problems, one of them of size n'. Now, you need only to consider the optimal solution to n', and not all possible solutions to it, based on the optimal substructure property.
An example is the knapsack problem:
D(i,k) = min { D(i-1,k), D(i-1,k-weight(i)) + cost(i) }
The optimal substructure assumption here, is D(i,k) can check only optimal solutions to D(i-1,k), and none optimal solutions are not considered.
An example where this does not hold is the Vertex Cover problem.
If you have a graph G=(V,E), assume you have an optimal solution to a subgraph G'=(V',E[intersection]V'xV') such that V' <= V - the optimal solution for G does not have to be consisted of of the optimal solution for G'/

Another good example is the difference between finding a shortest simple path between every pair of vertices in a graph, and finding a longest simple path between each of these pairs. ("Simple" means that no vertex on a path can be visited twice; if we don't put this constraint in for the "longest" version of the problem, then we can get infinitely long paths whenever the graph contains a cycle.)
The Floyd-Warshall algorithm can compute the answer to the first problem efficiently by exploiting the fact that, if a path from u to v is shortest-possible, then for any vertex x on this path, it must be that the subpath from u to x, and the subpath from x to v, are also shortest-possible. (Suppose to the contrary that there was a vertex x on the "shortest possible" path from u to v such that the subpath from u to x was not shortest-possible: then it's possible to find some other, shorter path from u to x -- and this can also be used to make the overall path from u to v shorter by the same amount, so the original u-to-v path could not have been shortest-possible after all.) That means that when looking for the shortest u-to-v path, the algorithm only needs to consider building it out of shortest-possible (that is, optimal) subpaths between other pairs of vertices -- not out of the much larger number of all such subpaths.
In contrast, consider the problem of determining the longest simple path between any two vertices in a graph. Is it likewise true that, if the longest path from u to v goes through some vertex x, then the subpaths from u to x, and from x to v, are necessarily also longest-possible? Unfortunately not: It may well be that the longest path from u to x uses some vertices in its interior that are also needed by the longest path from x to v, meaning that we can't simply glue these two paths together to get a longest simple path from u to v.
As a general rule, we can always "get around" this problem by choosing to use a sufficiently detailed definition of the subproblem to be solved: In this case, instead of asking for the longest path between two given vertices u and v, we can ask for the longest path between two given vertices u and v which uses only vertices from a given set S. Where previously we could build a function shortest(u, v) that takes two parameters, we must now build a function longest(u, v, S) that takes three; the overall longest path between 2 vertices u and v could then be computed using longest(u, v, V), where V is the entire vertex set of the graph. With this new definition, it's now once again possible to produce optimal solutions by combining only optimal solutions to subproblems, because we can ensure that we only try gluing together paths that result from subproblems whose S sets are disjoint. We can now correctly determine the longest path from u to v that uses only vertices in S, namely longest(u, v, S), by calculating the maximum, over all vertices x in S, and all ways of partitioning S-{x} into two subsets A and B, of longest(u, x, A) + longest(x, v, B).
Unfortunately, there are now an exponential number of subproblems to be solved, because a set of n vertices can be partitioned in 2^(n-1) different ways. (The algorithm just described is not the most efficient possible DP for this problem, but even the most efficient known DP still has this exponential factor in its running time.) The challenge in designing a DP algorithm is always to find a way to define subproblems that results in few enough different subproblems (ideally, only polynomially many) while still maintaining the two properties of overlapping subproblems and optimal substructure.

In Simple Words : "Principle of optimality states while solving the problem of optimization one has to solve sub-problems, solution of sub-problem will be the part of optimization problem" , if problem can be solved by optimal sub problem means it consist optimal substructure.
Example : let say in a graph , source vertex is s and destination is d.
We have to find shortest(s,d)
graph is
a g
b e h d
s c f i
d
length(s,a)=14
length(s,b)=10
length(s,c)=1
length(s,d)=6
length(c,b)=1
Note : No direct edge for (s,e) or (s,f).
While thinking to find an algo for this , if we are writing a priority queue structure which will traverse with least total PATH_Length .
We will assign each vertex PATH_LENGTH from source vertex.
we will keep assigning path_length to adjacent vertex if new path_length < existing path length.
Example : Len(s,b) > Len(s,a)+Len(a,b);
reset len(s,b)=2;
Adjacent nodes from S creating a path to get minimal path_length irrespective of destination node,Because they are making substructure that lead to solution.

Related

Quickest way to find closest set of point

I have three arrays of points:
A=[[5,2],[1,0],[5,1]]
B=[[3,3],[5,3],[1,1]]
C=[[4,2],[9,0],[0,0]]
I need the most efficient way to find the three points (one for each array) that are closest to each other (within one pixel in each axis).
What I'm doing right now is taking one point as reference, let's say A[0], and cycling all other B and C points looking for a solution. If A[0] gives me no result I'll move the reference to A[1] and so on. This approach as a huge problem because if I increase the number of points for each array and/or the number of arrays it requires too much time to converge some times, especially if the solution is in the last members of the arrays. So I'm wondering if there is any way to do this without maybe using a reference, or any quicker way than just looping all over the elements.
The rules that I must follow are the following:
the final solution has to be made by only one element from each array like: S=[A[n],B[m],C[j]]
each selected element has to be within 1 pixel in X and Y from ALL the other members of the solution (so Xi-Xj<=1 and Yi-Yj<=1 for each member of the solution).
For example in this simplified case the solution would be: S=[A[1],B[2],C[1]]
To clarify further the problem: what I wrote above it's just a simplify example to explain what I need. In my real case I don't know a priori the length of the lists nor the number of lists I have to work with, could be A,B,C, or A,B,C,D,E... (each of one with different number of points) etc. So I also need to find a way to make it as general as possible.
This requirement:
each selected element has to be within 1 pixel in X and Y from ALL the other members of the solution (so Xi-Xj<=1 and Yi-Yj<=1 for each member of the solution).
massively simplifies the problem, because it means that for any given (xi, yi), there are only nine possible choices of (xj, yj).
So I think the best approach is as follows:
Copy B and C into sets of tuples.
Iterate over A. For each point (xi, yi):
Iterate over the values of x from xi−1 to xi+1 and the values of y from yi−1 to yi+1. For each resulting point (xj, yj):
Check if (xj, yj) is in B. If so:
Iterate over the values of x from max(xi, xj)−1 to min(xi, xj)+1 and the values of y from max(yi, yj)−1 to min(yi, yj)+1. For each resulting point (xk, yk):
Check if (xk, yk) is in C. If so, we're done!
If we get to the end without having a match, that means there isn't one.
This requires roughly O(len(A) + len(B) + len(C)) time and O(len(B) + len(C) extra space.
Edited to add (due to a follow-up question in the comments): if you have N lists instead of just 3, then instead of nesting N loops deep (which gives time exponential in N), you'll want to do something more like this:
Copy B, C, etc., into sets of tuples, as above.
Iterate over A. For each point (xi, yi):
Create a set containing (xi, yi) and its eight neighbors.
For each of the lists B, C, etc.:
For each element in the set of nine points, see if it's in the current list.
Update the set to remove any points that aren't in the current list and don't have any neighbors in the current list.
If the set still has at least one element, then — great, each list contained a point that's within one pixel of that element (with all of those points also being within one pixel of each other). So, we're done!
If we get to the end without having a match, that means there isn't one.
which is much more complicated to implement, but is linear in N instead of exponential in N.
Currently, you are finding the solution with a bruteforce algorithm which has a O(n2) complexity. If your lists contains 1000 items, your algo will need 1000000 iterations to run... (It's even O(n3) as tobias_k pointed out)
Like you can see there: https://en.wikipedia.org/wiki/Closest_pair_of_points_problem, you could improve it by using a divide and conquer algorithm, which would run in a O(n log n) time.
You should search for Delaunay triangulation and/or Voronoi diagram implementations.
NB: if you can use external libs, you should also consider taking a look at the scipy lib: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.Delaunay.html

Optimizations for Raycasting

I've been wanting to build a 3D engine starting from scratch as a coding challenge with the end objective of implementing it on a fantasy console.
The best (i.e. most simple?) way I found was raytracing/raycasting. I haven't found much by looking online for raycasting algorithms, only finding point-in-polygon problems (which only tell me whether a ray intersects a polygon or not, not quite my interest since I wouldn't have info about the first intersection nor I'd have the intersection points).
The only solution I could think of is brute forcing the ray by moving it at small intervals and every time check whether that point is occupied by something or not (which would require having filled shapes and wouldn't let me have 2D shapes since they would never be rendered, although none of those is a problem). Still, it looks way too complex performance-wisely.
As far as I know, most of those problems are solved using linear algebra, but I'm not quite as competent as to build up a solution on my own. Does this problem have a practical solution?
EDIT: I just found an algebric solution in 2D which could maybe be expanded in 3D. The idea is:
For each edge, check whether one of the two vertices are in the field of view (i.e. if O is the origin of every ray and P is the vertex, you have to check first that the point is inside the far point of sight, and then whether the angle with the forward vector is less than the angle of vision). If at least one of the two vertices is inside the field of view, add them to an array E.
If we have an array R of rays to shoot and an array of arrays I of info about hit points, we can loop for each ray in R and for each edge in E and then store f(ray, edge) in I, where f is a function that gives us info on whether the ray and the edge collided and where they did.
f uses basic linear algebra: both the ray and the edge are, for all purposes, two segments. Two segments are just parts of two lines. Let's say that if the edge has vertices A and B (AB is the vector that goes from A to B) and if the far point is called P (OP is the vector that goes from O to P). We can create two lines, r and s, defined by A + ηAB and O + λOP. After we do a check to see whether r and s are parallel (check if the absolute value of the dot product of AB and OP is equal to the norm of AB times the norm of OP), it's trivial to get the values for η and λ.
Now, if η < 0 OR η > 1 we have that the two segments are not colliding.
After we've done this for every ray and every edge, we compare every element in each array i in I to see which one had the lowest λ. The lowest λ carries the first collision and hence the data to show on screen.
Everything here is linear algebra, though I fear that it might still be computationally heavy, since there's a lot going on, and it's still only 2D.

Generate test cases for levenshtein distance implementation with quickCheck

As part of me learning about quickCheck I want to build a test generator for a levenshtein edit distance implementation. The obvious approach I think is to start with two equal strings and a random non-reducable series of insert/delete/traspose actions, apply that to one of the strings and assert that the levenshtein distance is the length of the random series.
I am quite stuck with this can someone help?
Getting "non-reducible" right sounds pretty hard. I would try to find a larger number of less complicated invariants. Here are some ideas:
The edit distance between any string and itself is 0.
No two strings have a negative edit distance.
For an arbitrary string x, if you apply exactly one change to it, producing y, the edit distance between x and y should be 1.
Given two strings x and y, compute the distance d between them. Then, change y, yielding y', and compute its distance from x: it should differ from d by at most 1.
After applying n edits to a string x, the distance between the edited string and x should be at most n. Note that case (1) is a special case of this, where n=0, so you could omit that one for concision if you like. Or, keep it around, since case (1) may generate simpler counterexamples.
The function should be symmetric: the edit distance from x to y should be the same as from y to x.
If you have another, known-good implementation of the algorithm to test against, you could compare to that, and assert that you always get the same answer as it does.
The above were all just things that appealed to me without any research. You could do more: for example, encode the lower and upper bounds as defined by wikipedia.

Directed graph linear algorithm

I would like to know the best way to calculate the length of the shortest path between vertex s and every other vertex of the graph in linear time using dynamic programming.
The graph is weighted DAG.
What you can hope for is an algorithm linear in the number of edges and vertices, i.e. O(|E| + |V|), which also works correctly in presence of negative weights.
This is done by first computing a topological order and then 'exploring' the graph in the order given by this topological order.
Some notation: let's call d'(s,v) the shortest distance from s to v and d(u,v) the length/weight of the arc from u to v (if it exists).
Then, for a node v that is currently being visited, the shortest path from s to v is the minimum of d'(s,u)+d(u,v) for each in-neighbour u of v.
In principle, this is very similar to Dijkstra's algorithm except that we already know in which order to traverse the vertices.
The topological sorting ensures that all in-neighbours of v have already been visited and will not be updated again. So, whenever a node has been visited, the distance it is assigned is the correct shortest path from s to v. Therefore, you end up with a shortest s-v-path for each v.
A full description and implementation can be found here, which links to these lecture notes. I'm not sure where the algorithmic idea for this DAG algorithm was originally published in the literature.
This algorithm works for DAGs, even in the presence of negative weights/distances.
While a typical implementation of this algorithm will most likely not be done using dynamic programming explicitly, it can still be interpreted as such since the problem of finding a shortest path to a node v is computed using the shortest paths to the in-neighbours of v.
For further discussion on if/how this type of algorithm counts as dynamic programming, let me refer you to this question.
It's possible what you're looking for is Bellman-Ford algorithm, which is O(|V||E|) in terms of time complexity (not really linear).
Not sure if some witty dynamic-programming approach could improve on that though.
As hauron said, Bellman-Ford will give you what you're looking for in time O(|V||E|). This works even if your graph contains negative weighted edges, and Bellman-Ford uses dynamic programming at its core.
However, I must add that if your weights are non-negative, you can do Dijkstra from your vertex s in time O(|E| log |E|).
Initialize d[s] = 0.
For every vertex, calculate:
d[v] = min {d[u] + w(u,v) | (u,v) is an edge}
d[v] = ∞ if v has no incoming edges.
(The algorithm always halts since the graph is acyclic.)

How to find the cubes passed through by a triangle

Given a triangle with vertice A, B and C in 3D world and a axis-aligned bounding cuboid with length*width*height=nd*md*ld(n, m, l are integers and d is float) containing it, partition the cuboid into n*m*l cubes and how to find the cubes passed through by the triangle?
There are many algorithm to detect whether a triangle and a cube intersect. Loop over all cubes the problem can be solved. However, the complexity of this approach is O(n*m*l) or O(n^3). Is there an approach with complexity O(n^2) or even O(nlogn)?
You cannot improve upon O(n m l) for the following reason: select m=1 and l=1.
Then one has a planar arrangement of n cubes, and your triangle could intersect every one. If you need to report each cube intersected, you would have to report all n cubes.
But clearly this is just a flaw in your problem statement. What you should ask is the situation where n=m=l. So now you have an n x n x n set of cubes, and one triangle can only intersect O(n^2) of them.
In this case, certainly a triangle might intersect Ω(n^2) cubes, so one cannot
improve upon quadratic complexity. This rules out O(n log n).
So the question becomes: Is there a subcubic algorithm for identifying
the O(n^2) cubes intersected by a triangle? (And one may replace "triangle"
with "plane.")
I believe the answer is Yes. One method is to construct an octree representing
the cubes. Searches for "voxels" and "octree intersection" may lead you
to explicit algorithms.

Resources