I'm trying to implement a search for a personal project in which the exploration of the nodes are relatively expensive. I was hesitant between using DFS(Dijkstra's Forward Search) or A*.
My question is, will there be a case where A* explores more nodes than DFS?
Dijkstra's original algorithm does not use a min-priority queue and runs in time |V|^2 (where |V| is the number of nodes). The implementation based on a min-priority queue implemented by a Fibonacci heap and running in O(|E|+|V|\log |V|) (where |E| is the number of edges) is due to (Fredman & Tarjan 1984). This is asymptotically the fastest known single-source shortest-path algorithm for arbitrary directed graphs with unbounded non-negative weights. However, specialized cases (such as bounded/integer weights, directed acyclic graphs etc) can indeed be improved further.
The time complexity of A* depends on the heuristic. In the worst case of an unbounded search space, the number of nodes expanded is exponential in the depth of the solution (the shortest path) d: O(b^d), where b is the branching factor (the average number of successors per state).This assumes that a goal state exists at all, and is reachable from the start state; if it is not, and the state space is infinite, the algorithm will not terminate.
The A* algorithm is a generalization of Dijkstra's algorithm that cuts down on the size of the subgraph that must be explored, if additional information is available that provides a lower bound on the "distance" to the target. This approach can be viewed from the perspective of linear programming: there is a natural linear program for computing shortest paths, and solutions to its dual linear program are feasible if and only if they form a consistent heuristic (speaking roughly, since the sign conventions differ from place to place in the literature). This feasible dual / consistent heuristic defines a non-negative reduced cost and A* is essentially running Dijkstra's algorithm with these reduced costs. If the dual satisfies the weaker condition of admissibility, then A* is instead more akin to the Bellman–Ford algorithm.
Worst case performance O(|E|)=O(b^{d})
and
Worst case space complexity O(|V|)=O(b^{d})
Dijkstra's algorithm can be viewed as a special case of A* where h(x)=0 for all x.
It should be noted, however, that Dijkstra's algorithm can be implemented more efficiently without including a h(x) value at each node.
Related
I am solving general black-box optimization problems like:
x*: f(x) -> min, where x are permutations of length N (N = 50 for example, so brute force search is not possible). Objective function f(x) is represented by stand-alone computer code and x represents configuration of complex system with the response simulated by f(x).
I learned, that in this case I can use many heuristic methods. But, most of these methods use always some kind of local search, which require suitable distance metric at search space (space of permutations x in my case). Under suitable distance metric I mean the metric which fulfill the "locality" property, e.g. small change of permutation x produce small change of objective function f(x). In my case is not known any suitable distance metric with this property, so any kind of local search is nearly the random search.
I have a few questions:
Are there available any heuristic black-box combinatorial optimization methods, which does not use local search and/or any distance metric at search space? I need to overcome the low "locality" of the problem or simply the fact, that any suitable distance metric at search space is unknown.
Is the "locality" property really so restricted at combinatorial optimization in general? May be I miss something..., but the most of real-world black-box combinatorial problem has low or very low "locality" due to the fact, that the common permutation distance metrics (Hamming, Kendal, etc.) are not suitable metrics in general.
Is there any general method how to find suitable distance metric at search space to satisfy at least approximately the "locality"?
Additional remarks:
In real, the black-box function f(x) is realized by stand-alone deterministic simulation code, where x plays a role of discrete configuration of the simulated physical system. So, function f(x) has definitely well defined properties, but this properties are so difficult, that is not possible to simple exploit it.
Because of above mentioned complicated internal properties of function f(x) is not possible to find proper distance metric d(x,x') in search space which fulfill "locality" (similar x and x' in a sense of any distance metric produce similar responses f(x) and f(x'))
So, finally, I am looking for any optimization heuristics, which are able to find any suitable sub-optimal solutions only by informations available by properties of f(x) at fitness space. Like EDA's (Estimation of Distribution Algorithms) for example.
The main reason of this question is, what types of optimization heuristics are suitable to solve this kind of problems.
I was reading about the variants of the A* search algorithm and I came across dynamic weighting. As I understand it, a weight is applied to the search equation, which decreases as the search gets closer to the goal node. I was specifically looking at this article : http://theory.stanford.edu/~amitp/GameProgramming/Variations.html
Can anyone tell me what the advantages of this would be? Why would you not care what nodes you expand at the start? Is it to help searches that don't necessarily have a good heuristic?
Thanks
For the TLDNR-crowd:
Dynamic weighting sacrifices solution optimality to speed up the search. The larger the weight, the more greedy the search.
For my fellow scholars:
Motivation
From the Wikipedia A-star article:
A-star's admissibility criterion guarantees an optimal solution path, but it also means that A* must examine all equally meritorious paths to find the optimal path. We can speed up the search at the expense of optimality by relaxing the admissibility criterion to obtain an approximate solution. Oftentimes we want to bound this relaxation, so that we can guarantee that the solution path is no worse than (1 + ε) times the optimal solution path. This new guarantee is referred to as ε-admissible.
Static Weighting
Before we talk about dynamic weighting, let's compare A-star to the simplest ε-admissible relaxation: static-weighted A-star.
In static-weighted A-star, f(n) = g(n) + w·h(n), with w=(1+ε) for some ε>0. To illustrate the effect on optimality and search speed, compare the number of nodes expanded in each of the following illustrations. Empty circles represent nodes in the open set; filled-in circles are in the closed set.
A-star (left) vs. Weighted A-star with ε=4 (right)
As you can see, weighted A-star expanded far fewer nodes and completed about 3x as fast. However, since we used ε=4, weighted A-star could theoretically return a solution that is (1+ε)=(1+4)=5x times as long as the optimal path.
Dynamic Weighting
Dynamic Weighting is a technique that makes the heuristic weight a function of the search state, i.e. f(n) = g(n) + w(n)·h(n), where w(n) = (1 + ε - (ε*d(n))/N), d(n) is the depth of the current search and N is an upper bound on the search depth.
In this way, dynamic-weight A-Star initially behaves very much like a Greedy Best First search, but as the search depth (i.e. the number of hops in the graph) increases, the algorithm takes a more conservative approach, behaving more like the traditional A-star algorithm.
Amit Patel's page says
With dynamic weighting, you assume that at the beginning of your
search, it’s more important to get (anywhere) quickly; at the end of
the search, it’s more important to get to the goal.
He is correct, but I would saythat with dynamic weighting, you assume that at the beginning of your search, it's more important to follow your heuristic; at the end of the search, it becomes equally important to consider the length of the path, too.
Additional Materials and Links:
Asst. Prof. Ira Pohl -- The Avoidance of (Relative)
Catastrophe, Heuristic Competence, Genuine DYnamic Weighting and
Computational Issues in Heuristic Problem Solving
Dynamic Weighting on Amit Patel's Variants of A*
Wikipedia -- Bounded Relaxation for the A* Search Algorithm
I'm wondering how you can quantify the results of the Needleman-Wunsch algorithm (typically used for aligning nucleotide/protein sequences).
Consider some fixed scoring scheme and two sequences of varying length S1 and S2. Say we calculate every possible alignment of S1 and S2 by brute force, and the highest scoring alignment has a score x. And of course, this has considerably higher complexity than the Needleman-Wunsch approach.
When using the Needleman-Wunsch algorithm to find a sequence alignment, say that it has a score y.
Consider r to be the score generated via Needleman-Wunsch for two random sequences R1 and R2.
How does x compare to y? Is y always greater than r for two sequences of known homology?
In general, I do understand that we use the Needleman-Wunsch algorithm to significantly speed up sequence alignment (vs a brute-force approach), but don't understand the cost in accuracy (if any) that comes with it. I had a go at reading the original paper (Needleman & Wunsch, 1970) but am still left with this question.
Needlman-Wunsch always produces an optimal answer - it's much faster than brute force and doesn't sacrifice accuracy in the process. The key insight it uses is that it's not actually necessary to generate all possible alignments, since most of them contain bad sub-alignments and couldn't possibly be optimal. The Needleman-Wunsch algorithm works by instead slowly building up optimal alignments for fragments of the original strands and then slowly growing those smaller alignments into larger alignments using the guarantee that any optimal alignment must contain an optimal alignment for a slightly smaller case.
I think your question boils down to whether dynamic programming finds the optimal solution ie, garantees that y >= x. For a discussion on this I would refer to people who are likely smarter than me:
https://cs.stackexchange.com/questions/23599/how-is-dynamic-programming-different-from-brute-force
Basically, it says that dynamic programming will likely produce optimal result ie, same as brute force, but only for particular problems that satisfy the Bellman principle of optimality.
According to Wikipedia page for Needleman-Wunsch, the problem does satisfy Bellman principle of optimality:
https://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm
Specifically:
The Needleman–Wunsch algorithm is still widely used for optimal global
alignment, particularly when the quality of the global alignment is of
the utmost importance. However, the algorithm is expensive with
respect to time and space, proportional to the product of the length
of two sequences and hence is not suitable for long sequences.
There is also mention of optimality elsewhere in the same Wikipedia page.
Is there a term or expression for a heuristic function (as in pathfinding, state space, or combinatorial search) which can estimate the distance between any two nodes (goal or non-goal nodes)?
Furthermore, is there a term for such a function which never overestimates the aforementioned distance?
In the context of the A* search algorithm, a heuristic estimate which never overestimates the distance is called "admissible".
Other than "heuristic" and "estimate", I don't think there's a consistent, distinguished term for the function itself.
In a graph, when we know the depth at which goal node is, Which graph search algorithm is fastest to use: BFS or DFS?
And how would you define "best" ?
If you know that the goal node is at depth n from the root node (the node from which you begin the search), BFS - will ensure that the search won't iterate nodes with depth > n.
That said, DFS might still "choose" such a route that will be faster (iterate less nodes) than BFS.
So to sum up, I don't think that you can define "best" in such a scenario.
As I mentioned in the comments, if the solution is at a known depth d, you can use depth-limited search instead of DFS. For all three methods (BFS, DFS and DLS), the algorithmic complexity is linear in the number of nodes and links in your state space graph, in the worst case (i.e. O(|V|+|E|).
In practice, depending on d, DLS can be faster though, because BFS requires developping the search tree until depth d-1, and possibly a part of depth d (so almost the whole tree). With DLS, this happens only in the worst cases.