What does the star in the A* algorithm mean? - search

I'm quite sure the * (star) in the A* algorithm means that the algorithm is admissible, i.e. it is guaranteed that it finds the shortest path in the graph if this path exists (when the heuristic employed is optimistic).
Am I right? I was unsuccessfully looking for any info about the topic but I couldn't find any reference. Hopefully, most experienced users in this community know something else about the history of A* than I do.
By the way, I think that other algorithms like IDA*, D*, SMA*, MOA*, NAMOA*, ... that are based on A* follow the same name convention.

The reason is that scientists first came up with an improved version of the Dijkstra algorithm they called A1. Later on, the inventors of A* discovered an improvement of A1 that they called A2. These people then managed to prove that A2 was actually optimal under some assumptions on the heuristic in use. Because A2 was optimal, it was renamed A*. In science, and in optimisation in particular, a " * " symbol is often used to denote optimal solutions. Some also interpret the " * " as meaning "any version number" since it was proven impossible to build an "A3" algorithm that would outperform A2/A*.
By the way, in this context, "optimal" doesn't mean that it reaches the optimal solution, but that it does so while exploring the minimum number of nodes. Of course, A* is also complete, which means it reaches the optimal solution (if we use an admissible heuristic).

Related

Trivial Heuristic for A* essentially makes it Equivalent to Uniform Cost Search?

As the title says, this is what I'm thinking. A trivial heuristic, say estimate = 0 for every node, essentially makes it count only the current cost. Is there a still a difference between the two algorithms if the heuristic is trivial?
You can, as you say, use a heuristic of 0 for every node and functionally they will be very close.
There is one subtle difference between A* and Uniform Cost Search (UCS): UCS knows that all edges are cost 1, so it can terminate when the goal is generated, while A* can only terminate when the goal is expanded.
The other difference is the complexity of data structures needed. A* might have arbitrary edge costs and so it needs more complicated priority queues to be efficient.

Proof of optimal efficiency of A* Search

It is mentioned in Norvig's Artificial Intelligence that A* Search is optimally efficient. However, I could not figure out why nor find on the web the proof. Does anyone happen to have a proof?
I hope I'm not doing your homework ;). I only sketch the proof here
First thing you need to see is that A* is optimal. That is it returns the shortest path according to your cost function g. I think this prove is trivial under the assumption that the heuristic h is not overestimating the cost of the solution. If this wouldn't hold optimal efficiency would be meaningless, as A* wouldn't be optimal.
Optimal efficiency: among all optimal algorithms starting from the same name node A* is expending the fewest nodes.
Lets assume an algorithm B does not expand a node n which is A* expanded by A*. By definition for this path g(n)+h(n) <= f where f is the cost of the shortest path. Consider a second problem for which all heuristics values are the same as in the original problem. However, there is a new path to a new goal with total cost smaller f.
The assumed algorithm B would expand n hence never reach this new goal. Hence, B wouldn't find this optimal path. Therefore, our original assumption that B is optimal is violated.

Monotonicity and A*. Is it optimal?

Is A* an optimal search algorithm (ie. will it find the best solution) even if it is non-monotonic? Why or Why not?
A* is an optimal search algorithm as long as the heuristic is admissible.
But, if the heuristic is inconsistent, you will need to re-expand nodes to ensure optimality. (That is, if you find a shorter path to a node on the closed list, you need to update the g-cost and put it back on the open list.)
This re-expansion can introduce exponential overhead in state spaces with exponentially growing edge costs. There are variants of A* which reduce this to polynomial overhead by essentially interleaving a dijkstra search in between each A* node expansion.
This paper has an overview of recent work, and particularly citations of other work by Martelli and Mero who detail these worst-case graphs and suggest optimizations to improve A*.
A* is an optimal search algorithm if and only if the heuristic is both admissible and monotonic (also referred to as consistent). For a discussion of the distinction between these criteria, see this question. Effectively, the consistency requirement means that the heuristic cannot overestimate the distance between any pair of nodes in the graph being searched. This is the necessary because any overestimate could result in the search ignoring a good path in favor of a path that is actually worse.
Let's walk through an example of why this is true. Consider two nodes, A and B, with heuristic estimates of 5 and 6, respectively. If the heuristic is admissible and consistent, the shortest path that could possibly exist goes through A and is no shorter than 5. But what if the heuristic isn't consistent? In order for a heuristic to be admissible but inconsistent, it must always underestimate the distance to the goal but not result in there being a clear relationship between heuristic estimates at nodes within the graph. For a concrete example, see my answer to this question. In this example, the heuristic function randomly chooses between two other functions. If the heuristic estimates at A and B were not calculated based on the same function, we don't actually know which one of them currently has the potential to lead to a shorter path. In effect, we aren't using the same scale to measure them. So we could choose A when B was actually the better option. This might result in us finding the goal through a sub-optimal path.

Connection between A-star search and integer programming, extending A-star

Does anyone have a good reference for the connections between A-star search and more general integer programming formulations for a Euclidean shortest path problem?
In particular I'm interested in how one modifies A-star to cope with additional (perhaps path-dependent) constraints, if it makes sense to use a general-purpose LP/IP solver to tackle constrained shortest path problems like this or if something more specialised is required to achieve the same kind of performance obtained by A-star together with a good heuristic.
Not afraid of maths, but most of the references I'm finding for more complex shortest path problems aren't very explicit about how they relate to heuristic-guided algorithms like A* (perhaps because 'A*' is hard to google for...)
You might want to look into constraint optimization, specifically soft-arc consistency, and constraint satisfaction, specifically arc-consistency, or other types of consistency such as i-consistency. Here's some references about constraint optimization:
[1] Thomas Schiex. Soft constraint Processing. http://www.inra.fr/mia/T/schiex/Export/Ecole.pdf
[2] Dechter, Rina. Constraint Processing, 1st ed. Morgan Kaufmann, San Francisco, CA 94104-3205, 2003.
[3] Kask, K., and Dechter, R. Mini-Bucket Heuristics for Improved Search. In Proc. UAI-1999 (San Francisco, CA, 1999), Morgan Kaufmann, pp. 314–323.
[3] might be especially interesting because it deals with combining A* with a heuristic of the type you seem to be interested in.
I'm not sure whether this helps you. Here's how I got the idea that it might:
Constraint optimization is a generalization of SAT towards optimization and variables with more than two values. A set of soft-constraints, i.e. partial cost functions, and a set of discrete variables define your problem. Typically a branch-and-bound algorithm is used to traverse the search tree that this problem implies. Soft-arc consistency refers to a set of heuristics that use local soft-constraints to compute the approximate distance to the goal node in that search tree, from your current position. These heuristics are used within the branch-and-bound search, much like heuristics are used within A* search.
Branch-and-bound relates to A* over trees much the same way that depth-first search relates to breadth-first search. So, apart from the fact that a DFS-like algorithm (branch-and-bound) is used in this case, and that it is a tree instead of a graph, it looks like (soft)-arc consistency or other types of consistency is what you are looking for.
Unfortunately, while you can in principle use A* in place of branch-and-bound, it is not clear yet (as far as I know) how in general you could combine A* with soft-arc consistency. Going from a tree to a graph might further complicate things, but I don't know that.
So, no final answer, just some stuff to look at as a starter, maybe :).

O(n^2) (or O(n^2lg(n)) ?)algorithm to calculate the longest common subsequence (LCS) of two 'ring' string

This is a problem appeared in today's Pacific NW Region Programming Contest during which no one solved it. It is problem B and the complete problem set is here: http://www.acmicpc-pacnw.org/icpc-statements-2011.zip. There is a well-known O(n^2) algorithm for LCS of two strings using Dynamic Programming. But when these strings are extended to rings I have no idea...
P.S. note that it is subsequence rather than substring, so the elements do not need to be adjacent to each other
P.S. It might not be O(n^2) but O(n^2lgn) or something that can give the result in 5 seconds on a common computer.
Searching the web, this appears to be covered by section 4.3 of the paper "Incremental String Comparison", by Landau, Myers, and Schmidt at cost O(ne) < O(n^2), where I think e is the edit distance. This paper also references a previous paper by Maes giving cost O(mn log m) with more general edit costs - "On a cyclic string to string correcting problem". Expecting a contestant to reproduce either of these papers seems pretty demanding to me - but as far as I can see the question does ask for the longest common subsequence on cyclic strings.
You can double the first and second string and then use the ordinary method, and later wrap the positions around.
It is a good idea to "double" the strings and apply the standard dynamic programing algorithm. The problem with it is that to get the optimal cyclic LCS one then has to "start the algorithm from multiple initial conditions". Just one initial condition (e.g. setting all Lij variables to 0 at the boundaries) will not do in general. In practice it turns out that the number of initial states that are needed are O(N) in number (they span a diagonal), so one gets back to an O(N^3) algorithm.
However, the approach does has some virtue as it can be used to design efficient O(N^2) heuristics (not exact but near exact) for CLCS.
I do not know if a true O(N^2) exist, and would be very interested if someone knows one.
The CLCS problem has quite interesting properties of "periodicity": the length of a CLCS of
p-times reapeated strings is p times the CLCS of the strings. This can be proved by adopting a geometric view off the problem.
Also, there are some additional benefits of the problem: it can be shown that if Lc(N) denotes the averaged value of the CLCS length of two random strings of length N, then
|Lc(N)-CN| is O(\sqrt{N}) where C is Chvatal-Sankoff's constant. For the averaged length L(N) of the standard LCS, the only rate result of which I know says that |L(N)-CN| is O(sqrt(Nlog N)). There could be a nice way to compare Lc(N) with L(N) but I don't know it.
Another question: it is clear that the CLCS length is not superadditive contrary to the LCS length. By this I mean it is not true that CLCS(X1X2,Y1Y2) is always greater than CLCS(X1,Y1)+CLCS(X2,Y2) (it is very easy to find counter examples with a computer).
But it seems possible that the averaged length Lc(N) is superadditive (Lc(N1+N2) greater than Lc(N1)+Lc(N2)) - though if there is a proof I don't know it.
One modest interest in this question is that the values Lc(N)/N for the first few values of N would then provide good bounds to the Chvatal-Sankoff constant (much better than L(N)/N).
As a followup to mcdowella's answer, I'd like to point out that the O(n^2 lg n) solution presented in Maes' paper is the intended solution to the contest problem (check http://www.acmicpc-pacnw.org/ProblemSet/2011/solutions.zip). The O(ne) solution in Landau et al's paper does NOT apply to this problem, as that paper is targeted at edit distance, not LCS. In particular, the solution to cyclic edit distance only applies if the edit operations (add, delete, replace) all have unit (1, 1, 1) cost. LCS, on the other hand, is equivalent to edit distances with (add, delete, replace) costs (1, 1, 2). These are not equivalent to each other; for example, consider the input strings "ABC" and "CXY" (for the acyclic case; you can construct cyclic counterexamples similarly). The LCS of the two strings is "C", but the minimum unit-cost edit is to replace each character in turn.
At 110 lines but no complex data structures, Maes' solution falls towards the upper end of what is reasonable to implement in a contest setting. Even if Landau et al's solution could be adapted to handle cyclic LCS, the complexity of the data structure makes it infeasible in a contest setting.
Last but not least, I'd like to point out that an O(n^2) solution DOES exist for CLCS, described here: http://arxiv.org/abs/1208.0396 At 60 lines, no complex data structures, and only 2 arrays, this solution is quite reasonable to implement in a contest setting. Arriving at the solution might be a different matter, though.

Resources