How to impure A* algorithm to support multi-searching in a maze - search

If I have a A* function that supports finding the optimal path from a starting point to a target in a maze, how should I modify the heuristic function to be admissible so that if there are multiple targets the function still return the optimal result.

Assuming that the problem is to visit only one target:
The first solution that comes to mind is to loop over all possible targets, compute the admissible heuristic value for each of them, and then finally return the minimum of those values as the final heuristic value. That way you're sure that the heuristic is still admissible (does not overestimate the distance to any of the targets).
EDIT: Now assuming that the problem is to visit all targets:
In this case, A* may not even necessarily be the best algorithm to use. You can use A* with the heuristic as described above to first find the shortest path to the closest target. Then, upon reaching the first (closest) target, you can run it again in the same way to find the next closest target, etc.
But this may not give an optimal overall solution. It is possible that it is beneficial to first visit the second closest target (for example), because in some cases that may enable a cheaper follow-up path to the second target.
If you want an optimal overall solution, you'll want to have a look at a different overall approach probably. Your problem will be similar to the Travelling Salesman Problem (though not exactly the same, since I think you're allowed to visit the same point multiple times for example). This link may also be able to help you.

Related

Any-goal bidirectional A* pathfinding reference

(reposted from cs.stackexchange since I got no answers or comments)
I want to solve the problem of finding a shortest path on a directed weighted graph from a certain node to any of a specified set of destination nodes (preferably the closest one, but that's not that important). The standard (I believe) way to do this with the A* algorithm is to use a distance-to-closest-goal heuristic (which is admissable) and exit as soon as any of the goal nodes is reached.
However, in my scenario (which is game AI, if that matters) some (or all) of the goals might be unreachable; furthermore, the set of nodes reachable from such goals is typically quite small (or, at least, I want to optimize in that particular case). For the case of a single goal, bidirectional search sounds promising: the reverse search direction would quickly exhaust all reachable nodes and conclude that no path exists. These slides by Andrew Goldberg et al. describe the bidirectional A* algorithm with proper conditions on the heuristics, as well as stopping conditions.
My question is: is there a way to combine these two approaches, i.e. to perform bidirectional A* to find path to any of a specified set of goal nodes? I'm not sure what heuristic function to choose for the reverse search direction, what are the stopping conditions, etc. Googling for anything on this topic didn't get me anywhere either.

Compare Depth first branch and bound and IDA* search algorithm

I want to compare and know precise differences between depth first branch and bound and IDA* algorithms. I browsed the internet but i am unable to find clear explanations. Please help!
IDA* does a f-cost limited depth-first search, pruning paths that are more expensive (using the lower-bound heuristic) than the current cost bound. It gradually increases the bound until a solution is found.
DFBnB searches through a tree keeping track of the best solution found thus far, gradually decreasing the cost of the best solution until it is optimal. DFBnB also uses a lower-bound heuristic to prune any paths that are more expensive than the current best solution.
Some algorithms, like Budgeted Tree Search, do both types of pruning - using both current cost bounds and the best found solution thus far.

Hypothesis search tree

I have a object with many fields. Each field has different range of values. I want to use hypothesis to generate different instances of this object.
Is there a limit to the number of combination of field values Hypothesis can handle? Or what does the search tree hypothesis creates look like? I don't need all the combinations but I want to make sure that I get a fair number of combinations where I test many different values for each field. I want to make sure Hypothesis is not doing a DFS until it hits the max number of examples to generate
TLDR: don't worry, this is a common use-case and even a naive strategy works very well.
The actual search process used by Hypothesis is complicated (as in, "lead author's PhD topic"), but it's definitely not a depth-first search! Briefly, it's a uniform distribution layered on a psudeo-random number generator, with a coverage-guided fuzzer biasing that towards less-explored code paths, with strategy-specific heuristics on top of that.
In general, I trust this process to pick good examples far more than I trust my own judgement, or that of anyone without years of experience in QA or testing research!

How to turn TSP into minimum hamiltonian path?

I'm trying to solve this problem http://coj.uci.cu/24h/problem.xhtml?abb=1368.
After a lot of research, and spending a lot of time i was able to implement a Branch and Bound algorithm for TSP, which gets a path passing all points and returning to start.
I was thinking that removing the longest edge from that path i would get the answer, but just when i finished my algorithm, i discovered that this isn't true in all cases, reading this question: Minimal Distance Hamiltonian Path Javascript
I've found some answers saying that adding a dummy point with zero distance to every other point, and then removing it solves the problem, but i don't know the specifics of that. I've already added that dummy point, now instead of getting 26.01 now it's 16.23 as the answer. I haven't removed the dummy point yet, because i don't understand "the whole point of adding the dummy point".
Can you guide me for solving this? Or is it better to take another approach instead of the TSP?
The dummy point allows you to have the connection between the two ends at an arbitrarily large distance. In the TSP the two ends would also have to lie very close to each other in order to minimize total distance. In your path problem this requirement does not exist and so a TSP optimum is subjective to a constraint not valid for your problem and thus may not be an optimum to your path problem.
If you introduce a dummy point (or think of it as a shortcut, a wormhole) your ends may lie far apart without affecting your distance.

String Matching Algorithms

I have a python app with a database of businesses and I want to be able to search for businesses by name (for autocomplete purposes).
For example, consider the names "best buy", "mcdonalds", "sony" and "apple".
I would like "app" to return "apple", as well as "appel" and "ple".
"Mc'donalds" should return "mcdonalds".
"bst b" and "best-buy" should both return "best buy".
Which algorithm am I looking for, and does it have a python implementation?
Thanks!
The Levenshtein distance should do.
Look around - there are implementations in many languages.
Levenshtein distance will do this.
Note: this is a distance, you have to calculate it to every string in your database, which can be a big problem if you have a lot of entries.
If you have this problem then record all the typos the users make (typo=no direct match) and offline build a correction database which contains all the typo->fix mappings. some companies do this even more clever, eg: google watches how users correct their own typos and learns the mappings from this.
Soundex or Metaphone might work.
I think what you are looking for is a huge field of Data Quality and Data Cleansing. I fear if you could find a python implementation regarding this as it has to be something which cleanses considerable amount of data in db which could be of business value.
Levensthein distance goes in the right direction but only half the way. There are several tricks to get it to use the half matches as well.
One would be to use a subsequence dynamic time warping (DTW is actually a generalization of levensthein distance). For this you relax the start and end cases when calcualting the cost matrix. If you only relax one of the conditions you can get autocompletion with spell checking. I am not sure if there is a python implementation available, but if you want to implement it for yourself it should not be more than 10-20 LOC.
The other idea would be to use a Trie for speed up, which can do DTW/Levensthein on multiple results simultaniously (huge speedup if your database is large). There is a paper on Levensthein on Tries at IEEE, so you can find the algorithm there. Again for this you would need to relax the final boundary condition, so you get partial matches. However since you step down in the trie you just need to check when you have fully consumed the input and then return all leaves.
check this one http://docs.python.org/library/difflib.html
it should help you

Resources