Any-goal bidirectional A* pathfinding reference - reference

(reposted from cs.stackexchange since I got no answers or comments)
I want to solve the problem of finding a shortest path on a directed weighted graph from a certain node to any of a specified set of destination nodes (preferably the closest one, but that's not that important). The standard (I believe) way to do this with the A* algorithm is to use a distance-to-closest-goal heuristic (which is admissable) and exit as soon as any of the goal nodes is reached.
However, in my scenario (which is game AI, if that matters) some (or all) of the goals might be unreachable; furthermore, the set of nodes reachable from such goals is typically quite small (or, at least, I want to optimize in that particular case). For the case of a single goal, bidirectional search sounds promising: the reverse search direction would quickly exhaust all reachable nodes and conclude that no path exists. These slides by Andrew Goldberg et al. describe the bidirectional A* algorithm with proper conditions on the heuristics, as well as stopping conditions.
My question is: is there a way to combine these two approaches, i.e. to perform bidirectional A* to find path to any of a specified set of goal nodes? I'm not sure what heuristic function to choose for the reverse search direction, what are the stopping conditions, etc. Googling for anything on this topic didn't get me anywhere either.

Related

How does DenseLayout in qiskit's transpiler work?

I'm looking for an explanation about the Dense Layout algorithm used by qiskit's transpiler.
I saw the source code, but still I don't understand what """Choose a Layout by finding the most connected subset of qubits""" means!
Is there a paper about this kind of mapping algorithm or other resource I can learn about it from?
It does a breadth first search for a connected subset starting at each qubit. The subset with the most connectivity is selected. Due to symmetry there are many subsets with same connectivity. However, it also looks at the noise in the device and picks the subset with the least amount of noise. Finally that set is run through a reverse cuthill mckee traversal to reorder the qubits in the set for a lower degree.
There is no paper on it as I came up with it to solve a bug in earlier versions of the Qiskit swap mapper.

Hypothesis search tree

I have a object with many fields. Each field has different range of values. I want to use hypothesis to generate different instances of this object.
Is there a limit to the number of combination of field values Hypothesis can handle? Or what does the search tree hypothesis creates look like? I don't need all the combinations but I want to make sure that I get a fair number of combinations where I test many different values for each field. I want to make sure Hypothesis is not doing a DFS until it hits the max number of examples to generate
TLDR: don't worry, this is a common use-case and even a naive strategy works very well.
The actual search process used by Hypothesis is complicated (as in, "lead author's PhD topic"), but it's definitely not a depth-first search! Briefly, it's a uniform distribution layered on a psudeo-random number generator, with a coverage-guided fuzzer biasing that towards less-explored code paths, with strategy-specific heuristics on top of that.
In general, I trust this process to pick good examples far more than I trust my own judgement, or that of anyone without years of experience in QA or testing research!

How to impure A* algorithm to support multi-searching in a maze

If I have a A* function that supports finding the optimal path from a starting point to a target in a maze, how should I modify the heuristic function to be admissible so that if there are multiple targets the function still return the optimal result.
Assuming that the problem is to visit only one target:
The first solution that comes to mind is to loop over all possible targets, compute the admissible heuristic value for each of them, and then finally return the minimum of those values as the final heuristic value. That way you're sure that the heuristic is still admissible (does not overestimate the distance to any of the targets).
EDIT: Now assuming that the problem is to visit all targets:
In this case, A* may not even necessarily be the best algorithm to use. You can use A* with the heuristic as described above to first find the shortest path to the closest target. Then, upon reaching the first (closest) target, you can run it again in the same way to find the next closest target, etc.
But this may not give an optimal overall solution. It is possible that it is beneficial to first visit the second closest target (for example), because in some cases that may enable a cheaper follow-up path to the second target.
If you want an optimal overall solution, you'll want to have a look at a different overall approach probably. Your problem will be similar to the Travelling Salesman Problem (though not exactly the same, since I think you're allowed to visit the same point multiple times for example). This link may also be able to help you.

Close-Enough TSP implementation

I'm looking for a solution to a Close-Enough Traveling Salesman Problem (CETSP) where I have a set of nodes that I need to visit all within a certain distance of optimally. I've found a couple of sources for some approaches towards this TSP variant but was unable to find a solver or a algorithm that I could easily use.
Do you have any suggestions for how I can go about getting a solution to my CETSP problem, whether it be running an implementation of it myself or using an existing solver.
You can try using UFFLP. They have an example where you can find the correct coordinates the salesman is supposed to pass given a predetermined sequence. So you can generate thousands of sequences and choose the best one (just a simple heuristic).
Have a look at http://www.gapso.com.br/en/ufflp-en/
You will find useful information.

How do I classify this value using a decision tree

Basically my decision tree can't classify a value using the normal algorithm.
I get to a node, and there are two options (say, sunny and windy), but at this node my value is different (for example, rainy).
Are there any methods to deal with this, e.g. change the tree or just estimate based on other data?
I was thinking of assigning the most common value at that node but this is just a guess.
Have you considered fuzzy logic for the rich/poor continuum? As for things that can't be expressed as a continuum, I can't think of a way it can be done. Rainy weather, for example, is so fundamentally different from sunny and windy weather in how we experience and react to it, I'm not sure how you expect a computer (or whatever it is you're writing your decision tree for) to figure out what to do. (Aside from simply having an "I don't know what to do" output state, but I'm assuming you wanted something more meaningful than that.)
The whole point in decision trees is that the options are complete and (hopefully) mutual exclusive.
If it is not you'll get into trouble. Redefine poor and rich to cover everything. (all incomes, all states of mind...)
But honestly, interpret such weather examples as what they are: just examples for a concept, not the holy grail of meteorology.
The issue here is that you've learned a decision from different data as you are using to classify it. More specific, your decision tree knows only two values (i.e., sunny and windy) for the attribute Weather. But your data for classification also allows the value rainy.
Since your decision tree has no observation when the weather was rainy, this value turns useless. In other words, you have to eliminate this value from your classification.
The only solution is to do data cleaning before using the decision tree as classifier.
You have two options:
1. Remove all observations/instances with Weather="rainy" from your data set because you can't classify them. The disadvantage is that all instances with Weather="rainy" are not classified.
2. For all observations/instances with Weather="rainy", remove the value or rather set it to unknown/null. In case that your decision tree can handle null values, it can classify all of your data set. If not, you still have a problem. In that case you should go for option 3.
3. Relearn your decision tree with Weather={sunny, windy, rainy}
(4). In your case the following is not an option. Replace "rainy" with either "sunny" or "rainy. There are different heuristics for that.
You are talking about the "normal algorithm", which is a quite blurry statement. I assume you are using a strictly-binary rooted decision tree, where the each internal node makes a binary split of the data. Thus, the condition evaluation at each internal node outputs a Boolean variable, which splits the data into the left node (true) and right node (false). In your case, you can have a categorical variable weather with two possible values in the training data, which makes only two possible node: weather==sunny or weather==windy. Hence, the rainy samples will be always on the right node, as it is not sunny and not windy.
In the following picture, the rainy samples will be classified as not sunny, not windy.

Resources