How to implement a best-first search in Haskell? - haskell

Is there a way to do a best-first search efficiently in Haskell? I want to maintain a dynamic tree. I have a heuristic function which can compute a number for each node and a successor function which will return a list of children of a node.
At every step, I want to take the leaf-node with the best heuristic and replace it with its children. This process is repeated until I get a node which is 'good enough'.
For this, I need to maintain a priority queue of the leaves. But is there a way by which I can go from a leaf in the priority queue to its position in the tree efficiently so that I can modify the tree?

One possible answer is to use Haskell's sneakiest form of mutability: laziness. If you lazily generate the entire tree (even if it's infinite) then repeatedly view different points in the tree according to your priority queue then you will only ever produce as much of the tree as is needed to perform your best-first search.
You'll still pay for repeated traversals of the lower branches of the tree, but perhaps you can change the structure of your search somehow to do better.

You shouldn't need an explicit tree for a best first search. Just the priority queue should do. Make the nodes of the queue carry as much context as necessary to compute the successor nodes and cost function. If you really need the tree, then as others have said, zippers are the way to go. You'd keep zippers in your priority queue rather than just tree nodes. Since the priority queue holds all the leaves, you shouldn't need to modify a shared tree.

Check out the weighted-search package, which implements a priority search monad. It allows you to express the solutions you want, adding weight along the way to indicate when you know that a solution has increased in expense, and the package will find the least-weight solutions first. In your case, create the whole tree (possibly infinite) you want to search, and then use weighted-search to construct all the paths through that tree and give weights to them. Then you will get the least-weight paths first.

Related

How would you implement a heuristic search in Haskell?

In Haskell or some other functional programming language, how would you implement a heuristic search?
Take as an example search space, the nine-puzzle, that is a 3x3 grid with 8 tiles and 1 hole, and you move tiles into the hole until you have correctly assembled a picture. The heuristic is the "Manhattan heuristic", which evaluates a board position adding up the distance each tile is from its target position, taking as the distance the number of squares horizontally plus the number of squares vertically each tile needs to be moved to get to the correct location.
I have been reading John Hughes paper on pretty printing as I know that pretty printer back-tracks to find better solutions. I am trying to understand how to generalise a heuristic search along these lines.
===
Note that my ultimate aim here is not to write a solver for the 9-puzzle, but to learn some general techniques for writing efficient heuristic searches in FP languages. I am also interested to learn if there is code that can be generalised and re-used across a wider class of such problems, rather than solving any specific problem.
For example, a search space can be characterised by a function that maps a State to a List of States together with some 'operation' that describes how one state is transitioned into another. There could also be a goal function, mapping a State to Bool, indicating when a goal State has been reached. And of course, the heuristic function mapping a State to a Number reflecting how well it is estimated to score. Other descriptions of the search are possible.
I don't think it's necessarily very specific to FP or Haskell (unless you utilize lists as "multiple possibility" monads, as in Learn You A Haskell For Great Good).
One way to do it would be by writing a recursive function taking the following:
the current state (that is the board configuration)
possibly some path metadata, e.g., the number of steps from the initial configuration (which is just the recursion depth), or a memoization-map of all the states already considered
possibly some decision, metadata, e.g., a pesudo-random number generator
Within each recursive call, the function would take the state, and check if it is the required result. If not it would
if it uses a memoization map, check if a choice was already considered
If it uses a recursive-step count, check whether to pursue the choices further
If it decides to recursively call itself on the possible choices emanating from this state (e.g., if there are different tiles which can be pushed into the hole), it could do so in the order based on the heuristic (or possibly pseudo-randomly based on the order based on the heuristic)
The function would return whether it succeeded, and, if they are used, updated versions of the memoization map and/or pseudo-random number generator.

Efficient algorithm for grouping array of strings by prefixes

I wonder what is the best way to group an array of strings according to a list of prefixes (of arbitrary length).
For example, if we have this:
prefixes = ['GENERAL', 'COMMON', 'HY-PHE-NATED', 'UNDERSCORED_']
Then
tasks = ['COMMONA', 'COMMONB', 'GENERALA', 'HY-PHE-NATEDA', 'UNDERESCORED_A', 'HY-PHE-NATEDB']
Should be grouped this way:
[['GENERALA'], ['COMMONA', 'COMMONB'], ['HY-PHE-NATEDA', 'HY-PHE-NATEDB'], ['UNDERESCORED_A'] ]
The naïve approach is to loop through all the tasks and inner loop through prefixes (or vice versa, whatever) and test each task for each prefix.
Can one give me a hint how to make this in a more efficient way?
It depends a bit on the size of your problem, of course, but your naive approach should be okay if you sort both your prefixes and your tasks and then build your sub-arrays by traversing both sorted lists only forwards.
There are a few options, but you might be interested in looking into the trie data structure.
http://en.wikipedia.org/wiki/Trie
The trie data structure is easy to understand and implement and works well for this type of problem. If you find that this works for your situation you can also look at Patricia Tries which achieve the similar performance characteristics but typically have better memory utilization. They are a little more involved to implement but not overly complex.

How to define a tree-like DAG in Haskell

How do you define a directed acyclic graph (DAG) (of strings) (with one root) best in Haskell?
I especially need to apply the following two functions on this data structure as fast as possible:
Find all (direct and indirect) ancestors of one element (including the parents of the parents etc.).
Find all (direct) children of one element.
I thought of [(String,[String])] where each pair is one element of the graph consisting of its name (String) and a list of strings ([String]) containing the names of (direct) parents of this element. The problem with this implementation is that it's hard to do the second task.
You could also use [(String,[String])] again while the list of strings ([String]) contain the names of the (direct) children. But here again, it's hard to do the first task.
What can I do? What alternatives are there? Which is the most efficient way?
EDIT: One more remark: I'd also like it to be defined easily. I have to define the instance of this data type myself "by hand", so i'd like to avoid unnecessary repetitions.
Have you looked at the tree implemention in Martin Erwig's Functional Graph Library? Each node is represented as a context containing both its children and its parents. See the graph type class for how to access this. It might not be as easy as you requested, but it is already there, well-tested and easy-to-use. I have used it for more than a decade in a large project.

Iterative octree traversal

I am not able to figure out the procedure for iterative octree traversal though I have tried approaching it in the way of binary tree traversal. For my problem, I have octree nodes having child and parent pointers and I would like to iterate and only store the leaf nodes in the stack.
Also, is going for iterative traversal faster than recursive traversal?
It is indeed like binary tree traversal, but you need to store a bit of intermediate information. A recursive algorithm will not be slower per se, but use a bit more stack space for O(log8) recursive calls (about 10 levels for 1 billion elements in the octree).
Iterative algorithms will also need the same amount of space to be efficient, but you can place it into the heap it you are afraid that your stack might overflow.
Recursively you would do (pseudocode):
function traverse_rec (octree):
collect value // if there are values in your intermediate nodes
for child in children:
traverse_rec (child)
The easiest way to arrive at an iterative algorithm is to use a stack or queue for depth first or breath first traversal:
function traverse_iter_dfs(octree):
stack = empty
push_stack(root_node)
while not empty (stack):
node = pop(stack)
collect value(node)
for child in children(node):
push_stack(child)
Replace the stack with a queue and you got breath first search. However, we are storing something in the region of O(7*(log8 N)) nodes which we are yet to traverse. If you think about it, that's the lesser evil though, unless you need to traverse really big trees. The only other way is to use the parent pointers, when you are done in a child, and then you need to select the next sibling, somehow.
If you don't store in advance the index of the current node (in respect to it's siblings) though, you can only search all the nodes of the parent in order to find the next sibling, which essentially doubles the amount of work to be done (for each node you don't just loop through the children but also through the siblings). Also, it looks like you at least need to remember which nodes you visited already, for it is in general undecidable whether to descend farther down or return back up the tree otherwise (prove me wrong somebody).
All in all I would recommend against searching for such a solution.
Depends on what your goal is. Are you trying to find whether a node is visible, if a ray will intersect its bounding box, or if a point is contained in the node?
Let's assume that you are doing the last one, checking if a point is/should be contained in the node. I would add a method to the Octnode that takes a point and checks whether or not it lies within the bounding box of the Octnode. If it does return true, else false, pretty simple. From here, call a drill down method that starts at your head node and check each child, simple "for" loop, to see which Octnode it lies in, it can at most be one.
Here is where your iterative vs recursive algorithm comes into play. If you want iterative, just store the pointer to the current node, and swap this pointer from the head node to the one containing your point. Then just keep drilling down till you reach maximal depth or don't find an Octnode containing it. If you want a recursive solution, then you will call this drill down method on the Octnode that you found the point in.
I wouldn't say that iterative versus recursive has much performance difference in terms of speed, but it could have a difference in terms of memory performance. Each time you recurse you add another call depth onto the stack. If you have a large Octree this could result in a large number of calls, possibly blowing your stack.

Walking a huge game tree

I have a game tree that is too big to walk in its entirety.
How can I write a function that will evaluate the tree until a time limit or depth limit is reached?
It would help to have a bit more detail, I think. Also, you raise two entirely separate issues--do you want both limits applied simultaneously, or are you looking for how to do each independently? That said, in rough terms:
Time limit: This is clearly impossible without using IO, to start with. Assuming your game tree traversal function is largely pure you'd probably prefer not to intertwine it with a bunch of time-tracking that determines control flow. I think the simplest thing here is probably to have the traversal produce a stream of progressively better results, place each "best result so far" into an MVar or such, and run it on a separate thread. If the time limit is reached, just kill the thread and take the current value from the MVar.
Depth limit: The most thorough way to do this would be to simply perform a breadth-first search, wouldn't it? If that's not viable for whatever reason, I don't think there's any better solution than the obvious one of simply keeping a counter to indicate the current depth and not continuing deeper when the maximum is reached. Note that this is a case where the code can potentially be tidied up using a Reader-style monad, where each recursive call is wrapped in something like local (subtract 1).
The timeout function in the base package allows you to kill a computation after a certain period. Interleaving timeout with a stream of increasingly deeper results, such that the most recent result is stored in an MVar is a relatively common trick for search problems in Haskell.
You can also use a lazy writer monad for your traversal, generating a list of improving answers. Now you've simplified your problem somewhat, to just taking the first "good enough" or "best so far" result from the list by some criteria. On top of that you can use the timeout trick that dons described, or any other approach you think appropriate...

Resources