Path-finding an unexplored maze

Path-finding an unexplored maze - search

I have a tile-based maze. In the beginning I am aware of my coordinates, the coordinates of the target tile and at each point I can tell which of the four adjacent tiles I can move onto. This means that I have to explore the maze as I attempt to solve it to learn more about it.
What would be an appropriate algorithm to solve this graph taking into account the maze-has-to-be-explored twist?

I'd recommend looking into A* algorithm, Dijkstra's algorithm, D* algorithm, depth first search and breadth first search.. Jump-point-search.
A good demo is found here:
http://qiao.github.io/PathFinding.js/visual/
if you need any help on implementation just ask, I primarily do C++ but also Java, c# and python.
you don't need to have any knowledge about the 'maze/grid' prior to pathfinding as you will be checking only immediate neighbours.
B C C
B A C
C C B
here where A = Agent, B = Blocked and C = Clear..
if you were using an A* implementation you would have two lists:
Open: add all nodes you encounter which are movable (in my example C are movable nodes).
Closed: One you decide on next move add the node to Closed list.
Cost is evaluated by F(G + H)
G = actual cost (accumulates when you add each move) e.g. +10 for a vertical/horizontal move and +15 for a diagonal.
H = is your heuristic, generally you use something like the Manhattan distance which is the distance from Agent A to destination D taking only Horizontal and Vertical movement into consideration.
So as you can see.. you have no need to know the map specifics.

Related

How to trace the path that visits all nodes in bfs/dfs

This is similar to How to trace the path in a Breadth-First Search?, but the method described in the answers in that post doesn't work for my case, it seems.
By path here, I essentially mean a sequence of connected nodes to get from a beginning to end state.
Consider an undirected graph with vertices V={a,b,c} and edges = {{a,b},{a,c}}, and assume that we must traverse the successors alphabetically. We start at node a and the end state is to visit all 3 nodes.
Breadth first search would first visit the edge a->b, and then the edge a->c. So the solution path is a->b->a->c. Since there is no edge between b & c, we must go back through a (so we must traverse the edge b->a). In the answer in the above linked post, the accepted solution would only output a->c.
I can't think of a way to modify the conventional bfs algorithm to do this. I have the same question for dfs, but I'd like to start with bfs for now.

It seems strange to want to do this. It's certainly simpler with depth-first search (DFS), which always either follows an edge or backtracks along that edge. In contrast, breadth-first search (BFS) generally does not visit (or backtrack to) a node adjacent to the previous one visited.
Specifically, this part of your question is wrong, and reveals a misconception:
Since there is no edge between b & c, we must go back through a (so we must traverse the edge b -> a).
BFS will never traverse the edge back from b to a in your example. It finishes visiting b, then polls c from the queue and visits c immediately, without "travelling" there via a.
For an analogy, it makes sense to think of DFS as tracing out a path; if you are in a maze, you could use breadcrumbs to mark places you've "visited", and therefore solve the maze by DFS. In contrast, a human cannot solve a maze by BFS because humans cannot have a queue of places they know how to get to, and "teleport" to the next place in the queue. BFS does not meaningfully trace out a path that you could follow by travelling along edges in the graph.
That said, if you really wanted to, you can construct a path visiting the nodes of the graph, such that each node is visited for the first time in the same order as BFS. The simplest way to do this would be to do a BFS to build a list of the nodes in "BFS order", while also building the "BFS tree". Then for each node in BFS order, you can get to it from the previous node going via their lowest common ancestor in the BFS tree. This path only goes via nodes that have already been visited earlier in BFS order.

What is the difference between overlapping subproblems and optimal substructure?

I understand the target approach for both the methods where Optimal Substructure calculates the optimal solution based on an input n while Overlapping Subproblems targets all the solutions for the range of input say from 1 to n.
For a problem like the Rod Cutting Problem. In this case while finding the optimal cut, do we consider each cut hence it can be considered as Overlapping Subproblem and work bottom-up. Or do we consider the optimal cut for a given input n and work top-down.
Hence, while they do deal with the optimality in the end, what are the exact differences between the two approaches.
I tried referring to this Overlapping Subproblem, Optimal Substructure and this page as well.
On a side note as well, does this relate to the solving approaches of Tabulation(top-down) and Memoization(bottom-up)?
This thread makes a valid point but I'm hoping if it could be broken down easier.

To answer your main question: overlapping subproblems and optimal substructure are both different concepts/properties, a problem that has both these properties or conditions being met can be solved via Dynamic Programming. To understand the difference between them, you actually need to understand what each of these term means in regards to Dynamic Programming.
I understand the target approach for both the methods where Optimal Substructure calculates the optimal solution based on an input n while Overlapping Subproblems targets all the solutions for the range of input say from 1 to n.
This is a poorly worded statement. You need to familiarize yourself with the basics of Dynamic Programming. Hopefully following explanation will help you get started.
Let's start with defining what each of these terms, Optimal Substructure & Overlapping Subproblems, mean.
Optimal Substructure: If optimal solution to a problem, S, of size n can be calculated by JUST looking at optimal solution of a subproblem, s, with size < n and NOT ALL solutions to subproblem, AND it will also result in an optimal solution for problem S, then this problem S is considered to have optimal substructure.
Example (Shortest Path Problem): consider a undirected graph with vertices a,b,c,d,e and edges (a,b), (a,e), (b,c), (c,d), (d,a) & (e,b) then shortest path between a & c is a -- b -- c and this problem can be broken down into finding shortest path between a & b and then shortest path between b & c and this will give us a valid solution. Note that we have two ways of reaching b from a:
a -- b (Shortest path)
a -- e -- b
Longest Path Problem does not have optimal substructure. Longest path between a & d is a -- e -- b -- c -- d, but sum of longest paths between a & c (a -- e -- b -- c) and c & d (c -- b -- e -- a -- d) won't give us a valid (non-repeating vertices) longest path between a & d.
Overlapping Subproblems: If you look at this diagram from the link you shared:
You can see that subproblem fib(1) is 'overlapping' across multiple branches and thus fib(5) has overlapping subproblems (fib(1), fib(2), etc).
On a side note as well, does this relate to the solving approaches of Tabulation(top-down) and Memoization(bottom-up)?
This again is a poorly worded question. Top-down(recursive) and bottom-up(iterative) approaches are different ways of solving a DP problem using memoization. From the Wikipedia article of Memoization:
In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
For the given fibonacci example, if we store fib(1) in a table after it was encountered the first time, we don't need to recompute it again when we see it next time. We can reuse the stored result and hence saving us lot of computations.
When we implement an iterative solution, "table" is usually an array (or array of arrays) and when we implement a recursive solution, "table" is usually a dynamic data structure, a hashmap (dictionary).
You can further read this link for better understanding of these two approaches.

Proof of optimal efficiency of A* Search

It is mentioned in Norvig's Artificial Intelligence that A* Search is optimally efficient. However, I could not figure out why nor find on the web the proof. Does anyone happen to have a proof?

I hope I'm not doing your homework ;). I only sketch the proof here
First thing you need to see is that A* is optimal. That is it returns the shortest path according to your cost function g. I think this prove is trivial under the assumption that the heuristic h is not overestimating the cost of the solution. If this wouldn't hold optimal efficiency would be meaningless, as A* wouldn't be optimal.
Optimal efficiency: among all optimal algorithms starting from the same name node A* is expending the fewest nodes.
Lets assume an algorithm B does not expand a node n which is A* expanded by A*. By definition for this path g(n)+h(n) <= f where f is the cost of the shortest path. Consider a second problem for which all heuristics values are the same as in the original problem. However, there is a new path to a new goal with total cost smaller f.
The assumed algorithm B would expand n hence never reach this new goal. Hence, B wouldn't find this optimal path. Therefore, our original assumption that B is optimal is violated.

drawing minmal DFA for the given regular expression

What is the direct and easy approach to draw minimal DFA, that accepts the same language as of given Regular Expression(RE).
I know it can be done by:
Regex ---to----► NFA ---to-----► DFA ---to-----► minimized DFA
But is there any shortcut way? like for (a+b)*ab

Regular Expression to DFA
Although there is NO algorithmic shortcut to draw DFA from a Regular Expression(RE) but a shortcut technique is possible by analysis not by derivation, it can save your time to draw a minimized dfa. But off-course the technique you can learn only by practice. I take your example to show my approach:
(a + b)*ab
First, think about the language of the regular expression. If its difficult to sate what is the language description at first attempt, then find what is the smallest possible strings can be generate in language then find second smallest.....
Keep memorized solution of some basic regular expressions. For example, I have written here some basic idea to writing left-linear and right-linear grammars directly from regular expression. Similarly you can write for construing minimized dfa.
In RE (a + b)*ab, the smallest possible string is ab because using (a + b)* one can generate NULL(^) string. Second smallest string can be either aab or bab. Now one thing we can easily notice about language is that any string in language of this RE always ends with ab (suffix), Whereas prefix can be any possible string consist of a and b including ^.
Also, if current symbol is a; then one possible chance is that next symbol would be a b and string end. Thus in dfa we required, a transition such that when ever a b symbol comes after symbol a, then it should be move to some of the final state in dfa.
Next, if a new symbol comes on final state then we should move to some non-final state because any symbol after b is possible only in middle of some string in language as all language string terminates with suffix 'ab'.
So with this knowledge at this stage we can draw an incomplete transition diagram like below:
--►(Q0)---a---►(Q1)---b----►((Qf))
Now at this point you need to understand: every state has some meaning for example
(Q0) means = Start state
(Q1) means = Last symbol was 'a', and with one more 'b' we can shift to a final state
(Qf) means = Last two symbols was 'ab'
Now think what happens if a symbol a comes on final state. Just more to state Q1 because this state means last symbol was a. (updated transition diagram)
--►(Q0)---a---►(Q1)---b----►((Qf))
▲-----a--------|
But suppose instead of symbol a a symbol b comes at final state. Then we should move from final state to some non-final state. In present transition graph in this situation we should make a move to initial state from final state Qf.(as again we need ab in string for acceptation)
--►(Q0)---a---►(Q1)---b----►((Qf))
▲ ▲-----a--------|
|----------------b--------|
This graph is still incomplete! because there is no outgoing edge for symbol a from Q1. And for symbol a on state Q1 a self loop is required because Q1 means last symbol was an a.
a-
||
▼|
--►(Q0)---a---►(Q1)---b----►((Qf))
▲ ▲-----a--------|
|----------------b--------|
Now I believe all possible out-going edges are present from Q1 & Qf in above graph. One missing edge is an out-going edge from Q0 for symbol b. And there must be a self loop at state Q0 because again we need a sequence of ab so that string can be accept. (from Q0 to Qf shift is possible with ab)
b- a-
|| ||
▼| ▼|
--►(Q0)---a---►(Q1)---b----►((Qf))
▲ ▲-----a--------|
|----------------b--------|
Now DFA is complete!
Off-course the method might look difficult at first few tries. But if you learn to draw this way you will observe improvement in your analytically skills. And you will find this method is quick and objective way to draw DFA.
* In the link I given, I described some more regular expressions, I would highly encourage you to learn them and try to make DFA for those regular expressions too.

Quadtree object movement

So I need some help brainstorming, from a theoretical standpoint. Right now I have some code that just draws some objects. The objects lie in the leaves of a quadtree. Now as the objects move I want to keep them placed in the correct leaf of the quadtree.
Right now I am just reconstructing the quadtree on the objects after I change their position. I was trying to figure out a way to correct the tree without rebuilding it completely. All I can think of is having a bunch of pointers to adjacent leaf nodes.
Does anyone have an idea of how to figure out the node into which an object moves without just having a ton of pointers everywhere or a link to articles on this? All I could find was different ways to build the quadtree, nothing about updating it.

If I understand your question. You want some way of mapping between spatial coordinates and leaves on the quadtree.
Here's one possible solution I've been looking at:
For simplicity, let's do the 1D case first. And lets assume we have 32 gridpoints in x. Every grid point then corresponds to some leaf on a quadtree of depth five. (depth 0 = the whole grid, depth 1 = 2 points, depth 2 = 4 points... depth 5 = 32 points).
Each leaf could be represented by the branch indices leading to the leaf. At each level there are two branches we can label A and B. So, a particular leaf might be labeled BBAAB, which would mean, go down the B branch, then the B branch, then the A branch, then the B branch and then the B branch.
So, how do you map e.g. BBABB to an x grid point between 0..31? Just convert it to binary, so that BBABB->11011 = 27. Thus, the mapping from gridpoint to leaf-node is simply a matter of translating the letters A and B into 0s and 1s and then interpreting the result as a binary number.
For the 2D case, it's only slightly more complicated. Now we have four branches from each node, so we can label each branch path using a four-letter alphabet, e.g. starting from the root and taking the 3rd branch and then the fourth branch and then the first branch and then the second branch and then the second branch again we would generate the string CDABB.
Now to convert the string (e.g. 'CDABB') into a pair of gridvalues (x,y).
Let's assume A is lower-left, B is lower right, C is upper left and D is upper right. Then, symbolically, we could write, A.x=0, A.y=0 / B.x=1, B.y=0 / C.x=0, C.y=1 / D.x=1, D.y=1.
Taking the example CDABB, we first look at its x values (CDABB).x = (01011), which gives us the x grid point. And similarly for y.
Finally, if you want to find out e.g. the node immediately to the right of CDABB, then simply convert it to a pair of binary numbers in x and y, add +1 to the x value and convert the new pair of binary numbers back into a string.
I'm sure this has all been discovered, but I haven't yet found this information on the web.

If you have the spatial data necessary to insert an element into the quad-tree in the first place (ex: its point or rectangle), then you have the same data needed to remove it.
An easy way is before you move an element, remove it from the quad-tree using the same data you used to originally insert it, then move it, then re-insert.
Removal from the quad-tree can first remove the element from the leaf node(s), then if the leaf nodes become empty, remove them from their parents. If the parents become empty, remove them from their parents, and so forth.
This simple method is efficient enough for a complex world of objects moving every frame as long as you implement the quad-tree efficiently (ex: use a free list for the nodes). There shouldn't have to be a heap allocation on a per-node basis to insert it, nor a heap deallocation involved in removing every single node. Most node allocations/deallocations should be a simple constant-time operation just involving, say, the manipulation of a couple of integers or pointers.
You can also make this a little more complex if you like. You can start off storing the previous position of an object and then move it. If the new position occupies nodes other than the previous position, then remove the object from the nodes it no longer occupies and insert it to the new ones. Otherwise just keep it in the same node(s).
Update
I usually try to avoid linking my previous answers, but in this case I ended up doing a pretty comprehensive write up on the topic which would be hard to replicate anywhere else. Here it is: https://stackoverflow.com/a/48330314/4842163

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string