Pacman pathfinding heuristic - search

How do I go about implementing an admissable heuristic function for a pacman game such that it finds the shortest path from a given location that includes multiple goals(all remaining dots). Currently i'm using an A* search with manhattan distances as the heuristic. I take the sum of all manhattan distances from a node to every remaining dot that has not yet been eaten and that is my H(n). The algorithm takes extremely long to complete and i'm not really sure about how to tiebreak.

Well, I'm assuming you're taking the edX course in Artificial Intelligence.
Taking the sum of the difference between your current position and each food pellet will not be admissible as you need to take into account that eating one pellet may get you even closer to another pellet.
Depending on the size of the grid and how sparse it is, what you can do is run a BFS from your pacman's current location to find the nearest pellet. You can then use that distance as an admissable heuristic.

Related

how to find farthest neighbors in Euclidean space?

Given a set P of n points in 2D, for any point x in P, what is the fastest way to find out the farthest neighbor of x? By farthest neighbor, we mean a point in P which has the maximum Euclidean distance to x.
To the best of my knowledge, the current standard kNN search algorithm for various trees (R-Trees, quadtrees, kd-trees) was developed by:
G. R. Hjaltason and H. Samet., "Distance browsing in spatial
databases.", ACM TODS 24(2):265--318. 1999
See here. It traverses the tree based on a priority queue of nearest nodes/entries. One key insight is that the algorithm also works for farthest neighbor search.
The basic algorithm uses a priority queue. The queue can contain tree nodes as well as data entries, all sorted by their distance to your search point.
As initial step it adds the root node to the priority queue. Then repeat the following until k entries have been found:
Take the first element from the queue. If it is an entry, return it. If it is a node, add all elements in the node to the priority queue.
Repeat 1.
The paper describes an implementation for R-Trees, but they claim it can be applied to most tree-like structures. I have implemented the nearest neighbor version myself for R-Trees and PH-Trees (a special type of quadtree), both in Java. I think I know how to do it efficiently for KD-Trees but I believe it is somewhat complicated.

Pacman ai project - Suitable of combination of step cost and heuristic

As part of a project, I am trying to implement A* within the context of a pacman game (see UC Berkley pacman ai project). There are no ghosts or capsules, only a maze and the 'fruit'. I am having trouble, however, understanding the relationship between my heuristic function and my cost function.
As per the project, when defining the search problem, we need to specify a step cost that derives from:
score = -Nb Steps + 10*NbOfEatenDots + 200*NbOfEatenGhosts + (-500*isLoss) + (500*isWin)
This cost is supposed to be always positive and so, for simplicity, I have decided to take: 1.5 - (0.5*AteAFoodDot). I have ignored ghosts and capsules since they do not exist and I have given a preferential score for moves tht end up eating a dot. I have also ignored steps that result in a loss (since they do not exist) and steps that result in a win state.
Now as far as the A* algorithm itself is concerned, we have to implement a cost function and a heuristic function of our own:
As a cost function I have chosen: Cost = sum(step costs to current state) and as a heuristic: h = Manhattan distance between pacman and the dot closest to him + manhattan distance of this dot and another dot that is furthest away from it, as long as it exists, which is an admissible heuristic. I have also implemented this heuristic using real maze distances instead of manhattan distances, but this seemed too time consuming for mazes with many food dots.
Now if I have understood correctly if g(n) is my cost function and h(n) my heuristic, I must always have: g(n to goal) >= h(n) so that A* always returns an optimal path and the closest the values of g and h for a node n, the less nodes will be expanded.
In this respect, is it not in my interest to ignore how the score is computed, ignore the fact that a step results in eating a food dot or not and simply take step_cost = 1 for all steps?
This is how I obtain the best results with respect to computation time and nodes expanded, but ignoring the cost function of the game seems wrong.
Could someone clarify this for me? Is it a matter of rpeference/choice or is there an objective correct answer/best approach?

Calculating the distance between each pair of a set of points

So I'm working on simulating a large number of n-dimensional particles, and I need to know the distance between every pair of points. Allowing for some error, and given the distance isn't relevant at all if exceeds some threshold, are there any good ways to accomplish this? I'm pretty sure if I want dist(A,C) and already know dist(A,B) and dist(B,C) I can bound it by [dist(A,B)-dist(B,C) , dist(A,B)+dist(B,C)], and then store the results in a sorted array, but I'd like to not reinvent the wheel if there's something better.
I don't think the number of dimensions should greatly affect the logic, but maybe for some solutions it will. Thanks in advance.
If the problem was simply about calculating the distances between all pairs, then it would be a O(n^2) problem without any chance for a better solution. However, you are saying that if the distance is greater than some threshold D, then you are not interested in it. This opens the opportunities for a better algorithm.
For example, in 2D case you can use the sweep-line technique. Sort your points lexicographically, first by y then by x. Then sweep the plane with a stripe of width D, bottom to top. As that stripe moves across the plane new points will enter the stripe through its top edge and exit it through its bottom edge. Active points (i.e. points currently inside the stripe) should be kept in some incrementally modifiable linear data structure sorted by their x coordinate.
Now, every time a new point enters the stripe, you have to check the currently active points to the left and to the right no farther than D (measured along the x axis). That's all.
The purpose of this algorithm (as it is typically the case with sweep-line approach) is to push the practical complexity away from O(n^2) and towards O(m), where m is the number of interactions we are actually interested in. Of course, the worst case performance will be O(n^2).
The above applies to 2-dimensional case. For n-dimensional case I'd say you'll be better off with a different technique. Some sort of space partitioning should work well here, i.e. to exploit the fact that if the distance between partitions is known to be greater than D, then there's no reason to consider the specific points in these partitions against each other.
If the distance beyond a certain threshold is not relevant, and this threshold is not too large, there are common techniques to make this more efficient: limit the search for neighbouring points using space-partitioning data structures. Possible options are:
Binning.
Trees: quadtrees(2d), kd-trees.
Binning with spatial hashing.
Also, since the distance from point A to point B is the same as distance from point B to point A, this distance should only be computed once. Thus, you should use the following loop:
for point i from 0 to n-1:
for point j from i+1 to n:
distance(point i, point j)
Combining these two techniques is very common for n-body simulation for example, where you have particles affect each other if they are close enough. Here are some fun examples of that in 2d: http://forum.openframeworks.cc/index.php?topic=2860.0
Here's a explanation of binning (and hashing): http://www.cs.cornell.edu/~bindel/class/cs5220-f11/notes/spatial.pdf

Given a polygon and a point in 2D, how can one find the feature (vertex or edge) of the polygon closest to the point?

A naive approach is to find, for each edge in the polygon, the point on that edge closest to the given point, and then take the one that's closest. Is there a faster algorithm? My goal is to implement a 2D Super Mario Galaxy-style platformer.
Apparently this can be done with Voronoi regions, as in this video: http://www.youtube.com/watch?v=Ldh2YKobuWo
However, I can't find any Voronoi algorithms that deal with edges as well as points. Ideas?
Calculate the point-line distance for each of the edges, then pick the shortest one. There is no shortcut. This site has a good explanation and even implementations in various languages.
However, finding "the point on that edge closest to the given point" is a computationally unnecessary intermediate result.
If the polygon is convex, then the overhead of the voronoi calculation far exceeds that of the naive approach.
If this is run many times, and each time the point changes slightly, you only need to check 3 segments (think about it: as you move around, assuming many checks, then the closest edge will only change to an adjacent edge)

Insert a point into a finite 2D region with maximum distance to existing points

I have a set of 2D points inside a finite 2D region of space (let's say a world-aligned rectangle to keep things simple for now). What would be an exceedingly efficient way to insert a new point into the set that has a relatively large distance to its new closest neighbour?
I could slowly build a Delaunay triangulation and limit my search to the largest triangles only, but I was hoping someone has a different (better) idea.
Goodwill,
David
Edit:
Forgot to mention that I need to do this thousands of times, every time taking all the previous points into account. I'm looking for an algorithm that doesn't slow down to a crawl as my point set grows.
Use the Bowyer-Watson or other incremental algorithm to maintain the Voronoi diagram. The vertexes of the Voronoi diagram are candidate points, keep all the candidate points in a priority queue ordered by distance to the source points. That should be pretty fast, and optimal (at least, optimal at each step).
Were you looking for something faster and less optimal?

Resources