Uniform Cost Graph Search opening too many nodes - search

I'm working on an assignment from an archived AI course from 2014.
The parameter "problem" refers to an object that has different cost functions chosen at run (sometimes it is 1 cost per move; sometimes moves are more expensive depending on which side of the pacman board the moves are done on).
As written below, I get the right behavior but I open more search nodes than expected (about 2x what the assignment expects).
If I turn the cost variable to negative, I get the right behavior on the 1-unit cost case AND get a really low number of nodes. But the behavior is opposite for the cases of higher costs for a given side of the board.
So basically question is: Does it seem like I'm opening any nodes unnecessarily in the context of a uniform cost search?
def uniformCostSearch(problem):
"""Search the node of least total cost first."""
def UCS(problem, start):
q = util.PriorityQueue()
for node in problem.getSuccessors(start): ## Comes a tuple ((x,y), 'North', 1)
q.push([node], node[2]) ##Push nodes onto queue one a time (so they are paths)
while not q.isEmpty():
pathToCheck = q.pop() ##Pops off the lowest priorty path on the queue?
#if pathToCheck in q.heap:
# continue
lastNode = pathToCheck[-1][0] ## Gets the coordinates of that last node in that path
if problem.isGoalState(lastNode): ##Checks if those coordinates are goal
return pathToCheck ## If goal, returns that path that was checked
else: ## Else, need to get successors of that node and put them on queue
for successor in problem.getSuccessors(lastNode): ##Gets those successors the popped path's last node and iterates over them
nodesVisited = [edge[0] for edge in pathToCheck] ##Looks at all the edges (the node plus its next legal move and cost) and grabs just the coords visited (nodes)
if successor[0] not in nodesVisited: ## ##Checks to see if those visited were in THIS particular path (to avoid cyclces)
newPath = pathToCheck + [successor] ## If ONE successor was not in path, adds it to the growing path (will do the next one in next part of loop)
cost = problem.getCostOfActions([x[1] for x in newPath])
q.update(newPath, cost) #Pushes that validated new path onto the back of the queue for retrieval later
return None
start = problem.getStartState()#Starting point of stack
successorList = UCS(problem, start)
directions = []
for i in range(len(successorList)):
directions += successorList[i]
return directions[1::3]

I figured it out.
Basically, while I'm checking that I don't revisit nodes within a given path, I'm not checking if I'm visiting nodes in others paths on the queue. I can check that by adding a nodesVisited list that just appends all nodes ever visited and checking THAT for duplicate visits.

Related

Why a quadtree sometimes need a max number to hold in a node?

I am doing a compuational geometric issue which uses TriangularMeshQuadtree from a C# library, and some of its constructors is written as follows (from metadata, so I cannot see the details of implementations):
constructor 1:
// Summary:
// Constructor to use if you are going to store the objects in x/y space, and there
// is a smallest node size because you don't want the nodes to be smaller than a
// group of pixels.
//
// Parameters:
// xMax:
// eastern border of node coverage.
//
// xMin:
// western border of node coverage.
//
// yMax:
// northern border of node coverage.
//
// yMin:
// southern border of node coverage.
//
// maxItems:
// number of items to hold in a node before splitting itself into four branch and
// redispensing the items into them.
public TriangularMeshQuadtree(double xMax, double xMin, double yMax, double yMin, int maxItems);
constructor 2:
//
// Summary:
// Gets quad tree of a list of triangular surface in the plane with normal of dir
//
// Parameters:
// surfaces:
// A list of triangular surface
//
// dir:
// The normal of plane on which quad tree is projected
//
// maxItemNumber:
// The maximum number of items in each node of quad tree
//
// transformator:
// Coordinate transformator
//
// Returns:
// A quad tree
public static TriangularMeshQuadtree GetQuadTree(List<SubTSurf> surfaces, Vector3 dir, int maxItemNumber, out CoordinateTransformator transformator);
My understanding of a quadtree is that it divides a set of points recursively into 4 sections until every point is unique in one section. I dont understand the definition of maxItem in the above code and how it works with quadtree.
Your understanding "... until every point is unique in one section" is not quite correct. It describes a very special kind of quadtree that is usually used to in example to explain the concept.
In general, a quadtree can hold many more items per node. This is often done to reduce the number of nodes (if we have more entries per node than we need fewer nodes). The benefit of reducing node count is:
Reduced memory usage (every node adds some memory overhead)
Usually faster search (every nodes adds an indirection which is slow), i.e. a very 'deep' tree is slow to traverse.
The maxItem should not be too large, because inside a node the points are usually stored in a linear list. A linear list obviously requires linear search, so that slows things down if the list is too large. In my experience sensible values for maxItem are between 10 and 100.
Another parameter that is often given is maxDepth. This parameter limits the depth of the tree, which is equal to the number of parents of a given node. The idea is that a bad dataset can result in a tree that is very 'deep', which makes it expensive to traverse. Instead, is a node is at depth=maxDepth, it is prevented from splitting, even if it exceeds maxItem entries.
Having said all the above, there are useful real-world quadtree-type structures that allow at most one entry per quadrant. One example is the PH-Tree (disclaimer: self-advertisement). It uses other techniques to limit the depth to 32 or 64. It takes a while to explain and it wasn't part of the question, so I just reference the documentation here.

Changing the speed of a periodically appearing object in grid

I'm working in the minigrid environment. I currently have an object moving across the screen at one grid space per step count (from left to right). This object periodically appears currently half time on screen, and half off. I'd like to slow down the object so that it moves slower across the screen. I'm not sure how to do it without losing the periodic appearance attribute. Current code is below:
idx = (self.step_count+2)%(2*self.width) # 2 is the ratio of appear not appear
if idx < self.width:
try:
self.put_obj(self.obstacles[i_obst], idx , old_pos[1])
self.grid.set(*old_pos, None) # deletes old obstacle
except:
pass
else:
self.grid.set(*old_pos, None) # deletes old obstacle
I got something to work. The below snippet includes an integer titled "slow_factor" that reduces the speed yet still has the idx useful for the original purpose.
idx = (self.step_count+2)//slow_factor%(2*self.width)

How to calculate the maximum branching factor for the towers of hanoi

I am modeling the towers of Hanoi problem with n discs and k pegs, and I am trying to find its maximun branching factor. The problem is that, as the number both of discs and pegs is variable, so is the number of actions possible for each node. How can I find a generic way of assesing the maximum branching factor depending on k and n?
In general the smallest disc can move to any other peg: k-1 options.
The second smallest disk (at the top of the stack on a peg; might not be the second smallest overall) can move onto any pegs except the one with the smallest disc: k-2 options.
This continues until the largest disk on the top of a peg, which can't move anywhere (assuming n>k).
So, the expected branching factor is: (k-1)+(k-2)+(k-3)+...+2+1 = (k-1)*k/2
The only time you won't get this is when one of the pegs contains no disks. If n>>k this will rarely happen. But, it means that if you are searching from random states to a goal state, you should consider searching backwards, because the standard goal state has the lowest branching factor since only one peg has a disc.
The n < k case can be similarly analyzed, except that you stop after n disks and subtract an additional term for the moves we counted the first time around that aren't available now:
k(k-1)/2 - (k-n)(k-n-1)/2

tqdm notebook: increase bar width

I have a task I'd like to monitor the progress of; it's a brute force np problem running in a while loop.
For the first x (unknown number) iterations of the loop it discovers an unknown additional number of future combinations (many per loop), eventually it progresses through the solution to a point where it is solving puzzles (each loop is a single solution) faster than it is finding new possible puzzles and it eventually solves the last puzzle it found (100%).
I've created some fake growth to provide a repeatable example:
from tqdm import tqdm_notebook as tqdm
growthFactorA = 19
growthFactorB = 2
prog = tqdm(total=50, dynamic_ncols=True)
done = []
todo = [1]
while len(todo)>0:
current = todo.pop(0)
if current < growthFactorA:
todo.extend(range(current+1, growthFactorA+growthFactorB))
done.append(current)
prog.total = len(todo) + len(done)
prog.update()
You'll see the total eventually stops at 389814 at first it is growing much faster that the loop is solving puzzles, but at a point the system stops growing.
It is impossible to calculate the number of iterations before running the algorithm.
The blue bar is confined to the original total amount used at initialization. My goal is to achieve something similar to if the initial total was set to 389814, it's okay that during the growth period (early on in the trial) the progress bar appears to move backwards or not move as the total increases.
As posted in https://github.com/tqdm/tqdm/issues/883#issuecomment-575873544 for now you could do:
prog.container.children[0].max = prog.total (after setting the new prog.total).
This is even more annoying in case of writing code to run on both notebooks and CLI (from tqdm.auto import tqdm), where you'll have to first check hasattr(prog, 'container').

linearK - large time difference between empirical and acceptance envelopes in spatstat

I am interested in knowing correlation in points between 0 to 2km on a linear network. I am using the following statement for empirical data, this is solved in 2 minutes.
obs<-linearK(c, r=seq(0,2,by=0.20))
Now I want to check the acceptance of Randomness, so I used envelopes for the same r range.
acceptance_enve<-envelope(c, linearK, nsim=19, fix.n = TRUE, funargs = list(r=seq(0,2,by=0.20)))
But this show estimated time to be little less than 3 hours. I just want to ask if this large time difference is normal. Am I correct in my syntax to the function call of envelope its extra arguments for r as a sequence?
Is there some efficient way to shorten this 3 hour execution time for envelopes?
I have a road network of whole city, so it is quite large and I have checked that there are no disconnected subgraphs.
c
Point pattern on linear network
96 points
Linear network with 13954 vertices and 19421 lines
Enclosing window: rectangle = [559.653, 575.4999] x
[4174.833, 4189.85] Km
thank you.
EDIT AFTER COMMENT
system.time({s <- runiflpp(npoints(c), as.linnet(c));
+ linearK(s, r=seq(0,2,by=0.20))})
user system elapsed
343.047 104.428 449.650
EDIT 2
I made some really small changes by deleting some peripheral network segments that seem to have little or no effect on the overall network. This also lead to split some long segments into smaller segments. But now on the same network with different point pattern, I have even longer estimated time:
> month1envelope=envelope(months[[1]], linearK ,nsim = 39, r=seq(0,2,0.2))
Generating 39 simulations of CSR ...
1, 2, [etd 12:03:43]
The new network is
> months[[1]]
Point pattern on linear network
310 points
Linear network with 13642 vertices and 18392 lines
Enclosing window: rectangle = [560.0924, 575.4999] x [4175.113,
4189.85] Km
System Config: MacOS 10.9, 2.5Ghz, 16GB, R 3.3.3, RStudio Version 1.0.143
You don't need to use funargs in this context. Arguments can be passed directly through the ... argument. So I suggest
acceptance_enve <- envelope(c, linearK, nsim=19,
fix.n = TRUE, r=seq(0,2,by=0.20))
Please try this to see if it accelerates the execution.

Resources