Is there any case where we can say Alpha-Beta pruning is inefficient. In other words, let's say we have a game where you have to reach 27 to win, and you and your opponent may only use 1,2,5 each time to add up. So is Alpha-Beta pruning efficient in here? Isn't it a little bit confusing to evaluate it that way, especially at the begining of our case where there are a lot of possibilities which we don't really care about?
I feel like I can explain this, but I can't! Help.
For this game, it might happen that it is possible to reduce it to some mathematical formula, and the tree search and alpha-beta pruning would be overkill.
But let's say it is not possible. You have a game with two or three outcomes: LOSS(-1), WIN(1) and possibly DRAW(0), and no meaningful evaluation of intermediate positions. Then you would need to search to the end of each variation, and so e.g. iterative deepening would be pointless.
However, alpha-beta pruning could be very efficient: If beta=-1 (meaning the opponent has found a win), you can just return -1 right away, without even searching a PV. If beta=0, the only time you would need to search all child nodes is when all (except possibly the last) moves lose.
The condition for alpha-beta to sufficiently efficient is, of course, that the complete tree is small enough to traverse in reasonable time.
EDIT: I forgot to mention that for your particular example, remembering evaluations would have much, much greater effect than alpha-beta pruning with regard to the number of nodes traversed (from 2688332 to 77).
Related
Always there are multiple ways people describe differences in tabulation and memoization in dynamic programming, but I will summarize to what is normally said.
memoization is a where we add caching to a function, to make recursive calls take less computations. typically used on recursive functions for a top down solution that starts with the initial problem and then recursively calls itself to smaller problems
tabulation uses a table to keep track of subproblem results and works up in bottom up manner, solving smallest sub problems before larger ones in a iterative manner.
Well my question is whats the difference? Sometimes I look at different situations and the line is super blurred. Also, with memoization working in a "top down" fashion, its really just referring to the stack nature of it, and in that sense its still going to the base case, aka bottom and then using those results to build up to the final result, so how is that really different from a tabulation going from bottom up until its done? Or is it a situational case where tabulation aproaches don't involve recursion, the fact that a dynmaic programming problem uses it IS what differentiates the two different methods? If someone knowledgable could offer there thoughts it would be much appreciated
You're right that they're just two implementation methods for the same computation. A recursive formulation with memoization will fill in the memo cache with the same entries that an explicit tabular formulation will put in its table.
Explicit tabular formulations are strictly less useful, however. This is because they need more information about the problem in advance. They start by enumerating all possibly useful base cases and putting those in the table. (So what's "possibly useful?" That's the rub!) Then they enumerate the new "layer" of all possible problem versions that can be solved with the base cases. Then a layer of others that can be solved with those, etc. etc. This continues until it the "top level" problem turns up in a layer.
For the kinds of problems typically seen in textbooks and coding interviews, determining all useful base cases is deliberately easy. The problem parameters are 2 or 3 "dense" natural numbers, so the table of solutions can be a 2d or 3d array with all elements containing useful values. In many of these, you can prove that the current layer only depends on a few (possibly one) previous layer, so all the rest can be discarded, which saves memory.
Practical problems aren't often so nice. The parameter sets aren't small or aren't natural numbers, or - even when they are - they're sparse so that filling in all entries of an array would be a waste.
In these cases, memoization is the only reasonable choice. The top-down recursion determines the sub-problems (on down to the base cases) that need solving as they occur. Sparseness doesn't matter because the memo cache can store parameter sets as explicit keys. When the current layer doesn't need more than K previous ones, various strategies can still be applied to discard the others.
I want to compare and know precise differences between depth first branch and bound and IDA* algorithms. I browsed the internet but i am unable to find clear explanations. Please help!
IDA* does a f-cost limited depth-first search, pruning paths that are more expensive (using the lower-bound heuristic) than the current cost bound. It gradually increases the bound until a solution is found.
DFBnB searches through a tree keeping track of the best solution found thus far, gradually decreasing the cost of the best solution until it is optimal. DFBnB also uses a lower-bound heuristic to prune any paths that are more expensive than the current best solution.
Some algorithms, like Budgeted Tree Search, do both types of pruning - using both current cost bounds and the best found solution thus far.
I have seen some MCTS implementation online and how they are used in a game.
A best move is calculated each move based on the state at that moment.
If you have a sequence of moves in a game between human and computer like:
turn_h1,turn_c1,turn_h2,turn_c2,turn_h3,turn_c3,....turn_hn,turn_cn
turn_h(i)=human, turn_c(i)=computer and i the i-th move of a player (human/computer).
And for each computer's turn i there is a corresponding state that is used to determine the i-th best move with MCTS.
Question: Should the tree built in the (i-1)-th turn(bestmove) be used for the i-th turn(MCTS bestmove)?
I mean, should the tree which was the result of the best move in state (n-1) be used as input for determining the best move at the i-th state?
Other words can I re-use already constructed tree-nodes from previous turns/bestmoves calculations, so that I do not need to build the whole tree again?
I have created a sequence of turns in pseudo-code just to make clear what what I mean with using the (i-1)th state(tree) to feed the next MCST bestmove. (of course in real world the logic below would be implemented as an iteration/loop construct):
#start game
initial_game_state.board= initialize_board()
#turn 1
#human play
new_game_state_1 = initial_game_state.board.make_move(move_1)
#computer play
move_1 = MCTS.determine_bestmove(new_game_state_1)
new_game_state_2 = game_state_1.board.make_move(move_1)
#turn 2
#human play
new_game_state_3 = new_game_state_2.board.make_move(move_2)
#computer play
move_3 = MCTS.determine_bestmove(new_game_state_3)
new_game_state_4 = new_game_state_4.board.makeMove(move_3)
#turn 3
# ....
Yes you can do this. This is commonly referred to as "tree reuse" (at least, that's how I usually call it).
You would start out your MCTS call (except for the very first one, in which there is no "previous tree" yet) by navigating from the root node to the node that corresponds to the one you have actually reached in the "real" game.
Note that, in a two-player alternating-move game, this does not only involve a move that your MCTS agent made, but also a move made by the opponent. Due to how MCTS work, if the opponent "surprised" your MCTS agent by selecting a move that MCTS didn't predict, it is likely that this leads to a subtree of the previous tree that had relatively few visits. In this case, tree reuse won't have much effect. But in cases where the opponent doesn't surprise you, and plays exactly what MCTS already predicted during the previous search, you may end up getting a relatively large subtree to initialise your new search with.
As for if you "should" do this, as is the literal wording in your question... you don't have to. There are many MCTS implementations out there which don't do this. I'd generally recommend it anyway. It's not too difficult to implement. It generally won't give a big boost in performance (because the playing strength of MCTS tends to scale sub-linearly with increases in "thinking time"), but it definitely shouldn't hurt either, and may give a small boost in playing strength.
Note that, in nondeterministic games, if you implement an "open-loop" variant of MCTS (without explicit chance nodes), the part of the subtree that you're "re-using" will be partially based on outdated information. In such games, it may be beneficial to discount all the statistics gathered in your previous search (i.e. multiply all your visit counts and accumulated scores by a number between 0 and 1) before starting the new search process.
Important implementation detail: when re-using the previous tree, if your new root node (which used to be a node in the middle of your previous tree) has a reference/pointer back to its parent node, make sure to set it to null. If you forget about this, all search trees of all your previous searches will fully persist in memory throughout an entire game, and you'll likely run out of memory quickly.
I've written a GA to model a handful of stocks (4) over a period of time (5 years). It's impressive how quickly the GA can find an optimal solution to the training data, but I am also aware that this is mainly due to it's tendency to over-fit in the training phase.
However, I still thought I could take a few precautions and and get some kind of prediction on a set of unseen test stocks from the same period.
One precaution I took was:
When multiple stocks can be bought on the same day the GA only buys one from the list and it chooses this one randomly. I thought this randomness might help to avoid over-fitting?
Even if over-fitting is still occurring,shouldn't it be absent in the initial generations of the GA since it hasn't had a chance to over-fit yet?
As a note, I am aware of the no-free-lunch theorem which demonstrates ( I believe) that there is no perfect set of parameters which will produce an optimal output for two different datasets. If we take this further, does this no-free-lunch theorem also prohibit generalization?
The graph below illustrates this.
->The blue line is the GA output.
->The red line is the training data (slightly different because of the aforementioned randomness)
-> The yellow line is the stubborn test data which shows no generalization. In fact this is the most flattering graph I could produce..
The y-axis is profit, the x axis is the trading strategies sorted from worst to best ( left to right) according to there respective profits (on the y axis)
Some of the best advice I've received so far (thanks seaotternerd) is to focus on the earlier generations and increase the number of training examples. The graph below has 12 training stocks rather than just 4, and shows only the first 200 generations (instead of 1,000). Again, it's the most flattering chart I could produce, this time with medium selection pressure. It certainly looks a little bit better, but not fantastic either. The red line is the test data.
The problem with over-fitting is that, within a single data-set it's pretty challenging to tell over-fitting apart from actually getting better in the general case. In many ways, this is more of an art than a science, but here are some general guidelines:
A GA will learn to do exactly what you attach fitness to. If you tell it to get really good at predicting one series of stocks, it will do that. If you keep swapping in different stocks to predict, though, you might be more successful at getting it to generalize. There are a few ways to do this. The one that has had perhaps the most promising results for reducing over-fitting is imposing spatial structure on the population and evaluating on different test cases in different cells, as in the SCALP algorithm. You could also switch out the test cases on a time basis, but I've had more mixed results with that sort of an approach.
You are correct that over-fitting should be less of a problem early on. Generally, the longer you run a GA, the more over-fitting will be possible. Typically, people tend to assume that the general rules will be learned first, before the rote memorization of over-fitting takes place. However, I don't think I've actually ever seen this studied rigorously - I could imagine a scenario where over-fitting was so much easier than finding general rules that it happens first. I have no idea how common that is, though. Stopping early will also reduce the ability of the GA to find better general solutions.
Using a larger data-set (four stocks isn't that many) will make your GA less susceptible to over-fitting.
Randomness is an interesting idea. It will definitely hurt the GA's ability to find general rules, but it should also reduce over-fitting. Without knowing more about the specifics of your algorithm, it's hard to say which would win out.
That's a really interesting thought about the no free lunch theorem. I'm not 100% sure, but I think it does apply here to some extent - better fitting some data will make your results fit other data worse, by necessity. However, as wide as the range of possible stock behaviors is, it is much narrower than the range of all possible time series in general. This is why it is possible to have optimization algorithms at all - a given problem that we are working with tends produce data that cluster relatively closely together, relative to the entire space of possible data. So, within that set of inputs that we actually care about, it is possible to get better. There is generally an upper limit of some sort on how well you can do, and it is possible that you have hit that upper limit for your data-set. But generalization is possible to some extent, so I wouldn't give up just yet.
Bottom line: I think that varying the test cases shows the most promise (although I'm biased, because that's one of my primary areas of research), but it is also the most challenging solution, implementation-wise. So as a simpler fix you can try stopping evolution sooner or increasing your data-set.
in game search tree there are many algorithms to get the optimal solution, like minimax algorithm. I start learn how to solve this problem with minimax algorithm, the algorithm clear. but I'm confused about the tree itself, in games like tic tac toe number of node not very huge, but on others like chess there are many nodes. i think this need large space in memory. So is there any algorithms to evaluate and build tree in the same time?
A tree of game states is not normally built as a complete data structure. Instead, states are evaluated as they are created, and most are discarded in the process. Often, a linked-list from the state being evaluated back to the current state of the game is maintained. But if one move is shown to be much better than another, then the entire line for the poor move will be discarded, so it will occupy no space in memory.
One simple way to search the state space for a game like chess is to do the search recursively to a given depth. In that case, very few game states actually exist at one time, and those that do exist are simply referenced on the call-stack. More sophisticated algorithms will create a larger tree, but (especially for chess) none will maintain a tree of all possible states. For chess, a breadth-first search may be better, using a queue rather than a stack, and this will maintain only states at a certain depth in the tree. Even better would be a priority queue in which the best states are stored for further evaluation, and the worst states are discarded completely.