MAXMAX递归如何工作? [英] How exactly does minimax recursion work?

查看:262
本文介绍了MAXMAX递归如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我在寻找一个Tic-Tac-Toe游戏的Mini-max,但不知道递归是如何工作的?好吧,所以基本上这里是我的问题:

So I was looking up Mini-max for a Tic-Tac-Toe Game, but couldn't understand how the recursion worked? Okay, so basically here are my questions:


  1. minimax如何知道他的轮到呢?

  2. 如何产生可能的举动?

  3. 你如何知道什么时候在终端节点,以及如何生成终端节点?

例如在此伪代码

function integer minimax(node, depth)
if node is a terminal node or depth <= 0:
    return the heuristic value of node
α = -∞
for child in node: # evaluation is identical for both players
    α = max(α, -minimax(child, depth-1))
return α

A 节点正确?而且深度有多少层的代码在递归中下降?还有什么是 max 函数,从哪里生成的节点?

A node is a board correct? And is the depth how many plies the code has to go down in recursion? Also what is the max function and where are the nodes being generated from?

现在,

class Board{
    public:
        Board();
        ~Board(){};
    public: // The board
        // In the board, 1 is x, 2 is o, 0 is empty square.
        int board[3][3];
};

但我怎么知道是谁?

推荐答案

我们将使用你的tic-tac-toe作为例子

We'll use your tic-tac-toe as an example first.


  • 最小化算法适用于玩家交替轮转的游戏,但可以适用于玩家每回合可以进行多次移动的游戏。为了简单起见,我们假设前者。在这种情况下,您不需要为每个节点存储X移动或O移动,因为这只能由节点深度的奇偶性确定(无论我是偶数步长还是奇数

  • 从每个位置生成可能的移动要求您知道它的移动是什么(可以像以前一样确定),以及合法移动的规则特定位置。对于像井字游戏这样简单的游戏,给定位置,可以枚举所有状态,其包括当前位置的副本加上属于当前播放器的新片段,依次放置在每个空的正方形。对于像奥赛罗的游戏,你还必须检查每个位置,以确保它遵循规则,并根据规则的后果更新最终位置(对于奥赛罗,翻转一堆的颜色)。一般来说,从您正在追踪的每个有效位置,您都会枚举新作品的所有可能排列,并检查规则集允许哪些排列。

  • 一般来说,生成整个树,因为游戏树大小可以轻易超过地球的存储容量。您始终设置最大迭代深度。然后,终端节点仅仅是最大深度处的节点,或者是不存在合法移动的节点(对于井字,填充每个方块的板)。您不预先生成终端节点;它们在游戏树构建期间自然生成。井字游戏非常简单,您可以 生成整个游戏树,但不要尝试使用井字游戏代码。

  • A minimax algorithm works best for games where players alternate turns, but can be adapted to games where players may make multiple moves per turn. We'll assume the former, for simplicity. In that case, you need not store 'X to move' or 'O to move' with each node, because that can just be determined by the parity of the node depth (whether I'm an even number of steps, or an odd number of steps, from the top).
  • Generating possible moves from each position requires that you know whose move it is (which can be determined as before), and the rules for legal moves from a particular position. For a simple game like tic-tac-toe, given a position, it suffices to enumerate all the states that consist of a copy of the current position plus a new piece, belonging to the current player, placed at each empty square in turn. For games like Othello, you must also check each placement to ensure that it follows the rules, and update the final position according to the consequences of the rule (for Othello, flipping the colors of a bunch of pieces). In general, from each valid position you're tracking, you enumerate all the possible placings of a new piece and check to see which ones are allowed by the ruleset.
  • In general, you NEVER generate the entire tree, since game tree sizes can easily exceed the storage capacity of Earth. You always set a maximum depth of iteration. A terminal node, then, is simply a node at the maximum depth, or a node from which no legal moves exist (for tic-tac-toe, a board with every square filled). You don't generate the terminal nodes beforehand; they get generated naturally during game tree construction. Tic-tac-toe is simple enough that you can generate the entire game tree, but then don't try to use your tic-tac-toe code for e.g. Othello.

查看您的伪代码:


  • max(a,b)是返回 a b

  • depth 是您要搜索的最大深度。

  • 您计算的启发式值是一些描述电路板值的数值。对于像tic-tac-toe这样的游戏,你可以枚举整个游戏树,你可以指定 1 为玩家赢得的棋盘位置对于其他玩家获胜的棋盘位置,分析 -1 ,对于任何不确定的位置, 0 一般来说,您必须自行制作启发式方法,或使用已被广泛接受的方法。

  • 您可以根据其父节点在分析期间即时生成节点。

  • max(a, b) is any function that returns the larger of a or b. This is usually provided by a math library or similar.
  • The depth is the maximum depth to which you will search.
  • The heuristic value you're computing is some numerical value that describes the value of the board. For a game like tic-tac-toe, which is simple enough that you CAN enumerate the entire game tree, you can designate 1 for a board position that wins for the player doing the analysis, -1 for a board position that wins for the other player, and 0 for any inconclusive position. In general, you'll have to cook up a heuristic yourself, or use a well-accepted one.
  • You generate the nodes on the fly during your analysis based on their parent nodes. Your root node is always the position from which you're doing analysis.

如果你还没有使用图形或树,你的根节点总是从中进行分析的位置。我建议你先这样做;

If you haven't worked with graphs or trees yet, I suggest you do so first; the tree primitive, in particular, is essential to this problem.

作为对这个问题的答案,在这个线程中的一个注释要求一个例子,确定它的一个给定的节点的转弯,我提供这个伪Python:

As an answer to a comment in this thread asking for an example of determining whose turn it is for a given node, I offer this pseudo-Python:

who_started_first = None

class TreeNode:
    def __init__(self, board_position = EMPTY_BOARD, depth = 0):
        self.board_position = board_position
        self.children = []
        self.depth = depth
    def construct_children(self, max_depth):
        # call this only ONCE per node!
        # even better, modify this so it can only ever be called once per node
        if max_depth > 0:

            ### Here's the code you're actually interested in.
            if who_started_first == COMPUTER:
                to_move = (COMPUTER if self.depth % 2 == 0 else HUMAN)
            elif who_started_first == HUMAN:
                to_move = (HUMAN if self.depth % 2 == 0 else COMPUTER)
            else:
                raise ValueError('who_started_first invalid!')

            for position in self.board_position.generate_all(to_move):
                # That just meant that we generated all the valid moves from the
                # currently stored position. Now we go through them, and...
                new_node = TreeNode(position, self.depth + 1)
                self.children.append(new_node)
                new_node.construct_children(max_depth - 1)

每个节点都能够跟踪其从根节点的绝对深度。当我们尝试确定如何为下一步移动产生板块位置时,我们检查它的移动是基于我们深度的奇偶性( self.depth%2 )和我们的第一个移动的记录。

Each node is capable of keeping track of its absolute depth from the 'root' node. When we try to determine how we should generate board positions for the next move, we check to see whose move it is based on the parity of our depth (the result of self.depth % 2) and our record of who moved first.

这篇关于MAXMAX递归如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆