Minimax 为一个白痴解释 [英] Minimax explained for an idiot

查看:36
本文介绍了Minimax 为一个白痴解释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经浪费了我一整天的时间试图使用极小极大算法来制作无与伦比的 tictactoe AI.一路上我错过了一些东西(脑残).

我不是在这里寻找代码,只是更好地说明我哪里出错了.

这是我当前的代码(由于某种原因,minimax 方法总是返回 0):

from copy import deepcopy类 Square(对象):def __init__(self, player=None):self.player = 玩家@财产定义为空(自我):返回 self.player 是 None类板(对象):win_combos = ([0, 1, 2], [3, 4, 5], [6, 7, 8], [0, 3, 6], [1, 4, 7], [2, 5, 8],[0, 4, 8], [2, 4, 6],)def __init__(self, squares={}):self.squares = 正方形对于范围内的 i (9):如果 self.squares.get(i) 是 None:self.squares[i] = Square()@财产def available_moves(self):return [k for k, v in self.squares.iteritems() if v.empty]@财产定义完成(自我):对于 self.spiring_combos 中的组合:组合可用 = 真对于 pos 组合:如果不是在 self.available_moves 中的位置:组合可用 = 假如果组合可用:返回 self.winner 不是 None返回真@财产def player_won(self):返回 self.winner == 'X'@财产def computer_won(self):返回 self.winner == 'O'@财产定义绑定(自我):返回 self.complete == True 和 self.winner 是 None@财产定义获胜者(自己):对于 ('X', 'O') 中的玩家:位置 = self.get_squares(玩家)对于 self.spiring_combos 中的组合:赢 = 真对于 pos 组合:如果 pos 不在位置:胜利 = 错误如果赢了:回归球员返回无@财产def启发式(自我):如果 self.player_won:返回-1elif self.tied:返回 0elif self.computer_won:返回 1def get_squares(self, player):return [k for k,v in self.squares.iteritems() if v.player == player]def make_move(self, position, player):self.squares[position] = Square(玩家)def minimax(self, node, player):如果 node.complete:返回 node.heuristica = -1e10000在 node.available_moves 中移动:child = deepcopy(节点)child.make_move(移动,玩家)a = max([a, -self.minimax(child, get_enemy(player))])返回一个def get_enemy(玩家):如果玩家=='X':返回'O'返回X"

解决方案

您的完整功能未按预期工作,导致游戏在任何事情发生之前都被声明为绑定.例如,考虑以下设置:

<代码>>>o获胜 = {1:正方形('X'),3:正方形('O'),4:正方形('X'),6:正方形('O'),8:正方形('X'),}>>nb = Board(oWinning)>>完整版真的>>绑定真的

这应该是计算机下一步的胜利.相反,它表示游戏已绑定.

问题是您的逻辑现在完整地检查组合中的所有方块是否都是空闲的.如果其中任何一个不是,则假定无法赢得该组合.它需要做的是检查该组合中是否有任何位置被占用,只要所有这些组合都是 None 或同一玩家,该组合就应被视为仍然可用.

例如

def available_combos(self, player):返回 self.available_moves + self.get_squares(player)@财产定义完成(自我):对于 ('X', 'O') 中的玩家:对于 self.spiring_combos 中的组合:组合可用 = 真对于 pos 组合:如果不是在 self.available_combos(player) 中的位置:组合可用 = 假如果组合可用:返回 self.winner 不是 None返回真

现在我用更新的代码正确地测试了这个,我在这个测试用例上得到了预期的结果:

<预><代码>>>>nb.minimax(nb, 'O')-1>>>nb.minimax(nb, 'X')1

I've wasted my entire day trying to use the minimax algorithm to make an unbeatable tictactoe AI. I missed something along the way (brain fried).

I'm not looking for code here, just a better explanation of where I went wrong.

Here is my current code (the minimax method always returns 0 for some reason):

from copy import deepcopy


class Square(object):
    def __init__(self, player=None):
        self.player = player
    
    @property
    def empty(self):
        return self.player is None


class Board(object):
    winning_combos = (
        [0, 1, 2], [3, 4, 5], [6, 7, 8], [0, 3, 6], [1, 4, 7], [2, 5, 8],
        [0, 4, 8], [2, 4, 6],
    )
    
    def __init__(self, squares={}):
        self.squares = squares
        for i in range(9):
            if self.squares.get(i) is None:
                self.squares[i] = Square()
    
    @property
    def available_moves(self):
        return [k for k, v in self.squares.iteritems() if v.empty]
    
    @property
    def complete(self):
        for combo in self.winning_combos:
            combo_available = True
            for pos in combo:
                if not pos in self.available_moves:
                    combo_available = False
            if combo_available:
                return self.winner is not None
        return True
    
    @property
    def player_won(self):
        return self.winner == 'X'
    
    @property
    def computer_won(self):
        return self.winner == 'O'
    
    @property
    def tied(self):
        return self.complete == True and self.winner is None
    
    @property
    def winner(self):
        for player in ('X', 'O'):
            positions = self.get_squares(player)
            for combo in self.winning_combos:
                win = True
                for pos in combo:
                    if pos not in positions:
                        win = False
                if win:
                    return player
        return None
    
    @property
    def heuristic(self):
        if self.player_won:
            return -1
        elif self.tied:
            return 0
        elif self.computer_won:
            return 1
    
    def get_squares(self, player):
        return [k for k,v in self.squares.iteritems() if v.player == player]
    
    def make_move(self, position, player):
        self.squares[position] = Square(player)
    
    def minimax(self, node, player):
        if node.complete:
            return node.heuristic
        a = -1e10000
        for move in node.available_moves:
            child = deepcopy(node)
            child.make_move(move, player)
            a = max([a, -self.minimax(child, get_enemy(player))])
        return a


def get_enemy(player):
    if player == 'X':
        return 'O'
    return 'X'

解决方案

Your complete function is not working as expected, causing games to be declared tied before anything can happen. For instance, consider this setup:

>> oWinning = {
 1: Square('X'),
 3: Square('O'), 4: Square('X'),
 6: Square('O'), 8: Square('X'),
}
>> nb = Board(oWinning)
>> nb.complete
True
>> nb.tied
True

This should be a win for the computer on the next move. Instead, it says the game is tied.

The problem is that your logic in complete, right now, checks to see if all of the squares in a combo are free. If any of them are not, it presumes that that combo can't be won with. What it needs to do is check if any positions in that combo are occupied, and so long as all of those combos are either None or the same player, that combo should be considered still available.

e.g.

def available_combos(self, player):
    return self.available_moves + self.get_squares(player)

@property
def complete(self):
    for player in ('X', 'O'):
        for combo in self.winning_combos:
            combo_available = True
            for pos in combo:
                if not pos in self.available_combos(player):
                   combo_available = False
            if combo_available:
                return self.winner is not None
    return True

Now that I properly tested this with the updated code I'm getting the expected result on this test case:

>>> nb.minimax(nb, 'O')
-1
>>> nb.minimax(nb, 'X')
1

这篇关于Minimax 为一个白痴解释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆