极大极小算法的实现错误在哪里? [英] Where's error in implementation of minimax algorithm?

查看:29
本文介绍了极大极小算法的实现错误在哪里?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

它在 Tic-Tac-Toe 游戏中的实现存在一个小问题.对于以下组合:

['x', 'o', 'e','o', 'e', 'e','e', 'e', 'e']

最好的选择是

['x', 'o', 'e','o', 'x', 'e','e', 'e', 'e']

但它返回我认为最合适的一个:

['x', 'o', 'x','o', 'e', 'e','e', 'e', 'e']

在这种情况下,AI 输了.代码如下:

var board = ['x', 'o', 'e', 'o', 'e', 'e', 'e', 'e', 'e'];var signPlayer = 'o';var signAI = (signPlayer === 'x') ?'o' : 'x';游戏 = {结束:功能(板){for (var i = 0; i  best_score){best_score = 分数;bestMove = 新游戏;}}返回最佳移动;}功能分钟(董事会){如果(游戏结束(棋盘)){退货板;}var newGame = [];var 最差移动 = [];无功分数;变量最差得分 = +无穷大;var moveArray = game.possible_moves(board, signPlayer);for (var i = 0; i 

解决方案

存在以下问题:

  • over 方法为某些板提供错误的输出,例如对于此板:

    ['e', 'e', 'e', 'o', 'o', 'o', 'x', 'x', 'e']

    当它在前三个元素中找到三个 e 值并返回 false 后,它实际上将停止查找,即它没有看到第二行的胜利o.要修复,请更改此行:

    return board[i] !== 'e' ?板[i]:假;

    到:

    if (board[i] !== 'e') return board[i];

    如果在一行中找到三个 e,这将使函数继续进行其他检查.在其他循环中也需要类似的修复(最后一个除外).

  • 虽然极大极小算法成功访问了搜索树中的节点,但它并没有将找到的叶分数(0、-100 或 100)带回搜索树中.相反,您仅通过查看静态棋盘配置来重新计算每个位置的分数,忽略您可以从递归调用中获得的最佳/最差分数.为了解决这个问题,让 minmax 函数不仅返回最佳移动,而且还返回与之相关的分数.所以替换这个:

    return bestMove;

    与:

    return [best_score, bestMove];

    然后你从递归调用中获取分数,如果你替换这个:

    score = moveScore(min(newGame));

    与:

    score = min(newGame)[0];

    对于游戏结束的情况,您需要进行类似的更改.替换这个:

    if (game.over(board)) {退货板;}

    与:

    if (game.over(board)) {返回 [moveScore(board), []];}

    请注意,这是调用moveScore的唯一合适时机.数组的第二个元素应该是最好的移动,但由于没有移动,最好只使用一个空数组.

  • 这是一个小问题:您实际上并没有使用从主要调用中获得的最佳移动到 max.通过上述更改,您可以在主调用中获得最佳移动:

    [score, nextBoard] = max(board);

这是您更正的代码,最后还有一些额外的代码,允许通过点击 3x3 网格来玩游戏.为此,我将代码 e 更改为空格,因为它在印制板上看起来更好:

var board = ['x', 'o', ' ', 'o', ' ', ' ', ' ', '', ' '];var signPlayer = 'o';var signAI = (signPlayer === 'x') ?'o' : 'x';var 游戏 = {结束:功能(板){for (var i = 0; i  best_score){best_score = 分数;bestMove = 新游戏;}}//返回最佳移动;返回 [best_score, bestMove];}功能分钟(董事会){//if (game.over(board)) {//返回板;//}如果(游戏结束(棋盘)){返回 [moveScore(board), []];}var newGame = [];变量最差移动 = [];无功分数;变量最差得分 = +无穷大;var moveArray = game.possible_moves(board, signPlayer);for (var i = 0; i  tds[i].textContent = v);span.textContent = 分数;}显示(板);table.onclick = 函数 (e) {var i = tds.indexOf(e.target);if (i == -1 || board[i] !== ' ' || game.over(board)) 返回;板[i] = signPlayer;显示(板);[得分,棋盘] = 最大值(棋盘,1);显示(板,分数);}

td { border: 1px solid;宽度:20px;文本对齐:居中;光标:手 }tr { 高度:25px;v-align: 中间 }

<tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr>
<div>得分:<span></span>

最后说明

我刚刚进行了更正以使其工作,但请注意有几种方法可以提高效率.为此,您可以使用 alpha-beta 搜索、跟踪已评估板的分数,同时通过翻译(旋转、镜像)和变异板来映射类似板,而不是每次移动时都创建一个新板.

There is one little problem with it's implementation for a Tic-Tac-Toe game. For the following combination:

['x', 'o', 'e',  
 'o', ' e', 'e',  
 'e', ' e', 'e']  

the best choice would be

['x', 'o', 'e',  
 'o', ' x', 'e',  
 'e', ' e', 'e']  

but it returns as I suppose the nearest suitable one:

['x', 'o', 'x',  
 'o', ' e', 'e',  
 'e', ' e', 'e']   

And in this case AI loses. Here is the code:

var board = ['x', 'o', 'e', 'o', 'e', 'e', 'e', 'e', 'e'];
var signPlayer = 'o';
var signAI = (signPlayer === 'x') ? 'o' : 'x';

game = {
    over: function(board) {
        for (var i = 0; i < board.length; i += 3) {
            if (board[i] === board[i + 1] && board[i + 1] === board[i + 2]) {
                return board[i] !== 'e' ? board[i] : false;
            }
        }
        for (var j = 0; j < 3; j++) {
            if (board[j] === board[j + 3] && board[j + 3] === board[j + 6]) {
                return board[j] !== 'e' ? board[j] : false;
            }
        }
        if ((board[4] === board[0] && board[4] === board[8]) || 
        (board[4] === board[2] && board[4] === board[6])) {
            return board[4] !== 'e' ? board[4] : false;
        }
        return ( board.every(function(element) {
            return element !== 'e';
        })) 
    },
    winner: function(board) {
        return game.over(board);
    },
    possible_moves: function(board, sign) {
        var testBoard = [], 
        nextBoard;
        for (var i = 0; i < board.length; i++) {
            nextBoard = board.slice();
            if (nextBoard[i] === 'e') {
                nextBoard[i] = sign;
                testBoard.push(nextBoard);
            }
        }
        return testBoard;
    }
}

function moveScore(board) {
    var result = game.winner(board);

    if (result === signPlayer) {
        return -100;
    }
    if (result === signAI) {
        return +100;
    }
    return 0;
    //Game is a draw
}

function max(board) {

    if (game.over(board)) {
        return board;
    }
    var newGame = [];
    var bestMove = [];
    var score;
    var best_score = -Infinity;
    var movesArray = game.possible_moves(board, signAI);

    for (var i = 0; i < movesArray.length; i++) {
        newGame = movesArray[i].slice();
        score = moveScore(min(newGame));
        if (score > best_score) {
            best_score = score;
            bestMove = newGame;
        }
    }
    return bestMove;
}

function min(board) {

    if (game.over(board)) {
        return board;
    }
    var newGame = [];
    var worstMove = [];
    var score;
    var worst_score = +Infinity;
    var movesArray = game.possible_moves(board, signPlayer);

    for (var i = 0; i < movesArray.length; i++) {
        newGame = movesArray[i].slice();
        score = moveScore(max(newGame));
        if (score < worst_score) {
            worst_score = score;
            worstMove = newGame;
        }
    }
    return worstMove;
}

max(board);

解决方案

There are the following issues:

  • The over method gives wrong output for some boards, like for instance for this board:

    ['e', 'e', 'e', 'o', 'o', 'o', 'x', 'x', 'e']
    

    It will actually stop looking after it finds the three e values in the first three elements and return false, i.e. it does not see the win on the second row for o. To fix, change this line:

    return board[i] !== 'e' ? board[i] : false;
    

    to:

    if (board[i] !== 'e') return board[i];
    

    This will make the function continue with the other checks if it finds three e in a row. Similar fixes are necessary in the other loops (except the very last one).

  • Although the minimax algorithm visits the nodes in the search tree succesfully, it does not carry the found leaf-score (0, -100 or 100) back up in the search tree. Instead you recalculate each position's score by just looking at a static board configuration, ignoring the best/worst score you could get from the recursive call. To fix this, let the min and max function not only return the best move, but also the score associated with it. So replace this:

    return bestMove;
    

    with:

    return [best_score, bestMove];
    

    And then you pick up the score from the recursive call, if you replace this:

    score = moveScore(min(newGame));
    

    with:

    score = min(newGame)[0];
    

    You need to do a similar change for the case where the game is over. Replace this:

    if (game.over(board)) {
        return board;
    }
    

    with:

    if (game.over(board)) {
        return [moveScore(board), []];
    }
    

    Note that this is the only right moment to call moveScore. The second element of the array should be the best move, but as there is no move, it is better to just use an empty array for that.

  • This is a minor issue: you don't actually use the best move you get from the main call to max. With the above change, you could get both the best move and its score in the main call:

    [score, nextBoard] = max(board);
    

Here is your corrected code, with some additional code at the end to allow a game to be played by clicking on a 3x3 grid. For that purpose I have changed the code e to a space, as it looks better on a printed board:

var board = ['x', 'o', ' ', 'o', ' ', ' ', ' ', ' ', ' '];
var signPlayer = 'o';
var signAI = (signPlayer === 'x') ? 'o' : 'x';

var game = {
    over: function(board) {
        for (var i = 0; i < board.length; i += 3) {
            if (board[i] === board[i + 1] && board[i + 1] === board[i + 2]) {
                //return board[i] !== ' ' ? board[i] : false;
                if (board[i] !== ' ') return board[i];
            }
        }
        for (var j = 0; j < 3; j++) {
            if (board[j] === board[j + 3] && board[j + 3] === board[j + 6]) {
                //return board[j] !== ' ' ? board[j] : false;
                if (board[j] !== ' ') return board[j];
            }
        }
        if ((board[4] === board[0] && board[4] === board[8]) || 
                (board[4] === board[2] && board[4] === board[6])) {
            //return board[4] !== ' ' ? board[4] : false;
            if (board[4] !== ' ') return board[4];
        }
        return ( board.every(function(element) {
            return element !== ' ';
        })) 
    },
    winner: function(board) {
        return game.over(board);
    },
    possible_moves: function(board, sign) {
        var testBoard = [], 
        nextBoard;
        for (var i = 0; i < board.length; i++) {
            nextBoard = board.slice();
            if (nextBoard[i] === ' ') {
                nextBoard[i] = sign;
                testBoard.push(nextBoard);
            }
        }
        return testBoard;
    }
}

function moveScore(board) {
    var result = game.winner(board);

    if (result === signPlayer) {
        return -100;
    }
    if (result === signAI) {
        return +100;
    }
    return 0;
    //Game is a draw
}

function max(board) {
    //if (game.over(board)) {
    //    return board;
    //}
    if (game.over(board)) {
        return [moveScore(board), []];
    }
    var newGame = [];
    var bestMove = [];
    var score;
    var best_score = -Infinity;
    var movesArray = game.possible_moves(board, signAI);
    for (var i = 0; i < movesArray.length; i++) {
        newGame = movesArray[i].slice();
        //score = moveScore(min(newGame));
        score = min(newGame)[0];
        if (score > best_score) {
            best_score = score;
            bestMove = newGame;
        }
    }
    //return bestMove;
    return [best_score, bestMove];
}

function min(board) {
    //if (game.over(board)) {
    //    return board;
    //}
    if (game.over(board)) {
        return [moveScore(board), []];
    }
    var newGame = [];
    var worstMove = [];
    var score;
    var worst_score = +Infinity;
    var movesArray = game.possible_moves(board, signPlayer);

    for (var i = 0; i < movesArray.length; i++) {
        newGame = movesArray[i].slice();
        //score = moveScore(max(newGame));
        score = max(newGame)[0];
        if (score < worst_score) {
            worst_score = score;
            worstMove = newGame;
        }
    }
    //return worstMove;
    return [worst_score, worstMove];
}

// Extra code for adding a simple GUI

var board = [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' '];
var score = null;

var tds = Array.from(document.querySelectorAll('td'));
var table = document.querySelector('table');
var span = document.querySelector('span');

function display(board, score) {
    board.forEach( (v, i) => tds[i].textContent = v );
    span.textContent = score;
}
display(board);

table.onclick = function (e) {
    var i = tds.indexOf(e.target);
    if (i == -1 || board[i] !== ' ' || game.over(board)) return;
    board[i] = signPlayer;
    display(board);
    [score, board] = max(board, 1);
    display(board, score);
}

td { border: 1px solid; width: 20px; text-align: center; cursor: hand }
tr { height: 25px; v-align: middle }

<table>
<tr><td></td><td></td><td></td></tr>
<tr><td></td><td></td><td></td></tr>
<tr><td></td><td></td><td></td></tr>
</table>
<div>
Score: <span></span>
</div>

Final note

I have just made the corrections to make it work, but note there are several ways to improve the efficiency. This you can do by using an alpha-beta search, tracking scores for already evaluated boards, while mapping similar boards by translations (turning, mirroring), and mutating boards instead of creating a new board each time you play a move.

这篇关于极大极小算法的实现错误在哪里?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆