极小与α+β修剪转换为negamax [英] Conversion of minimax with alpha beta pruning to negamax

查看:334
本文介绍了极小与α+β修剪转换为negamax的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个极小算法的α+β修剪,在跳棋游戏,现在我想使用的 negamax 方法。我期待这两个是等价的,因为negamax只是一个技术写的极小。但由于某些原因,我的两种算法不同的表现。当我在相同的输入运行它们两者的negamax版本似乎评估更多的国家,所以我认为有些事情一定是错的α+β修剪。

在code列出了两种算法(极大极小 negamax 函数),并在底部的播放功能,从我给他们打电话。该评估功能,我用它来评估各国两种算法的基本启发。

任何帮助,察觉错误将大大appriciated。

 的#includeplayer.hpp
#包括<算法>
#包括<限制>
#包括< cstdlib>

命名空间跳棋
{

INT evaluatedStates = 0;

INT评估(const的游戏状态和放大器状态)
{
    // FIXME:提高试探。
    INT redScore = 0;
    INT whiteScore = 0;
    INT件= 0;
    的for(int i = 1; I< = 32; ++ I)
    {
        片= state.at(ⅰ);
        如果(件和放大器; CELL_RED){
            ++ redScore;
            如果(件和放大器; CELL_KING)
                redScore + = 2; //国王奖金。
        }否则如果(件和放大器; CELL_WHITE){
            ++ whiteScore;
            如果(件和放大器; CELL_KING)
                whiteScore + = 2; //国王奖金。
        }
    }
    返回state.getNextPlayer()== CELL_RED? whiteScore  -  redScore:redScore  -  whiteScore;
}

INT极小(const的游戏状态和放大器;状态,INT深度,诠释一个,INT B,布尔最大值)
{
    如果(深== 0 || state.isEOG()){
        ++ evaluatedStates;
        回报评估(州);
    }
    的std ::矢量<游戏状态> possibleMoves;
    state.findPossibleMoves(possibleMoves);
    如果(最大){
        对于(const的游戏状态和放大器;移动:possibleMoves){
            A =的std :: MAX(A,极大极小(移动,深度 -  1,A,B,FALSE));
            如果(b将=一)
                返回b; //β截止。
        }
        返回;
    } 其他 {
        对于(const的游戏状态和放大器;移动:possibleMoves){
            B =标准::分(B,极小(移动,深度 -  1,A,B,真));
            如果(b将=一)
                返回; //α截止。
        }
        返回b;
    }
}

INT negamax(const的游戏状态和放大器;状态,INT深度,诠释一个,int b)在
{
    如果(深== 0 || state.isEOG()){
        ++ evaluatedStates;
        回报评估(州);
    }
    的std ::矢量<游戏状态> possibleMoves;
    state.findPossibleMoves(possibleMoves);
    对于(const的游戏状态和放大器;移动:possibleMoves){
        A =的std :: MAX(A,-negamax(移动,深度 -  1,-b,-a));
        如果(b将=一)
            返回b; //β截止。
    }
    返回;
}

游戏状态播放器::播放(const的游戏状态和放大器; pState,const的截止日期和放大器; pDue)
{
    游戏状态bestMove(pState,移动());

    的std ::矢量<游戏状态> possibleMoves;
    pState.findPossibleMoves(possibleMoves);

    INT A = -std ::  -  numeric_limits的LT; INT> :: MAX();
    INT B =的std :: numeric_limits的< INT> :: MAX();
    对于(const的游戏状态和放大器;移动:possibleMoves){
        INT V = negamax(移动,10,A,B);
        // INT V =极大极小(移动,10,A,B,FALSE);
        如果(V&胺a){
            一= V;
            bestMove =移动;
        }
    }
    的std :: CERR<< 评估的规定:<< evaluatedStates<<的std :: ENDL;
    返回bestMove;
}

/ *命名空间跳棋* /}
 

解决方案

极小() negamax()功能是正确的。我认为 state.getNextPlayer()返回谁旁边移动播放器。这意味着,你的评估() negamax()函数返回一个分值从玩家的角度出发。

不过,极小()返回从最高的角度得分。所以如果你打算取消注释极小()播放()的功能,这将导致一个错误。

  // INT V = negamax(移动,10,A,B);
INT V =极大极小(移动,10,A,B,FALSE); //假设分的球员的角度
                                ^^^^^

如果(V&胺a){//假设最大播放的透视
    一= V;
    bestMove =移动;
}
 

更换呼叫极小()参数应该解决这个问题:

  INT V =极大极小(移动,10,A,B,真正的); //假设最大的球员的角度
 

I've written a minimax algorithm with alpha beta pruning for the game Checkers, and now I'm trying to rewrite it using the negamax approach. I'm expecting the two to be equivalent, since negamax is just a technique to write the minimax. But for some reason my two algorithms behave differently. When I run them both on the same input, the negamax version seems to evaluate more states, so I think something must be wrong with the alpha beta pruning.

The code below shows both algorithms (minimax and negamax functions), and at the bottom the play function from which I call them. The evaluate function is the basic heuristic which I use to evaluate states in both algorithms.

Any help with spotting the error would be much appriciated.

#include "player.hpp"
#include <algorithm>
#include <limits>
#include <cstdlib>

namespace checkers
{

int evaluatedStates = 0;

int evaluate(const GameState &state)
{
    // FIXME: Improve heuristics.
    int redScore = 0;
    int whiteScore = 0;
    int piece = 0;
    for (int i = 1; i <= 32; ++i)
    {
        piece = state.at(i);
        if (piece & CELL_RED) {
            ++redScore;
            if (piece & CELL_KING)
                redScore += 2;   // King bonus.
        } else if (piece & CELL_WHITE) {
            ++whiteScore;
            if (piece & CELL_KING)
                whiteScore += 2; // King bonus.
        }
    }
    return state.getNextPlayer() == CELL_RED ? whiteScore - redScore : redScore - whiteScore;
}

int minimax(const GameState &state, int depth, int a, int b, bool max)
{
    if (depth == 0 || state.isEOG()) {
        ++evaluatedStates;
        return evaluate(state);
    }
    std::vector<GameState> possibleMoves;
    state.findPossibleMoves(possibleMoves);
    if (max) {
        for (const GameState &move : possibleMoves) {
            a = std::max(a, minimax(move, depth - 1, a, b, false));
            if (b <= a)
                return b; // β cutoff.
        }
        return a;
    } else {
        for (const GameState &move : possibleMoves) {
            b = std::min(b, minimax(move, depth - 1, a, b, true));
            if (b <= a)
                return a; // α cutoff.
        }
        return b;
    }
}

int negamax(const GameState &state, int depth, int a, int b)
{
    if (depth == 0 || state.isEOG()) {
        ++evaluatedStates;
        return evaluate(state);
    }
    std::vector<GameState> possibleMoves;
    state.findPossibleMoves(possibleMoves);
    for (const GameState &move : possibleMoves) {
        a = std::max(a, -negamax(move, depth - 1, -b, -a));
        if (b <= a)
            return b; // β cutoff.
    }
    return a;
}

GameState Player::play(const GameState &pState, const Deadline &pDue)
{
    GameState bestMove(pState, Move());

    std::vector<GameState> possibleMoves;
    pState.findPossibleMoves(possibleMoves);

    int a = -std::numeric_limits<int>::max();
    int b = std::numeric_limits<int>::max();
    for (const GameState &move : possibleMoves) {
        int v = negamax(move, 10, a, b);
        //int v = minimax(move, 10, a, b, false);
        if (v > a) {
            a = v;
            bestMove = move;
        }
    }
    std::cerr << "Evaluated states: " << evaluatedStates << std::endl;
    return bestMove;
}

/*namespace checkers*/ }

解决方案

Your minimax() and negamax() functions are correct. I assume that state.getNextPlayer() returns the player who has to move next. That means that your evaluate() and negamax() functions return a score from the perspective of that player.

However, the minimax() returns a score from the perspective of max. So if you try uncommenting minimax() in your play() function, that would lead to a bug

//int v = negamax(move, 10, a, b);
int v = minimax(move, 10, a, b, false); // assumes perspective of min player
                                ^^^^^

if (v > a) {                            // assumes perspective of max player
    a = v;
    bestMove = move;
}

Replacing the call to minimax() with a true parameter should solve it:

int v = minimax(move, 10, a, b, true); // assumes perspective of max player

这篇关于极小与α+β修剪转换为negamax的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆