MiniMax与Alpha Beta修剪Othello不工作 [英] MiniMax with Alpha Beta Pruning for Othello not working

查看:233
本文介绍了MiniMax与Alpha Beta修剪Othello不工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个othello(reversi)游戏的alpha beta最小极限的实现。不知何故,这从来没有真正返回适当的行动采取。它似乎返回默认动作我放在函数(0,0)和辅助值-32768,这意味着它已在MAX子程序修剪。任何关于我可以改善这方面的提示,以及我如何可以解决这个问题?

I have the following implementation of a alpha beta minimax for an othello (reversi) game. Somehow, this never really returns the proper action to take. It seems to return the default action I put in the function (0, 0) and the secondary value of -32768, which means it got pruned at the MAX subroutine. Any tips on what I can improve with this and how I can fix this problem?

注意:我已经确定后续正确返回大部分。现在的最大深度为8.计算机播放器的pn(播放器编号)为1,人类播放器为0.第一个阶段0为MINIMAX_MAX。 Alpha和Beta最初分别设置为INT_MIN和INT_MAX。

Note: I've identified the successors being returned properly for the most part. The max depth for now is 8. Computer player's pn (player number) is 1 and the human player's is 0. The first stage, 0, is MINIMAX_MAX. Alpha and beta are initially set to INT_MIN and INT_MAX respectively.

mm_out minimax(Grid& G, int alpha, int beta, Action& A, uint pn, uint depth, bool stage) {
    if (G.check_terminal_state() || depth == MAX_DEPTH) {
#ifdef DEBUG
        cout << "best action: (" << A.get_x() << ", " << A.get_y() << ")\n";
#endif
        return mm_out(A, G.get_utility(pn));
    }

    // add end game score total here

#ifdef DEBUG
    if (stage == MINIMAX_MAX) {
        cout << "max " << alpha << " " << beta << "\n";
    }
    else {
        cout << "min " << alpha << " " << beta << "\n";
    }
#endif

    set<Action> succ_temp = G.get_successors(pn);
    for (Action a : succ_temp) {

#ifdef DEBUG
        cout << a.get_x() << " " << a.get_y() << '\n';
#endif

        Grid gt(G);
        a.evaluate(gt);
    }
    set<Action, action_greater> successors(succ_temp.begin(), succ_temp.end());

#ifdef DEBUG
    Player p(0, "minimaxtest");
    G.display(p);
    int test;
    cin >> test;
#endif

    // if no successor, that player passes
    if (successors.size()) {
        for (auto a = successors.begin(); a != successors.end(); ++a) {
            Grid gt(G);
            gt.do_move(pn, a->get_x(), a->get_y(), !PRINT_ERR);
            Action at = *a;
            mm_out mt = minimax(gt, alpha, beta, at, pn ^ 1, depth + 1, !stage);
            int temp = mt.val;
//          A = mt.best_move;

            if (stage == MINIMAX_MAX) {
                if (alpha < temp) {
                    alpha = temp;
                    A = *a;
#ifdef DEBUG
                    cout << "Current action: (" << A.get_x() << ", " << A.get_y() << ") alpha = " << alpha << "\n";
#endif
                }
                if (alpha >= beta) {
#ifdef DEBUG
                    cout << "pruned at max\n";
#endif
                    return mm_out(A, beta);
                }
            }
            else {
                if (beta > temp) {
                    beta = temp;
                    A = *a;
#ifdef DEBUG
                    cout << "Current action: (" << A.get_x() << ", " << A.get_y() << ") beta = " << beta << "\n";
#endif
                }
                if (alpha >= beta) {
#ifdef DEBUG
                    cout << "pruned at min\n";
#endif
                    return mm_out(A, alpha);
                }


}
    }
    return mm_out(A, (stage == MINIMAX_MAX) ? alpha : beta);
}
else {
    cout << "no successor\n";
    return mm_out(A, (stage == MINIMAX_MAX) ? (std::numeric_limits<int>::max() - 1) : (std::numeric_limits<int>::min() + 1));
}

}

效用函数:

int Grid::get_utility(uint pnum) const {
    if (pnum)
        return wcount - bcount;
    return bcount - wcount;
}


推荐答案

c $ c> alpha / 参数的值(不是引用):

You should pass the alpha / beta parameters by value (not by reference):

mm_out minimax(Grid& G, int alpha, int beta, Action& A, uint pn, uint depth, bool stage)

每个节点将alpha和beta值传递给其子节点。然后,孩子们根据自己的回合更新自己的副本 alpha值或beta值,并返回该节点的最终评估。然后用于更新父级的Alpha或Beta值。

Each node passes the alpha and beta values to its children. The children then update their own copies of the alpha or beta value depending on whose turn it is and return the final evaluation of that node. That is then used to update the alpha or beta value of the parent.

这篇关于MiniMax与Alpha Beta修剪Othello不工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆