井字游戏AI Bug [英] Tic Tac Toe AI Bugs

查看:76
本文介绍了井字游戏AI Bug的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为Tic Tac Toe实施AI,该AI足够聪明,永不丢失.我尝试了两种不同的算法,但AI仍然会犯错误.

我从 minimax alpha-beta修剪算法.这是一个现场演示: http://iioengine.com/ttt/minimax.htm

它运行没有错误,但是如果您先选择左下角,然后选择底行的其他两个正方形之一-AI不会看到它.我确定这不是minimax算法中的缺陷-有人能在我的源代码中看到错误吗?您可以检查演示页面以查看所有内容,但这是主要的AI功能:

 功能bestMove(板,深度,低,高,对手){var best = new Move(null,-iio.maxInt);var p;对于(var c = 0; c< grid.C; c ++)for(var r = 0; r< grid.R; r ++){如果(board [c] [r] =='_'){var nuBoard = board.clone();nuBoard [c] [r] = getTypeChar(opponent);如果(checkWin(nuBoard,getTypeChar(对手)))p = new Move([c,r],-evaluateBoard(board,getTypeChar(opponent))* 10000);否则如果(checkScratch(nuBoard))p = new Move([c,r],0);否则(depth == 0)p = new Move([c,r],-evaluateBoard(board,getTypeChar(opponent)));别的 {p = bestMove(nuBoard,depth-1,-high,-low,!opponent);}如果(p.score> best.score){最好= p;如果(最佳分数>低)低=最佳分数;如果(best.score> =高)返回最佳;}}}最好的回报} 

如果您对negamax更为熟悉,我也尝试过使用它.我直接从此页面提出了逻辑.这是一个现场演示: http://iioengine.com/ttt/negamax.htm

一旦达到获胜状态,该冻结状态就会消失,但是您已经可以看到AI相当愚蠢.代码集成有问题吗?

如果您发现我的代码中存在阻止这些算法运行不正常的缺陷,请告诉我.谢谢.

使用代码更新:

 功能checkWin(board,type){对于(var i = 0; i< 3; i ++)如果(evaluateRow(board,[i,0,i,1,i,2],type)> = WIN_SCORE|| evaluateRow(board,[0,i,1,i,2,i],type)> = WIN_SCORE)返回true;if(evaluateRow(board,[0,0,1,1,2,2],type)> = WIN_SCORE|| evaluateRow(board,[2,0,1,1,0,2],type)> = WIN_SCORE)返回true;返回false;}函数validateBoard(board,type){var moveTotal = 0;对于(var i = 0; i< 3; i ++){moveTotal + = evaluateRow(board,[i,0,i,1,i,2],type);moveTotal + = evaluateRow(board,[0,i,1,i,2,i],type);}moveTotal + = evaluateRow(board,[0,0,1,1,2,2],type);moveTotal + = evaluateRow(board,[2,0,1,1,0,2],type);返回moveTotal;} 

解决方案

问题出在您的 evaluateBoard()函数中.评估函数是minimax/negamax算法的核心.如果您的AI表现不佳,问题通常出在每一步的评估板上.

在评估董事会时,您需要考虑三件事:获胜的举动,阻挡的举动以及导致分叉的举动.

制胜法宝

静态评估功能需要知道移动是否会导致当前玩家获胜.如果此举对当前玩家造成损失,则需要返回一个非常低的负数(低于任何常规举动).如果此举导致当前玩家获胜,则需要返回非常高的正数(大于任何常规举动).

要记住的重要一点是,相对于AI轮到的玩家,此评估必须 .如果AI当前正在预测人类玩家的移动方向,则评估必须从人类玩家的角度看待董事会.轮到AI时,评估必须从Computer Player的角度看待董事会.

阻止动作

当您运行评估功能时,AI实际上并不认为阻止人类玩家是有好处的.您的评估功能看起来像是只计算可用的移动次数并返回结果.相反,您需要为可帮助AI获胜的举动返回更高的正数.

要考虑阻塞问题,您需要确定玩家是否在开放行,列或对角线中有2个其记号,然后将阻塞正方形的得分高于其他任何正方形.因此,如果轮到计算机移动了,并且人类玩家在开放行中有2个令牌,则该行中的第3个正方形必须具有较高的正数(但不如获胜正方形大).这将导致计算机优先于该正方形.

仅考虑获胜动作和阻止动作,您将拥有一台运行良好的计算机.

分叉

前叉移动会导致计算机出现问题.主要问题是计算机本身就太聪明"了.由于它假设人类玩家每次都会始终做出最佳举动,因此它将发现所有可能做出的举动最终都会因此而蒙受损失的情况,因此,它只会选择棋盘上的第一个举动,因为其他都没关系.

如果我们通过您的示例,我们可以看到这种情况:人类玩家在左下角,计算机在中上角.

  |O |--- + --- +-||--- + --- +-X || 

当人类玩家移动到右下角时,计算机会发现,如果尝试阻止该动作,人类玩家将采取的最佳移动是占据中间的正方形,从而产生分叉并赢得胜利对于人类(尽管由于人类易犯错误,即使是什至时间都不会发生,但计算机并不知道这一点.)

  |O |--- + --- +-|X |--- + --- +-X |O |X 

由于无论是阻止还是不阻止人类获胜,计算机都会输掉,因此阻止人类实际上会冒出可能的最低分数(因为这会导致计算机蒙受损失).这意味着计算机将获得最佳成绩-中间方格.

您将必须找出处理此类情况的最佳方法,因为每个人的使用方式都不相同.这只是要注意的事情.

I'm trying to implement an AI for Tic Tac Toe that is smart enough to never lose. I've tried two different algorithms but the AI still makes mistakes.

I started with this minimax alpha-beta pruning algorithm. Here's a live demo: http://iioengine.com/ttt/minimax.htm

It runs without error, but if you take the bottom left corner first, then either of the other two squares on the bottom row - the AI doesn't see that coming. I'm sure this isn't a flaw in the minimax algorithm - can anyone see an error in my source? you can inspect the demo page to see everything but here is the primary ai function:

function bestMove(board,depth,low,high,opponent){
      var best=new Move(null,-iio.maxInt);
      var p;
      for (var c=0;c<grid.C;c++)
        for(var r=0;r<grid.R;r++){
          if (board[c][r]=='_'){
            var nuBoard=board.clone();
            nuBoard[c][r]=getTypeChar(opponent);
            if(checkWin(nuBoard,getTypeChar(opponent)))
              p=new Move([c,r],-evaluateBoard(board,getTypeChar(opponent))*10000);
            else if (checkScratch(nuBoard))
              p=new Move([c,r],0);
            else if (depth==0)
              p=new Move([c,r],-evaluateBoard(board,getTypeChar(opponent)));
            else {
              p=bestMove(nuBoard,depth-1,-high,-low,!opponent);
            }
            if (p.score>best.score){
              best=p;
              if (best.score > low)
                low=best.score;
              if (best.score >= high) return best;
            }
          }
        }
      return best;
    }

If you are more familiar with negamax, I tried that one too. I lifted the logic straight from this page. Here is a live demo: http://iioengine.com/ttt/negamax.htm

That one freezes up once you reach a win state, but you can already see that the AI is pretty stupid. Is something wrong with the code integration?

Please let me know if you find a flaw in my code that prevents these algrothims from running properly. Thnx.

Update with code:

function checkWin(board,type){
      for (var i=0;i<3;i++)
        if (evaluateRow(board,[i,0,i,1,i,2],type) >= WIN_SCORE
          ||evaluateRow(board,[0,i,1,i,2,i],type) >= WIN_SCORE)
          return true;
      if(evaluateRow(board,[0,0,1,1,2,2],type) >= WIN_SCORE
       ||evaluateRow(board,[2,0,1,1,0,2],type) >= WIN_SCORE)
        return true;
      return false;
    }

function evaluateBoard(board,type){
      var moveTotal=0;
      for (var i=0;i<3;i++){
        moveTotal+=evaluateRow(board,[i,0,i,1,i,2],type);
        moveTotal+=evaluateRow(board,[0,i,1,i,2,i],type);
      }
      moveTotal+=evaluateRow(board,[0,0,1,1,2,2],type);
      moveTotal+=evaluateRow(board,[2,0,1,1,0,2],type);
      return moveTotal;
    }

解决方案

The problem lies in your evaluateBoard() function. The evaluation function is the heart of a minimax/negamax algorithm. If your AI is behaving poorly, the problem usually lies in the evaluation of the board at each move.

For the evaluation of the board, you need to take into consideration three things: winning moves, blocking moves, and moves that result in a fork.

Winning Moves

The static evaluation function needs to know if a move results in a win or a loss for the current player. If the move results in a loss for the current player, it needs to return a very low negative number (lower than any regular move). If the move results in a win for the current player, it needs to return a very high positive number (larger than any regular move).

What is important to remember is that this evaluation has to be relative to the player whose turn the AI is making. If the AI is currently predicting where the Human player will move, then the evaluation must look at the board from the point of view of the Human player. When it's the AI's turn move, the evaluation must look at the board from the point of view of the Computer player.

Blocking Moves

When you run your evaluation function, the AI actually doesn't think blocking the Human player is beneficial. Your evaluation function looks like it just counts the number of available moves and returns the result. Instead, you need to return a higher positive number for moves that will help the AI win.

To account for blocking, you need to figure out if a player has 2 of their tokens in an open row, column, or diagonal, and then score the blocking square higher than any other square. So if it is the Computer's turn to move, and the Human player has 2 tokens in an open row, the 3rd square in the row needs to have a high positive number (but not as high as a winning square). This will cause the computer to favor that square over any others.

By just accounting for Winning moves and Blocking moves, you will have a Computer that plays fairly well.

Forking Moves

Forking moves cause problems for the Computer. The main problem is that the Computer is 'too smart' for it's own good. Since it assumes that the Human player will always make the best move every time, it will find situations where all moves that it could make will eventually end in a loss for it, so it will just pick the first move on the board it can since nothing else matters.

If we go through your example, we can see this happen: Human player plays bottom left, Computer plays top middle.

   | O |
---+---+---
   |   |
---+---+---
 X |   |   

When the Human player makes a move to the bottom right corner, the Computer sees that if it tries to block that move, the best move the Human player would make is to take the middle square, resulting in a fork and a win for the Human (although this won't happen even time since Humans are fallible, the Computer doesn't know that).

   | O |
---+---+---
   | X |
---+---+---
 X | O | X  

Because the computer will lose whether it blocks or doesn't block the Human from winning, blocking the Human will actually bubble up the lowest possible score (since it results in a loss for the Computer). This means that the Computer will take the best score it can - the middle square.

You'll have to figure out what is the best way to handle such situations, since everyone would play it differently. It's just something to be aware of.

这篇关于井字游戏AI Bug的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆