如何实现连接4的转置表? [英] How to implement a transposition table for connect 4?

查看:139
本文介绍了如何实现连接4的转置表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用python创建connect 4 AI,为此我使用了带有迭代加深和alpha beta修剪的minimax.对于更大的深度,它仍然很慢,因此我想实现一个转置表.在阅读了它之后,我认为我有了大致的想法,但是我还没有完全能够使它起作用.这是我的代码的一部分:( minimax的最大化部分):

I'm making a connect 4 AI in python, and I'm using minimax with iterative deepening and alpha beta pruning for this. For greater depths it's still quite slow, so I wanted to implement a transposition table. After reading up on it I think i get the general idea but i haven't been able to quite make it work. Here's part of my code: (the maximizing part of the minimax):

    if(isMaximizing):
    maxEval = -99999999999
    bestMove = None
    # cache.get(hash(board)) Here's where i'd check to see if the hash is already in the table 
    # if so i searched for the best move that was given to that board before.

    # loop through possible moves
    for move in [3,2,4,1,5,0,6]:
        if moves[move] > -1:
            # check if time limit has been reached for iterative deepening
            if startTime - time.time() <= -10:
                timeout = True
                return (maxEval, bestMove, timeout)

            if timeout == False:
                board = makeMove((moves[move],move), True, board) # make the move 
                eval = minimax(depth - 1, board, False, alpha, beta, cache, zobTable, startTime, timeout)[0]

                if eval > maxEval:
                    maxEval = eval
                    bestMove = (moves[move]+1,move)

                board[moves[move] + 1][move] = '_'  # undo the move on the board
                moves[move] = moves[move] + 1 # undo the move in the list of legal moves

                alpha = max(alpha, maxEval)
                if alpha >= beta:
                    break
                # cache.set(hash(board), (eval, value)) Here's where i would set the value and bestmove for the current boardstate
    return (maxEval, bestMove, timeout)

现在,我正在使用zobrist哈希方法对板进行哈希处理,并且我正在使用有序词典将哈希板添加到其中.在此哈希键中,我添加了该板的值和该板的bestMove.不幸的是,这似乎使算法选择了不好的动作(它曾经起作用过),有人知道您应该将板态放在缓存中,又应该从缓存中获取它们吗?

Right now i'm hashing the board with the zobrist hashing method, and i'm using an ordered dict to add the hashed boards to. To this hashkey i've added the value for the board and the bestMove for that board. Unfortunately this seems to make the algorithm pick bad moves (it worked before), does anyone know where you should put the boardstates in the cache, and where you should get them from the cache?

推荐答案

您的方法有几点:

  1. 如果您希望事情变得更快,那么用C或C ++编写高效的代码将比python快得多.通过从python切换到良好的C/C ++实现,我已经看到这种搜索代码的性能提高了10到100倍.无论哪种方式,您都应尝试编写避免在搜索过程中分配内存的代码,因为这非常昂贵.也就是说,与添加转置表相比,更有效的编码可以带来更好的回报.

  1. If you want things to be fast, writing efficient code in C or C++ is going to be much faster than python. I've seen 10-100x improvements in performance in this sort of search code by switching away from python and to a good C/C++ implementation. Either way you should try to write code that avoids allocating memory during search, as this is very expensive. That is to say, you could see better returns from coding more efficiently than from adding a transposition table.

在游戏树搜索中将Zobrist哈希用于换位表时,通常不会显式存储状态.您仅检查哈希值是否相等.尽管发生错误的可能性很小,但是仅存储哈希就需要更少的内存,而对于64位哈希,对于您正在执行的搜索类型,发生冲突的可能性可能会很小.(产生错误的机会更低.)

When using Zobrist hashing for a transposition table in game tree search, you typically do not store the state explicitly. You only check to see if the hashes are equal. While there is a small chance of error, it requires far less memory to store just the hash, and with a 64-bit hash the chance of collisions are probably vanishingly small for the types of searches you are doing. (The chances of errors resulting are even lower.)

当将值存储在换位表中时,还需要存储在搜索过程中使用的alpha和beta范围.当您在节点中间搜索中获得一个值时,它要么是真实值的上限(因为value = beta),要么是真实值的下限(因为value = alpha),或者是节点的实际值(alpha< value< beta).您需要将其存储在转置表中.然后,当您要重用该值时,必须检查是否可以在给定当前alpha和beta边界的情况下使用该值.(您可以通过在换位表中找到该值之后进行实际搜索来验证这一点,以查看是否从搜索中获得与该表中相同的值.)

When you store values in the transposition table, you also need to store the alpha and beta bounds used during the search. When you get a value back at a node mid-search it is either an upper bound on the true value (because value = beta), a lower bound on the true value (because value = alpha) or the actual value of the node (alpha < value < beta). You need to store this in your transposition table. Then, when you want to re-use the value, you have to check that you can use the value given your current alpha and beta bounds. (You can validate this by actually doing the search after finding the value in the transposition table to see if you get the same value from search that you got in the table.)

这篇关于如何实现连接4的转置表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆