如何使用转表与MTD(F) [英] How to use transposition tables with MTD(f)

查看:331
本文介绍了如何使用转表与MTD(F)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在写一个AI的纸牌游戏,我已经发现了一些测试后,使用MTD(f)关于我的α+β算法 - 一系列的零窗口搜索 - 比仅仅使用α-β通过快本身。

I'm writing an AI for a card game and after some testing I've discovered that using MTD(f) on my alpha beta algorithm - a series of zero-window searches - is faster than just using alpha-beta by itself.

在MTD(F)算法描述以及这里的http://people.csail.mit.edu/plaat/mtdf.html

The MTD(f) algorithm is described well here http://people.csail.mit.edu/plaat/mtdf.html

我的问题是,对每个传中MTD(F)搜索(每猜测),我不重用任何我已存储的,即使写了链路上表明,previous位置我(其实结算迭代之间的桌子加快算法)。

The problem I have is that for each pass in the MTD(f) search (for each guess) I don't reuse any of the previous positions I have stored even though the write up on the link suggests that I should (in fact clearing the table between iterations speeds up the algorithm).

我的问题是,当我保存在我的换位表中的位置和价值我也保存了alpha和beta值,它是有效的。因此第二次通过与不同的猜测(因此α和β)的树不可能重复使用的任何信息。请问这是什么是可以预期的还是我失去了一些东西根本在这里?

My problem is that when I store a position and a value in my transposition table I also store the alpha and beta values for which it is valid. Therefore a second pass through the tree with a different guess (and therefore alpha and beta) can't possibly reuse any information. Is this what is to be expected or am I missing something fundamental here?

例如,如果对阿尔法= 3的β= 4,我们来到了7(显然截止),因此我应该存储在表中作为有效的α= 3至公测= 6?或β= 7?

For instance if for alpha=3 beta=4 we come to a result of 7 (obviously a cut-off) should I store that in the table as valid for alpha=3 to beta=6? Or beta=7?

推荐答案

您的问题归结为如何使用一个置换表的旁边一个alpha测试版的搜索概念的理解。这是一个巨大的问题,我碰到的欢迎,并环顾四周后,我发现这个讨论这解释了这一概念对我来说更自然,比我读了关于这个专题的所有纸张。

Your problem comes down to the conceptual understanding of how to use a transposition table along side an alpha beta search. This was a huge problem I ran into as well, and after looking around I found this discussion which explained the concept to me more naturally than any paper I had read on the topic.

由于在截止时基本上你不能把所有的α-β的结果相同,只是重新presents绑定的,而不是真正的极小值的结果。事实已证明,使用范围仍然会永远给你同样的最好的一个状态,但可能不会有确切的分数。当您从截止存储的状态,你需要把它作为一个约束,并尝试改善它的下传。这往往会评估同一节点多次,但根据需要在实际的得分就会不断提高。

Basically you cannot treat all alpha-beta results the same because when a cutoff occurs, the result only represents a bound, and not the true minimax value. It has been proven that using bounds will still always give you the same best next state, but possibly without having the exact score. When you store the state from a cutoff, you need to treat it as a bound and try to improve upon it on the next pass. This will often evaluate the same node multiple times, but it will continually improve upon the actual score as needed.

下面是一个更完整的实现的概念,一个很好的例子在$ p上市$ pviously链接的文章。滚动到第14页。

Here is a good example of a more complete implementation of the concepts listed in the previously linked article. Scroll to page 14.

这篇关于如何使用转表与MTD(F)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆