为什么蒙特卡洛树搜索重置树 [英] Why does Monte Carlo Tree Search reset Tree

查看:92
本文介绍了为什么蒙特卡洛树搜索重置树的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于蒙特卡洛树搜索,我有一个很小但可能很愚蠢的问题。我了解其中的大部分内容,但是一直在研究某些实现,并注意到在针对给定状态运行MCTS并返回最佳移动之后,该树将被丢弃。因此,下一步,我们必须在这个新状态下从头开始运行MCTS,以获得下一个最佳位置。

I had a small but potentially stupid question about Monte Carlo Tree Search. I understand most of it but have been looking at some implementations and noticed that after the MCTS is run for a given state and a best move returned, the tree is thrown away. So for the next move, we have to run MCTS from scratch on this new state to get the next best position.

我只是想知道为什么我们不保留一些来自老树的信息。似乎有关于旧树状态的有价值的信息,尤其是考虑到最好的举动是MCTS探索最多的一种。有什么特殊的原因不能以某种有用的方式使用这些旧信息?

I was just wondering why we don't retain some of the information from the old tree. It seems like there is valuable information about the states in the old tree, especially given that the best move is one where the MCTS has explored most. Is there any particular reason we can't use this old information in some useful way?

推荐答案

某些实现确实保留了这些信息。

Some implementations do indeed retain the information.

例如, AlphaGo零位纸张说:


在随后的时间步中搜索树被重用
:对应于已执行动作的子节点变为新的根
节点;该子对象下方的子树及其所有统计信息都将保留,而其余
的树将被丢弃

The search tree is reused at subsequent time-steps: the child node corresponding to the played action becomes the new root node; the subtree below this child is retained along with all its statistics, while the remainder of the tree is discarded

这篇关于为什么蒙特卡洛树搜索重置树的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆