什么是游戏2048的最佳算法? [英] What is the optimal algorithm for the game 2048?

查看:315
本文介绍了什么是游戏2048的最佳算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近偶然发现了游戏 2048 。您可以通过移动它们在任何的四个方向,使做大砖合并同类砖。每次移动后,新的互动出现在以价值随机空位或者 2 4 。游戏结束时,所有的箱子都装满,也没有动作,可以合并的瓷砖,或者创建具有值瓷砖 2048

I have recently stumbled upon the game 2048. You merge similar tiles by moving them in any of the four directions to make "bigger" tiles. After each move, a new tile appears at random empty position with value of either 2 or 4. The game terminates when all the boxes are filled and there are no moves that can merge tiles, or you create a tile with a value of 2048.

一,我需要遵循一个明确的策略来达到目标​​。所以,我想编写一个程序吧。

One, I need to follow a well-defined strategy to reach the goal. So, I thought of writing a program for it.

我目前的算法:

while (!game_over) {
    for each possible move:
        count_no_of_merges_for_2-tiles and 4-tiles
    choose the move with large number of merges
}

我在做什么是在任何时候,我会尝试合并与价值观的瓷砖 2 4 ,那就是,我努力 2 4 瓷砖,尽可能最小。如果我尝试这种方式,其他所有的瓷砖都自动得到合并,战略似乎不错。

What I am doing is at any point, I will try to merge the tiles with values 2 and 4, that is, I try to have 2 and 4 tiles, as minimum as possible. If I try it this way, all other tiles were automatically getting merged and the strategy seems good.

但是,当我真正使用这种算法,我只得到约4000分的比赛结束了。最大点AFAIK略超过20000点,这是比我目前的成绩更大。难道还有比上述更好的算法?

But, when I actually use this algorithm, I only get around 4000 points before the game terminates. Maximum points AFAIK is slightly more than 20,000 points which is way larger than my current score. Is there a better algorithm than the above?

推荐答案

我公司开发的2048 AI使用的 expectimax 的。人工智能简单地执行最大化在所有可能的动作,随后期望对所有可能的瓦片鱼卵(由瓦片的概率加权,即10%的4和90%的2)。据我所知,这是不可能的修剪expectimax优化(除非删除是极其不可能的分支机构),因此使用的算法是一个精心优化的蛮力搜索。

I developed a 2048 AI using expectimax optimization, instead of the minimax search used by @ovolve's algorithm. The AI simply performs maximization over all possible moves, followed by expectation over all possible tile spawns (weighted by the probability of the tiles, i.e. 10% for a 4 and 90% for a 2). As far as I'm aware, it is not possible to prune expectimax optimization (except to remove branches that are exceedingly unlikely), and so the algorithm used is a carefully optimized brute force search.

人工智能在其默认的配置(8最大搜索深度),需要从10毫秒到200毫秒的任何地方执行一动,这取决于板位置的复杂性。在测试中,AI达到每秒5-10移动在整个游戏过程中的平均移动速度。如果搜索的深度限制在6移动,AI可以轻松执行每秒20+的举动,这使得一些有趣的看

The AI in its default configuration (max search depth of 8) takes anywhere from 10ms to 200ms to execute a move, depending on the complexity of the board position. In testing, the AI achieves an average move rate of 5-10 moves per second over the course of an entire game. If the search depth is limited to 6 moves, the AI can easily execute 20+ moves per second, which makes for some interesting watching.

要评估的AI的得分表现,我跑了AI的100倍(通过远程控制连接到浏览器的游戏)。对于每个瓦片,这里是游戏中的瓷砖达到至少一次的比例:

To assess the score performance of the AI, I ran the AI 100 times (connected to the browser game via remote control). For each tile, here are the proportions of games in which that tile was achieved at least once:

2048: 100%
4096: 100%
8192: 100%
16384: 94%
32768: 36%

最低得分超过所有运行的是124024;最大比分取得了794076.平均得分是387222.人工智能再也没能获得2048瓦(所以它从来没有在100场比赛输掉了比赛甚至一度);事实上,它在每一个至少运行一次实现在 8192 瓷砖!

下面是最佳运行的截图:

Here's the screenshot of the best run:

本场比赛耗时27830移动超过96分钟,平均每秒4.8的动作。

This game took 27830 moves over 96 minutes, or an average of 4.8 moves per second.

我的方法连接codeS整板(16项)作为一个单独的64位整数(其中,地砖是半字节,即4位数据块)。在64位的机器,这使得整个电路板在一台机器寄存器被传来传去。

My approach encodes the entire board (16 entries) as a single 64-bit integer (where tiles are the nybbles, i.e. 4-bit chunks). On a 64-bit machine, this enables the entire board to be passed around in a single machine register.

位移运算用于提取单个的行和列。一个单一的行或列是一个16位的数量,规模65536可连接$ C $,所以C的单一一行或一列运行的转换表。例如,移动被实现为4查找到pcomputed动效应表,它描述了每个举动如何影响一个单一的行或列(例如,右移表中包含的条目$ P $1122 - > 0023描述如何对行[2,2,4,4]变为行[0,0,4,8]当移动到右侧)。

Bit shift operations are used to extract individual rows and columns. A single row or column is a 16-bit quantity, so a table of size 65536 can encode transformations which operate on a single row or column. For example, moves are implemented as 4 lookups into a precomputed "move effect table" which describes how each move affects a single row or column (for example, the "move right" table contains the entry "1122 -> 0023" describing how the row [2,2,4,4] becomes the row [0,0,4,8] when moved to the right).

评分也做得使用表查找。这些表包含上计算所有可能的行/列启发式的分数,并且将所得的分数一个板是简单地跨越每行和列中的表值的总和。

Scoring is also done using table lookup. The tables contain heuristic scores computed on all possible rows/columns, and the resultant score for a board is simply the sum of the table values across each row and column.

此板重新presentation,以及用于运动和得分查表方法中,允许的AI来搜索一个巨大的游戏状态数在很短的时间周期(超过千万每秒游戏状态上的一个芯我的2011年中期的笔记本电脑)。

This board representation, along with the table lookup approach for movement and scoring, allows the AI to search a huge number of game states in a short period of time (over 10,000,000 game states per second on one core of my mid-2011 laptop).

该expectimax搜索本身是codeD作为递归搜索期望措施之间交替(测试所有可能的瓷砖产卵的地点和价值观,并通过各种可能性的概率加权的优化分数),而最大化步骤(测试所有可能的行动,并选择一个最好的成绩)。树搜索结束时,看到一个previously见过的位置(使用置换表),当它到达一个predefined深度限制,或当它到达一个板状态是极不可能的(例如,它是由从起始位置得到64瓦片连续到达)。典型的搜索深度为4-8移动。

The expectimax search itself is coded as a recursive search which alternates between "expectation" steps (testing all possible tile spawn locations and values, and weighting their optimized scores by the probability of each possibility), and "maximization" steps (testing all possible moves and selecting the one with the best score). The tree search terminates when it sees a previously-seen position (using a transposition table), when it reaches a predefined depth limit, or when it reaches a board state that is highly unlikely (e.g. it was reached by getting 6 "4" tiles in a row from the starting position). The typical search depth is 4-8 moves.

若干启发式用于引导朝有利位置的优化算法。的precise选择启发式的对算法的性能有巨大的影响。各种启发式经过加权,并组合成一个位置得分,这决定了好给定板位置。优化搜索然后将致力于最大限度的平均得分所有可能的董事会职务。实际的得分,如图所示的游戏,是的没有的用于计算板得分,因为实在是太沉重了有利于合并的瓷砖时(延迟合并可能产生较大收益)的。

Several heuristics are used to direct the optimization algorithm towards favorable positions. The precise choice of heuristic has a huge effect on the performance of the algorithm. The various heuristics are weighted and combined into a positional score, which determines how "good" a given board position is. The optimization search will then aim to maximize the average score of all possible board positions. The actual score, as shown by the game, is not used to calculate the board score, since it is too heavily weighted in favor of merging tiles (when delayed merging could produce a large benefit).

起初,我用了两个很简单的启发式,发放奖金的开放式广场和上边缘有较大的值。这些启发式执行pretty的好,经常达到16384,但从来没有得到到32768。

Initially, I used two very simple heuristics, granting "bonuses" for open squares and for having large values on the edge. These heuristics performed pretty well, frequently achieving 16384 but never getting to 32768.

切赫Morávek(@xificurk)把我的AI和增加了两个新的启发。第一试探是点球用于具有非单调的行和列从而增加作为行列的增加,确保非单调排小数目不会强烈地影响得分,但非单调的行大量伤得分基本上。第二启发式计数潜在合并的数目(相邻相等的值),除了开放空间。这两种启发式送达推向单调板(这是比较容易合并)的算法,以及对董事会职位有很多合并的(鼓励其调整合并在可能获得更好的效果)。

Petr Morávek (@xificurk) took my AI and added two new heuristics. The first heuristic was a penalty for having non-monotonic rows and columns which increased as the ranks increased, ensuring that non-monotonic rows of small numbers would not strongly affect the score, but non-monotonic rows of large numbers hurt the score substantially. The second heuristic counted the number of potential merges (adjacent equal values) in addition to open spaces. These two heuristics served to push the algorithm towards monotonic boards (which are easier to merge), and towards board positions with lots of merges (encouraging it to align merges where possible for greater effect).

此外,切赫也进行了优化使用元优化策略(使用的算法称为 CMA-ES <启发式权/一>),其中权重本身进行了调整,以获得可能的最高平均得分

Furthermore, Petr also optimized the heuristic weights using a "meta-optimization" strategy (using an algorithm called CMA-ES), where the weights themselves were adjusted to obtain the highest possible average score.

这些变化的影响是非常显著。从达到16384瓦约13%的时间内超过90%的时间实现它的算法去,算法开始达到32768以上的时间的1/3(而旧的试探从来没有一次产生了32768瓦)

The effect of these changes are extremely significant. The algorithm went from achieving the 16384 tile around 13% of the time to achieving it over 90% of the time, and the algorithm began to achieve 32768 over 1/3 of the time (whereas the old heuristics never once produced a 32768 tile).

我认为还是有改进的余地的启发。该算法肯定是没有最佳,但我觉得它越来越pretty的接近。

I believe there's still room for improvement on the heuristics. This algorithm definitely isn't yet "optimal", but I feel like it's getting pretty close.

这是人工智能达到超过它的比赛第三个是一个巨大的里程碑32768片;我会很惊讶地听到,如果任何人的球员在正式比赛达到32768(即不使用工具,如savestates或撤销)。我认为65536瓦是指日可待!

That the AI achieves the 32768 tile in over a third of its games is a huge milestone; I will be surprised to hear if any human players have achieved 32768 on the official game (i.e. without using tools like savestates or undo). I think the 65536 tile is within reach!

您可以尝试AI自己。在code可在 https://github.com/nneonneo/2048-ai

You can try the AI for yourself. The code is available at https://github.com/nneonneo/2048-ai.

这篇关于什么是游戏2048的最佳算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆