游戏2048的最优算法是什么? [英] What is the optimal algorithm for the game 2048?

查看:38
本文介绍了游戏2048的最优算法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近偶然发现了游戏 )时终止,当它达到预定义的深度限制时,或者当它到达一个极不可能的棋盘状态时(例如,它是通过从起始位置连续获得 6 个4"块来达到的).典型的搜索深度为 4-8 次移动.

启发式

几种启发式方法用于将优化算法引导至有利位置.启发式的精确选择对算法的性能有巨大的影响.各种启发式方法被加权并组合成一个位置分数,它确定给定的棋盘位置有多好".优化搜索的目标是最大化所有可能的棋盘位置的平均分数.游戏显示的实际分数用于计算棋盘分数,因为它的权重太大,不利于合并瓷砖(当延迟合并可能会产生很大的好处时).>

最初,我使用了两个非常简单的启发式方法,为开放的正方形和在边缘具有大值的人提供奖励".这些启发式算法的表现相当不错,经常达到 16384,但从未达到 32768.

Petr Morávek (@xificurk) 使用我的 AI 并添加了两个新的启发式方法.第一个启发式是对非单调行和列随着排名的增加而增加的惩罚,确保小数字的非单调行不会强烈影响分数,但大数字的非单调行会大大损害分数.第二个启发式计算除开放空间之外的潜在合并(相邻相等值)的数量.这两种启发式方法有助于将算法推向单调板(更容易合并)和具有大量合并的板位置(鼓励它在可能的情况下对齐合并以获得更大的效果).

此外,Petr 还使用元优化"策略(使用名为 CMA-ES),其中权重本身被调整以获得可能的最高平均分.

这些变化的影响极其显着.算法从大约 13% 的时间实现 16384 块到超过 90% 的时间,并且算法开始在 1/3 的时间内实现 32768(而旧的启发式算法从未产生过 32768 块).

我相信启发式方法仍有改进的余地.这个算法肯定还不是最佳",但我觉得它已经很接近了.

<小时>

人工智能在超过三分之一的游戏中达到 32768 块是一个巨大的里程碑;听说是否有人类玩家在官方游戏中达到了 32768(即不使用 savestates 或 undo 等工具),我会感到惊讶.我觉得65536瓷砖触手可及!

您可以自己尝试 AI.代码可在 https://github.com/nneonneo/2048-ai 获得.

I have recently stumbled upon the game 2048. You merge similar tiles by moving them in any of the four directions to make "bigger" tiles. After each move, a new tile appears at random empty position with a value of either 2 or 4. The game terminates when all the boxes are filled and there are no moves that can merge tiles, or you create a tile with a value of 2048.

One, I need to follow a well-defined strategy to reach the goal. So, I thought of writing a program for it.

My current algorithm:

while (!game_over) {
    for each possible move:
        count_no_of_merges_for_2-tiles and 4-tiles
    choose the move with a large number of merges
}

What I am doing is at any point, I will try to merge the tiles with values 2 and 4, that is, I try to have 2 and 4 tiles, as minimum as possible. If I try it this way, all other tiles were automatically getting merged and the strategy seems good.

But, when I actually use this algorithm, I only get around 4000 points before the game terminates. Maximum points AFAIK is slightly more than 20,000 points which is way larger than my current score. Is there a better algorithm than the above?

解决方案

I developed a 2048 AI using expectimax optimization, instead of the minimax search used by @ovolve's algorithm. The AI simply performs maximization over all possible moves, followed by expectation over all possible tile spawns (weighted by the probability of the tiles, i.e. 10% for a 4 and 90% for a 2). As far as I'm aware, it is not possible to prune expectimax optimization (except to remove branches that are exceedingly unlikely), and so the algorithm used is a carefully optimized brute force search.

Performance

The AI in its default configuration (max search depth of 8) takes anywhere from 10ms to 200ms to execute a move, depending on the complexity of the board position. In testing, the AI achieves an average move rate of 5-10 moves per second over the course of an entire game. If the search depth is limited to 6 moves, the AI can easily execute 20+ moves per second, which makes for some interesting watching.

To assess the score performance of the AI, I ran the AI 100 times (connected to the browser game via remote control). For each tile, here are the proportions of games in which that tile was achieved at least once:

2048: 100%
4096: 100%
8192: 100%
16384: 94%
32768: 36%

The minimum score over all runs was 124024; the maximum score achieved was 794076. The median score is 387222. The AI never failed to obtain the 2048 tile (so it never lost the game even once in 100 games); in fact, it achieved the 8192 tile at least once in every run!

Here's the screenshot of the best run:

This game took 27830 moves over 96 minutes, or an average of 4.8 moves per second.

Implementation

My approach encodes the entire board (16 entries) as a single 64-bit integer (where tiles are the nybbles, i.e. 4-bit chunks). On a 64-bit machine, this enables the entire board to be passed around in a single machine register.

Bit shift operations are used to extract individual rows and columns. A single row or column is a 16-bit quantity, so a table of size 65536 can encode transformations which operate on a single row or column. For example, moves are implemented as 4 lookups into a precomputed "move effect table" which describes how each move affects a single row or column (for example, the "move right" table contains the entry "1122 -> 0023" describing how the row [2,2,4,4] becomes the row [0,0,4,8] when moved to the right).

Scoring is also done using table lookup. The tables contain heuristic scores computed on all possible rows/columns, and the resultant score for a board is simply the sum of the table values across each row and column.

This board representation, along with the table lookup approach for movement and scoring, allows the AI to search a huge number of game states in a short period of time (over 10,000,000 game states per second on one core of my mid-2011 laptop).

The expectimax search itself is coded as a recursive search which alternates between "expectation" steps (testing all possible tile spawn locations and values, and weighting their optimized scores by the probability of each possibility), and "maximization" steps (testing all possible moves and selecting the one with the best score). The tree search terminates when it sees a previously-seen position (using a transposition table), when it reaches a predefined depth limit, or when it reaches a board state that is highly unlikely (e.g. it was reached by getting 6 "4" tiles in a row from the starting position). The typical search depth is 4-8 moves.

Heuristics

Several heuristics are used to direct the optimization algorithm towards favorable positions. The precise choice of heuristic has a huge effect on the performance of the algorithm. The various heuristics are weighted and combined into a positional score, which determines how "good" a given board position is. The optimization search will then aim to maximize the average score of all possible board positions. The actual score, as shown by the game, is not used to calculate the board score, since it is too heavily weighted in favor of merging tiles (when delayed merging could produce a large benefit).

Initially, I used two very simple heuristics, granting "bonuses" for open squares and for having large values on the edge. These heuristics performed pretty well, frequently achieving 16384 but never getting to 32768.

Petr Morávek (@xificurk) took my AI and added two new heuristics. The first heuristic was a penalty for having non-monotonic rows and columns which increased as the ranks increased, ensuring that non-monotonic rows of small numbers would not strongly affect the score, but non-monotonic rows of large numbers hurt the score substantially. The second heuristic counted the number of potential merges (adjacent equal values) in addition to open spaces. These two heuristics served to push the algorithm towards monotonic boards (which are easier to merge), and towards board positions with lots of merges (encouraging it to align merges where possible for greater effect).

Furthermore, Petr also optimized the heuristic weights using a "meta-optimization" strategy (using an algorithm called CMA-ES), where the weights themselves were adjusted to obtain the highest possible average score.

The effect of these changes are extremely significant. The algorithm went from achieving the 16384 tile around 13% of the time to achieving it over 90% of the time, and the algorithm began to achieve 32768 over 1/3 of the time (whereas the old heuristics never once produced a 32768 tile).

I believe there's still room for improvement on the heuristics. This algorithm definitely isn't yet "optimal", but I feel like it's getting pretty close.


That the AI achieves the 32768 tile in over a third of its games is a huge milestone; I will be surprised to hear if any human players have achieved 32768 on the official game (i.e. without using tools like savestates or undo). I think the 65536 tile is within reach!

You can try the AI for yourself. The code is available at https://github.com/nneonneo/2048-ai.

这篇关于游戏2048的最优算法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆