遗传算法在游戏代理启发式评估函数中的优化 [英] Genetic algorithm for optimization in game playing agent heuristic evaluation function

本文介绍了遗传算法在游戏代理启发式评估函数中的优化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是对以下问题的回答: 如何为游戏创建良好的评估功能?,特别是@David(这是第一个答案).

This is in response to an answer given in this question: How to create a good evaluation function for a game?, particularly by @David (it is the first answer).

背景:我正在使用一种遗传算法来优化使用minimax/alpha beta修剪(带有迭代加深)的游戏代理中的超参数.特别是,我想使用遗传算法来优化启发式(评估)功能参数.我使用的评估功能是:

Background: I am using a genetic algorithm to optimize the hyper parameters in a game playing agent that is using minimax / alpha beta pruning (with iterative deepening). In particular, I would like to optimize the heuristic (evaluation) function parameters using a genetic algorithm. The evaluation function I use is:

f(w)= w * num_my_moves-(1-w)* num_opponent_moves

f(w) = w * num_my_moves - (1-w) * num_opponent_moves

要优化的唯一参数是[0,1]中的w.

The only parameter to optimize is w in [0,1].

这是我编程遗传算法的方式:

  1. 创建一个随机种群,例如100名代理商
  2. 让他们随机替换玩1000场比赛.
  3. 让父母成为表现最佳的媒介,并混入一些表现较差的媒介,以实现遗传多样性.
  4. 随意育种一些父母来创造孩子. *繁殖过程:我们将孩子定义为其父母体重的平均值. 即childWeight = 0.5 (father.w + mother.w)
  5. 新人口由父母和新创建的孩子组成.
  6. 随机突变1%的人口,如下所示: newWeight = agent.x + random.uniform(-0.01,0.01)并说明平凡的边界案例(即小于零且大于一的情况).
  7. 进化10次(即对新人口重复一次)
  1. Create a random population of say 100 agents
  2. Let them play 1000 games at random with replacement.
  3. Let the parents be the top performing agents with some poorer performing agents mixed in for genetic diversity.
  4. Randomly breed some parents to create children. * Breeding process: We define a child to be an average of the weights of its parents. i.e. childWeight = 0.5(father.w+ mother.w)
  5. The new population is formed by the parents and the newly created children.
  6. Randomly mutate 1% of the population as follows: newWeight = agent.x + random.uniform(-0.01,0.01) and account for trivial border cases (i.e. less than zero and greater than one, appropriately).
  7. Evolve 10 times (i.e. repeat for new population)

我的问题:请评估以上粗体字.特别是,没有人有更好的繁殖方法(而不是简单地平均父母权重),还有没有人有更好的变异方法,而不仅仅是添加random.uniform(-0.01,0.01)?

My question: Please evaluate the bold points above. In particular, does anyone have a better way to breed (rather than trivially averaging the parent weights), and does anyone have a better way to mutate, rather than just adding random.uniform(-0.01,0.01)?

推荐答案

看起来您实际上并没有将遗传算法应用于代理,而是直接对表型/权重进行了简单的进化.我建议您尝试引入您的体重的遗传表示,然后进化该基因组.一个例子是将您的权重表示为二进制字符串,并对字符串的每一位应用演化,这意味着每一位都有被突变的可能性.这称为点突变.您还可以应用其他许多突变,但这只是一个开始.

It looks like you're not actually applying a genetic-algorithm to your agents, but rather just simple evolution directly on the phenotype/weights. I suggest you try introducing a genetic representation of your weights, and evolve this genome instead. An example would be to represent your weights as a binary string, and apply evolution on each bit of the string, meaning there is a likelihood that each bit gets mutated. This is called point mutations. There are many other mutations you can apply, but it would do as a start.

您会注意到,您的特工不会受制于局部极小值,因为有时微小的遗传变化会极大地改变表型/权重.

What you will notice is that your agents don't get stuck in local minima as much because sometimes a small genetic change can vastly change the phenotype/weights.

好吧,这听起来可能很复杂,但事实并非如此.让我举一个例子:

Ok, that might sound complicated, it's not really. Let me give you an example:

假设基数为10的权重为42.二进制形式为101010.现在,您已对二进制表示的每一位实现了1%的突变率.假设最后一位被翻转了.然后,我们用二进制101011或十进制43.没有太大的变化.另一方面,对第二个位进行相同的操作将使您得到111010二进制或58十进制.注意大跃进.这就是我们想要的,并且可以让您的座席人群更快地搜索解决方案空间的很大一部分.

Say you have a weight of 42 in base 10. This would be 101010 in binary. Now you have implemented a 1% mutation rate on each bit of the binary representation. Let's say the last bit is flipped. Then we have 101011 in binary, or 43 in decimal. Not such a big change. Doing the same with the second bit on the other hand gives you 111010 in binary or 58 decimal. Notice the big jump. This is what we want, and lets your agent population search a larger part of the solution space faster.

关于繁殖.您可以尝试交叉.假设您有许多权重,每个权重都带有遗传编码.如果将整个基因组(所有二进制数据)表示为一个较长的二进制字符串,则可以组合两个亲本基因组的各个部分.再举个例子.以下是父亲"和母亲"的基因组和表型:

With regard to breeding. You can try crossover. Lets assume you have many weights each with a genetic encoding. If you represent the whole genome (all the binary data) as one long binary string you can combine sections of the two parents genome. Example, again. The following is the "father" and "mother" genome and phenotype:

Weight Name:          W1     W2     W3     W4     W5
Father Phenotype:     43     15     34     17     14
Father Genome:    101011 001111 100010 010001 001110
Mother Genome:    100110 100111 011001 010100 101000
Mother Phenotype:     38     39     25     20     40

您可以做的是在同一个地方通过两个基因组绘制任意线条,然后将片段任意分配给孩子.这是交叉的一个版本.

What you can do is draw arbitrary lines through both genomes at the same place, and assign the segments arbitrarily to the child. This is a version of crossover.

Weight Name:          W1     W2     W3     W4     W5
Father Genome:    101011 00.... ...... .....1 001110
Mother Genome:    ...... ..0111 011001 01010. ......
Child Genome:     101011 000111 011001 010101 001110
Child Phenotype:      43      7     25     21     14

这里的前8位和后7位来自父亲,中间来自母亲.请注意,权重W1和W5完全来自父亲,而W3完全来自母亲.而W2和W4是组合. W4几乎没有任何变化,而W2发生了巨大变化.

Here the first 8 and the last 7 bits come from the father, and the middle comes from the mother. Notice how weight W1 and W5 are entirely from the father, and W3 is entirely from the mother. While W2 and W4 are combinations. W4 had hardly any change, while W2 has changed drastically.

我希望这会给您一些有关如何执行遗传算法的见识.也就是说,除非您要学习,否则我建议您使用现代库而不是自己实现.

I hope this gives you some insight in how to do genetic algorithms. That said, I recommend using a modern library instead of implementing it yourself, unless you are doing it to learn.

编辑:有关处理权重/二进制表示的更多信息:

Edit: More on handling the weights/binary representation:

  • 如果需要分数,可以通过将分子和分母分开为不同的权重,或者将其中的一个作为常数,例如4210给出4.2.)
  • 大于0的约束免费.要真正获得负数,您需要抵消权重.
  • 通过将权重除以该位字符串长度的最大可能值,可以获得少于1个约束.在上面的示例中,您有6位,最多可以变为63位.如果您在突变后在基数10中获得10101042的二进制字符串,则执行42/63得到0.667,并且只能获得最高为1.0,即63/63.
  • 两个权重的总和等于1?如果对于W1W2获得101010001000,则给出42和8,则可以转到W1_scaled = W1 / (W1 + W2) = 0.84W2_scaled = W2 / (W1 + W2) = 0.16.这应该总是给您W1_scaled + W2_scaled = 1.
  • If you need fractions, you can do this by separating the numerator and denominator as different weights, or have one of them as a constant, e.g., 42 and 10 gives 4.2.)
  • Larger than 0 constraints come free. To actually get negative numbers you need to negate your weights.
  • Less than 1 constraint you can get by dividing the weight by the maximum possible value for that bit string length. In the examples above you have 6 bits, which can become a maximum of 63. If you then after mutation get a binary string of 101010 or 42 in base 10, you do 42/63 getting 0.667 and can only ever get as high as 1.0, when 63/63.
  • Two weights' sum equal to 1? If you get 101010 and 001000 for W1 and W2, it gives 42 and 8, then you can go W1_scaled = W1 / (W1 + W2) = 0.84 and W2_scaled = W2 / (W1 + W2) = 0.16. This should give you W1_scaled + W2_scaled = 1 always.

这篇关于遗传算法在游戏代理启发式评估函数中的优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆