AI如何为战列舰建立遗传规划模型 [英] AI How to model genetic programming for Battleships

查看:115
本文介绍了AI如何为战列舰建立遗传规划模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于遗传编程的问题。我将为一个



正如您所看到的那样,它的平均分数非常低,但学得很快。但是,船的方向不断变化,除随机分量外还会产生一些噪音。一次又一次的突变使得这个群体变得有点混乱,但随着整个群体的改善而变得越来越少。



挑战,以及许多论文确定的原因,是让更多的东西变得可变,特别是以建设性的方式。例如,镜头数量可能是可变的。或者,根据最后一次拍摄是命中还是错过分支树替换拍摄列表可能会改进,但很难说。这就是'决策'逻辑考虑因素的来源。有一个随机镜头列表或树根据先前的镜头决定采取哪个分支更好吗?更高层次的挑战包括预测哪些变化将使小组学得更快,并且不易受到不良突变的影响。



最后,考虑可能有多个团体,一组是战舰猎人,一组是潜艇猎人。每个组虽然由相同的代码组成,但可以进化不同的内部遗传学,使他们能够专门完成他们的任务。



无论如何,一如既往,从某处开始简单,随便学习,直到你恢复阅读论文为止。



PS>也需要这样:

 公共类位置{
int x;
int y;
Position(int x,int y){this.x = x; this.y = y;}

@Override
public boolean equals(Object m){
return(((Position)m).x == x&& ((位置)M).Y == y)基
}
}

UDATE:已添加发货类,修正了一些错误:

 公共类船舶{
列表<位置>位置;

//测试是否有点击
public boolean madeHit(Position shot){
for(Position p:positions){
if(p.equals(拍)))返回true;
}
返回false;
}

//进行新的定位
public int newOrientation(){
positions = new ArrayList< Position>(3);
//随机发货。
int shipInX = 0,oShipInX = 0,shipInY = 0,oShipInY = 0;

int orient =(int)(Math.random()* 4);
if(orient == 0){
oShipInX = 1;
shipInX =(int)(Math.random()* 3)-3;
}
else if(orient == 1){
oShipInX = -1;
shipInX =(int)(Math.random()* 3);
}
else if(orient == 2){
oShipInY = 1;
shipInY =(int)(Math.random()* 3)-3;
}
else if(orient == 3){
oShipInY = -1;
shipInY =(int)(Math.random()* 3);
}

//使船舶的位置为
for(int i = 0; i< 3; ++ i){
positions.add(新职位(shipInX,shipInY));
if(orient == 2 || orient == 3)
shipInY = shipInY + oShipInY;
else
shipInX = shipInX + oShipInX;
}
返回东方;
}

public int getSize(){
return positions.size();
}
}


I have a question regarding Genetic Programming. I am going to work on a genetic algorithm for a game called Battleships.

My question is: How would I decide upon a "decision" model for the AI to evolve? And how does that work?

I have read multiple papers and multiple answers that just speak about using different models, but could not find something specific, which, unfortunately, I apparently need to wrap my head around the problem.

I want it to evolve over multiple iterations and "learn" what works best, but not sure how to save these "decisions" (I know to a file, but "encoded" how?) in a good way, so it will learn to take a stance to previous actions and base off info from the current board state.

I have been contemplating a "Tree structure" for the AI to base decisions on, but I don't actually know how to get started.

If someone could either point me in the right direction (a link? Some pseudo-code? Something like that), that'd be very much appreciated, I tried to google as much as possible, watch multiple youtube videos about the subject, but I think I just need that little nudge in the right direction.

I may also just not know what exactly to search for, and this is why I come up blank with results on what and how I implement this.

解决方案

ANSWER PART I: The basis for a genetic algorithm is a having a group of actors, some of which reproduce. The fittest are chosen for reproduction and the offspring are copies of the parents that are slightly mutated. It's a pretty simple concept, but to program it you have to have actions that can be randomly chosen and dynamically modified. For the battleship simulation I created a class called a Shooter because it 'shoots' at a position. The assumption here is that the first position has been hit, and the shooter is now trying to sink the battleship.

public class Shooter implements Comparable<Shooter> {
    private static final int NUM_SHOTS = 100;
    private List<Position> shots;
    private int score;

    // Make a new set of random shots.
    public Shooter newShots() {
        shots = new ArrayList<Position>(NUM_SHOTS);
        for (int i = 0; i < NUM_SHOTS; ++i) {
            shots.add(newShot());
        }
        return this;
    }
    // Test this shooter against a ship
    public void testShooter(Ship ship) {
        score = shots.size();
        int hits = 0;
        for (Position shot : shots) {
            if (ship.madeHit(shot)) {
                if (++hits >= ship.getSize())
                    return;
            } else {
                score = score - 1;
            }
        }
    }

    // get the score of the testShotr operation
    public int getScore() {
        return score;
    }
    // compare this shooter to other shooters.
    @Override
    public int compareTo(Shooter o) {
        return score - o.score;
    }
    // getter
    public List<Position> getShots() {
        return shots;
    }
    // reproduce this shooter
    public Shooter reproduce() {
        Shooter offspring = new Shooter();
        offspring.mutate(shots);
        return offspring;
    }
    // mutate this shooter's offspring
    private void mutate(List<Position> pShots) {
        // copy parent's shots (okay for shallow)
        shots = new ArrayList<Position>(pShots);
        // 10% new mutations, in random locations
        for (int i = 0; i < NUM_SHOTS / 10; i++) {
            int loc = (int) (Math.random() * 100);
            shots.set(loc, newShot());
        }
    }
    // make a new random move
    private Position newShot() {
        return new Position(((int) (Math.random() * 6)) - 3, ((int) (Math.random() * 6)) - 3);
    }
}

The idea here is that a Shooter has up to 100 shots, randomly chosen between +-3 in the X and +- 3 in the Y. Yea, 100 shots is overkill, but hey, whatever. Pass a Ship to this Shooter.testShooter and it will score itself, 100 being the best score, 0 being the worst.

This Shooter actor has reproduce and mutate methods that will return an offspring that has 10% of its shots randomly mutated. The general idea is that the best Shooters have 'learned' to shoot their shots in a cross pattern ('+') as quickly as possible, since a ship is oriented in one of four ways (North, South, East, West).

The program that runs the simulation, ShooterSimulation, is pretty simple:

public class ShooterSimulation {
    private int NUM_GENERATIONS = 1000;
    private int NUM_SHOOTERS = 20;
    private int NUM_SHOOTERS_NEXT_GENERATION = NUM_SHOOTERS / 10;

    List<Shooter> shooters = new ArrayList<Shooter>(NUM_SHOOTERS);
    Ship ship;

    public static void main(String... args) {
        new ShooterSimulation().run();
    }

    // do the work
    private void run() {
        firstGeneration();
        ship = new Ship();
        for (int gen = 0; gen < NUM_GENERATIONS; ++gen) {
            ship.newOrientation();
            testShooters();
            Collections.sort(shooters);
            printAverageScore(gen, shooters);
            nextGeneration();
        }
    }

    // make the first generation
    private void firstGeneration() {
        for (int i = 0; i < NUM_SHOOTERS; ++i) {
            shooters.add(new Shooter().newShots());
        }
    }

    // test all the shooters
    private void testShooters() {
        for (int mIdx = 0; mIdx < NUM_SHOOTERS; ++mIdx) {
            shooters.get(mIdx).testShooter(ship);
        }
    }

    // print the average score of all the shooters
    private void printAverageScore(int gen, List<Shooter> shooters) {
        int total = 0;
        for (int i = 0, j = shooters.size(); i < j; ++i) {
            total = total + shooters.get(i).getScore();
        }
        System.out.println(gen + " " + total / shooters.size());
    }

    // throw away the a tenth of old generation
    // replace with offspring of the best fit
    private void nextGeneration() {
        for (int l = 0; l < NUM_SHOOTERS_NEXT_GENERATION; ++l) {
            shooters.set(l, shooters.get(NUM_SHOOTERS - l - 1).reproduce());
        }
    }
}

The code reads as pseudo-code from the run method: make a firstGeneration then iterate for a number of generations. For each generation, set a newOrientation for the ship, then do testShooters, and sort the results of the test with Collections.sort. printAverageScore of the test, then build the nextGeneration. With the list of average scores you can, cough cough, do an 'analysis'.

A graph of the results looks like this:

As you can see it starts out with pretty low average scores, but learns pretty quickly. However, the orientation of the ship keeps changing, causing some noise in addition to the random component. Every now and again a mutation messes up the group a bit, but less and less as the group improves overall.

Challenges, and the reason for many papers to be sure, is to make more things mutable, especially in a constructive way. For example, the number of shots could be mutable. Or, replacing the list of shots with a tree that branches depending on whether the last shot was a hit or miss might improve things, but it's difficult to say. That's where the 'decision' logic considerations come in. Is it better to have a list of random shots or a tree that decides which branch to take depending on the prior shot? Higher level challenges include predicting what changes will make the group learn faster and be less susceptible to bad mutations.

Finally, consider that there could be multiple groups, one group a battleship hunter and one group a submarine hunter for example. Each group, though made of the same code, could 'evolve' different internal 'genetics' that allow them to specialize for their task.

Anyway, as always, start somewhere simple and learn as you go until you get good enough to go back to reading the papers.

PS> Need this too:

public class Position {
    int x;
    int y;
    Position(int x, int y ) {this.x=x; this.y=y;}

    @Override
    public boolean equals(Object m) {
        return (((Position)m).x==x && ((Position)m).y==y);
    }
}

UDATE: Added Ship class, fixed a few bugs:

public class Ship {
    List<Position> positions;

    // test if a hit was made
    public boolean madeHit(Position shot) {
        for (Position p: positions) {
            if ( p.equals(shot)) return true;
        }
        return false;
    }

    // make a new orientation
    public int newOrientation() {
        positions = new ArrayList<Position>(3);
        // make a random ship direction.
        int shipInX=0, oShipInX=0 , shipInY=0, oShipInY=0;

        int orient = (int) (Math.random() * 4);
        if( orient == 0 ) {
            oShipInX = 1;
            shipInX = (int)(Math.random()*3)-3;
        }
        else if ( orient == 1 ) {
            oShipInX = -1;
            shipInX = (int)(Math.random()*3);
        }
        else if ( orient == 2 ) {
            oShipInY = 1;
            shipInY = (int)(Math.random()*3)-3;
        }
        else if ( orient == 3 ) {
            oShipInY = -1;
            shipInY = (int)(Math.random()*3);
        }

        // make the positions of the ship
        for (int i = 0; i < 3; ++i) {
            positions.add(new Position(shipInX, shipInY));
            if (orient == 2 || orient == 3)
                shipInY = shipInY + oShipInY;
            else
                shipInX = shipInX + oShipInX;
        }
        return orient;
    }

    public int getSize() {
        return positions.size();
    }
}

这篇关于AI如何为战列舰建立遗传规划模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆