将学生分成小组的最快启发式算法是什么? [英] What's the fastest heuristic algorithm to split students into groups?

查看:152
本文介绍了将学生分成小组的最快启发式算法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有X个学生,其中X是6的倍数。我现在想将学生分成6组。



我有一个函数用来衡量6人一组的好程度(现在可以说它是一个黑盒子,并且在恒定时间内运行)。通过拆分学生,然后在每个组上调用我的函数来衡量其良好性,然后总结每个组的优度,我就能衡量出一组特定组的良好程度。



我正在尝试创建一种算法,该算法将以某种方式对学生进行分组,以使所有组的总体优势最大化,并且没有一个组的个人优势低于某个值y 。换句话说,在所有组的优度均高于y的约束下,将学生分为6组,以使总体优度最大化。



学生人数(X)I预计可以在大约36上运行该算法。



问题似乎是NP-Complete,所以我可以接受启发式算法。我对此没有太多经验,但是我认为某种遗传算法,模拟退火甚至贪婪算法都可能起作用,但是我不确定从哪里开始研究。



有人可以指出我正确的方向吗?我已经进行了一些研究,并且该问题似乎与Traveling Salesman问题几乎相同(问题空间是学生/节点的所有排列),但是我认为我不能将TSP算法应用于此,因为节点数(大约36岁)对于任何有效的东西来说都是很大的。

解决方案

让我们以36名学生分配为6名学生为例组。检查所有组合是不切实际的,因为有3,708,580,189,773,818,399,040。但是,通过检查学生在成对的小组之间的每个分布来进行反复改进的策略应该是可行的。



有462种方法可以将12个学生分成两组,因此,找到最佳的12→ 2分布仅需要924个调用组质量功能。 6个小组中有15个可能的小组配对,因此有13,860个电话将显示配对小组和在小组之间重新分配学生以取得最大进步的最佳方法。





从随机初始分布开始,该算法计算所有15组配对的最佳分布: AB,CD,EF,BC,DE,FA,AC,BD,CE,DF,EA,FB,AD,BE,CF





然后比较所有15个得分对的组合,以找到总得分最高的组合,例如 DE + AC + FB





然后重新分配学生并返回新的总体得分。这是一个改进步骤。然后重复此过程多次,直到找不到更多的改进或时间用完为止。从不同的随机初始分布开始,多次运行该算法可能也很有用。



该算法可以在配对阶段和配对组合阶段进行微调。优化一对组时,您必须选择例如与将两组均增加的分布相比,将两组中的学生分布提高一个组的分数增加+4但将另一组的分数降低-1的情况(综合改善为+3)是否更好?他们的得分提高了+1,而综合得分只有+2。



再次在配对组合阶段,您必须决定是否需要对所有三个配对进行改进,或者是否选择组合最高的组合改善。



我认为,如果某个小组在某个步骤之后获得较低的分数,则可以提高整体分数,这将使学生在小组之间有更多的移动,并且可能导致探索更多组合。






为了能够编写代码来测试此策略,需要一个虚拟的组质量功能,因此,我将从1到36的学生编号,并使用将相邻学生编号之间的距离相乘的函数。所以例如组 [2,7,15,16,18,30] 的得分为 5 * 8 * 1 * 2 * 12 = 960 。如果您认为编号是学生能力的排名,那么高素质的团队就意味着混合能力的团队。最佳分布为:



 
组A:[1、7、13、19、25、31]
组B:[2、8、14、20、26、32、32]
组C:[3、9、15、21、27、33、33]
组D:[4, 10、16、22、28、34] E组
:[5、11、17、23、29、35] F组
F:[6、12、18、24、30、36]



每组得分 6 * 6 * 6 * 6 * 6 = 7776 ,总得分为 46656 。在实践中,我发现使用 Log(score)会产生更好的结果,因为相对于对一两个小组的大改进,它倾向于对所有小组进行小小的改进。 (需要对多个组或质量最低的组进行改进,或者只是选择最佳的总体改进,这是您必须微调至特定的组质量功能的部分。)






令我惊讶的是,该算法总是设法以4至7个步骤找到最佳解决方案,这意味着进行了超过100,000个组质量功能调用。我使用的分组质量算法当然很简单,因此您必须与实际情况进行核对,以评估这种方法在特定情况下的用处。但是很明显,该算法仅需几个步骤就可以彻底重新分配分布。



(为简单起见,下面的代码示例针对36个学生和6个小组的情况进行了硬编码。每组中的学生排序都是为了简化质量函数。)



  function Improvement(groups){变量对= [[0,1],[0,2],[0,3],[0,4],[0,5],[1,2],[1,3],[1,4 ],[1,5],[2,3],[2,4],[2,5],[3,4],[3,5],[4,5]]; var combi = [[0,9,14],[0,10,13],[0,11,12],[1,6,14],[1,7,13],[1,8,12 ],[2,5,14],[2,7,11],[2,8,10],[3,5,13],[3,6,11],[3,8,9], [4,5,12],[4,6,10],[4,7,9]]; //查找所有组对的最佳分布var optim = []; for(var i = 0; i< 15; i ++){optim [i] = optimise(groups [pairs [i] [0]],groups [pairs [i] [1]]); } //查找配对的最佳组合var best,成绩= -1; for(var i = 0; i< 15; i ++){var current = optim [combi [i] [0]]。score + optim [combi [i] [1]]。score + optim [combi [i] [2] .score;如果(当前>得分){得分=当前;最好=我; }} //将学生重新分配到组中并返回新的得分组[0] = optim [combi [best] [0]]。group1.slice(); groups [1] = optim [combi [best] [0]]。group2.slice(); groups [2] = optim [combi [best] [1]]。group1.slice(); groups [3] = optim [combi [best] [1]]。group2.slice(); groups [4] = optim [combi [best] [2]]。group1.slice(); groups [5] = optim [combi [best] [2]]。group2.slice();返回得分;} //查找对成对的最优分配函数函数optimise(group1,group2){var optim = {group1:[],group2:[],得分:-1}; var set = group1.concat(group2).sort(function(a,b){return a-b}); var distr = [0,0,0,0,0,1,1,1,1,1,1,1]; // //尝试进行每个组合{//将第一个组中的第一个学生保留为避免对称组合var groups = [[set [0]],[]]; //根据(VAR j = 0; j< 11; j ++)的二进制数组将学生分配到组0或1中{groups [distr [j]]。push(set [j + 1]); } //如果更好,请检查组和存储的分数var score = quality(groups [0])+ quality(groups [1]); if(得分&optim.score){optim.group1 = groups [0] .slice(); optim.group2 = groups [1] .slice(); optim.score =得分; }} while(increment(distr));回归乐观//生成二进制数组的下一个置换函数增量(数组){var digit = array.length,count = 0; while(--digit> = 0){if((array [digit] == 1)++ count else if(count){array [digit] = 1; for(var i = array.length-1; i> digit; i--){array [i] = --count> 0? 1:0; }返回true;返回false; }} //为一个小组评分;范围:0〜8.958797346140275函数质量(组){//对所有组的对数小改进,而对一个组的大改进不返回return Math.log((group [5]-group [4])*(group [4]-group [3])*(组[3]-组[2])*(组[2]-组[1])*(组[1]-组[0]));} //所有分数的总和6组;范围:0〜53.75278407684165功能全面质量(组){变量得分= 0;对于(var i = 0; i< 6; i ++)得分+ = quality(groups [i]);返回分数;} //准备随机测试数据var学生= [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19 ,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36]; var groups = [[],[],[],[ ],[],[]]; for(var i = 5; i> = 0; i--){for(var j = 5; j> = 0; j--){var pick = Math。 floor(Math.random()*(i * 6 + j)); groups [i] .push(students [pick]);学生[pick] =学生[i * 6 + j]; } groups [i] .sort(function(a,b){return a-b});} //显示初始得分和分布var score = totalQuality(groups); document.write(< PRE> Initial: + score.toFixed(2)+ + JSON.stringify(groups)+< BR>); //改善分发,直到不再增加得分var prev,step = 0; do {prev = score;分数=提高(组); document.write( Step + ++ step +: + score.toFixed(2)+ + JSON.stringify(groups)+< BR>);} while(score> prev& &得分< 53.75278407684165); if(得分> = 53.75278407684165)document.write(达到最佳解决方案。< / PRE>));  



注意:在选择了最佳组合后,将学生重新分配到这些组中之后,您当然知道现在这三对拥有最佳的学生分布。因此,您可以在接下来的步骤中跳过对这三对的检查,并将它们的当前分数用作最佳分数。


I have X number of students, where X is a multiple of 6. I now want to split up the students into groups of 6.

I have a function that measures how "good" a group of 6 is (lets say it's a black box that runs in constant time for now). By splitting up the students, and then calling my function on each group to measure it's goodness, and then summing up the goodness of each group, I'm able to measure how "good" a certain set of groups is.

I'm trying to create an algorithm that will group the students in a way so that the total goodness of all the groups is maximized, and no group has an individual goodness below some value y. In other words, group the students into groups of 6 to maximize total goodness under the constraint that all groups have a goodness above y.

The number of students (X) I expect to run this algorithm on is about ~36.

The problem appears to be NP-Complete, so I'm okay with settling for a heuristic algorithm. I don't have a lot of experience with this, but some sort of genetic algorithm or simulated annealing or even a greedy algorithm might work I would think, but I'm not sure where to start my research.

Could someone point me in the right direction please? I've done some research, and the problem seems almost identical to the Travelling Salesman Problem (the problem space is all permutations of the students/nodes) but I don't think I can apply TSP algorithms to this because the number of "nodes" (around 36) would be quite large for anything to be effective.

解决方案

Let's take the example of 36 students distributed into 6 groups. Checking all combinations is impractical, because there are 3,708,580,189,773,818,399,040. However, a strategy which makes repeated improvements by checking every distribution of students between pairs of groups should be feasible.

There are 462 ways to split 12 students into 2 groups, so finding the optimal 12→2 distribution takes only 924 calls to the "group quality" function. There are 15 possible pairings of groups among 6 groups, so 13,860 calls will reveal the best way to pair the groups and redistribute the students between the pairs to get the most improvement.

Starting with a random initial distribution, the algorithm calculates the optimal distribution for all 15 pairings of groups: AB,CD,EF,BC,DE,FA,AC,BD,CE,DF,EA,FB,AD,BE,CF.

It then compares the scores for all 15 combinations of pairs, to find the combination with the highest overal score, e.g. DE+AC+FB.

It then redistributes the students, and returns the new overall score. This constitutes one improvement step. This process is then repeated a number of times, until no more improvement can be found, or until you run out of time. It may also be useful to run the algorithm several times, starting with different random initial distributions.

This algorithm can be fine-tuned in both the pairing and the combination of pairings phase. When optimizing a pair of groups, you'll have to choose e.g. whether a distribution of the students over the two groups that increases the score of the one group by +4 but decreases the score of the other group by -1, for a combined improvement of +3, is preferable over a distribution where both groups increase their score by +1, for a combined improvement of only +2.

And again in the combinations of pairs phase, you'll have to decide whether an improvement of all three pairs is required, or whether you choose the combinations with the highest combined improvement.

I assume that allowing a group to have a lower score after a step if that improves the overall score, will allow for more movement of the students between the groups, and may lead to more combinations being explored.


To be able to write code to test this strategy, a dummy "group quality" function is needed, so I'm numbering the students from 1 to 36 and using a function which multiplies the distance between adjacent students' numbers. So e.g. the group [2,7,15,16,18,30] would have score 5*8*1*2*12 = 960. If you imagine the numbering to be a ranking of the students' ability, then a high-quality group means a mixed-ability group. The optimal distribution is:

group A: [1,  7, 13, 19, 25, 31]
group B: [2,  8, 14, 20, 26, 32]
group C: [3,  9, 15, 21, 27, 33]
group D: [4, 10, 16, 22, 28, 34]
group E: [5, 11, 17, 23, 29, 35]
group F: [6, 12, 18, 24, 30, 36]

with every group scoring 6*6*6*6*6 = 7776 and a total score of 46656. In practice I found that using Log(score) gave better results, because it favours small improvements across all groups over large improvements to one or two groups. (Favouring improvements to several groups, or to the lowest-quality groups, or just choosing the best overall improvement, is the part you'll have to fine-tune to your specific "group quality" function.)


To my surprise, the algorithm always manages to find the optimal solution, and in just 4 to 7 steps, which means that less than 100,000 "group quality" function calls are made. The "group quality" algorithm I'm using is of course quite simple, so you'd have to check it with the real thing to gauge the usefulness of this approach in your specific case. But it's clear that this algorithm manages to thoroughly rearrange the distribution in just a few steps.

(The code example below is hard-coded for the case of 36 students and 6 groups for simplicity. The sorting of students in each group is done to simplify the quality function.)

function improve(groups) {
    var pairs = [[0,1],[0,2],[0,3],[0,4],[0,5],[1,2],[1,3],[1,4],[1,5],[2,3],[2,4],[2,5],[3,4],[3,5],[4,5]];
    var combi = [[0,9,14],[0,10,13],[0,11,12],[1,6,14],[1,7,13],[1,8,12],[2,5,14],[2,7,11],[2,8,10],[3,5,13],[3,6,11],[3,8,9],[4,5,12],[4,6,10],[4,7,9]];
    // FIND OPTIMAL DISTRIBUTION FOR ALL PAIRS OF GROUPS
    var optim = [];
    for (var i = 0; i < 15; i++) {
        optim[i] = optimise(groups[pairs[i][0]], groups[pairs[i][1]]);
    }
    // FIND BEST COMBINATION OF PAIRS
    var best, score = -1;
    for (var i = 0; i < 15; i++) {
        var current = optim[combi[i][0]].score + optim[combi[i][1]].score + optim[combi[i][2]].score;
        if (current > score) {
            score = current;
            best = i;
        }
    }
    // REDISTRIBUTE STUDENTS INTO GROUPS AND RETURN NEW SCORE
    groups[0] = optim[combi[best][0]].group1.slice();
    groups[1] = optim[combi[best][0]].group2.slice();
    groups[2] = optim[combi[best][1]].group1.slice();
    groups[3] = optim[combi[best][1]].group2.slice();
    groups[4] = optim[combi[best][2]].group1.slice();
    groups[5] = optim[combi[best][2]].group2.slice();
    return score;
}

// FIND OPTIMAL DISTRIBUTION FOR PAIR OF GROUPS
function optimise(group1, group2) {
    var optim = {group1: [], group2: [], score: -1};
    var set = group1.concat(group2).sort(function(a, b) {return a - b});
    var distr = [0,0,0,0,0,1,1,1,1,1,1];
    // TRY EVERY COMBINATION
    do {
        // KEEP FIRST STUDENT IN FIRST GROUP TO AVOID SYMMETRIC COMBINATIONS
        var groups = [[set[0]], []];
        // DISTRIBUTE STUDENTS INTO GROUP 0 OR 1 ACCORDING TO BINARY ARRAY
        for (var j = 0; j < 11; j++) {
            groups[distr[j]].push(set[j + 1]);
        }
        // CHECK SCORE OF GROUPS AND STORE IF BETTER
        var score = quality(groups[0]) + quality(groups[1]);
        if (score > optim.score) {
            optim.group1 = groups[0].slice();
            optim.group2 = groups[1].slice();
            optim.score = score;
        }
    } while (increment(distr));
    return optim;

    // GENERATE NEXT PERMUTATION OF BINARY ARRAY
    function increment(array) {
        var digit = array.length, count = 0;
        while (--digit >= 0) {
            if (array[digit] == 1) ++count
            else if (count) {
                array[digit] = 1;
                for (var i = array.length - 1; i > digit; i--) {
                    array[i] = --count > 0 ? 1 : 0;
                }
                return true;
            }
        }
        return false;
    }
}

// SCORE FOR ONE GROUP ; RANGE: 0 ~ 8.958797346140275
function quality(group) {
    // LOGARITHM FAVOURS SMALL IMPROVEMENTS TO ALL GROUPS OVER LARGE IMPROVEMENT TO ONE GROUP
    return Math.log((group[5] - group[4]) * (group[4] - group[3]) * (group[3] - group[2]) * (group[2] - group[1]) * (group[1] - group[0]));
}

// SUM OF SCORES FOR ALL 6 GROUPS ; RANGE: 0 ~ 53.75278407684165
function overallQuality(groups) {
    var score = 0;
    for (var i = 0; i < 6; i++) score += quality(groups[i]);
    return score;
}

// PREPARE RANDOM TEST DATA
var students = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36];
var groups = [[],[],[],[],[],[]];
for (var i = 5; i >=0; i--) {
    for (var j = 5; j >= 0; j--) {
        var pick = Math.floor(Math.random() * (i * 6 + j));
        groups[i].push(students[pick]);
        students[pick] = students[i * 6 + j];
    }
    groups[i].sort(function(a, b) {return a - b});
}

// DISPLAY INITIAL SCORE AND DISTRIBUTION
var score = overallQuality(groups);
document.write("<PRE>Initial: " + score.toFixed(2) + " " + JSON.stringify(groups) + "<BR>");

// IMPROVE DISTRIBUTION UNTIL SCORE NO LONGER INCREASES
var prev, step = 0;
do {
    prev = score;
    score = improve(groups);
    document.write("Step " + ++step + " : " + score.toFixed(2) + " " + JSON.stringify(groups) + "<BR>");
} while (score > prev && score < 53.75278407684165);
if (score >= 53.75278407684165) document.write("Optimal solution reached.</PRE>");

Note: after having chosen the best combination of pairs and having redistributed the students in those pairs of groups, you of course know that those three pairs now have their optimal distribution of students. So you can skip checking those three pairs in the following step, and use their current score as the optimal score.

这篇关于将学生分成小组的最快启发式算法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆