随机将参与者重新分配到组中,以使原来来自同一组的参与者不会成为同一组的参与者 [英] Randomly reassign participants to groups such that participants originally from same group don't end up in same group
问题描述
我基本上是在尝试进行这种蒙特卡洛分析,在该分析中,我将实验参与者随机地重新分配给新的组,然后根据随机的新组重新分析数据.所以这就是我想要要做的事情:
I'm basically trying to do this Monte Carlo kind of analysis where I randomly reassign the participants in my experiment to new groups, and then reanalyze the data given the random new groups. So here's what I want to do:
参加者最初分为八组,每组四个.我想将每个参与者随机地重新分配到一个新的组中,但是我不希望任何参与者最终与同一个原始组中的另一个参与者结成新组.
Participants are originally grouped into eight groups of four participants each. I want to randomly reassign each participant to a new group, but I don't want any participants to end up in a new group with another participant from their same original group.
这是我走了多远:
import random
import pandas as pd
import itertools as it
data = list(it.product(range(8),range(4)))
test_df = pd.DataFrame(data=data,columns=['group','partid'])
test_df['new_group'] = None
for idx, row in test_df.iterrows():
start_group = row['group']
takens = test_df.query('group == @start_group')['new_group'].values
fulls = test_df.groupby('new_group').count().query('partid >= 4').index.values
possibles = [x for x in test_df['group'].unique() if (x not in takens)
and (x not in fulls)]
test_df.loc[idx,'new_group'] = random.choice(possibles)
这里的基本思想是,我将参与者随机分配给新组,但要满足以下条件:(a)新组中没有其原始组合作伙伴之一,并且(b)新组中没有新组的参与者已经有4个或更多参与者重新分配给它.
The basic idea here is that I randomly reassign a participant to a new group with the constraints that (a) the new group doesn't have one of their original group partners in, and (b) the new group doesn't have 4 or more participants already reassigned to it.
此方法的问题在于,很多时候,当我们尝试重新分配最后一个组时,仅有的其余组插槽位于同一组中.我也可以尝试在失败时重新进行随机化,直到成功为止,但这感觉很愚蠢.另外,我想进行100次随机重新分配,这样方法可能会变得很慢....
The problem with this approach is that, many times, by the time we try to reassign the last group, the only remaining group slots are in that same group. I could also just try to re-randomize when it fails until it succeeds, but that feels silly. Also, I want to make 100 random reassignments, so that approach could get very slow....
因此,必须有一种更聪明的方法来做到这一点.考虑到目标的简单性,我也觉得应该有一个更简单的方法来解决这个问题(但我意识到这可能会产生误导作用……)
So there must be a smarter way to do this. I also feel like there should be a simpler way to solve this, given how simple the goal feels (but I realize that can be misleading...)
推荐答案
更好的解决方案
睡觉之后,我发现~ Big O of numGroups
中有一个更好的解决方案.
Better solution
After sleeping on it I've found a significantly better solution that's in ~ Big O of numGroups
.
import random
import numpy as np
import pandas as pd
import itertools as it
np.random.seed(0)
numGroups=4
numMembers=4
data = list(it.product(range(numGroups),range(numMembers)))
df = pd.DataFrame(data=data,columns=['group','partid'])
解决方案
g = np.repeat(range(numGroups),numMembers).reshape((numGroups,numMembers))
In [95]: g
Out[95]:
array([[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3]])
g = np.random.permutation(g)
In [102]: g
Out[102]:
array([[2, 2, 2, 2],
[3, 3, 3, 3],
[1, 1, 1, 1],
[0, 0, 0, 0]])
g = np.tile(g,(2,1))
In [104]: g
Out[104]:
array([[2, 2, 2, 2],
[3, 3, 3, 3],
[1, 1, 1, 1],
[0, 0, 0, 0],
[2, 2, 2, 2],
[3, 3, 3, 3],
[1, 1, 1, 1],
[0, 0, 0, 0]])
注意对角线.
array([[2, -, -, -],
[3, 3, -, -],
[1, 1, 1, -],
[0, 0, 0, 0],
[-, 2, 2, 2],
[-, -, 3, 3],
[-, -, -, 1],
[-, -, -, -]])
将对角线从上到下
newGroups = []
for i in range(numGroups):
newGroups.append(np.diagonal(g[i:i+numMembers]))
In [106]: newGroups
Out[106]:
[array([2, 3, 1, 0]),
array([3, 1, 0, 2]),
array([1, 0, 2, 3]),
array([0, 2, 3, 1])]
newGroups = np.ravel(newGroups)
df["newGroups"] = newGroups
In [110]: df
Out[110]:
group partid newGroups
0 0 0 2
1 0 1 3
2 0 2 1
3 0 3 0
4 1 0 3
5 1 1 1
6 1 2 0
7 1 3 2
8 2 0 1
9 2 1 0
10 2 2 2
11 2 3 3
12 3 0 0
13 3 1 2
14 3 2 3
15 3 3 1
旧解决方案:蛮力法
结果比我想的要难得多...
Old Solution: Brute Force Method
Turned out to be a lot harder than I thought...
我有一种蛮力方法,基本上可以猜测组的不同排列,直到最终得到每个人最终都属于不同组的一种排列.与您所展示的相比,此方法的好处是它不会遭受最后耗尽组"的困扰.
I have a brute force method that basically guesses different permutations of groups until it finally gets one where everyone ends up in a different group. The benefit of this approach vs. what you've shown is that it doesn't suffer from "running out of groups at the end".
它可能会变慢-但是对于8个组和每个组4个成员来说,它很快.
It can potentially get slow - but for 8 groups and 4 members per group it's fast.
import random
import numpy as np
import pandas as pd
import itertools as it
random.seed(0)
numGroups=4
numMembers=4
data = list(it.product(range(numGroups),range(numMembers)))
df = pd.DataFrame(data=data,columns=['group','partid'])
解决方案
g = np.repeat(range(numGroups),numMembers).reshape((numGroups,numMembers))
In [4]: g
Out[4]:
array([[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3]])
def reArrange(g):
g = np.transpose(g)
g = [np.random.permutation(x) for x in g]
return np.transpose(g)
# check to see if any members in each old group have duplicate new groups
# if so repeat
while np.any(np.apply_along_axis(lambda x: len(np.unique(x))<numMembers,1,g)):
g = reArrange(g)
df["newGroup"] = g.ravel()
In [7]: df
Out[7]:
group partid newGroup
0 0 0 2
1 0 1 3
2 0 2 1
3 0 3 0
4 1 0 0
5 1 1 1
6 1 2 2
7 1 3 3
8 2 0 1
9 2 1 0
10 2 2 3
11 2 3 2
12 3 0 3
13 3 1 2
14 3 2 0
15 3 3 1
这篇关于随机将参与者重新分配到组中,以使原来来自同一组的参与者不会成为同一组的参与者的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!