无需替换即可高效生成 numpy.random.choice 的多个实例 [英] Efficiently generating multiple instances of numpy.random.choice without replacement
问题描述
我是 Python 新手.在阅读时,请提及有关改进我的 Python 代码的任何其他建议.
I'm new to Python. While reading, please mention any other suggestions regarding ways to improve my Python code.
问题: 如何在 Python 中生成包含随机数的 8xN 维数组?约束条件是该数组的每一列必须包含 8 次抽奖,而不能从整数集 [1,8] 中替换.更具体地说,当 N = 10 时,我想要这样的东西.
Question: How do I generate a 8xN dimensional array in Python containing random numbers? The constraint is that each column of this array must contain 8 draws without replacement from the integer set [1,8]. More specifically, when N = 10, I want something like this.
[[ 6. 2. 3. 4. 7. 5. 5. 7. 8. 4.]
[ 1. 4. 5. 5. 4. 4. 8. 5. 7. 5.]
[ 7. 3. 8. 8. 3. 8. 7. 3. 6. 7.]
[ 3. 6. 7. 1. 5. 6. 2. 1. 5. 1.]
[ 8. 1. 4. 3. 8. 2. 3. 4. 3. 3.]
[ 5. 8. 1. 7. 1. 3. 6. 8. 1. 6.]
[ 4. 5. 2. 6. 2. 1. 1. 6. 4. 2.]
[ 2. 7. 6. 2. 6. 7. 4. 2. 2. 8.]]
为此,我使用以下方法:
To do this I use the following approach:
import numpy.random
import numpy
def rand_M(N):
M = numpy.zeros(shape = (8, N))
for i in range (0, N):
M[:, i] = numpy.random.choice(8, size = 8, replace = False) + 1
return M
实际上 N 将是 ~1e7.上述算法的时间复杂度为 O(n),当 N=1e3 时大约需要 0.38 秒.因此,当 N = 1e7 时,时间约为 1 小时(即 3800 秒).必须有一种更有效的方法.
In practice N will be ~1e7. The algorithm above is O(n) in time and it takes roughly .38 secs when N=1e3. The time therefore when N = 1e7 is ~1hr (i.e. 3800 secs). There has to be a much more efficient way.
定时功能
from timeit import Timer
t = Timer(lambda: rand_M(1000))
print(t.timeit(5))
0.3863314103162543
推荐答案
创建一个指定形状的随机数组,然后沿着要保持限制的轴排序,从而为我们提供了一个矢量化且非常有效的解决方案.这将基于此 smart answer
到 MATLAB 以不同方式随机排列列
.这是实现 -
Create a random array of specified shape and then sort along the axis where you want to keep the limits, thus giving us a vectorized and very efficient solution. This would be based on this smart answer
to MATLAB randomly permuting columns differently
. Here's the implementation -
样品运行 -
In [122]: N = 10
In [123]: np.argsort(np.random.rand(8,N),axis=0)+1
Out[123]:
array([[7, 3, 5, 1, 1, 5, 2, 4, 1, 4],
[8, 4, 3, 2, 2, 8, 5, 5, 6, 2],
[1, 2, 4, 6, 5, 4, 4, 3, 4, 7],
[5, 6, 2, 5, 8, 2, 7, 8, 5, 8],
[2, 8, 6, 3, 4, 7, 1, 1, 2, 6],
[6, 7, 7, 8, 6, 6, 3, 2, 7, 3],
[4, 1, 1, 4, 3, 3, 8, 6, 8, 1],
[3, 5, 8, 7, 7, 1, 6, 7, 3, 5]], dtype=int64)
运行时测试 -
In [124]: def sortbased_rand8(N):
...: return np.argsort(np.random.rand(8,N),axis=0)+1
...:
...: def rand_M(N):
...: M = np.zeros(shape = (8, N))
...: for i in range (0, N):
...: M[:, i] = np.random.choice(8, size = 8, replace = False) + 1
...: return M
...:
In [125]: N = 5000
In [126]: %timeit sortbased_rand8(N)
100 loops, best of 3: 1.95 ms per loop
In [127]: %timeit rand_M(N)
1 loops, best of 3: 233 ms per loop
因此,等待 120x
加速!
Thus, awaits a 120x
speedup!
这篇关于无需替换即可高效生成 numpy.random.choice 的多个实例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!