有效生成numpy.random.choice的多个实例而无需替换 [英] Efficiently generating multiple instances of numpy.random.choice without replacement

查看:212
本文介绍了有效生成numpy.random.choice的多个实例而无需替换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Python的新手.阅读时,请提及有关改进我的Python代码的其他建议.

I'm new to Python. While reading, please mention any other suggestions regarding ways to improve my Python code.

问题:如何在Python中生成包含随机数的8xN维数组? 约束是该数组的每一列必须包含8个抽奖,而不能替换为整数[1,8] .更具体地说,当N = 10时,我想要这样的东西.

Question: How do I generate a 8xN dimensional array in Python containing random numbers? The constraint is that each column of this array must contain 8 draws without replacement from the integer set [1,8]. More specifically, when N = 10, I want something like this.

[[ 6.  2.  3.  4.  7.  5.  5.  7.  8.  4.]
 [ 1.  4.  5.  5.  4.  4.  8.  5.  7.  5.]
 [ 7.  3.  8.  8.  3.  8.  7.  3.  6.  7.]
 [ 3.  6.  7.  1.  5.  6.  2.  1.  5.  1.]
 [ 8.  1.  4.  3.  8.  2.  3.  4.  3.  3.]
 [ 5.  8.  1.  7.  1.  3.  6.  8.  1.  6.]
 [ 4.  5.  2.  6.  2.  1.  1.  6.  4.  2.]
 [ 2.  7.  6.  2.  6.  7.  4.  2.  2.  8.]]

为此,我使用以下方法:

To do this I use the following approach:

import numpy.random
import numpy
def rand_M(N):
    M = numpy.zeros(shape = (8, N))
    for i in range (0, N):
        M[:, i] = numpy.random.choice(8, size = 8, replace = False) + 1 
    return M

实际上,N为〜1e7.上面的算法在时间上为O(n),当N = 1e3时大约需要0.38秒.因此,当N = 1e7时的时间约为1小时(即3800秒).必须有一种更有效的方法.

In practice N will be ~1e7. The algorithm above is O(n) in time and it takes roughly .38 secs when N=1e3. The time therefore when N = 1e7 is ~1hr (i.e. 3800 secs). There has to be a much more efficient way.

为功能计时

from timeit import Timer 
t = Timer(lambda: rand_M(1000))
print(t.timeit(5))
0.3863314103162543

推荐答案

创建一个指定形状的随机数组,然后沿要保留限制的轴排序,从而为我们提供了矢量化且非常有效的解决方案.这将基于此 smart answer

Create a random array of specified shape and then sort along the axis where you want to keep the limits, thus giving us a vectorized and very efficient solution. This would be based on this smart answer to MATLAB randomly permuting columns differently. Here's the implementation -

样品运行-

In [122]: N = 10

In [123]: np.argsort(np.random.rand(8,N),axis=0)+1
Out[123]: 
array([[7, 3, 5, 1, 1, 5, 2, 4, 1, 4],
       [8, 4, 3, 2, 2, 8, 5, 5, 6, 2],
       [1, 2, 4, 6, 5, 4, 4, 3, 4, 7],
       [5, 6, 2, 5, 8, 2, 7, 8, 5, 8],
       [2, 8, 6, 3, 4, 7, 1, 1, 2, 6],
       [6, 7, 7, 8, 6, 6, 3, 2, 7, 3],
       [4, 1, 1, 4, 3, 3, 8, 6, 8, 1],
       [3, 5, 8, 7, 7, 1, 6, 7, 3, 5]], dtype=int64)

运行时测试-

In [124]: def sortbased_rand8(N):
     ...:     return np.argsort(np.random.rand(8,N),axis=0)+1
     ...: 
     ...: def rand_M(N):
     ...:     M = np.zeros(shape = (8, N))
     ...:     for i in range (0, N):
     ...:         M[:, i] = np.random.choice(8, size = 8, replace = False) + 1 
     ...:     return M
     ...: 

In [125]: N = 5000

In [126]: %timeit sortbased_rand8(N)
100 loops, best of 3: 1.95 ms per loop

In [127]: %timeit rand_M(N)
1 loops, best of 3: 233 ms per loop

因此,等待 120x 加速!

Thus, awaits a 120x speedup!

这篇关于有效生成numpy.random.choice的多个实例而无需替换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆