在python中生成1,000,000+随机数的最快方法 [英] Fastest Way to generate 1,000,000+ random numbers in python

查看:150
本文介绍了在python中生成1,000,000+随机数的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在用python编写一个需要生成大量随机数FAST的应用.目前,我有一个计划使用numpy生成一个大批处理中的所有数字(一次约500,000).虽然这似乎比python的实现要快.我仍然需要它来加快速度.有任何想法吗?我愿意用C编写并将其嵌入程序中或进行操作.

I am currently writing an app in python that needs to generate large amount of random numbers, FAST. Currently I have a scheme going that uses numpy to generate all of the numbers in a giant batch (about ~500,000 at a time). While this seems to be faster than python's implementation. I still need it to go faster. Any ideas? I'm open to writing it in C and embedding it in the program or doing w/e it takes.

对随机数的限制:

  • 一组7个数字,可以全部具有不同的界限:
    • 例如:[0-X1、0-X2、0-X3、0-X4、0-X5、0-X6、0-X7]
    • 当前,我正在生成一个包含[0-1]中随机值的7个数字的列表,然后乘以[X1..X7]
    • A Set of 7 numbers that can all have different bounds:
      • eg: [0-X1, 0-X2, 0-X3, 0-X4, 0-X5, 0-X6, 0-X7]
      • Currently I am generating a list of 7 numbers with random values from [0-1) then multiplying by [X1..X7]
      • 当前仅生成13个数字,然后除以它们的总和

      有什么想法吗?预先计算这些数字并将它们存储在文件中会更快吗?

      Any ideas? Would pre calculating these numbers and storing them in a file make this faster?

      谢谢!

      推荐答案

      您可以通过执行最初描述的操作(生成一堆随机数并相应地相乘和相除)来加快上述mtrw的速度. ..

      You can speed things up a bit from what mtrw posted above just by doing what you initially described (generating a bunch of random numbers and multiplying and dividing accordingly)...

      此外,您可能已经知道这一点,但是在使用大型numpy数组时,请确保就地进行操作(* =,/=,+ =等).大型数组在内存使用方面有很大的不同,而且速度也会大大提高.

      Also, you probably already know this, but be sure to do the operations in-place (*=, /=, +=, etc) when working with large-ish numpy arrays. It makes a huge difference in memory usage with large arrays, and will give a considerable speed increase, too.

      In [53]: def rand_row_doubles(row_limits, num):
         ....:     ncols = len(row_limits)
         ....:     x = np.random.random((num, ncols))
         ....:     x *= row_limits                  
         ....:     return x                          
         ....:                                       
      In [59]: %timeit rand_row_doubles(np.arange(7) + 1, 1000000)
      10 loops, best of 3: 187 ms per loop
      

      相比:

      In [66]: %timeit ManyRandDoubles(np.arange(7) + 1, 1000000)
      1 loops, best of 3: 222 ms per loop
      

      差别不大,但是如果您真的对速度感到担忧,那是什么.

      It's not a huge difference, but if you're really worried about speed, it's something.

      只是为了证明它是正确的:

      Just to show that it's correct:

      In [68]: x.max(0)
      Out[68]:
      array([ 0.99999991,  1.99999971,  2.99999737,  3.99999569,  4.99999836,
              5.99999114,  6.99999738])
      
      In [69]: x.min(0)
      Out[69]:
      array([  4.02099599e-07,   4.41729377e-07,   4.33480302e-08,
               7.43497138e-06,   1.28446819e-05,   4.27614385e-07,
               1.34106753e-05])
      

      同样,对于您的行总和",...

      Likewise, for your "rows sum to one" part...

      In [70]: def rand_rows_sum_to_one(nrows, ncols):
         ....:     x = np.random.random((ncols, nrows))
         ....:     y = x.sum(axis=0)
         ....:     x /= y
         ....:     return x.T
         ....:
      
      In [71]: %timeit rand_rows_sum_to_one(1000000, 13)
      1 loops, best of 3: 455 ms per loop
      
      In [72]: x = rand_rows_sum_to_one(1000000, 13)
      
      In [73]: x.sum(axis=1)
      Out[73]: array([ 1.,  1.,  1., ...,  1.,  1.,  1.])
      

      老实说,即使您重新实现C语言中的功能,我也不确定您是否可以在这方面胜过numpy ...不过,我可能错了!

      Honestly, even if you re-implement things in C, I'm not sure you'll be able to beat numpy by much on this one... I could be very wrong, though!

      这篇关于在python中生成1,000,000+随机数的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆