在Cython中生成随机数的规范方法 [英] Canonical way to generate random numbers in Cython

查看:85
本文介绍了在Cython中生成随机数的规范方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

生成伪统一随机数(在[0,1中为双精度数])的最佳方法是什么:

  1. 跨平台(理想情况下具有相同的样品序列)
  2. 线程安全(显式传递prng的突变状态或 内部使用线程本地状态)
  3. 没有GIL锁
  4. 可以轻松地用Cython包装

3年前有类似的帖子关于此问题,但很多答案并不符合所有标准.例如,drand48是POSIX特定的.

我知道的唯一(但不确定)满足某些条件的方法是:

from libc.stdlib cimport rand, RAND_MAX

random = rand() / (RAND_MAX + 1.0)

请注意@ogrisel 询问关于3的相同问题年前.

修改

调用 rand 并不是线程安全的.感谢您指出@DavidW.

我认为最简单的方法是使用提供

(开始时的-std=c++11是针对GCC的.对于其他编译器,您可能需要对其进行调整.无论如何,默认情况下c ++ 11越来越多,因此可以将其删除)

参考您的条件:

  1. 跨平台的任何支持C ++的平台.我认为应该指定顺序,以便其可重复.
  2. 线程安全,因为状态完全存储在mt19937对象中(每个线程应具有自己的mt19937).
  3. 没有GIL-它是C ++,没有Python部件
  4. 非常容易.


:有关使用discrete_distribution的信息.

这有点困难,因为discrete_distribution的构造函数不太明显如何包装(它们涉及迭代器).我认为最简单的方法是通过C ++向量,因为对它的支持内置在Cython中,并且可以很容易地与Python列表相互转换

# use Cython's built in wrapping of std::vector
from libcpp.vector cimport vector

cdef extern from "<random>" namespace "std":
    # mt19937 as before

    cdef cppclass discrete_distribution[T]:
        discrete_distribution()
        # The following constructor is really a more generic template class
        # but tell Cython it only accepts vector iterators
        discrete_distribution(vector.iterator first, vector.iterator last)
        T operator()(mt19937 gen)

# an example function
def test2():
    cdef:
        mt19937 gen = mt19937(5)
        vector[double] values = [1,3,3,1] # autoconvert vector from Python list
        discrete_distribution[int] dd = discrete_distribution[int](values.begin(),values.end())
    return dd(gen)

显然,这比均匀分布要复杂得多,但它并没有那么复杂(讨厌的位可能隐藏在Cython函数中).

What is the best way to generate pseudo uniform random numbers (a double in [0, 1)) that is:

  1. Cross platform (ideally with same same sample sequence)
  2. Thread safe (explicit passing of the mutated state of the prng or using a thread-local state internally)
  3. Without GIL lock
  4. Easily wrappable in Cython

There was a similar post over 3 years ago about this but a lot of the answers don't meet all criteria. For example, drand48 is POSIX-specific.

The only method I'm aware of, which seems (but not sure) to meet all some criteria is:

from libc.stdlib cimport rand, RAND_MAX

random = rand() / (RAND_MAX + 1.0)

Note @ogrisel asked the same question about 3 years ago.

Edit

Calling rand is not thread safe. Thanks for pointing that out @DavidW.

解决方案

I think the easiest way to do this is to use the C++11 standard library which provides nice encapsulated random number generators and ways to use them. This is of course not the only options, and you could wrap pretty much any suitable C/C++ library (one good option might be to use whatever library numpy uses, since that's most likely already installed).

My general advice is to only wrap the bits you need and not bother with the full hierarchy and all the optional template parameters. By way of example I've shown one of the default generators, fed into a uniform float distribution.

# distutils: language = c++
# distutils: extra_compile_args = -std=c++11

cdef extern from "<random>" namespace "std":
    cdef cppclass mt19937:
        mt19937() # we need to define this constructor to stack allocate classes in Cython
        mt19937(unsigned int seed) # not worrying about matching the exact int type for seed

    cdef cppclass uniform_real_distribution[T]:
        uniform_real_distribution()
        uniform_real_distribution(T a, T b)
        T operator()(mt19937 gen) # ignore the possibility of using other classes for "gen"

def test():
    cdef:
        mt19937 gen = mt19937(5)
        uniform_real_distribution[double] dist = uniform_real_distribution[double](0.0,1.0)
    return dist(gen)

(The -std=c++11 at the start is for GCC. For other compilers you may need to tweak this. Increasingly c++11 is a default anyway, so you can drop it)

With reference to your criteria:

  1. Cross platform on anything that supports C++. I believe the sequence should be specified so it's repeatable.
  2. Thread safe, since the state is stored entirely within the mt19937 object (each thread should have its own mt19937).
  3. No GIL - it's C++, with no Python parts
  4. Reasonably easy.


Edit: about using discrete_distribution.

This is a bit harder because the constructors for discrete_distribution are less obvious how to wrap (they involve iterators). I think the easiest thing to do is to go via a C++ vector since support for that is built into Cython and it is readily convertable to/from a Python list

# use Cython's built in wrapping of std::vector
from libcpp.vector cimport vector

cdef extern from "<random>" namespace "std":
    # mt19937 as before

    cdef cppclass discrete_distribution[T]:
        discrete_distribution()
        # The following constructor is really a more generic template class
        # but tell Cython it only accepts vector iterators
        discrete_distribution(vector.iterator first, vector.iterator last)
        T operator()(mt19937 gen)

# an example function
def test2():
    cdef:
        mt19937 gen = mt19937(5)
        vector[double] values = [1,3,3,1] # autoconvert vector from Python list
        discrete_distribution[int] dd = discrete_distribution[int](values.begin(),values.end())
    return dd(gen)

Obviously that's a bit more involved than the uniform distribution, but it's not impossibly complicated (and the nasty bits could be hidden inside a Cython function).

这篇关于在Cython中生成随机数的规范方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆