为什么梅森捻线机比线性同余发生器更快? [英] Why is Mersenne twister faster than the linear congruential generator?

查看:190
本文介绍了为什么梅森捻线机比线性同余发生器更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用gcc C ++标准库的Mersenne twister实现进行了测试.它的性能优于线性同余生成器和C rand,后者很可能是LCG. 增强文档似乎也提供了类似的内容结果,但更青睐Mersenne捻线机.谁能解释一下?

I tested with the gcc C++ standard library's Mersenne twister implementation. It outperforms both the linear congruential generator and the C rand, which is most likely an LCG. A boost documentation also seems to give a similar result, but favoring Mersenne twister even more. Can anyone explain this?

#include <cstdlib>
#include <iostream>
#include <chrono>
#include <random>

class Timer
{
private:
  std::chrono::high_resolution_clock::time_point start_time;
  std::chrono::high_resolution_clock::time_point stop_time;

public:
  void start()
  {
    start_time = std::chrono::high_resolution_clock::now();
  }

  void stop()
  {
    stop_time = std::chrono::high_resolution_clock::now();
  }

  double measure()
  {
    using namespace std::chrono;
    return duration_cast<microseconds>
    (stop_time - start_time).count() / 1000000.0;
  }
};

template<typename T>
class Random
{
private:
  T generator;

public:
  Random()
  : generator
  (std::chrono::high_resolution_clock::now().time_since_epoch().count())
  {
  }

  int generate_integer(int begin, int end)
  {
    return std::uniform_int_distribution<int>(begin, end - 1)(generator);
  }
};

int main()
{
  constexpr int n = 300000000;
  Random<std::minstd_rand> mr;
  Random<std::mt19937> mt;
  Timer t;
  for (int j = 0; j < 3; ++j)
  {
    t.start();
    for (int i = 0; i < n; ++i)
    {
      static_cast<volatile void>(mr.generate_integer(0, 10));
    }
    t.stop();
    std::cout << "minstd " << t.measure() << std::endl;
    t.start();
    for (int i = 0; i < n; ++i)
    {
      static_cast<volatile void>(mt.generate_integer(0, 10));
    }
    t.stop();
    std::cout << "mersenne " << t.measure() << std::endl;
    t.start();
    for (int i = 0; i < n; ++i)
    {
      static_cast<volatile void>(std::rand() % 10);
    }
    t.stop();
    std::cout << "rand " << t.measure() << std::endl;
  }
}

结果

minstd 4.70876
mersenne 1.55853
rand 4.11873
minstd 4.53199
mersenne 1.55928
rand 4.15159
minstd 4.5374
mersenne 1.55667
rand 4.13715

推荐答案

Mersenne Twister算法并不像看起来那样复杂.或者,更确切地说,几乎所有复杂部分的执行频率都不足以严重影响长期平均速度.

The Mersenne Twister algorithm isn't as complicated as it looks. Or, more precisely, nearly all of the complicated part isn't executed often enough to seriously impact the long-term average speed.

如果您查看Wikipedia上的伪代码实现,则绝大多数呼叫只会执行extract_number()函数的后半部分;其余的非初始化代码(大多数在twist()函数中)仅在625(最常见的版本)中的一个调用中运行.每次运行的部分都非常简单,只需少量的移位和其他按位运算,就可以预期在大多数处理器上它们会非常快. extract_number()开头的测试几乎总是错误的,因此可以通过分支预测轻松地对其进行优化.

If you look at the pseudocode implementation on Wikipedia, the vast majority of calls only exercise the second half of the extract_number() function; the rest of the non-initialization code (mostly in the twist() function) only runs in one call in 625 (in the most common version). The part that runs every time is very simple, just a handful of shifts and other bitwise operations, which can be expected to be very fast on most processors. The test at the beginning of extract_number() is nearly always false and so can be easily optimized by branch prediction.

将其与线性同余算法进行对比,在该算法中,每次调用都会进行整数乘(昂贵)和模除(非常),除非您使用2模的幂作弊,这会影响您的随机数的质量). LC和MT算法涉及的算术是如此不同,以至于如果它们的相对性能在一个系统之间变化,我并不感到惊讶,但是我毫不怀疑地相信,至少在某些情况下,MT会更快.

Contrast this with the linear congruential algorithm, in which every call does an integer multiply (expensive) and modulo division (very expensive, unless you cheat by using a power of 2 modulus, which impacts the quality of your random numbers). The arithmetic involved in the LC and MT algorithms is so different that I'm not surprised if their relative performance varies from one system to another, but I have no trouble believing that MT is faster in at least some cases.

(如果仔细看一下MT算法,乍一看似乎在twist()中每次迭代都要进行几次模运算,但是形式很容易优化.)

(If you look closely at the MT algorithm, it appears at first glance to be doing several modulo operations per iteration in twist(), but those are in forms that are easy to optimize away.)

至于普通的rand(),其实现方式差异很大,因此不应期望它们在系统之间保持一致.许多实现使用16位算法,并且自然会比32或64位算法快.

As for plain old rand(), implementations of this vary widely and shouldn't be expected to be consistent across systems. Many implementations use 16-bit arithmetic and would naturally be faster than 32 or 64-bit algorithms.

这篇关于为什么梅森捻线机比线性同余发生器更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆