g ++ c ++ 11 constexpr评估性能 [英] g++ c++11 constexpr evaluation performance

查看:108
本文介绍了g ++ c ++ 11 constexpr评估性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

g ++(4.7.2)和类似的版本似乎在编译期间令人惊讶地评价constexpr。在我的机器上,事实上比编译程序在运行时快得多。



这种行为有合理的解释吗?
是否有优化技术,只有
适用于编译时,可以比实际编译代码更快执行?
如果是这样,哪个?



这里是我的测试程序和观察结果。

  #include< iostream> 

constexpr int mc91(int n)
{

return(n> 100)? n-10:mc91(mc91(n + 11));

}

constexpr double foo(double n)
{
return(n> 2)? (0.9999)*((unsigned int)(foo(n-1)+ foo(n-2))%100)
}

constexpr unsigned ack(unsigned m,unsigned n)
{
return m == 0
? n + 1
:n == 0
? ack(m-1,1)
:ack(m-1,ack(m,n-1));
}

constexpr unsigned slow91(int n){
return mc91(mc91(foo(n))%100);
}

int main(void)
{
constexpr unsigned int compiletime_ack = ack(3,14);
constexpr int compiletime_91 = slow91(49);
static_assert(compiletime_ack == 131069,必须在编译时求值);
static_assert(compiletime_91 == 91,必须在编译时求值);
std :: cout<< compiletime_ack< std :: endl;
std :: cout<< compiletime_91<< std :: endl;
std :: cout<< ack(3,14)<< std :: endl;
std :: cout<< slow91(49)<< std :: endl;
return 0;
}

compiletime:

  time g ++ constexpr.cpp -std = c ++ 11 -fconstexpr-depth = 10000000 -O3 

real 0m0.645s
用户0m0.600s
sys 0m0.032s

运行时:


$ b b

 时间./a.out 

131069
91
131069
91

real 0m43.708s
用户0m43.567s
sys 0m0.008s

这里mc91是通常的mac carthy f91(可以在维基百科上找到),foo只是一个无用的函数返回介于约1和100之间的实数值,具有fib运行时复杂性。

$ p

令人惊讶的是,程序甚至可以运行得更快,

解决方案

在编译期,冗余(相同) code> constexpr 调用可以记住,而运行时递归行为不会提供这一点。



如果您更改每个递归函数,例如...

  constexpr unsigned slow91(int n){
return mc91(mc91(foo(n))%100);
}

...到 constexpr ,但确实记住运行时的过去计算:

  std: :unordered_map< int,boost :: optional< unsigned> > results4; 
//参数^^^结果^^^^^^^^

无符号slow91(int n){
boost :: optional< unsigned> & ret =ults4 [n];
if(!ret)
{
ret = mc91(mc91(foo(n))%100);
}
return * ret;
}

您会得到不太令人惊讶的结果。



compiletime:

  time g ++ test.cpp -std = c ++ 11 -O3 

real 0m1.708s
用户0m1.496s
sys 0m0.176s

运行时:

  time ./a.out 

131069
91
131069
91

real 0m0.097s
用户0m0.064s
sys 0m0.032s


g++ (4.7.2) and similar versions seem to evaluate constexpr surprisingly fast during compile-time. On my machines in fact much faster than the compiled program during runtime.

Is there a reasonable explanation for that behavior? Are there optimization techniques involved which are only applicable at compile-time, that can be executed quicker than actual compiled code? If so, which?

Here`s my test program and the observed results.

#include <iostream>

constexpr int mc91(int n)
 {

     return (n > 100)? n-10 : mc91(mc91(n+11));

 }

constexpr double foo(double n)
{
   return (n>2)? (0.9999)*((unsigned int)(foo(n-1)+foo(n-2))%100):1;
}

constexpr unsigned ack( unsigned m, unsigned n )
{
    return m == 0
        ? n + 1
        : n == 0
        ? ack( m - 1, 1 )
        : ack( m - 1, ack( m, n - 1 ) );
}

constexpr unsigned slow91(int n) {
   return mc91(mc91(foo(n))%100);
}

int main(void)
{
   constexpr unsigned int compiletime_ack=ack(3,14);
   constexpr int compiletime_91=slow91(49);
   static_assert( compiletime_ack == 131069, "Must be evaluated at compile-time" );
   static_assert( compiletime_91  == 91,     "Must be evaluated at compile-time" );
   std::cout << compiletime_ack << std::endl;
   std::cout << compiletime_91  << std::endl;
   std::cout << ack(3,14) << std::endl;
   std::cout << slow91(49) << std::endl;
   return 0;
}

compiletime:

time g++ constexpr.cpp -std=c++11 -fconstexpr-depth=10000000 -O3 

real    0m0.645s
user    0m0.600s
sys     0m0.032s

runtime:

time ./a.out 

131069
91
131069
91

real    0m43.708s
user    0m43.567s
sys     0m0.008s

Here mc91 is the usual mac carthy f91 (as can be found on wikipedia) and foo is just a useless function returning real values between about 1 and 100, with a fib runtime complexity.

Both the slow calculation of 91 and the ackermann functions get evaluated with the same arguments by the compiler and the compiled program.

Surprisingly the program would even run faster, just generating code and running it through the compiler than executing the code itself.

解决方案

At compile-time, redundant (identical) constexpr calls can be memoized, while run-time recursive behavior does not provide this.

If you change every recursive function such as...

constexpr unsigned slow91(int n) {
   return mc91(mc91(foo(n))%100);
}

... to a form that isn't constexpr, but does remember past calculations at runtime:

std::unordered_map< int, boost::optional<unsigned> > results4;
//     parameter(s) ^^^           result ^^^^^^^^

unsigned slow91(int n) {
     boost::optional<unsigned> &ret = results4[n];
     if ( !ret )
     {
         ret = mc91(mc91(foo(n))%100);
     }
     return *ret;
}

You will get less surprising results.

compiletime:

time g++ test.cpp -std=c++11 -O3

real    0m1.708s
user    0m1.496s
sys     0m0.176s

runtime:

time ./a.out

131069
91
131069
91

real    0m0.097s
user    0m0.064s
sys     0m0.032s

这篇关于g ++ c ++ 11 constexpr评估性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆