gcc中的分区优化 [英] Optimisation of division in gcc

查看:143
本文介绍了gcc中的分区优化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这里是一些代码(完整的程序后面的问题)

  template< typename T& 
T fizzbuzz(T n){
T count(0);
#if CONST
const T div(3);
#else
T div(3);
#endif
for(T i(0); i <= n; ++ i){
if(i%div == T(0))count + = i;
}
return count;
}


$ b $ p

现在,如果我使用 int ,那么根据是否定义CONST,我得到了6个性能差异的因子:

  $ gcc --version 
gcc(GCC)3.4.4(cygming special,gdc 0.12,using dmd 0.125)

$ make -B wrappedint CPPFLAGS = - O3 -Wall -Werror - DWRAP = 0 -DCONST = 0&&
time ./wrappedint
g ++ -O3 -Wall -Werror -DWRAP = 0 -DCONST = 0 wrappedint.cpp -o wrappedi
nt
484573652

real 0m2.543s
用户0m2.059s
sys 0m0.046s

$ make -B wrappedint CPPFLAGS = - O3 -Wall -Werror -DWRAP = 0 -DCONST = 1&&
time ./wrappedint
g ++ -O3 -Wall -Werror -DWRAP = 0 -DCONST = 1 wrappedint.cpp -o wrappedi
nt
484573652

real 0m0.655s
用户0m0.327s
sys 0m0.046s

检查反汇编显示,在fast(const)情况下,模数已经变成乘法和shift类型的东西,而在慢(非const)情况下,它使用 idivl



更糟糕的是,如果我试图将我的整数包装在一个类中,那么优化不会发生,无论我使用const还是不。代码总是使用 idivl 并运行缓慢:

  #include< iostream> 

struct WrappedInt {
int v;
explicit WrappedInt(const int& val):v(val){}
bool operator< =(const WrappedInt& rhs)const {return v <= rhs.v; }
bool operator ==(const WrappedInt& rhs)const {return v == rhs.v; }
WrappedInt& operator ++(){++ v; return * this; }
WrappedInt& operator + =(const WrappedInt& rhs){v + = rhs.v; return * this; }
WrappedInt operator%(const WrappedInt& rhs)const
{return WrappedInt(v%rhs.v); }
};

std :: ostream& operator<<<(std :: ostream& s,WrappedInt w){
return s< ;
}

template< typename T>
T fizzbuzz(T n){
T count(0);
#if CONST
const T div(3);
#else
T div(3);
#endif
for(T i(0); i <= n; ++ i){
if(i%div == T(0))count + = i;
}
return count;
}

int main(){
#if WRAP
WrappedInt w(123456789);
std :: cout<< fizzbuzz(w)< \\\
;
#else
std :: cout<< fizzbuzz< int>(123456789)< \\\
;
#endif
}

我的问题是:



1)有一个C ++本身的简单原理,或gcc的优化,这解释了为什么会发生这种情况,还是只是一个各种启发式运行,这是你得到的代码的情况?



2)有没有办法使编译器意识到我的本地声明和从不引用的const WrappedInt可以被当作编译时const值?我想这个东西是一个直接替换int的模板。



3)有一种已知的方式来包装一个int,使编译器可以丢弃包装时优化?目标是WrappedInt将是一个基于策略的模板。但是如果一个do-nothing策略导致基本上任意的6倍速度惩罚超过int,我更好的特殊情况下,使用int直接。

我猜它只是刚刚运行的严重老的GCC版本。我在我的机器上的最古老的编译器 - gcc-4.1.2,与非const和wrap版本一起执行快速方法(并且只在-O1处执行)。


Here's some code (full program follows later in the question):

template <typename T>
T fizzbuzz(T n) {
    T count(0);
    #if CONST
        const T div(3);
    #else
        T div(3);
    #endif
    for (T i(0); i <= n; ++i) {
        if (i % div == T(0)) count += i;
    }
    return count;
}

Now, if I call this template function with int, then I get a factor of 6 performance difference according to whether I define CONST or not:

$ gcc --version
gcc (GCC) 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)

$ make -B wrappedint CPPFLAGS="-O3 -Wall -Werror -DWRAP=0 -DCONST=0" &&
 time ./wrappedint
g++  -O3 -Wall -Werror -DWRAP=0 -DCONST=0   wrappedint.cpp   -o wrappedi
nt
484573652

real    0m2.543s
user    0m2.059s
sys     0m0.046s

$ make -B wrappedint CPPFLAGS="-O3 -Wall -Werror -DWRAP=0 -DCONST=1" &&
 time ./wrappedint
g++  -O3 -Wall -Werror -DWRAP=0 -DCONST=1   wrappedint.cpp   -o wrappedi
nt
484573652

real    0m0.655s
user    0m0.327s
sys     0m0.046s

Examining the disassembly shows that in the fast (const) case, the modulo has been turned into a multiplication and shift type thing, whereas in the slow (non-const) case it's using idivl.

Even worse, if I try to wrap my integer in a class, then the optimisation doesn't happen whether I use const or not. The code always uses idivl and runs slow:

#include <iostream>

struct WrappedInt {
    int v;
    explicit WrappedInt(const int &val) : v(val) {}
    bool operator<=(const WrappedInt &rhs) const { return v <= rhs.v; }
    bool operator==(const WrappedInt &rhs) const { return v == rhs.v; }
    WrappedInt &operator++() { ++v; return *this; }
    WrappedInt &operator+=(const WrappedInt &rhs) { v += rhs.v; return *this; }
    WrappedInt operator%(const WrappedInt &rhs) const 
        { return WrappedInt(v%rhs.v); }
};

std::ostream &operator<<(std::ostream &s, WrappedInt w) {
    return s << w.v;
}

template <typename T>
T fizzbuzz(T n) {
    T count(0);
    #if CONST
        const T div(3);
    #else
        T div(3);
    #endif
    for (T i(0); i <= n; ++i) {
        if (i % div == T(0)) count += i;
    }
    return count;
}

int main() {
    #if WRAP
        WrappedInt w(123456789);
        std::cout << fizzbuzz(w) << "\n";
    #else
        std::cout << fizzbuzz<int>(123456789) << "\n";
    #endif
}

My questions are:

1) Is there a simple principle of C++ itself, or gcc's optimisation, which explains why this happens, or is it just a case of "various heuristics run, this is the code you get"?

2) Is there any way to make the compiler realise that my locally-declared and never-referenced const WrappedInt can be treated as a compile-time const value? I want this thing to be a straight replacement for int in templates.

3) Is there a known way of wrapping an int such that the compiler can discard the wrapping when optimising? The goal is that WrappedInt will be a policy-based template. But if a "do-nothing" policy results in essentially arbitrary 6x speed penalties over int, I'm better off special-casing that situation and using int directly.

解决方案

I'm guessing its just the severely old GCC version you are running. The oldest compiler I have on my machine - gcc-4.1.2, performs the fast way with both the non-const and the wrap versions (and does so at only -O1).

这篇关于gcc中的分区优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆