使用g ++ 5.3.1编译时，该程序的运行速度比使用g ++ 4.8.4，相同命令编译的同一程序慢3倍. [英] The program runs 3 times slower when compiled with g++ 5.3.1 than the same program compiled with g++ 4.8.4, the same command

查看：157 发布时间：2020/7/23 1:31:52 c++ performance ubuntu gcc5

本文介绍了使用g ++ 5.3.1编译时，该程序的运行速度比使用g ++ 4.8.4，相同命令编译的同一程序慢3倍.的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

最近，我开始在g ++ 5.3.1上使用Ubuntu 16.04，并检查我的程序运行速度慢了3倍. 在此之前，我曾使用过Ubuntu 14.04，g ++ 4.8.4. 我用相同的命令构建它:CFLAGS = -std=c++11 -Wall -O3.

Recently, I've started to use Ubuntu 16.04 with g++ 5.3.1 and checked that my program runs 3 times slower. Before that I've used Ubuntu 14.04, g++ 4.8.4. I built it with the same commands: CFLAGS = -std=c++11 -Wall -O3.

我的程序包含循环，充满了数学调用(sin，cos，exp). 您可以在此处找到它.

My program contains cycles, filled with math calls (sin, cos, exp). You can find it here.

我尝试使用不同的优化标志(O0，O1，O2，O3，Ofast)进行编译，但是在所有情况下都出现了问题(使用Ofast两种变体运行速度更快，但第一个变体运行速度慢了3倍)

I've tried to compile with different optimization flags (O0, O1, O2, O3, Ofast), but in all cases the problem is reproduced (with Ofast both variants run faster, but the first runs 3 times slower still).

在我的程序中，我使用libtinyxml-dev，libgslcblas.但是它们在两种情况下都具有相同的版本，并且在性能方面并没有在程序中起任何重要的作用(根据代码和callgrind分析).

In my program I use libtinyxml-dev, libgslcblas. But they have the same versions in both cases and don't take any significant part in the program (according to code and callgrind profiling) in terms of performance.

我已经执行了概要分析，但是它并没有让我知道为什么会发生这种情况. Kcachegrind比较(左侧比较慢). 我只注意到，与Ubuntu 14.04中的libm-2.19相比，该程序现在使用libm-2.23.

I've performed profiling, but it doesn't give me any idea about why it happens. Kcachegrind comparison (left is slower). I've only noticed that now the program uses libm-2.23 compared to libm-2.19 with Ubuntu 14.04.

我的处理器是Haswell i7-5820.

My processor is i7-5820, Haswell.

我不知道为什么它变慢了.你有什么主意吗?

I have no idea why it becomes slower. Do you have any ideas?

P.S.您可以在下面找到最耗时的功能:

P.S. Below you can find the most time consuming function:

void InclinedSum::prepare3D()
{
double buf1, buf2;
double sum_prev1 = 0.0, sum_prev2 = 0.0;
int break_idx1, break_idx2; 
int arr_idx;

for(int seg_idx = 0; seg_idx < props->K; seg_idx++)
{
    const Point& r = well->segs[seg_idx].r_bhp;

    for(int k = 0; k < props->K; k++)
    {
        arr_idx = seg_idx * props->K + k;
        F[arr_idx] = 0.0;

        break_idx2 = 0;

        for(int m = 1; m <= props->M; m++)
        {
            break_idx1 = 0;

            for(int l = 1; l <= props->L; l++)
            {
                buf1 = ((cos(M_PI * (double)(m) * well->segs[k].r1.x / props->sizes.x - M_PI * (double)(l) * well->segs[k].r1.z / props->sizes.z) -
                            cos(M_PI * (double)(m) * well->segs[k].r2.x / props->sizes.x - M_PI * (double)(l) * well->segs[k].r2.z / props->sizes.z)) /
                        ( M_PI * (double)(m) * tan(props->alpha) / props->sizes.x + M_PI * (double)(l) / props->sizes.z ) + 
                            (cos(M_PI * (double)(m) * well->segs[k].r1.x / props->sizes.x + M_PI * (double)(l) * well->segs[k].r1.z / props->sizes.z) -
                            cos(M_PI * (double)(m) * well->segs[k].r2.x / props->sizes.x + M_PI * (double)(l) * well->segs[k].r2.z / props->sizes.z)) /
                        ( M_PI * (double)(m) * tan(props->alpha) / props->sizes.x - M_PI * (double)(l) / props->sizes.z )
                            ) / 2.0;

                buf2 = sqrt((double)(m) * (double)(m) / props->sizes.x / props->sizes.x + (double)(l) * (double)(l) / props->sizes.z / props->sizes.z);

                for(int i = -props->I; i <= props->I; i++)
                {   

                    F[arr_idx] += buf1 / well->segs[k].length / buf2 *
                        ( exp(-M_PI * buf2 * fabs(r.y - props->r1.y + 2.0 * (double)(i) * props->sizes.y)) - 
                        exp(-M_PI * buf2 * fabs(r.y + props->r1.y + 2.0 * (double)(i) * props->sizes.y)) ) *
                        sin(M_PI * (double)(m) * r.x / props->sizes.x) * 
                        cos(M_PI * (double)(l) * r.z / props->sizes.z);
                }

                if( fabs(F[arr_idx] - sum_prev1) > F[arr_idx] * EQUALITY_TOLERANCE )
                {
                    sum_prev1 = F[arr_idx];
                    break_idx1 = 0;
                } else
                    break_idx1++;

                if(break_idx1 > 1)
                {
                    //std::cout << "l=" << l << std::endl;
                    break;
                }
            }

            if( fabs(F[arr_idx] - sum_prev2) > F[arr_idx] * EQUALITY_TOLERANCE )
            {
                sum_prev2 = F[arr_idx];
                break_idx2 = 0;
            } else
                break_idx2++;

            if(break_idx2 > 1)
            {
                std::cout << "m=" << m << std::endl;
                break;
            }
        }
    }
}
}

进一步调查. 我编写了以下简单程序:

Further investigation. I wrote the following simple program:

#include <cmath>
#include <iostream>
#include <chrono>

#define CYCLE_NUM 1E+7

using namespace std;
using namespace std::chrono;

int main()
{
    double sum = 0.0;

    auto t1 = high_resolution_clock::now();
    for(int i = 1; i < CYCLE_NUM; i++)
    {
        sum += sin((double)(i)) / (double)(i);
    }
    auto t2 = high_resolution_clock::now();

    microseconds::rep t = duration_cast<microseconds>(t2-t1).count();

    cout << "sum = " << sum << endl;
    cout << "time = " << (double)(t) / 1.E+6 << endl;

    return 0;
}

我真的很想知道为什么这个简单的示例程序在g ++ 4.8.4 libc-2.19(libm-2.19)下比在g ++ 5.3.1 libc-2.23(libm-2.23)下快2.5.

I am really wondering why this simple sample program is 2.5 faster under g++ 4.8.4 libc-2.19 (libm-2.19) than under g++ 5.3.1 libc-2.23 (libm-2.23).

编译命令为:

g++ -std=c++11 -O3 main.cpp -o sum

使用其他优化标记不会更改比率.

Using other optimization flags don't change the ratio.

我怎么知道gcc或libc是谁在减慢该程序的速度?

How can I understand who, gcc or libc, slow down the program?

使用g ++ 5.3.1编译时，该程序的运行速度比使用g ++ 4.8.4，相同命令编译的同一程序慢3倍. [英] The program runs 3 times slower when compiled with g++ 5.3.1 than the same program compiled with g++ 4.8.4, the same command

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

使用g ++ 5.3.1编译时，该程序的运行速度比使用g ++ 4.8.4，相同命令编译的同一程序慢3倍. [英] The program runs 3 times slower when compiled with g++ 5.3.1 than the same program compiled with g++ 4.8.4, the same command

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭