在C ++中重新声明变量会花费什么吗? [英] Does redeclaring variables in C++ cost anything?

查看:70
本文介绍了在C ++中重新声明变量会花费什么吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

出于可读性考虑,我认为下面的第一个代码块更好。但是第二个代码块更快吗?

For readability, I think the first code block below is better. But is the second code block faster?

第一个块:

for (int i = 0; i < 5000; i++){
    int number = rand() % 10000 + 1;
    string fizzBuzz = GetStringFromFizzBuzzLogic(number);
}

第二个区块:

int number;
string fizzBuzz;
for (int i = 0; i < 5000; i++){
    number = rand() % 10000 + 1;
    fizzBuzz = GetStringFromFizzBuzzLogic(number);
}

在C ++中重新声明变量会花费任何费用吗?

Does redeclaring variables in C++ cost anything?

推荐答案

我对这个特定的代码进行了基准测试,即使没有进行优化,这两种变体的运行时几乎都相同。最低的优化级别一旦打开,结果就会非常接近相同(时间测量中有+/-一点噪声)。

I benchmarked this particular code, and even WITHOUT optimisation, it came to almost the same runtime for both variants. And as soon as the lowest level of optimisation is turned on, the result is very close to identical (+/- a bit of noise in the time measurement).

编辑:对生成的汇编代码的以下分析显示,很难猜测哪种形式会更快,因为大多数人可能会给出的答案是 func2 ,但事实证明,此功能要慢一点,至少在使用clang ++和-O2进行编译时。有充分的证据表明,编写代码,基准,更改代码,基准是处理性能的正确方法,而不是根据阅读代码来猜测。记住有人告诉我的内容,优化有点像将洋葱分层放置-一旦优化了一部分,您最终就会看到非常相似的东西,只是稍微小了一点……;)

below analysis of the generated assembler code shows that it's hard to guess which form is faster, since the answer most people would probably give is func2, but it turns out this function is a tiny bit slower, at least when compiling with clang++ and -O2. And it's good evidence that "writ code, benchmark, change code, benchmark" is the correct way to deal with performance, not guessing based on reading the code. And remember what someone told me, optimising is a bit like taking an onion apart in layers - once you optimise one part, you end up looking at something very similar just a little smaller... ;)

但是,我的初步分析使 func1 显着变慢-出于某些奇怪的原因,这是因为编译器无法优化 rand()%10000 + 1 func1 中,但在 func2 关闭优化功能时。这表示 func1 。但是,一旦启用优化,两个函数都会获得快速模数。

However, my initial analysis made func1 significantly slower - that turns out to be becuse the compiler, for some bizarr reason, doesn't optimise the rand() % 10000 + 1 in func1 but does in func2 when optimisation is turned of. This means that func1. However, once optimisation is enabled, both functions gets a "fast" modulo.

使用linux性能工具 perf 显示,使用clang ++和-O2,我们可以得到func1的以下内容

Using the linux performance tool perf shows that with clang++ and -O2 we get the following for func1

  15.76%  a.out    libc-2.20.so         free
  12.31%  a.out    libstdc++.so.6.0.20  std::string::_S_construct<char cons
  12.29%  a.out    libc-2.20.so         _int_malloc
  10.05%  a.out    a.out                func1
   7.26%  a.out    libc-2.20.so         __random
   6.36%  a.out    libc-2.20.so         malloc
   5.46%  a.out    libc-2.20.so         __random_r
   5.01%  a.out    libstdc++.so.6.0.20  std::basic_string<char, std::char_t
   4.83%  a.out    libstdc++.so.6.0.20  std::string::_Rep::_S_create
   4.01%  a.out    libc-2.20.so         strlen

对于func2:

  17.88%  a.out    libc-2.20.so         free
  10.73%  a.out    libc-2.20.so         _int_malloc                    
   9.77%  a.out    libc-2.20.so         malloc
   9.03%  a.out    a.out                func2                        
   7.63%  a.out    libstdc++.so.6.0.20  std::string::_S_construct<char con
   6.96%  a.out    libstdc++.so.6.0.20  std::string::_Rep::_S_create
   4.48%  a.out    libc-2.20.so         __random  
   4.39%  a.out    libc-2.20.so         __random_r
   4.10%  a.out    libc-2.20.so         strlen 

有些微妙的区别,但是我会认为这更多是因为基准测试的运行时间相对较短,而不是编译器生成的实际代码有所不同。

There are some subtle differences, but I would call those as being more to do with the relatively short runtime of the benchmark, rather than the difference in actual code generated by the compiler.

这与以下代码有关:

#include <iostream>
#include <string>
#include <cstdlib>

#define N 500000

extern std::string GetStringFromFizzBuzzLogic(int number);

void func1()
{
    for (int i = 0; i < N; i++){
        int number = rand() % 10000 + 1;
        std::string fizzBuzz = GetStringFromFizzBuzzLogic(number);
    }
}

void func2()
{
    int number;
    std::string fizzBuzz;
    for (int i = 0; i < N; i++){
        number = rand() % 10000 + 1;
        fizzBuzz = GetStringFromFizzBuzzLogic(number);
    }
}

static __inline__ unsigned long long rdtsc(void)
{
    unsigned hi, lo;
    __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
    return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
}

int main(int argc, char **argv)
{

    void (*f)();

    if (argc == 1)
    f = func1;
    else
    f = func2;

    for(int i = 0; i < 5; i++)
    {
        unsigned long long t1 = rdtsc();

        f();
        t1 = rdtsc() - t1;

        std::cout << "time=" << t1 << std::endl;
    }
}

并在单独的文件中:

#include <string>

std::string GetStringFromFizzBuzzLogic(int number)
{
    return "SomeString";
}

使用func1运行:

./a.out
time=876016390
time=824149942
time=826812600
time=825266315
time=826151399

使用func2运行:

./a.out
time=905721532
time=895393507
time=886537634
time=879836476
time=883887384

这是在N上添加了另一个0-因此运行时间延长了10倍-似乎相当稳定速度稍慢一些,但实际上只有几个百分点,而且可能在噪音范围内-确实,整个基准时间大约需要1.30-1.39秒。

This is with another 0 added to N - so 10 times longer runtime - it seems that it's fairly consistently a little SLOWER, but it's a few percent, and probably within the noise, really - in time, the whole benchmark takes around 1.30-1.39 seconds.

编辑:查看实际循环的汇编代码[这只是循环的一部分,但其余部分在代码实际执行方面是相同的]

Looking at the assembly code of the actual loop [this is only a portion of the loop, but the rest is identical in terms of what the code actutally does]

Func1:

.LBB0_1:                                # %for.body
    callq   rand
    movslq  %eax, %rcx
    imulq   $1759218605, %rcx, %rcx # imm = 0x68DB8BAD
    movq    %rcx, %rdx
    shrq    $63, %rdx
    sarq    $44, %rcx
    addl    %edx, %ecx
    imull   $10000, %ecx, %ecx      # imm = 0x2710
    negl    %ecx
    leal    1(%rax,%rcx), %esi
    movq    %r15, %rdi
    callq   _Z26GetStringFromFizzBuzzLogici
    movq    (%rsp), %rax
    leaq    -24(%rax), %rdi
    cmpq    %rbx, %rdi
    jne .LBB0_2
.LBB0_7:                                # %_ZNSsD2Ev.exit
    decl    %ebp
    jne .LBB0_1

Func2:

.LBB1_1:
    callq   rand
    movslq  %eax, %rcx
    imulq   $1759218605, %rcx, %rcx # imm = 0x68DB8BAD
    movq    %rcx, %rdx
    shrq    $63, %rdx
    sarq    $44, %rcx
    addl    %edx, %ecx
    imull   $10000, %ecx, %ecx      # imm = 0x2710
    negl    %ecx
    leal    1(%rax,%rcx), %esi
    movq    %rbx, %rdi
    callq   _Z26GetStringFromFizzBuzzLogici
    movq    %r14, %rdi
    movq    %rbx, %rsi
    callq   _ZNSs4swapERSs
    movq    (%rsp), %rax
    leaq    -24(%rax), %rdi
    cmpq    %r12, %rdi
    jne .LBB1_4
.LBB1_9:                                # %_ZNSsD2Ev.exit19
    incl    %ebp
    cmpl    $5000000, %ebp          # imm = 0x4C4B40

因此, func2 版本包含一个额外的函数调用:

So, as can be seen, the func2 version contains an extra function call:

    callq   _ZNSs4swapERSs

转换为 std :: basic_string< char,std :: char_traits< char> ;, std :: allocator< char> > :: swap(std :: basic_string< char,std :: char_traits< char> ;, std :: allocator< char>>&)) std :: string :: swap(std :: string&)-大概是调用 std :: string :: operator =(std :: string& s)<的结果/ code>。这可以解释为什么 func2 func1 慢一些。

which translates to std::basic_string<char, std::char_traits<char>, std::allocator<char> >::swap(std::basic_string<char, std::char_traits<char>, std::allocator<char> >&) or std::string::swap(std::string&) - which is presumably the result of calling std::string::operator=(std::string &s). This would explain why func2 is slightly slower than func1.

我敢肯定,有可能发现在一个循环中构造/销毁一个对象要花费大量时间的情况,但总的来说,这样做几乎没有或没有根本没有什么区别,拥有更清晰的代码实际上对读者有帮助。它还通常会帮助编译器进行生命周期分析,因为走动以减少以后是否使用该变量的代码更少(在这种情况下,代码总是很短,但是显然并非总是如此)在现实生活中的例子)

I'm sure it is possible to find cases where constructing/destroying an object takes significant amounts of time in a loop, but in general, it will make little or no difference at all, and having clearer code will actually help the reader. It will also often help the compiler with "life-time analysis", since it's less code to "walk" to find out if the variable is used later (in this case, the code is short anyway, but that's obviously not always the case in real life examples)

这篇关于在C ++中重新声明变量会花费什么吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆