后加工效率在C ++中预增 [英] Efficiency of postincrement v.s. preincrement in C++

查看:80
本文介绍了后加工效率在C ++中预增的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我通常认为预增加更多效率比C ++中的后增加。但是,当我最近阅读 游戏引擎架构(第2版) 时,是一个段说,postincrement优于比precycrement在for循环。因为,正如我所引用的,preincrement向你的代码中引入了一个数据依赖性 - CPU必须等待增量操作完成,然后才能在表达式中使用它的值。这是真的? (这真的颠覆了我对这个问题的想法。)

I usually think that preincrement is more efficient than postincrement in C++. But when I read the book Game Engine Architecture(2nd ed.) recently, there is a section says that postincrement is prefered than preincrement in for loop. Because, as I quote, "preincrement introduces a data dependency into your code -- the CPU must wait for the increment operation to be completed before its value can be used in the expression." Is this true? (It is really subverted my idea about this problem.)

这是从你感兴趣的部分的报价:

Here is the quote from the section in case you are interested:


5.3.2.1预增量与后增量

5.3.2.1 Preincrement versus Postincrement

请注意,在上面的例子中,我们使用了C ++的后增量运算符
p ++ ,而不是预增量运算符 ++ p 。这是一个微妙但有时重要的优化。预增量运算符在表达式中使用变量的(现已修改的)值之前增量变量的内容。后添加运算符在变量已使用之后递增它的内容。这意味着写 ++ p 会在代码中引入数据相关性 - CPU必须等待增量操作在其值之前完成可以在表达式中使用。在深度流水线的CPU上,这引入了 stall 。另一方面, p ++ 没有数据依赖性。变量的值可以立即使用,增量操作可以在其后或与其使用并行发生。

Notice in the above example that we are using C++’s postincrement operator, p++, rather than the preincrement operator, ++p. This is a subtle but sometimes important optimization. The preincrement operator increments the contents of the variable before its (now modified) value is used in the expression. The postincrement operator increments the contents of the variable after it has been used. This means that writing ++p introduces a data dependency into your code -- the CPU must wait for the increment operation to be completed before its value can be used in the expression. On a deeply pipelined CPU, this introduces a stall. On the other hand, with p++ there is no data dependency. The value of the variable can be used immediately, and the increment operation can happen later or in parallel with its use. Either way, no stall is introduced into the pipeline.

当然,在的update表达式中循环( for(init_expr;
test_expr; update_expr){...}
),在
之前和之后没有区别。这是因为任何好的编译器都会认识到
的变量的值不在 update_expr 中使用。但是在使用
值的情况下,postincrement是优越的,因为它不会在CPU的管道中引入一个stall
。因此,除非你绝对需要preincrement的语义,否则最好总是使用
postincrement。

Of course, within the "update" expression of a for loop (for(init_expr; test_expr; update_expr) { ... }), there should be no difference between pre- and postincrement. This is because any good compiler will recognize that the value of the variable isn’t used in update_expr. But in cases where the value is used, postincrement is superior because it doesn’t introduce a stall in the CPU’s pipeline. Therefore, it’s good to get in the habit of always using postincrement, unless you absolutely need the semantics of preincrement.

编辑:添加上述示例。

void processArray(int container[], int numElements)
{
    int* pBegin = &container[0];
    int* pEnd = &container[numElements];
    for (int* p = pBegin; p != pEnd; p++)
    {
        int element = *p;
        // process element...
    }
}

void processList(std::list<int>& container)
{
    std::list<int>::iterator pBegin = container.begin();
    std::list<int>::iterator pEnd = container.end();
    std::list<inf>::iterator p;
    for (p = pBegin; p != pEnd; p++)
    {
        int element = *p;
        // process element...
    }
}


推荐答案



preincrement向您的代码引入了数据依赖性 - CPU必须等待增量操作完成,在表达式中使用。

preincrement introduces a data dependency into your code -- the CPU must wait for the increment operation to be completed before its value can be used in the expression."

这是真的吗?


b $ b

这是真的 - 虽然可能过于严格。预增量不一定引入数据依赖 - 但它可以。

It is true - although perhaps overly strict. Pre increment doesn't necessarily introduce a data dependency - but it can.

a = b++ * 2;

这里,增量可以与乘法并行执行,增量和乘法的操作数立即可用,不依赖于任一操作的结果。

Here, the increment can be executed in parallel with the multiplication. The operands of both the increment and the multiplication are immediately available and do not depend on the result of either operation.

另一个例子:

a = ++b * 2;

这里,乘法必须在递增后执行,因为乘法的操作数之一取决于增量。

Here, the multiplication must be executed after the increment, because one of the operands of the multiplication depends on the result of the increment.

当然,这些语句做的事情略有不同,因此编译器可能不会总是能够将程序从一种形式转换到另一种形式,同时保持语义

Of course, these statements do slightly different things, so the compiler might not always be able to transform the program from one form to the other while keeping the semantics the same - which is why using the post increment might make a slight difference in performance.

这是一个实际的例子,使用循环:

A practical example, using a loop:

for(int i= 0; arr[i++];)
    count++;

for(int i=-1; arr[++i];) // more typically: (int i=0; arr[i]; ++i;)
    count++;

可能会认为后者必然更快,如果他们认为post-increment make a copy - 这在非基本类型的情况下将是非常真实的。然而,由于数据依赖性(因为 int 是一个基本类型,没有对增量运算符的过载函数),前者理论上可以更有效。是否取决于cpu架构和优化器的能力。

One might think that the latter is necessarily faster if they reason that "post-increment makes a copy" - which would have been very true in the case of non-fundamental types. However, due to the data dependency (and because int is a fundamental type with no overload function for increment operators), the former can theoretically be more efficient. Whether it is depends on the cpu architecture, and the ability of the optimizer.

对于什么是值得的 - 在一个简单的程序中,在x86 arch上,使用g ++编译器与优化上面的循环具有相同的程序集输出,所以它们在 情况下是完全等效的。

For what it's worth - in a trivial program, on x86 arch, using g++ compiler with optimization enabled, the above loops had identical assembly output, so they are perfectly equivalent in that case.

经验法则:

如果计数器是一个基本类型,并且不使用增量的结果,那么使用post / pre increment不会有什么区别。

If the counter is a fundamental type and the result of increment is not used, then it makes no difference whether you use post/pre increment.

如果计数器不是基本类型,并且未使用增量的结果并禁用优化,则预增量可能更有效。启用优化后,没有区别。

If the counter is not a fundamental type and the result of the increment is not used and optimizations are disabled, then pre increment may be more efficient. With optimizations enabled, there is no difference.

如果计数器是一个基本类型,并且使用增量的结果,那么post increment可以在理论上稍微更有效一些cpu架构 - 在某些上下文中使用一些编译器。

If the counter is a fundamental type and the result of increment is used, then post increment can theoretically be marginally more efficient - in some cpu architecture - in some context - using some compiler.

如果计数器是一个复杂类型,并且使用增量的结果,后增量。另请参见R Sahu关于这种情况的答案。

If the counter is a complex type and the result of the increment is used, then pre increment is typically faster than post increment. Also see R Sahu's answer regarding this case.

这篇关于后加工效率在C ++中预增的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆