实际上,为什么不同的编译器会计算int x = ++ i + ++ i;的不同值? [英] In practice, why would different compilers compute different values of int x = ++i + ++i;?

查看:47
本文介绍了实际上,为什么不同的编译器会计算int x = ++ i + ++ i;的不同值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下代码:

  int i = 1;int x = ++ i + ++ i; 

我们猜想编译器可能会对此代码进行处理(假设已编译).

  1. 两个 ++ i 都返回 2 ,导致 x = 4 .
  2. 一个 ++ i 返回 2 ,另一个返回 3 ,导致 x = 5 .
  3. 两个 ++ i 返回 3 ,导致 x = 6 .

在我看来,第二次出现的可能性最大.两个 ++ 运算符之一以 i = 1 执行, i 递增,结果为 2 返回.然后使用 i = 2 执行第二个 ++ 运算符,使 i 递增,结果为 3 返回.然后将 2 3 加在一起,得到 5 .

但是,我在Visual Studio中运行了这段代码,结果是 6 .我试图更好地理解编译器,并且想知道什么可能导致 6 的结果.我唯一的猜测是代码可以通过内置"并发执行.调用了两个 ++ 运算符,每个运算符在另一个返回之前先将 i 递增,然后都返回了 3 .这将与我对调用堆栈的理解相抵触,需要加以解释.

C ++ 编译器会执行哪些(合理的)操作,从而导致结果 4 或结果还是 6 ?

注意

该示例作为Bjarne Stroustrup的《编程:使用C ++的原理和实践》(C ++ 14)中的未定义行为的示例出现.

请参见肉桂的评论.

解决方案

编译器获取您的代码,将其拆分为非常简单的指令,然后重新组合并以其认为最佳的方式对其进行整理.

代码

  int i = 1;int x = ++ i + ++ i; 

包含以下说明:

  1.在我中存储12.将我读为tmp13.将1加到tmp14.将tmp1存储在i中5.将我读为tmp26.将我读为tmp37.将1加到tmp38.将tmp3存储在i中9.将我读为tmp410.将tmp2和tmp4添加为tmp511.将tmp5存储在x中 

尽管这是我编写方法的编号列表,但这里只有几个排序依赖项:1-> 2-> 3-> 4-> 5-> 10->11和1-> 6-> 7-> 8-> 9-> 10-> 11必须保持相对顺序.除此之外,编译器可以自由地重新排序,甚至可以消除冗余.

例如,您可以按以下顺序订购列表:

  1.在我中存储12.将我读为tmp16.将我读为tmp33.将1加到tmp17.将1加到tmp34.将tmp1存储在i中8.将tmp3存储在i中5.将我读为tmp29.将我读为tmp410.将tmp2和tmp4添加为tmp511.将tmp5存储在x中 

为什么编译器可以这样做?因为没有对增加的副作用进行排序.但是现在编译器可以简化:例如,在4中有一个死存储:该值立即被覆盖.另外,tmp2和tmp4确实是同一回事.

  1.在我中存储12.将我读为tmp16.将我读为tmp33.将1加到tmp17.将1加到tmp38.将tmp3存储在i中5.将我读为tmp210.添加tmp2和tmp2作为tmp511.将tmp5存储在x中 

现在,与tmp1有关的所有操作都是无效代码:从未使用过.而且我的重读也可以消除:

  1.在我中存储16.将我读为tmp37.将1加到tmp38.将tmp3存储在i中10.添加tmp3和tmp3作为tmp511.将tmp5存储在x中 

看,这段代码要短得多.优化器很高兴.程序员不是,因为我只增加了一次.糟糕!

让我们看看编译器可以做的其他事情:让我们回到原始版本.

  1.在我中存储12.将我读为tmp13.将1加到tmp14.将tmp1存储在i中5.将我读为tmp26.将我读为tmp37.将1加到tmp38.将tmp3存储在i中9.将我读为tmp410.将tmp2和tmp4添加为tmp511.将tmp5存储在x中 

编译器可以像这样重新排序:

  1.在我中存储12.将我读为tmp13.将1加到tmp14.将tmp1存储在i中6.将我读为tmp37.将1加到tmp38.将tmp3存储在i中5.将我读为tmp29.将我读为tmp410.将tmp2和tmp4添加为tmp511.将tmp5存储在x中 

,然后再次注意,我被阅读了两次,因此请消除其中之一:

  1.在我中存储12.将我读为tmp13.将1加到tmp14.将tmp1存储在i中6.将我读为tmp37.将1加到tmp38.将tmp3存储在i中5.将我读为tmp210.添加tmp2和tmp2作为tmp511.将tmp5存储在x中 

很好,但是可以走得更远:它可以重用tmp1:

  1.在我中存储12.将我读为tmp13.将1加到tmp14.将tmp1存储在i中6.将我读为tmp17.将1加到tmp18.将tmp1存储在i中5.将我读为tmp210.添加tmp2和tmp2作为tmp511.将tmp5存储在x中 

然后它可以消除6中i的重读:

  1.在我中存储12.将我读为tmp13.将1加到tmp14.将tmp1存储在i中7.将1加到tmp18.将tmp1存储在i中5.将我读为tmp210.添加tmp2和tmp2作为tmp511.将tmp5存储在x中 

现在4号仓库已失效:

  1.在我中存储12.将我读为tmp13.将1加到tmp17.将1加到tmp18.将tmp1存储在i中5.将我读为tmp210.添加tmp2和tmp2作为tmp511.将tmp5存储在x中 

现在3和7可以合并为一条指令:

  1.在我中存储12.将我读为tmp13 + 7.将2加到tmp18.将tmp1存储在i中5.将我读为tmp210.添加tmp2和tmp2作为tmp511.将tmp5存储在x中 

消除最后一个临时值:

  1.在我中存储12.将我读为tmp13 + 7.将2加到tmp18.将tmp1存储在i中10.将tmp1和tmp1添加为tmp511.将tmp5存储在x中 

现在您得到Visual C ++给您的结果.

请注意,在两个优化路径中,重要的顺序依存关系都得以保留,只要不删除指令就不会做任何事情.

Consider this code:

int i = 1;
int x = ++i + ++i;

We have some guesses for what a compiler might do for this code, assuming it compiles.

  1. both ++i return 2, resulting in x=4.
  2. one ++i returns 2 and the other returns 3, resulting in x=5.
  3. both ++i return 3, resulting in x=6.

To me, the second seems most likely. One of the two ++ operators is executed with i = 1, the i is incremented, and the result 2 is returned. Then the second ++ operator is executed with i = 2, the i is incremented, and the result 3 is returned. Then 2 and 3 are added together to give 5.

However, I ran this code in Visual Studio, and the result was 6. I'm trying to understand compilers better, and I'm wondering what could possibly lead to a result of 6. My only guess is that the code could be executed with some "built-in" concurrency. The two ++ operators were called, each incremented i before the other returned, and then they both returned 3. This would contradict my understanding of the call stack, and would need to be explained away.

What (reasonable) things could a C++ compiler do that would lead to a result of 4 or a result or 6?

Note

This example appeared as an example of undefined behavior in Bjarne Stroustrup's Programming: Principles and Practice using C++ (C++ 14).

See cinnamon's comment.

解决方案

The compiler takes your code, splits it into very simple instructions, and then recombines and arranges them in a way that it thinks optimal.

The code

int i = 1;
int x = ++i + ++i;

consists of the following instructions:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
5. read i as tmp2
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

But despite this being a numbered list the way I wrote it, there are only a few ordering dependencies here: 1->2->3->4->5->10->11 and 1->6->7->8->9->10->11 must stay in their relative order. Other than that the compiler can freely reorder, and perhaps eliminate redundancy.

For example, you could order the list like this:

1. store 1 in i
2. read i as tmp1
6. read i as tmp3
3. add 1 to tmp1
7. add 1 to tmp3
4. store tmp1 in i
8. store tmp3 in i
5. read i as tmp2
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

Why can the compiler do this? Because there's no sequencing to the side effects of the increment. But now the compiler can simplify: for example, there's a dead store in 4: the value is immediately overwritten. Also, tmp2 and tmp4 are really the same thing.

1. store 1 in i
2. read i as tmp1
6. read i as tmp3
3. add 1 to tmp1
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

And now everything to do with tmp1 is dead code: it's never used. And the re-read of i can be eliminated too:

1. store 1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
10. add tmp3 and tmp3, as tmp5
11. store tmp5 in x

Look, this code is much shorter. The optimizer is happy. The programmer is not, because i was only incremented once. Oops.

Let's look at something else the compiler can do instead: let's go back to the original version.

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
5. read i as tmp2
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

The compiler could reorder it like this:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

and then notice again that i is read twice, so eliminate one of them:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

That's nice, but it can go further: it can reuse tmp1:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp1
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

Then it can eliminate the re-read of i in 6:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

Now 4 is a dead store:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

and now 3 and 7 can be merged into one instruction:

1. store 1 in i
2. read i as tmp1
3+7. add 2 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

Eliminate the last temporary:

1. store 1 in i
2. read i as tmp1
3+7. add 2 to tmp1
8. store tmp1 in i
10. add tmp1 and tmp1, as tmp5
11. store tmp5 in x

And now you get the result that Visual C++ is giving you.

Note that in both optimization paths, the important order dependencies were preserved, insofar as the instructions weren't removed for doing nothing.

这篇关于实际上,为什么不同的编译器会计算int x = ++ i + ++ i;的不同值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆