在C ++ 11中,`i + = ++ i + 1'是否显示未定义的行为? [英] In C++11, does `i += ++i + 1` exhibit undefined behavior?

查看:185
本文介绍了在C ++ 11中,`i + = ++ i + 1'是否显示未定义的行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题出现在我正在阅读(的答案)为什么是i = ++ i + 1明确定义C ++ 11?



我认为微妙的解释是(1)表达式 ++ i 返回一个左值,但 + 以prvalues作为操作数,因此必须执行从lvalue到prvalue的转换;这涉及获得该左值的当前值(而不是比旧值 i 大一个),因此必须在之后排序副作用从递增(即更新 i )(2)赋值的LHS也是一个左值,因此其值求值不涉及获取当前值 i ;而这个值计算是不可靠的。 RHS的值计算,这没有问题(3)赋值本身的值计算涉及更新 i (再次),但是在其计算其RHS,因此在更新到 i 之后;没有问题。



很好,所以没有UB。现在我的问题是如果一个修改操作符从 = + = (或类似的操作符) / p>


表达式的计算 i + = ++ i + 1 未定义的行为?


正如我看到的,标准似乎在这里自相矛盾。由于 + = 的LHS仍然是一个左值(它的RHS仍然是prvalue),上述相同的推理适用于(1)和(2) ;在 + = 上的操作数的求值中没有未定义的行为。对于(3),复合赋值 + = (更确切地说,该操作的副作用;其值计算,如果需要的话)它的副作用)现在必须 取出 i 的当前值,然后如果标准没有这么明确地说明,否则这样的算子的评估总是调用未定义的行为)添加RHS并将结果存回 i 。如果这两个操作都是未定序的,那么这两个操作都会给出未定义的行为。 ++ 的副作用,但如上所述( ++ 的副作用在给出 + = 运算符的RHS值,其中该值计算在该化合物的操作之前被排序,其中 +



但另一方面,标准还说, E + = F 等价于 E = E + F ,除了(左值)E只计算一次。现在在我们的例子中, i (这是 E 在这里)的值计算不涉及需要排序wrt的任何东西其他动作,所以做一两次没有什么区别;我们的表达式应该严格等于 E = E + F 。但这里的问题;很明显,评估 i = i +(++ i + 1)会给出未定义的行为!是什么赋予了?或者这是标准的缺陷?



添加。我稍微修改了上面的讨论,副作用和价值计算,并使用表达式的评估(如同标准)来涵盖两者。我认为我的主要审讯不仅仅是在这个例子中行为是否被定义,而是如何必须阅读标准,以决定这一点。值得注意的是,如果取等价的 E op = F E = E op F 复合赋值操作的语义(在这种情况下,该示例清楚地具有UB),或者仅仅作为在确定要分配的值(即由 op code>,其中左值 - 右值将复合赋值运算符的LHS转换为左操作数,并将其右RHS作为右操作数)。后一种选择使得在这个例子中更难以争论UB,正如我试图解释的那样。我承认,使等价性具有权威性(使复合赋值成为一种第二类原语,其含义是通过重写第一类原语给出的;因此语言定义将被简化)是诱人的,但是




  • 等价不是绝对的,因为 E 只计算一次异常。注意,这个异常对于避免在 E 的评估涉及副作用未定义的行为,例如在相当常见的 a [ i ++] + = b; 使用。如果事实我认为没有绝对等同的重写,以消除复合作业是可能的;使用虚构的 ||| 运算符指定未排序的评估,可以尝试定义 E op = F; int 操作数),等同于 {int& L = E ||| int R = F; L = L + R; } ,但是示例不再有UB。


  • 标准不会将复合作业分配为第二类基元,而且不需要单独的语义定义。例如在5.17(强调我)


    赋值运算符(=)和复合赋值运算符所有组从右到左。 [...] 在所有情况下,赋值在右值和左值操作数的值
    计算之后,在赋值表达式的值计算之前排序。对于不确定顺序的函数调用,复合赋值的操作是单个评估



  • 如果意图是使复合赋值仅仅是简单赋值的短赋,则没有理由将它们明确地包括在本说明书中。



如果一个人承认复合赋值有一个它们自己的语义,然后出现的是,它们的评估涉及(除了数学运算)不仅仅是副作用(赋值)和值赋值(赋值之后的序列),而且是未命名的获取(先前)值。这通常在左值到右值转换的标题下处理,但是这样做难以证明,因为没有将LHS 作为右值操作数的操作符存在虽然有一个在扩展的等价形式)。正是这个未命名的操作,其与 ++ 的副作用的潜在无序关系将导致UB,但是该无序关系在标准中没有明确说明,因为未命名操作不是。

解决方案

没有明确的情况下,这里的未定义行为



当然,如我在问题中所指出的那样,可以给出导致UB的参数,并且在迄今给出的答案中已经重复。然而,这涉及严格的5.17:7的读取,这与<5.17:1 关于复合赋值的显式语句矛盾,自相矛盾。与5.17:7的阅读较弱,矛盾消失,UB的论点也是如此。此时,我的结论既不是 ,这里有UB,也不明确定义的行为,但标准的文本是不一致的,应该修改以确定哪种读数优先(我认为这意味着应写入缺陷报告)。当然,在这里可以引用标准中的后退子句(1.3.24中的注释),其中标准未能定义行为[明确和自我一致]的评估是未定义行为,但这将使任何使用的复合赋值(包括前缀增加/减少操作符)到UB中,这可能会吸引某些实现者,但肯定不会给程序员。



而不是争论给定的问题,让我提出一个稍微修改的例子,更清楚地显示不一致。假设已定义

  int& f(int& a){return a; } 

一个函数不执行任何操作并返回其现在将示例修改为

  n + = f(++ n)+ 1;请注意,尽管在标准中给出了关于函数调用排序的一些额外条件,但是这将首先在标准中给出。 glance似乎没有影响示例,因为没有副作用从函数调用(甚至没有本地内部的函数),因为增量发生在 f ,其评估不受这些额外条件的约束。实际上,让我们应用关于未定义行为的关键论证(CAUB),即5.17:7,其中说这种复合赋值的行为等同于(在这种情况下)

  n = n + f(++ n)+ 1; 

,除了 n 在这里没有区别的异常)。对我刚刚写的语句的评估显然有UB (RHS中的第一个(prvalue) n 的值计算没有 ++ 操作的效果,它涉及相同的标量对象(1.9:15),并且已经死了。)



所以对 n + = f(++ n)+ 1 的评估有未定义的行为,对吧? 错误!阅读5.17:1


对于不确定顺序的函数调用,复合分配是单个评估。 [注意:因此,函数调用不应介入左值 - 右值转换和与任何单个复合赋值运算符相关的副作用。 - 结束注释]


这种语言远远不够准确,我不认为这是一个假设,不确定序列应该意味着关于复合转让的操作。 (非规范的,我知道)笔记清楚地表明,左值 - 右值转换是复合赋值操作的一部分。现在是对 + = 的复合赋值操作的 f 的调用不确定顺序?我不确定,因为顺序关系是为个人价值计算和副作用定义的,而不是对运算符的完全评估,这可能涉及两者。事实上,复合赋值运算符的求值涉及三个项:左值操作数的左值到右值转换,副作用(适当的赋值),以及复合赋值的值计算其在副作用之后被排序,并且将原始左操作数返回为左值)。注意,除了上面引用的注释 ,在标准中从未明确提到左值到右值转换的存在;特别是,该标准对其关于其他评价的排序没有(其他)声明。很明显,在该示例中, f 的调用在之前被排序 + = (因为调用发生在右操作数到 + = 的值计算中),但它可能不确定地相对于左值到右值转换部分。我记得我的问题,因为 + = 的左操作数是一个左值(必然是这样),不能解释左值到右值的转换已发生作为部分



但是,根据被排除的中间语句的原理,调用 f 必须根据 + = 的复合赋值的操作进行不确定排序,或者不对其进行不确定排序;在后一种情况下,它必须在之前被排序,因为它不可能在它之后被排序(在 f 的调用之前在副作用之前被排序 + = ,并且该关系是反对称的)。因此,首先假定相对于操作不确定地排序。然后引用的条款说。 f 的调用 + = 是一个单独的操作,注释说明这意味着调用应该不介入左值 - 右值转换和与 + = 相关联的副作用;它应该在两者之前或之后进行测序。但是在副作用后排序是不可能的,所以它应该在两者之前。这使得(通过传递性)在左值到右值转换之前, ++ 排序的副作用,退出UB。接下来假设在 + = 的操作之前对 f 的调用进行排序。然后,它特别在左值到右值转换之前进行排序,并且再次通过传递性,因此 ++ 的副作用;没有UB在这个分支中。



结论:5.17:1矛盾5.17:7如果后者被采用(CAUB)减少1.9:15。正如我所说的CAUB也是自相矛盾的(通过参数在问题中指出),但这个答案变得很长,所以我现在离开它现在。



三个问题和解决这两个问题的两个建议



尝试理解标准对这些问题的写作,我区分了文本难以解释的三个方面;他们都有一个性质,文本不清楚它的语句是指什么模型。 (我引用编号项目末尾的文本,因为我不知道标记在报价后恢复编号的项目)


  1. 5.17:7的文本具有明显的简单性,尽管目的很容易掌握,但当应用于困难的情况时,我们几乎没有什么保留。它提出了一个清晰的声明(等效行为,显然在所有方面),但其应用程序受到异常条款的挫败。如果 E1 = E1 op E2 的行为未定义怎么办?那么 E1 op = E2 但是,如果由于 E1 = E1 中两次对 E1 em> op E2 ?然后,估计 E1 op = E2 应该不是UB,定义为什么?这就像说:第二个双胞胎的年轻人正像第一个,除了他没有在分娩死亡。坦率地说,我认为这个文本自从 E1 op = E2 形式的C版本A 复合赋值 与简单赋值表达式 E1 = E1 op E2 不同之处仅在于左值 E1 只计算一次。


    (5.17)7形式 E1 op = E2 相当于
    E1 = E1 E2 ,除了 E1 仅计算一次。 ..]



  2. 不太清楚什么是动作(评估)定义。据说(1.9:12)表达式的计算包括值计算和副作用的启动。虽然这似乎表明评估可能有多个(原子)组件,但是实际上,顺序关系大多数是为单个组件定义的(例如在1.9:14,15中),因此最好将它看作是概念包括值计算和(引发)副作用。但是在某些情况下,为语句(1.9:15)的表达式或函数调用(5.17:1)的(整个)执行定义了顺序关系,即使1.9:15中的段落避免了直接引用被调用函数正文中的执行。


    (1.9)12 或子表达式)通常包括
    两个值计算(...)和启动副作用。 [...] 13 是一个单线程执行的评估之间的非对称,传递,成对关系[...] 14每个值计算和副表达式与全表达式在每个值计算和与要评估的下一个完整表达式相关联的副作用之前被排序。 [...] 15调用函数(无论函数是否内联),每个值计算和副作用
    与任何参数表达式相关联,或与指定被调用函数的后缀表达式相关联,为
    在被调用函数的主体中的每个表达式或语句执行之前被排序。 [...]调用函数(包括其他函数调用)中的每个求值不确定地按照调用函数[...](5.2.6,5.17)的执行
    进行排序。 ..对于不确定顺序的函数调用,...



  3. 文本应更清楚地确认化合物与简单赋值相反,赋值涉及获取先前分配给其左操作数的值的动作;此操作类似于左值到右值转换,但不会作为左操作数的值计算的一部分发生,因为它不是prvalue;实际上,1.9:12只承认这样的用于pr值评估的动作的问题是


    (1.9)12 < (包括确定用于glvalue计算的对象的标识以及获取先前分配给用于prvalue计算的对象的值)



第二点与我们的具体问题最直接相关,我认为可以简单地通过选择一个清晰的观点和重新制定似乎表明不同观点的肠道。假定旧序列点的主要目的之一,现在是顺序关系,是为了清楚地说明postfix-increment运算符的副作用是无序的。在该运算符的值计算之后排序的动作(因此给出例如 i = i ++ UB),视点必须是个别值计算和(启动)各个副作用是可以定义之前测序的评价。出于实用的原因,我还将包括两种(简单的)评估:函数输入(使得1.9:15的语言可以简化为:当调用函数...时,每个值计算和副作用与任何参数表达式,或与指定被调用的函数的后缀表达式,在该函数的入口之前被排序)和函数退出(使得函数体中的任何动作在需要函数值的任何事物之前通过传递性排序;这通常是由序列点保证的,但是C ++ 11标准似乎失去了这样的保证;这可能会调用以 return i ++; 结尾的函数潜在的UB其中这不是为了,并且用于是安全的)。然后还可以清楚地说明函数调用的不确定顺序关系:对于每个函数调用和不是(直接或间接)评估该调用的一部分的每个评估,该评估应当排序(在前后) wrt该函数调用的入口和出口,并且它在两种情况下应当具有相同的关系,以使得特别地,这样的外部动作不能在函数入口之后但是在函数退出之前排序,如在



现在,为了解决点1.和3.,我可以看到两个路径(每个都影响两个点),它们对定义的或不同的结果有不同的结果不是我们示例的行为:



具有两个操作数和三个求值的复合赋值



复合操作有两个常规操作数,左值左操作数和右值操作数。为了解决3.的不清晰度,它包含在1.9:12中,获取先前分配给对象的值也可能出现在复合赋值中(而不是仅用于prvalue求值)。通过将5.17:7更改为


在复合赋值 op <$ c中定义赋值赋值的语义$ c> = ,获取先前分配给由左操作数引用的对象的值,操作符 op 应用该值作为左操作数和右操作数的 = 作为右操作数,结果值替换左操作数引用的对象的值。


(给出两个评估,提取和副作用;第三个评估是复合运算符的平凡值计算,在两个其他评估之后排序) / p>

为了清楚起见,在1.9:15中明确指出,操作数中的值计算在与操作符相关联的所有值计算之前排序对于运算符的结果),这确保了在获取左值操作数之前对其进行排序(人们几乎不能想象其他值),并且还排序操作数,因此在我们的示例中不包括UB。在它,我看到没有理由不在操作数之前的任何副作用之前的操作数的顺序值计算(因为他们显然必须);这将明确地提到5.17:1中的(复合)赋值是多余的。



具有三个操作数和两个求值的复合赋值

。另一方面,请注意,复合赋值中的值赋值在其副作用之前进行排序。

为了获得对于右操作数的值计算,compount赋值中的fetch不会被排序,使得我们的例子UB,最清楚的方法似乎是给出compound运算符隐含的第三(中间)操作数,这是一个prvalue,不是由单独的表达式表示,而是通过左操作数的左值到右值转换获得的(这三个操作数的性质对应于扩展形式的复合赋值,但是通过从左操作数获得中间操作数,确保该值从将被存储结果的同一对象获取,这是在当前公式中通过除了 E1 只计算一次子句)。与先前解决方案的不同之处在于,提取现在是真正的左值到值的转换(因为中间操作数是prvalue),并且被执行作为对复合赋值的操作数的值计算的一部分。 em>,这使得它自然不受右操作数的值计算的影响。应该在某处(在描述该隐式操作数的新子句中)指出左操作数的值计算在该左值到右值转换之前被排序(它显然必须)。现在1.9:12可以保留原样,而代替5.17:7我建议


具有左操作数 a (左值)的 = ,midlle和右操作数 b 分别 c (两个prvalues),运算符 op 应用 b 作为左操作数,将 c 作为右操作数,结果值替换 a


(这给出一个评估,副作用,作为第二个评估的复合运算符,



上述解决方案中建议的1.9:15和5.17:1的仍然适用的更改仍然可以应用,但不会给出我们的原始示例定义行为。然而,在这个答案顶部的修改示例仍然具有已定义的行为,除非部分5.17:1复合赋值是单个操作被废弃或修改(在5.2.6中存在用于后缀增量/减量的类似段落) 。这些段落的存在将表明在单个复合分配或后缀增量/减量中分离fecth和存储操作不是写出当前标准的那些人的意图(并且通过扩展使得我们的示例UB ),但这当然只是猜测。


This question came up while I was reading (the answers to) So why is i = ++i + 1 well-defined in C++11?

I gather that the subtle explanation is that (1) the expression ++i returns an lvalue but + takes prvalues as operands, so a conversion from lvalue to prvalue must be performed; this involves obtaining the current value of that lvalue (rather than one more than the old value of i) and must therefore be sequenced after the side effect from the increment (i.e., updating i) (2) the LHS of the assignment is also an lvalue, so its value evaluation does not involve fetching the current value of i; while this value computation is unsequenced w.r.t. the value computation of the RHS, this poses no problem (3) the value computation of the assignment itself involves updating i (again), but is sequenced after the value computation of its RHS, and hence after the prvious update to i; no problem.

Fine, so there is no UB there. Now my question is what if one changed the assigment operator from = to += (or a similar operator).

Does the evaluation of the expression i += ++i + 1 lead to undefined behavior?

As I see it, the standard seems to contradict itself here. Since the LHS of += is still an lvalue (and its RHS still a prvalue), the same reasoning as above applies as far as (1) and (2) are concerned; there is no undefined behavior in the evalutation of the operands on +=. As for (3), the operation of the compound assignment += (more precisely the side effect of that operation; its value computation, if needed, is in any case sequenced after its side effect) now must both fetch the current value of i, and then (obviously sequenced after it, even if the standard does not say so explicitly, or otherwise the evaluation of such operators would always invoke undefined behavior) add the RHS and store the result back into i. Both these operations would have given undefined behavior if they were unsequenced w.r.t. the side effect of the ++, but as argued above (the side effect of the ++ is sequenced before the value computation of + giving the RHS of the += operator, which value computation is sequenced before the operation of that compound assignment), that is not the case.

But on the other hand the standard also says that E += F is equivalent to E = E + F, except that (the lvalue) E is evaluated only once. Now in our example the value computation of i (which is what E is here) as lvalue does not involve anything that needs to be sequenced w.r.t. other actions, so doing it once or twice makes no difference; our expression should be strictly equivalent to E = E + F. But here's the problem; it is pretty obvious that evaluating i = i + (++i + 1) would give undefined behaviour! What gives? Or is this a defect of the standard?

Added. I have slightly modified my discussion above, to do more justice to the proper distinction between side effects and value computations, and using "evaluation" (as does the standard) of an expression to encompass both. I think my main interrogation is not just about whether behavior is defined or not in this example, but how one must read the standard in order to decide this. Notably, should one take the equivalence of E op= F to E = E op F as the ultimate authority for the semantics of the compound assignment operation (in which case the example clearly has UB), or merely as an indication of what mathematical operation is involved in determining the value to be assigned (namely the one identified by op, with the lvalue-to-rvalue converted LHS of the compound assignment operator as left operand and its RHS as right operand). The latter option makes it much harder to argue for UB in this example, as I have tried to explain. I admit that it is tempting to make the equivalence authoritative (so that compound assignments become a kind of second-class primitives, whose meaning is given by rewriting in term of first-class primitives; thus the language definition would be simplified), but there are rather strong arguments against this:

  • The equivalence is not absolute, because of the "E is evaluated only once" exception. Note that this exception is essential to avoid making any use where the evaluation of E involves a side effect undefined behavior, for instance in the fairly common a[i++] += b; usage. If fact I think no absolutely equivalent rewriting to eliminate compound assignments is possible; using a fictive ||| operator to designate unsequenced evaluations, one might try to define E op= F; (with int operands for simplicity) as equivalent to { int& L=E ||| int R=F; L = L + R; }, but then the example no longer has UB. In any case the standard gives us no rewriitng recipe.

  • The standard does not treat compound assignments as second-class primitives for which no separate definition of semantics is necessary. For instance in 5.17 (emphasis mine)

    The assignment operator (=) and the compound assignment operators all group right-to-left. [...] In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation.

  • If the intention were to let compound assignments be mere shorthands for simple assignments, there would be no reason to include them explicitly in this description. The final phrase even directly contradicts what would be the case if the equivalence was taken to be authoritative.

If one admits that compound assignments have a semantics of their own, then the point arises that their evaluation involves (apart from the mathematical operation) more than just a side effect (the assignment) and a value evaluation (sequenced after the assignment), but also an unnamed operation of fetching the (previous) value of the LHS. This would normally be dealt with under the heading of "lvalue-to-rvalue conversion", but doing so here is hard to justify, since there is no operator present that takes the LHS as an rvalue operand (though there is one in the expanded "equivalent" form). It is precisely this unnamed operation whose potential unsequenced relation with the side effect of ++ would cause UB, but this unsequenced relation is nowhere explicitly stated in the standard, because the unnamed operation is not. It is hard to justify UB using an operation whose very existence is only implicit in the standard.

解决方案

There is no clear case for Undefined Behavior here

Sure, an argument leading to UB can be given, as I indicated in the question, and which has been repeated in the answers given so far. However this involves a strict reading of 5.17:7 that is both self-contradictory and in contradiction with explicit statements in 5.17:1 about compound assignment. With a weaker reading of 5.17:7 the contradictions disappear, as does the argument for UB. Whence my conclusion is neither that there is UB here, nor that there is clearly defined behaviour, but the the text of the standard is inconsistent, and should be modified to make clear which reading prevails (and I suppose this means a defect report should be written). Of course one might invoke here the fall-back clause in the standard (the note in 1.3.24) that evaluations for which the standard fails to define the behavior [unambiguously and self-consistently] are Undefined Behavior, but that would make any use of compound assignments (including prefix increment/decrement operators) into UB, something that might appeal to certain implementors, but certainly not to programmers.

Instead of arguing for the given problem, let me present a slightly modified example that brings out the inconsistency more clearly. Assume one has defined

int& f (int& a) { return a; }

a function that does nothing and returns its (lvalue) argument. Now modify the example to

n += f(++n) + 1;

Note that while some extra conditions about sequencing of function calls are given in the standard, this would at first glance not seem to effect the example, since there are no side effect at all from the function call (not even locally inside the function), as the incrementation happens in the argument expression for f, whose evaluation is not subject to those extra conditions. Indeed, let us apply the Crucial Argument for Undefined Behavior (CAUB), namely 5.17:7 which says that the behavior of such a compound assignment is equivalent to that of (in this case)

n = n + f(++n) + 1;

except that n is evaluated only once (an exception that makes no difference here). The evaluation of the statement I just wrote clearly has UB (the value computation of the first (prvalue) n in the RHS is unsequenced w.r.t. the side effect of the ++ operation, which involves the same scalar object (1.9:15) and you're dead).

So the evaluation of n += f(++n) + 1 has undefined behavior, right? Wrong! Read in 5.17:1 that

With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation. [ Note: Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single compound assignment operator. — end note ]

This language is far from as precise as I would like it to be, but I don't think it is a stretch to assume that "indeterminately-sequenced" should mean "with respect to that operation of a compound assignment". The (non normative, I know) note makes it clear that the lvalue-to-rvalue conversion is part of the operation of the compound assignment. Now is the call of f indeterminately-sequenced with respect to the operation of the compound assignment of +=? I'm unsure, because the 'sequenced' relation is defined for individual value computations and side effects, not complete evaluations of operators, which may involve both. In fact the evaluation of a compound assignment operator involves three items: the lvalue-to-rvalue conversion of its left operand, the side effect (the assignment proper), and the value computation of the compound assignment (which is sequenced after the side effect, and returns the original left operand as lvalue). Note that the existence of the lvalue-to-rvalue conversion is never explicitly mentioned in the standard except in the note cited above; in particular, the standard makes no (other) statement at all regarding its sequencing relative to other evaluations. It is pretty clear that in the example the call of f is sequenced before the side effect and value computation of += (since the call occurs in the value computation of the right operand to +=), but it might be indeterminately-sequenced with respect to the lvalue-to-rvalue conversion part. I recall from my question that since the left operand of += is an lvalue (and necessarily so), one cannot construe the lvalue-to-rvalue conversion to have occurred as part of the value computation of the left operand.

However, by the principle of the excluded middle, the call to f must either be indeterminately-sequenced with respect to the operation of the compound assignment of +=, or not indeterminately-sequenced with respect to it; in the latter case it must be sequenced before it because it cannot possibly be sequenced after it (the call of f being sequenced before the side effect of +=, and the relation being anti-symmetric). So first assume it is indeterminately-sequenced with respect to the operation. Then the cited clause says that w.r.t. the call of f the evaluation of += is a single operation, and the note explains that it means the call should not intervene between the lvalue-to-rvalue conversion and the side effect associated with +=; it should either be sequenced before both, or after both. But being sequenced after the side effect is not possible, so it should be before both. This makes (by transitivity) the side effect of ++ sequenced before the lvalue-to-rvalue conversion, exit UB. Next assume the call of f is sequenced before the operation of +=. Then it is in particular sequenced before the lvalue-to-rvalue conversion, and again by transitivity so is the side effect of ++; no UB in this branch either.

Conclusion: 5.17:1 contradicts 5.17:7 if the latter is taken (CAUB) to be normative for questions of UB resulting from unsequenced evaluations by 1.9:15. As I said CAUB is self-contradictory as well (by arguments indicated in the question), but this answer is getting to long, so I'll leave it at this for now.

Three problems, and two proposals for resolving them

Trying to understand what the standard writes about these matters, I distinguish three aspects in which the text is hard to interpret; they all are of a nature that the text is insufficiently clear about what model its statements are referring to. (I cite the texts at the end of the numbered items, since I do not know the markup to resume a numbered item after a quote)

  1. The text of 5.17:7 is of an apparent simplicity that, although the intention is easy to grasp, gives us little hold when applied to difficult situations. It makes a sweeping claim (equivalent behavior, apparently in all aspects) but whose application is thwarted by the exception clause. What if the behavior of E1 = E1 op E2 is undefined? Well then that of E1 op = E2 should be as well. But what if the UB was due to E1 being evaluated twice in E1 = E1 op E2? Then evaluating E1 op = E2 should presumably not be UB, but if so, then defined as what? This is like saying "the youth of the second twin was exactly like that of the first, except that he did not die at childbirth." Frankly, I think this text, which has little evolved since the C version "A compound assignment of the the form E1 op = E2 differs from the simple assignment expression E1 = E1 op E2 only in that the lvalue E1 is evaluated only once." might be adapted to match the changes in the standard.

    (5.17) 7 The behavior of an expression of the form E1 op = E2 is equivalent to E1 = E1 op E2 except that E1 is evaluated only once.[...]

  2. It is not so clear what precisely the actions (evaluations) are between which the 'sequenced' relation is defined. It is said (1.9:12) that evaluation of an expression includes value computations and initiation of side effects. Though this appears to say that an evaluation may have multiple (atomic) components, the sequenced relation is actually mostly defined (e.g. in 1.9:14,15) for individual components, so that it might be better to read this as that the notion of "evaluation" encompasses both value computations and (initiation of) side effects. However in some cases the 'sequenced' relation is defined for the (entire) execution of an expression of statement (1.9:15) or for a function call (5.17:1), even though a passage in 1.9:15 avoids the latter by referring directly to executions in the body of a called function.

    (1.9) 12 Evaluation of an expression (or a sub-expression) in general includes both value computations (...) and initiation of side effects. [...] 13 Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread [...] 14 Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated. [...] 15 When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [...] Every evaluation in the calling function (including other function calls) ... is indeterminately sequenced with respect to the execution of the called function [...] (5.2.6, 5.17) 1 ... With respect to an indeterminately-sequenced function call, ...

  3. The text should more clearly acknowledge that a compound assignment involves, in contrast to a simple assignment, the action of fetching the value previously assigned to its left operand; this action is like lvalue-to-rvalue conversion, but does not happen as part of the value computation of that left operand, since it is not a prvalue; indeed it is a problem that 1.9:12 only acknowledges such action for prvalue evaluation. In particular the text should be more clear about which 'sequenced' relations are given for that action, if any.

    (1.9) 12 Evaluation of an expression... includes... value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation)

The second point is the least directly related to our concrete question, and I think it can be solved simply by choosing a clear point of view and reformulating pasages that seem to indicate a different point of view. Given that one of the main purposes of the old sequence points, and now the 'sequenced' relation, was to make clear that the side effect of postfix-increment operators is unsequenced w.r.t. to actions sequenced after the value computation of that operator (thus giving e.g. i = i++ UB), the point of view must be that individual value computations and (initiation of) individual side effects are "evaluations" for which "sequenced before" may be defined. For pragmatic reasons I would also include two more kinds of (trivial) "evaluations": function entry (so that the language of 1.9:15 may be simplified to: "When calling a function..., every value computation and side effect associated with any of its argument expressions, or with the postfix expression designating the called function, is sequenced before entry of that function") and function exit (so that any action in the function body gets by transitivity sequenced before anything that requires the function value; this used to be guaranteed by a sequence point, but the C++11 standard seems to have lost such guarantee; this might make calling a function ending with return i++; potentially UB where this is not intended, and used to be safe). Then one can also be clear about the "indeterminately sequenced" relation of functions calls: for every function call, and every evaluation that is not (directly or indirectly) part of evaluating that call, that evaluation shall be sequenced (either before or after) w.r.t. both entry and exit of that function call, and it shall have the same relation in both cases (so that in particular such external actions cannot be sequenced after function entry but before function exit, as is clearly desirable within a single thread).

Now to resolve points 1. and 3., I can see two paths (each affecting both points), which have different consequences for the defined or not behavior of our example:

Compound assignments with two operands, and three evaluations

Compound operations have thier two usual operands, an lvalue left operand and a prvalue right operand. To settle the unclarity of 3., it is included in 1.9:12 that fetching the value previously assigned to an object also may occur in compound assignments (rather than only for prvalue evaluation). The semantics of compount assignments are defined by changing 5.17:7 to

In a compound assignment op=, the value previously assigned to the object referred to by the left operand is fetched, the operator op is applied with this value as left operand and the right operand of op= as right operand, and the resulting value replaces that of the object referred to by the left operand.

(That gives two evaluations, the fetch and the side effect; a third evaluation is the trivial value computation of the compound operator, sequenced after both other evaluations.)

For clarity, state clearly in 1.9:15 that value computations in operands are sequenced before all value computations associated with the operator (rather than just those for the result of the operator), which ensures that evaluating the lvalue left operand is sequenced before fetching its value (one can hardly imagine otherwise), and also sequences the value computation of the right operand before that fetch, thus excluding UB in our example. While at it, I see no reason not to also sequence value computations in operands before any side effects associated with the operator (as they clearly must); this would make mentioning this explicitly for (compound) assignments in 5.17:1 superfluous. On the other hand do mention there that the value fetching in a compound assignment is sequenced before its side effect.

Compound assignments with three operands, and two evaluations

In order to obtain that the fetch in a compount assignment will be unsequenced with respect to the value computation of the right operand, making our example UB, the clearest way seems to be to give compound operators an implicit third (middle) operand, a prvalue, not represented by a separate expression, but obtained by lvalue-to-rvalue conversion from the left operand (this three-operand nature corresponds to the expanded form of compound assignments, but by obtaining the middle operand from the left operand, it is ensured that the value is fetched from the same object to which the result will be stored, a crucial guarantee that is only vaguely and implicitly given in the current formulation through the "except that E1 is evaluated only once" clause). The difference with the previous solution is that the fetch is now a genuine lvalue-to-rvalue conversion (since the middle operand is a prvalue) and is performed as part of the value computation of the operands to the compound assignment, which makes it naturally unsequenced with the value computation of the right operand. It should be stated somewhere (in a new clause that describes this implicit operand) that the value computation of the left operand is sequenced before this lvalue-to-rvalue conversion (it clearly must). Now 1.9:12 can be left as it is, and in place of 5.17:7 I propose

In a compound assignment op= with left operand a (an lvalue), and midlle and right operands brespectively c (both prvalues), the operator op is applied with b as left operand and c as right operand, and the resulting value replaces that of the object referred to by a.

(That gives one evaluation, the side effect, with as second evaluation the trivial value computation of the compound operator, sequenced after it.)

The still applicable changes to 1.9:15 and 5.17:1 suggested in the previous solution could still apply, but would not give our original example defined behavior. However the modified example at the top of this answer would still have defined behavior, unless the part 5.17:1 "compound assignment is a single operation" is scrapped or modified (there is a similar passage in 5.2.6 for postfix increment/decrement). The existence of those passages would suggest that detaching the fecth and store operations within a single compound assignement or postfix increment/decrement was not the intention of those who wrote the current standard (and by extension making our example UB), but this of course is mere guesswork.

这篇关于在C ++ 11中,`i + = ++ i + 1'是否显示未定义的行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆