这个代码来自“C ++编程语言”第4版第36.3.6节有明确的行为? [英] Does this code from "The C++ Programming Language" 4th edition section 36.3.6 have well-defined behavior?

查看:151
本文介绍了这个代码来自“C ++编程语言”第4版第36.3.6节有明确的行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Bjarne Stroustrup的 C ++编程语言 第4版部分 36.3.6 类似STL的操作,以下代码用作链接

  void f2()
{
std :: string s =但我听说它工作,即使你不相信它;
s.replace(0,4,).replace(s.find(even),4,only)
.replace(s.find(do not) ,6,);

assert(s ==我听说它只有当你相信它的工作);
}

assert 失败在 gcc 查看实时 )和 Visual Studio 查看实时 ),但在使用 Clang 查看实时 )。



为什么我会得到不同的结果?任何这些编译器错误地评估链接表达式,或者此代码显示某种形式的未指定未定义的行为

解决方案

代码展示未指定由于子表达式的未指定的评估顺序而导致的行为,虽然它不会调用未定义的行为,因为所有副作用都在函数内完成它在这种情况下在副作用之间引入了顺序关系。



这个例子在提议 N4228:Refine Expression Evaluation Order for Idiomatic C ++ ,其中说明了问题中的代码如下:

$这个代码已经由世界各地的C ++专家审核,并发布了
(C ++编程语言,4 [...]




详细



对很多人来说,很明显函数的参数有一个未指定的顺序但它可能不是那么明显如何这种行为与链接函数调用交互。当我第一次分析这种情况,显然不是所有的专家评论者都不清楚我。



乍看起来,由于每个 replace 必须从左到右计算相应的函数参数组必须从左到右被评估为组。



这是不正确的,函数参数具有未指定的求值顺序,虽然链函数调用确实为每个函数调用引入了一个从左到右的求值顺序,每个函数的参数调用仅在它们作为其一部分的成员函数调用之前被排序。特别是这会影响以下调用:

  s.find(even)

和:

  s.find不要)

这些不确定的顺序:

  s.replace(0,4,)


b $ b

可以在 replace 之前或之后对两个 find 调用进行求值,对 s 的副作用会改变 find 的结果,它会更改 s 。因此,取决于 replace 是否相对于两个 find 调用计算结果会有所不同。



如果我们查看链接表达式并检查一些子表达式的评估顺序:

  s.replace(0,4,).replace(s.find(even),4,only)
^ ^ ^ ^ ^ ^ ^ ^ ^
AB | | | C | | |
1 2 3 4 5 6

和:

  .replace(s.find(do not),6,); 
^ ^ ^ ^
D | | |
7 8 9

注意,我们忽略了 4 7 可以进一步细分为更多的子表达式。因此:<$ c
$ b


  • A code> 之前排序的 C 前排序的code>对于具有以下列出的一些例外的其他子表达式,不确定地对
  • 1 到 9
    $ b

    • 1 3 c $ c> B

    • 4 6 C

    • 7 $




这个问题的关键在于:




  • 4 < B

  • 不确定地排序 9 ul>

    4 7 的评估选择的潜在顺序尊重 B 解释了 clang gcc 当评估 f2()时。在我的测试中, clang 在评估 4 之前评估 B code> 7 ,而 gcc 会评估它。我们可以使用以下测试程序来演示每种情况下发生的情况:

      #include< iostream> 
    #include< string>

    std :: string :: size_type my_find(std :: string s,const char * cs)
    {
    std :: string :: size_type pos = s.find cs);
    std :: cout<< position<< cs < found in complete expression:
    << pos < std :: endl;

    return pos;
    }

    int main()
    {
    std :: string s =但我听说它的工作,即使你不相信它。
    std :: string copy_s = s;

    std :: cout<< position of even before s.replace(0,4,\\):
    << s.find(even)<< std :: endl;
    std :: cout<< position of do not before s.replace(0,4,\\):
    << s.find(do not)<< std :: endl< std :: endl;

    copy_s.replace(0,4,);

    std :: cout<< position of even after s.replace(0,4,\\):
    << copy_s.find(even)< std :: endl;
    std :: cout<< 位置不要在s.replace(0,4,\\)后面:
    << copy_s.find(do not)<< std :: endl< std :: endl;

    s.replace(0,4,).replace(my_find(s,even),4,only)
    .replace(my_find 't),6,);

    std :: cout<< Result:<< s<< std :: endl;
    }

    gcc的结果 查看实时

     甚至在s.replace(0,4,)之前的位置:26 
    不在s.replace之前的位置(0,4, :37

    甚至在s.replace(0,4,)之后的位置:22
    不在s.replace(0,4,)后的位置:33

    在完整表达式中找不到位置:37
    position甚至在完整表达式中找到:26

    结果:我听说它工作evenonlyyou donieve in

    clang的结果 查看实时

    ):

     甚至在s.replace(0,4,)之前的位置:26 
    不在s.replace(0,4,)之前的位置:37

    甚至在s.replace(0,4,)之后的位置:22
    不在s.replace(0,4,)后的位置:33

    位置甚至在完整表达式中找到:22
    在完整表达式中找不到位置:33

    结果:我听说它只有在你相信的情况下才有效

    的结果Visual Studio 查看实时 ):

     甚至在s.replace(0,4,)之前:26 
    不在s.replace之前的位置(0,4,):37

    后面的s.replace(0,4,):22
    不在s.replace后面的位置(0,4,):33

    在完整的表达式中:37
    position甚至在完整的表达式中找到:26
    结果:我听说它工作evenonlyy donye它

    标准的详细信息



    我们知道,除非指定,子表达式的计算是无序的,这是来自草稿C ++ 11标准部分 1.9 程式执行:



    [...]


    和[...]我们知道函数调用在函数调用后缀函数的关系之前引入了后缀表达式和相对于函数体的参数,从 1.9 部分:

    $ b $当调用函数(无论函数是否是内联函数)时,每个
    值计算和与任何参数相关的副作用
    $ b $ b表达式或后缀表达式指定被调用的
    函数,在被调用函数体中的每个表达式或
    语句执行之前被排序[...]


    我们还知道类成员访问,因此链接将从左到右从 5.2.5 类成员访问,其中包含:


    [...] ; 64
    该评估的结果与id表达式一起,
    确定整个后缀表达式的结果。


    请注意,在 id-expression 最终成为非静态成员函数的情况下,它不会指定()中的 expression-list ,因为这是一个单独的子表达式。 5.2 的相关语法:

      postfix-expression:
    postfix-expression(expression-listopt)//函数调用
    postfix-expression。 templateopt id-expression //类成员访问,结束
    //作为后缀表达式调用


    In Bjarne Stroustrup's The C++ Programming Language 4th edition section 36.3.6 STL-like Operations the following code is used as an example of chaining:

    void f2()
    {
        std::string s = "but I have heard it works even if you don't believe in it" ;
        s.replace(0, 4, "" ).replace( s.find( "even" ), 4, "only" )
            .replace( s.find( " don't" ), 6, "" );
    
        assert( s == "I have heard it works only if you believe in it" ) ;
    }
    

    The assert fails in gcc (see it live) and Visual Studio (see it live), but it does not fail when using Clang (see it live).

    Why am I getting different results? Are any of these compilers incorrectly evaluating the chaining expression or does this code exhibit some form of unspecified or undefined behavior?

    解决方案

    The code exhibits unspecified behavior due to unspecified order of evaluation of sub-expressions although it does not invoke undefined behavior since all side effects are done within functions which introduces a sequencing relationship between the side effects in this case.

    This example is mentioned in the proposal N4228: Refining Expression Evaluation Order for Idiomatic C++ which says the following about the code in the question:

    [...]This code has been reviewed by C++ experts world-wide, and published (The C++ Programming Language, 4th edition.) Yet, its vulnerability to unspecified order of evaluation has been discovered only recently by a tool[...]

    Details

    It may be obvious to many that arguments to functions have an unspecified order of evaluation but it is probably not as obvious how this behavior interacts with chained functions calls. It was not obvious to me when I first analyzed this case and apparently not to all the expert reviewers either.

    At first glance it may appear that since each replace has to be evaluated from left to right that the corresponding function argument groups must be evaluated as groups from left to right as well.

    This is incorrect, function arguments have an unspecified order of evaluation, although chaining function calls does introduce a left to right evaluation order for each function call, the arguments of each function call are only sequenced before with respect to the member function call they are part of. In particular this impacts the following calls:

    s.find( "even" )
    

    and:

    s.find( " don't" )
    

    which are indeterminately sequenced with respect to:

    s.replace(0, 4, "" )
    

    the two find calls could be evaluated before or after the replace, which matters since it has a side effect on s in a way that would alter the result of find, it changes the length of s. So depending on when that replace is evaluated relative to the two find calls the result will differ.

    If we look at the chaining expression and examine the evaluation order of some of the sub-expressions:

    s.replace(0, 4, "" ).replace( s.find( "even" ), 4, "only" )
    ^ ^       ^  ^  ^    ^        ^                 ^  ^
    A B       |  |  |    C        |                 |  |
              1  2  3             4                 5  6
    

    and:

    .replace( s.find( " don't" ), 6, "" );
     ^        ^                   ^  ^
     D        |                   |  |
              7                   8  9
    

    Note, we are ignoring the fact that 4 and 7 can be further broken down into more sub-expressions. So:

    • A is sequenced before B which is sequenced before C which is sequenced before D
    • 1 to 9 are indeterminately sequenced with respect to other sub-expressions with some of the exceptions listed below
      • 1 to 3 are sequenced before B
      • 4 to 6 are sequenced before C
      • 7 to 9 are sequenced before D

    The key to this issue is that:

    • 4 to 9 are indeterminately sequenced with respect to B

    The potential order of evaluation choice for 4 and 7 with respect to B explains the difference in results between clang and gcc when evaluating f2(). In my tests clang evaluates B before evaluating 4 and 7 while gcc evaluates it after. We can use the following test program to demonstrate what is happening in each case:

    #include <iostream>
    #include <string>
    
    std::string::size_type my_find( std::string s, const char *cs )
    {
        std::string::size_type pos = s.find( cs ) ;
        std::cout << "position " << cs << " found in complete expression: "
            << pos << std::endl ;
    
        return pos ;
    }
    
    int main()
    {
       std::string s = "but I have heard it works even if you don't believe in it" ;
       std::string copy_s = s ;
    
       std::cout << "position of even before s.replace(0, 4, \"\" ): " 
             << s.find( "even" ) << std::endl ;
       std::cout << "position of  don't before s.replace(0, 4, \"\" ): " 
             << s.find( " don't" ) << std::endl << std::endl;
    
       copy_s.replace(0, 4, "" ) ;
    
       std::cout << "position of even after s.replace(0, 4, \"\" ): " 
             << copy_s.find( "even" ) << std::endl ;
       std::cout << "position of  don't after s.replace(0, 4, \"\" ): "
             << copy_s.find( " don't" ) << std::endl << std::endl;
    
       s.replace(0, 4, "" ).replace( my_find( s, "even" ) , 4, "only" )
            .replace( my_find( s, " don't" ), 6, "" );
    
       std::cout << "Result: " << s << std::endl ;
    }
    

    Result for gcc (see it live)

    position of even before s.replace(0, 4, "" ): 26
    position of  don't before s.replace(0, 4, "" ): 37
    
    position of even after s.replace(0, 4, "" ): 22
    position of  don't after s.replace(0, 4, "" ): 33
    
    position  don't found in complete expression: 37
    position even found in complete expression: 26
    
    Result: I have heard it works evenonlyyou donieve in it
    

    Result for clang (see it live):

    position of even before s.replace(0, 4, "" ): 26
    position of  don't before s.replace(0, 4, "" ): 37
    
    position of even after s.replace(0, 4, "" ): 22
    position of  don't after s.replace(0, 4, "" ): 33
    
    position even found in complete expression: 22
    position don't found in complete expression: 33
    
    Result: I have heard it works only if you believe in it
    

    Result for Visual Studio (see it live):

    position of even before s.replace(0, 4, "" ): 26
    position of  don't before s.replace(0, 4, "" ): 37
    
    position of even after s.replace(0, 4, "" ): 22
    position of  don't after s.replace(0, 4, "" ): 33
    
    position  don't found in complete expression: 37
    position even found in complete expression: 26
    Result: I have heard it works evenonlyyou donieve in it
    

    Details from the standard

    We know that unless specified the evaluations of sub-expressions are unsequenced, this is from the draft C++11 standard section 1.9 Program execution which says:

    Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.[...]

    and we know that a function call introduces a sequenced before relationship of the function calls postfix expression and arguments with respect to the function body, from section 1.9:

    [...]When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function.[...]

    We also know that class member access and therefore chaining will evaluate from left to right, from section 5.2.5 Class member access which says:

    [...]The postfix expression before the dot or arrow is evaluated;64 the result of that evaluation, together with the id-expression, determines the result of the entire postfix expression.

    Note, in the case where the id-expression ends up being a non-static member function it does not specify the order of evaluation of the expression-list within the () since that is a separate sub-expression. The relevant grammar from 5.2 Postfix expressions:

    postfix-expression:
        postfix-expression ( expression-listopt)       // function call
        postfix-expression . templateopt id-expression // Class member access, ends
                                                       // up as a postfix-expression
    

    这篇关于这个代码来自“C ++编程语言”第4版第36.3.6节有明确的行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆