对已经寻址数组基址的指针应用后递减是否会调用未定义的行为? [英] Does applying post-decrement on a pointer already addressing the base of an array invoke undefined behavior?

查看:34
本文介绍了对已经寻址数组基址的指针应用后递减是否会调用未定义的行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在寻找有关以下内容的相关或重复问题后无济于事(我只能勉强描述标记为 C 的指针算术和后递减问题的绝对数量,但足以说大量"对结果集计数严重不公正)我把它扔到戒指中,希望得到澄清或推荐给我逃避的重复.

After hunting for a related or duplicate question concerning the following to no avail (I can only do marginal justice to describe the sheer number of pointer-arithmetic and post-decrement questions tagged with C, but suffice it to say "boatloads" does a grave injustice to that result set count) I toss this in the ring in hopes of clarification or a referral to a duplicate that eluded me.

如果后递减运算符应用于如下所示的指针,即数组序列的简单反向迭代,以下代码是否会调用未定义的行为?

If the post-decrement operator is applied to a pointer such as below, a simple reverse-iteration of an array sequence, does the following code invoke undefined behavior?

#include <stdio.h>
#include <string.h>

int main()
{
    char s[] = "some string";
    const char *t = s + strlen(s);

    while(t-->s)
        fputc(*t, stdout);
    fputc('\n', stdout);

    return 0;
}

最近有人向我提议,6.5.6.p8 加法运算符与 6.5.2.p4 后缀自增和自减运算符相结合,甚至可以指定 后自减>t 当它已经包含 s 的基地址时调用未定义的行为,无论 t 的结果值(不是 t-- 表达式结果)是否被评估.我只是想知道是否确实如此.

It was recently proposed to me that 6.5.6.p8 Additive operators, in conjunction with 6.5.2.p4, Postfix increment and decrement operators, specifies even performing a post-decrement upon t when it already contains the base-address of s invokes undefined behavior, regardless of whether the resulting value of t (not the t-- expression result) is evaluated or not. I simply want to know if that is indeed the case.

标准中引用的部分是:

6.5.6 加法运算符

  1. 如果指针操作数和结果都指向相同的数组对象,或数组对象的最后一个元素之后的一个,评估不应产生溢出;否则,行为未定义.

及其与...的近乎紧密耦合的关系

and its nearly tightly coupled relationship with...

6.5.2.4 后缀自增和自减运算符约束

  1. 后缀自增或自减运算符的操作数应具有原子的、限定的或非限定的实数或指针类型,并且应该是一个可修改的左值.

语义

  1. 后缀++运算符的结果是操作数的值.作为副作用,操作数对象的值会增加(即,将适当类型的值 1 添加到其中).有关约束、类型和转换以及操作对指针的影响的信息,请参阅加法运算符和复合赋值的讨论.结果的值计算在更新操作数的存储值的副作用之前排序.对于不确定顺序的函数调用,后缀 ++ 的操作是单个评估.具有原子类型的对象上的 Postfix ++ 是具有 memory_order_seq_cst 内存顺序语义的读-修改-写操作.98)

  1. The result of the postfix ++ operator is the value of the operand. As a side effect, the value of the operand object is incremented (that is, the value 1 of the appropriate type is added to it). See the discussions of additive operators and compound assignment for information on constraints, types, and conversions and the effects of operations on pointers. The value computation of the result is sequenced before the side effect of updating the stored value of the operand. With respect to an indeterminately-sequenced function call, the operation of postfix ++ is a single evaluation. Postfix ++ on an object with atomic type is a read-modify-write operation with memory_order_seq_cst memory order semantics.98)

后缀 -- 运算符类似于后缀 ++ 运算符,只是操作数的值是递减的(即从它).

The postfix -- operator is analogous to the postfix ++ operator, except that the value of the operand is decremented (that is, the value 1 of the appropriate type is subtracted from it).

前向引用:加法运算符 (6.5.6)、复合赋值 (6.5.16.2).

Forward references: additive operators (6.5.6), compound assignment (6.5.16.2).

在发布的示例中使用后递减运算符的真正原因是为了避免根据数组的基地址评估最终无效的地址值.例如,上面的代码是对以下内容的重构:

The very reason for using the post-decrement operator in the posted sample is to avoid evaluating an eventually-invalid address value against the base address of the array. For example, the code above was a refactor of the following:

#include <stdio.h>
#include <string.h>

int main() 
{
    char s[] = "some string";

    size_t len = strlen(s);    
    char *t = s + len - 1;
    while(t >= s) 
    {
        fputc(*t, stdout);
        t = t - 1;
    }
    fputc('\n', stdout);
}

暂时忘记了 s 的非零长度字符串,这个通用算法显然有问题(可能对某些人来说不是很清楚).如果 s[] 改为 "",则 t 将被赋值为 s-1,即本身通过其过去地址不在 s 的有效范围内,因此与 s 进行比较的评估是不好的.如果 s 具有非零长度,则解决了初始 s-1 问题,但只是暂时的,因为最终 仍然 指望该值(无论它是什么)对于与 s 进行比较是有效的,以终止循环.这可能会更糟.它可能很天真:

Forgetting for a moment this has a non-zero-length string for s, this general algorithm clearly has issues (perhaps not as clearly to some). If s[] were instead "", then t would be assigned a value of s-1, which itself is not in the valid range of s through its one-past-address, and the evaluation for comparison against s that ensues is no good. If s has non-zero length, that addresses the initial s-1 problem, but only temporarily, as eventually this is still counting on that value (whatever it is) being valid for comparison against s to terminate the loop. It could be worse. it could have naively been:

    size_t len = strlen(s) - 1;
    char *t = s + len;

如果 s 是一个零长度的字符串,这将是一场灾难.这个问题的重构代码旨在解决所有这些问题.但是...

This has disaster written all over it if s were a zero-length string. The refactored code of this question opened with was intended to address all of these issues. But...

我的妄想症可能会影响到我,但如果他们真的全力以赴想要抓住你,那并不是妄想症.那么,根据标准(这些部分或其他部分),原始代码(如果您现在忘记了它的样子,请滚动到本小说的顶部)是否确实调用了未定义的行为?

My paranoia may be getting to me, but it isn't paranoia if they're really all out to get you. So, per the standard (these sections, or perhaps others), does the original code (scroll to the top of this novel if you forgot what it looks like by now) indeed invoke undefined behavior or not?

推荐答案

我很确定在这种情况下后递减的结果确实是未定义的行为.后递减显然从指向对象开头的指针中减去一个,因此结果不指向同一数组的元素,并且根据指针算术的定义(§6.5.6/8,如引用于OP)这是未定义的行为.您从不使用结果指针这一事实无关紧要.

I am pretty certain that the result of the post-decrement in this case is indeed undefined behaviour. The post-decrement clearly subtracts one from a pointer to the beginning of an object, so the result does not point to an element of the same array, and by the definition of pointer arithmetic (§6.5.6/8, as cited in the OP) that's undefined behaviour. The fact that you never use the resulting pointer is irrelevant.

有什么问题:

char *t = s + strlen(s);
while (t > s) fputc(*--t, stdout);

<小时>

有趣但无关紧要的事实:标准 C++ 库中反向迭代器的实现通常在反向迭代器中保存一个指向目标元素之后的指针.这允许反向迭代器正常使用,而无需涉及指向容器开始之前的一个"的指针,这将是 UB,如上所述.


Interesting but irrelevant fact: The implementation of reverse iterators in the standard C++ library usually holds in the reverse iterator a pointer to one past the target element. This allows the reverse iterator to be used normally without ever involving a pointer to "one before the beginning" of the container, which would be UB, as above.

这篇关于对已经寻址数组基址的指针应用后递减是否会调用未定义的行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆