在gcc linux x86-64 C ++中，(p + x)-x是否始终会导致p指向指针p和整数x [英] Does (p+x)-x always result in p for pointer p and integer x in gcc linux x86-64 C++

查看：114 发布时间：2020/11/13 0:20:49 c++ linux gcc x86-64 pointer-arithmetic

本文介绍了在gcc linux x86-64 C ++中，(p + x)-x是否始终会导致p指向指针p和整数x的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我们有:

char* p;
int   x;

最近在另一个问题中讨论过的，对无效指针的算术包括比较操作可能会产生意外情况gcc linux x86-64 C ++中的行为.这个新问题专门针对表达式(p+x)-x:在x86-64 linux上运行的任何现有GCC版本中，它都能生成意外行为(即结果不是p)吗?

As recently discussed in another question, arithmetic including comparison operations on invalid pointers can generate unexpected behavior in gcc linux x86-64 C++. This new question is specifically about the expression (p+x)-x: can it generate unexpected behavior (i.e., result not beingp) in any existing GCC version running on x86-64 linux?

请注意，此问题仅与指针算术有关.完全没有意图访问由*(p+x)指定的位置，这显然通常是不可预测的.

Note that this question is just about pointer arithmetic; there is absolutely no intention to access the location designated by *(p+x), which obviously would be unpredictable in general.

这里的实际兴趣是基于非零的数组.请注意，在这些应用程序中，(p+x)和x的减法发生在代码的不同位置.

The practical interest here is non-zero-based arrays. Note that (p+x) and the subtraction by x happen in different places in the code in these applications.

如果可以显示x86-64上的最新GCC版本从未为(p+x)-x生成意外行为，则可以对这些版本进行非零基数组认证，并且可以修改或配置生成意外行为的将来版本以支持该认证.

If recent GCC versions on x86-64 can be shown to never generate unexpected behavior for (p+x)-x then these versions can be certified for non-zero-based arrays, and future versions generating unexpected behavior could be modified or configured to support this certification.

更新

对于上述实际情况，我们还可以假设p本身是有效的指针，而p != NULL.

For the practical case described above, we could also assume p itself is a valid pointer and p != NULL.

推荐答案

是的，对于gcc5.x及更高版本，即使禁用了优化，该特定表达式也很早就被优化为p，而不考虑任何可能的运行时UB.

Yes, for gcc5.x and later specifically, that specific expression is optimized very early to just p, even with optimization disabled, regardless of any possible runtime UB.

即使使用静态数组和编译时常量大小，也会发生这种情况. gcc -fsanitize=undefined也不插入任何工具来查找它.在-Wall -Wextra -Wpedantic

This happens even with a static array and compile-time constant size. gcc -fsanitize=undefined doesn't insert any instrumentation to look for it either. Also no warnings at -Wall -Wextra -Wpedantic

int *add(int *p, long long x) {
    return (p+x) - x;
}

int *visible_UB(void) {
    static int arr[100];
    return (arr+200) - 200;
}

使用gcc -dump-tree-original在任何优化通过之前转储其内部程序逻辑表示，表明该优化甚至发生在gcc5.x和更高版本中的之前. (甚至在-O0处也会发生.)

Using gcc -dump-tree-original to dump its internal representation of program logic before any optimization passes shows that this optimization happened even before that in gcc5.x and newer. (And happens even at -O0).

;; Function int* add(int*, long long int) (null) ;; enabled by -tree-original return <retval> = p; ;; Function int* visible_UB() (null) ;; enabled by -tree-original { static int arr[100]; static int arr[100]; return <retval> = (int *) &arr; }

那是

That's from the Godbolt compiler explorer with gcc8.3 with -O0.

x86-64 asm输出仅为:

The x86-64 asm output is just:

; g++8.3 -O0 add(int*, long long): mov QWORD PTR [rsp-8], rdi mov QWORD PTR [rsp-16], rsi # spill args mov rax, QWORD PTR [rsp-8] # reload only the pointer ret visible_UB(): mov eax, OFFSET FLAT:_ZZ10visible_UBvE3arr ret

-O3输出当然只是mov rax, rdi

-O3 output is of course just mov rax, rdi

gcc4.9和更早版本仅在以后的过程中执行此优化，而不是在-O0 处执行:树转储仍然包括减法，而x86-64 asm是

gcc4.9 and earlier only do this optimization in a later pass, and not at -O0: the tree dump still includes the subtract, and the x86-64 asm is

# g++4.9.4 -O0 add(int*, long long): mov QWORD PTR [rsp-8], rdi mov QWORD PTR [rsp-16], rsi mov rax, QWORD PTR [rsp-16] lea rdx, [0+rax*4] # RDX = x*4 = x*sizeof(int) mov rax, QWORD PTR [rsp-16] sal rax, 2 neg rax # RAX = -(x*4) add rdx, rax # RDX = x*4 + (-(x*4)) = 0 mov rax, QWORD PTR [rsp-8] add rax, rdx # p += x + (-x) ret visible_UB(): # but constants still optimize away at -O0 mov eax, OFFSET FLAT:_ZZ10visible_UBvE3arr ret

这确实与-fdump-tree-original输出一致:

return <retval> = p + ((sizetype) ((long unsigned int) x * 4) + -(sizetype) ((long unsigned int) x * 4));

如果x*4溢出，您仍将获得正确的答案.在实践中，我想不出一种方法来编写导致UB导致行为可观察的变化的函数.

If x*4 overflows, you'll still get the right answer. In practice I can't think of a way to write a function that would lead to the UB causing an observable change in behaviour.

作为较大函数的一部分，将允许编译器推断范围信息，例如p[x]与p[0] 是同一对象的一部分，因此在/之间读取内存.允许那么远，不会出现段错误.例如允许对搜索循环进行自动向量化.

As part of a larger function, a compiler would be allowed to infer some range info, like that p[x] is part of the same object as p[0], so reading memory in between / out that far is allowed and won't segfault. e.g. allowing auto-vectorization of a search loop.

但是我怀疑gcc是否会寻找它，更不用说利用它了.

But I doubt that gcc even looks for that, let alone takes advantage of it.

(请注意，您的问题标题特定于Linux上针对x86-64的gcc，例如，如果在单独的语句中完成，则不是关于gcc中类似的东西是否安全.我的意思是，在练习，但几乎不会在解析后立即进行优化.而且，一般来说，绝对不是关于C ++的.)

(Note that your question title was specific to gcc targeting x86-64 on Linux, not about whether similar things are safe in gcc, e.g. if done in separate statements. I mean yes probably safe in practice, but won't be optimized away almost immediately after parsing. And definitely not about C++ in general.)

我强烈建议不这样做.使用uintptr_t保留不是实际有效指针的类似指针的值.就像您在更新

I highly recommend not doing this. Use uintptr_t to hold pointer-like values that aren't actual valid pointers. like you're doing in the updates to your answer on C++ gcc extension for non-zero-based array pointer allocation?.

这篇关于在gcc linux x86-64 C ++中，(p + x)-x是否始终会导致p指向指针p和整数x的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在gcc linux x86-64 C ++中，(p + x)-x是否始终会导致p指向指针p和整数x [英] Does (p+x)-x always result in p for pointer p and integer x in gcc linux x86-64 C++

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

在gcc linux x86-64 C ++中，(p + x)-x是否始终会导致p指向指针p和整数x [英] Does (p+x)-x always result in p for pointer p and integer x in gcc linux x86-64 C++

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭