如何阻止GCC将此逐字节复制优化为memcpy调用？ [英] How do I stop GCC from optimizing this byte-for-byte copy into a memcpy call?

查看：302 发布时间：2020/10/6 23:29:47 c gcc compiler-optimization

本文介绍了如何阻止GCC将此逐字节复制优化为memcpy调用？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有用于 memcpy 的代码，这是我实现的标准C库的一部分，该库从 src复制内存到目的地一次一个字节：

  void * memcpy（void *限制dest，const void *限制src，size_t len）
 {
 char * dp =（char * restrict）dest; 
 const char * sp =（const char * restrict）src; 
 
 while（len--）
 {
 * dp ++ = * sp ++; 
} 
 
返回目的地； 
}

使用 gcc -O2 ，生成的代码是合理的：

  memcpy：
 .LFB0：
 movq％rdi，％ rax 
 testq％rdx，％rdx 
 je .L2 
 xorl％ecx，％ecx 
 .L3：
 movzbl（％rsi，％rcx），％r8d 
 movb％r8b，（％rax，％rcx）
 addq $ 1，％rcx 
 cmpq％rdx，％rcx 
 jne .L3 
 .L2：
 ret 
 .LFE0：

但是，在 gcc- O3 ，GCC将该天真的逐字节副本优化为 memcpy 调用：

  memcpy：
 .LFB0：
 testq％rdx，％rdx 
 je .L7 
 subq $ 8，％rsp 
呼叫memcpy 
 addq $ 8，％rsp 
 ret 
 .L7：
 movq％rdi，％rax 
 ret 
 .LFE0：

这将不起作用（ memcpy 无条件调用

我尝试通过 -fno-builtin-memcpy 和 -fno-loop-optimizations ，并且发生相同的事情。

我正在使用GCC版本8.3.0 ：

 使用内置规格。 
 COLLECT_GCC = gcc 
 COLLECT_LTO_WRAPPER = / usr / local / libexec / gcc / x86_64-cros-linux-gnu / 8.3.0 / lto-wrapper 
目标：x86_64-cros-linux-gnu 
配置为：../configure --prefix = / usr / local --libdir = / usr / local / lib64 --build = x86_64-cros-linux-gnu --host = x86_64-cros-linux- gnu --target = x86_64-cros-linux-gnu --enable-checking = release --disable-multilib --enable-threads = posix --disable-bootstrap --disable-werror --disable-libmpx --enable-静态--enable-shared --program-suffix = -8.3.0 --with-arch-64 = x86-64 
线程模型：posix 
 gcc版本8.3.0（GCC）

如何禁用使副本转换为 memcpy的优化调用？

解决方案

在这里似乎已经足够了：代替使用 -fno-builtin-memcpy 使用 -fno-builtin r仅编译 memcpy 的翻译单位！

另一种方法是通过- fno-tree-loop-distribute-patterns ;尽管这样做可能很脆弱，因为它禁止编译器先重新组织循环代码，然后再调用对 mem * 函数的调用来替换其中的一部分。 / p>

或者，由于您不能依赖C库中的任何内容，因此也许可以顺便使用 -ffreestanding 。

I have this code for memcpy as part of my implementation of the standard C library which copies memory from src to dest one byte at a time:

void *memcpy(void *restrict dest, const void *restrict src, size_t len)
{
    char *dp = (char *restrict)dest;
    const char *sp = (const char *restrict)src;

    while( len-- )
    {
        *dp++ = *sp++;
    }

    return dest;
}

With gcc -O2, the code generated is reasonable:

memcpy:
.LFB0:
        movq    %rdi, %rax
        testq   %rdx, %rdx
        je      .L2
        xorl    %ecx, %ecx
.L3:
        movzbl  (%rsi,%rcx), %r8d
        movb    %r8b, (%rax,%rcx)
        addq    $1, %rcx
        cmpq    %rdx, %rcx
        jne     .L3
.L2:
        ret
.LFE0:

However, at gcc -O3, GCC optimizes this naive byte-for-byte copy into a memcpy call:

memcpy:
.LFB0:
        testq   %rdx, %rdx
        je      .L7
        subq    $8, %rsp
        call    memcpy
        addq    $8, %rsp
        ret
.L7:
        movq    %rdi, %rax
        ret
.LFE0:

This won't work (memcpy unconditionally calls itself), and it causes a segfault.

I've tried passing -fno-builtin-memcpy and -fno-loop-optimizations, and the same thing occurs.

I'm using GCC version 8.3.0:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-cros-linux-gnu/8.3.0/lto-wrapper
Target: x86_64-cros-linux-gnu
Configured with: ../configure --prefix=/usr/local --libdir=/usr/local/lib64 --build=x86_64-cros-linux-gnu --host=x86_64-cros-linux-gnu --target=x86_64-cros-linux-gnu --enable-checking=release --disable-multilib --enable-threads=posix --disable-bootstrap --disable-werror --disable-libmpx --enable-static --enable-shared --program-suffix=-8.3.0 --with-arch-64=x86-64
Thread model: posix
gcc version 8.3.0 (GCC)

How do I disable the optimization that causes the copy to be transformed into a memcpy call?

解决方案

One thing that seems to be sufficient here: instead of using -fno-builtin-memcpy use -fno-builtin for compiling the translation unit of memcpy alone!

An alternative would be to pass -fno-tree-loop-distribute-patterns; though this might be brittle as it forbids the compiler from reorganizing the loop code first and then replacing part of them with calls to mem* functions.

Or, since you cannot rely anything in the C library, perhaps using -ffreestanding could be in order.

这篇关于如何阻止GCC将此逐字节复制优化为memcpy调用？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何阻止GCC将此逐字节复制优化为memcpy调用？ [英] How do I stop GCC from optimizing this byte-for-byte copy into a memcpy call?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何阻止GCC将此逐字节复制优化为memcpy调用？ [英] How do I stop GCC from optimizing this byte-for-byte copy into a memcpy call?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭