如何阻止GCC将此逐字节复制优化为memcpy调用? [英] How do I stop GCC from optimizing this byte-for-byte copy into a memcpy call?

查看:302
本文介绍了如何阻止GCC将此逐字节复制优化为memcpy调用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有用于 memcpy 的代码,这是我实现的标准C库的一部分,该库从 src复制内存目的地一次一个字节:

  void * memcpy(void *限制dest,const void *限制src,size_t len)
{
char * dp =(char * restrict)dest;
const char * sp =(const char * restrict)src;

while(len--)
{
* dp ++ = * sp ++;
}

返回目的地;
}

使用 gcc -O2 ,生成的代码是合理的:

  memcpy:
.LFB0:
movq%rdi,% rax
testq%rdx,%rdx
je .L2
xorl%ecx,%ecx
.L3:
movzbl(%rsi,%rcx),%r8d
movb%r8b,(%rax,%rcx)
addq $ 1,%rcx
cmpq%rdx,%rcx
jne .L3
.L2:
ret
.LFE0:

但是,在 gcc- O3 ,GCC将该天真的逐字节副本优化为 memcpy 调用:

  memcpy:
.LFB0:
testq%rdx,%rdx
je .L7
subq $ 8,%rsp
呼叫memcpy
addq $ 8,%rsp
ret
.L7:
movq%rdi,%rax
ret
.LFE0:

这将不起作用( memcpy 无条件调用

我尝试通过 -fno-builtin-memcpy -fno-loop-optimizations ,并且发生相同的事情。



我正在使用GCC版本8.3.0 :

 使用内置规格。 
COLLECT_GCC = gcc
COLLECT_LTO_WRAPPER = / usr / local / libexec / gcc / x86_64-cros-linux-gnu / 8.3.0 / lto-wrapper
目标:x86_64-cros-linux-gnu
配置为:../configure --prefix = / usr / local --libdir = / usr / local / lib64 --build = x86_64-cros-linux-gnu --host = x86_64-cros-linux- gnu --target = x86_64-cros-linux-gnu --enable-checking = release --disable-multilib --enable-threads = posix --disable-bootstrap --disable-werror --disable-libmpx --enable-静态--enable-shared --program-suffix = -8.3.0 --with-arch-64 = x86-64
线程模型:posix
gcc版本8.3.0(GCC)

如何禁用使副本转换为 memcpy的优化调用?

解决方案

在这里似乎已经足够了:代替使用 -fno-builtin-memcpy 使用 -fno-builtin r仅编译 memcpy 的翻译单位!



另一种方法是通过- fno-tree-loop-distribute-patterns ;尽管这样做可能很脆弱,因为它禁止编译器先重新组织循环代码,然后再 调用对 mem * 函数的调用来替换其中的一部分。 / p>

或者,由于您不能依赖C库中的任何内容,因此也许可以顺便使用 -ffreestanding


I have this code for memcpy as part of my implementation of the standard C library which copies memory from src to dest one byte at a time:

void *memcpy(void *restrict dest, const void *restrict src, size_t len)
{
    char *dp = (char *restrict)dest;
    const char *sp = (const char *restrict)src;

    while( len-- )
    {
        *dp++ = *sp++;
    }

    return dest;
}

With gcc -O2, the code generated is reasonable:

memcpy:
.LFB0:
        movq    %rdi, %rax
        testq   %rdx, %rdx
        je      .L2
        xorl    %ecx, %ecx
.L3:
        movzbl  (%rsi,%rcx), %r8d
        movb    %r8b, (%rax,%rcx)
        addq    $1, %rcx
        cmpq    %rdx, %rcx
        jne     .L3
.L2:
        ret
.LFE0:

However, at gcc -O3, GCC optimizes this naive byte-for-byte copy into a memcpy call:

memcpy:
.LFB0:
        testq   %rdx, %rdx
        je      .L7
        subq    $8, %rsp
        call    memcpy
        addq    $8, %rsp
        ret
.L7:
        movq    %rdi, %rax
        ret
.LFE0:

This won't work (memcpy unconditionally calls itself), and it causes a segfault.

I've tried passing -fno-builtin-memcpy and -fno-loop-optimizations, and the same thing occurs.

I'm using GCC version 8.3.0:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-cros-linux-gnu/8.3.0/lto-wrapper
Target: x86_64-cros-linux-gnu
Configured with: ../configure --prefix=/usr/local --libdir=/usr/local/lib64 --build=x86_64-cros-linux-gnu --host=x86_64-cros-linux-gnu --target=x86_64-cros-linux-gnu --enable-checking=release --disable-multilib --enable-threads=posix --disable-bootstrap --disable-werror --disable-libmpx --enable-static --enable-shared --program-suffix=-8.3.0 --with-arch-64=x86-64
Thread model: posix
gcc version 8.3.0 (GCC) 

How do I disable the optimization that causes the copy to be transformed into a memcpy call?

解决方案

One thing that seems to be sufficient here: instead of using -fno-builtin-memcpy use -fno-builtin for compiling the translation unit of memcpy alone!

An alternative would be to pass -fno-tree-loop-distribute-patterns; though this might be brittle as it forbids the compiler from reorganizing the loop code first and then replacing part of them with calls to mem* functions.

Or, since you cannot rely anything in the C library, perhaps using -ffreestanding could be in order.

这篇关于如何阻止GCC将此逐字节复制优化为memcpy调用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆