如何防止gcc优化破坏rep movsb代码? [英] How to prevent gcc optimization breaking rep movsb code?

查看:59
本文介绍了如何防止gcc优化破坏rep movsb代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用 rep movsb 指令创建我的memcpy代码.禁用优化后,它可以与任何大小完美配合.但是,当我启用优化后,它无法按预期运行.

I tried to create my memcpy code with rep movsb instruction. It works perfectly with any size when the optimization is disabled. But, when I enable optimization, it does not work as expected.

  1. 如何防止gcc优化破坏rep movsb代码?
  2. 我的代码是否有问题,导致未定义的行为?

创建我自己的memcpy的动机:

我从

Motivation to create my own memcpy:

I read about enhanced movsb for memcpy from Intel® 64 and IA-32 Architectures Optimization Reference Manual section 3.7.6. I came to the libc source code and I saw default memcpy from libc uses SSE instead of movsb.

因此,我想比较 SSE指令 rep movsb 对于memcpy的性能.但是现在,我发现它有问题.

Hence, I want to compare the performance between SSE instruction and rep movsb for memcpy. But now, I find something wrong with it.

#include <stdio.h>
#include <string.h>

inline static void *my_memcpy(
  register void *dest,
  register const void *src,
  register size_t n
) {
  __asm__ volatile(
    "mov %0, %%rdi;"
    "mov %1, %%rsi;"
    "mov %2, %%rcx;"
    "rep movsb;"
    :
    : "r"(dest), "r"(src), "r"(n)
    : "rdi", "rsi", "rcx"
  );
  return dest;
}

#define to_boolean_str(A) ((A) ? "true" : "false")

int main()
{
  char src[32];
  char dst[32];

  memset(src, 'a', 32);
  memset(dst, 'b', 32);

  my_memcpy(dst, src, 1);
  printf("%s\n", to_boolean_str(!memcmp(dst, src, 1)));

  my_memcpy(dst, src, 2);
  printf("%s\n", to_boolean_str(!memcmp(dst, src, 2)));

  my_memcpy(dst, src, 3);
  printf("%s\n", to_boolean_str(!memcmp(dst, src, 3)));

  return 0;
}

编译并运行

ammarfaizi2@integral:~$ gcc --version
gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

ammarfaizi2@integral:~$ gcc -O0 test.c -o test && ./test
true
true
true
ammarfaizi2@integral:~$ gcc -O1 test.c -o test && ./test
false
true
true
ammarfaizi2@integral:~$ gcc -O2 test.c -o test && ./test
false
true
true
ammarfaizi2@integral:~$ gcc -O3 test.c -o test && ./test
false
true
true
ammarfaizi2@integral:~$ 

摘要

my_memcpy(dst,src,1); 如果启用了优化,则会导致错误的行为.

Summary

my_memcpy(dst, src, 1); results in wrong behavior if optimizations are enabled.

推荐答案

按照您的描述,您的asm约束并不反映asm语句可以修改内存,因此编译器可以相对于读取或写入内存的操作自由地对其进行重新排序.在 dest src 处的内存.您需要将内存" 添加到Clobber列表.

As written, your asm constraints do not reflect that the asm statement can modify memory, so the compiler can freely reorder it with respect to operations that read or write the memory at dest or src. You need to add "memory" to the clobber list.

正如其他人所指出的,您还应该编辑约束以避免 mov .如果这样做,您还需要在约束中表示事实,即asm现在修改了其参数(例如,使它们成为双重输入/输出)并备份了 dest 的值,这样您就可以把它返还.因此,您可能会跳过此改进,直到您开始进行改进并且不了解约束的工作原理为止.

As others have noted, you should also edit the constraints to avoid mov. If you do so, you'll need to also represent in the constraints the fact that the asm now modifies its arguments (e.g. make them all dual input/output) and backup the value of dest so you can return it. So you might skip this improvement until you've gotten it working to begin with and until you understand how constraints work.

这篇关于如何防止gcc优化破坏rep movsb代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆