如何获得`gcc`产生`为X86-64从标准C bts`指令? [英] How to get `gcc` to generate `bts` instruction for x86-64 from standard C?

查看:148
本文介绍了如何获得`gcc`产生`为X86-64从标准C bts`指令?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由<一个启发href=\"http://stackoverflow.com/questions/2039592/effectiveness-of-gcc-optmization-on-bit-operations\">recent问题,我想知道是否有人知道如何获得 GCC 来产生X86-64 BTS 指令(位测试和设置)的没有的诉诸内联汇编或非标准的编译器内在函数。

Inspired by a recent question, I'd like to know if anyone knows how to get gcc to generate the x86-64 bts instruction (bit test and set) on the Linux x86-64 platforms, without resorting to inline assembly or to nonstandard compiler intrinsics.

相关问题:


  • <一个href=\"http://stackoverflow.com/questions/2039592/effectiveness-of-gcc-optmization-on-bit-operations\">Why不GCC为一个简单的做到这一点| = 操作是右手边正好有1位设置

  • Why doesn't gcc do this for a simple |= operation were the right-hand side has exactly 1 bit set?

<一个href=\"http://stackoverflow.com/questions/1983303/using-bts-assembly-instruction-with-gcc-compiler\">How使用编译器内在函数或 获得 BTS ASM 指令

便携性对我来说更重要的不是 BTS ,所以我不会用和 ASM 指令,如果还有另一种解决方案,我preFER不使用编译器instrinsics。

Portability is more important to me than bts, so I won't use and asm directive, and if there's another solution, I prefer not to use compiler instrinsics.

修改:在C源语言不支持原子操作,所以我不是在得到特别感兴趣的原子的检查并设置(即使是这样的原原因检查并设置在首位到存在)。如果我想要的东西原子我知道我没有使用标准的C源这样做的机会:它必须是一个内在的,库函数或内联汇编。 (我已经在支持多线程编译器实现的原子操作。)

EDIT: The C source language does not support atomic operations, so I'm not particularly interested in getting atomic test-and-set (even though that's the original reason for test-and-set to exist in the first place). If I want something atomic I know I have no chance of doing it with standard C source: it has to be an intrinsic, a library function, or inline assembly. (I have implemented atomic operations in compilers that support multiple threads.)

推荐答案

这是第一个链接的第一个答案 - 要花多少钱在事物宏伟计划无所谓。当你测试位的唯一部分:

It is in the first answer for the first link - how much does it matter in grand scheme of things. The only part when you test bits are:


  • 底层驱动。但是,如果你正在写一个你可能知道ASM,它足以tided到系统,并且可能是最延迟的I / O

  • 测试为标志。它通常要么初始化(仅在开始一次)或某些共享计算(这需要更多的时间)。

上的应用程序和macrobenchmarks性能的总体影响可能是最小的,即使微基准示出了改进。

The overall impact on performance of applications and macrobenchmarks is likely to be minimal even if microbenchmarks shows an improvement.

要在修改部分 - 使用 BTS 本身并不能保证原子操作。它所保证的是,这将是原子的这一核心的(所以内存完成)。在多处理器单元(罕见)或多核单位(很常见的),你仍然有其他处理器同步。

To the Edit part - using bts alone does not guarantee the atomic of the operation. All it guarantee is that it will be atomic on this core (so is or done on memory). On multi-processor units (uncommon) or multi-core units (very common) you still have to synchronize with other processors.

由于同步昂贵得多我相信之间的区别:

As synchronization is much more expensive I belive that difference between:

asm("lock bts %0, %1" : "+m" (*array) : "r" (bit));

asm("lock or %0, %1" : "+m" (*array) : "r" (1 << bit));

是最小的。而第二种形式:

is minimal. And the second form:


  • 可以设置多个标志同时

  • 有很好的 __ sync_fetch_and_or(阵列,1 LT;&LT;位)。形式(在GCC和英特尔编译器的工作,据我记得)

  • Can set several flag at once
  • Have nice __sync_fetch_and_or (array, 1 << bit) form (working on gcc and intel compiler as far as I remember).

这篇关于如何获得`gcc`产生`为X86-64从标准C bts`指令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆