__builtin_ctz(0)或__builtin_clz(0)有多不确定? [英] How undefined are __builtin_ctz(0) or __builtin_clz(0)?

查看:799
本文介绍了__builtin_ctz(0)或__builtin_clz(0)有多不确定?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

很长一段时间以来, gcc一直在提供内置位纠缠功能,尤其是尾随和前导0位的数量(也适用于long unsignedlong long unsigned,其后缀为lll):

For a long time, gcc has been providing a number of builtin bit-twiddling functions, in particular the number of trailing and leading 0-bits (also for long unsigned and long long unsigned, which have suffixes l and ll):

—内置功能:int __builtin_clz (unsigned int x)

返回 x中前导0位的数量,从最高有效位开始 位置.如果x为0,则结果不确定.

Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.

—内置功能:int __builtin_ctz (unsigned int x)

返回 x中的尾随0位的数量,从最低有效位开始 位置.如果x为0,则结果不确定.

Returns the number of trailing 0-bits in x, starting at the least significant bit position. If x is 0, the result is undefined.

但是,在我测试的每个在线(免责声明:仅x64)编译器上,结果都是clz(0)ctz(0)都返回基础内置类型的位数,例如

On every online (disclaimer: only x64) compiler I tested, however, the result has been that both clz(0) and ctz(0) return the number of bits of the underlying builtin type, e.g.

#include <iostream>
#include <limits>

int main()
{
    // prints 32 32 32 on most systems
    std::cout << std::numeric_limits<unsigned>::digits << " " << __builtin_ctz(0) << " " << __builtin_clz(0);    
}

实时示例 .

std=c++1y模式下的最新Clang SVN干线使所有这些函数都放松了C ++ 14 constexpr,这使它们可以在SFINAE表达式中用于3 ctz/<周围的包装器函数模板. c15> unsignedunsigned longunsigned long long

The latest Clang SVN trunk in std=c++1y mode has made all these functions relaxed C++14 constexpr, which makes them candidates to use in a SFINAE expression for a wrapper function template around the 3 ctz / clz builtins for unsigned, unsigned long, and unsigned long long

template<class T> // wrapper class specialized for u, ul, ull (not shown)
constexpr int ctznz(T x) { return wrapper_class_around_builtin_ctz<T>()(x); }

// overload for platforms where ctznz returns size of underlying type
template<class T>
constexpr auto ctz(T x) 
-> typename std::enable_if<ctznz(0) == std::numeric_limits<T>::digits, int>::type
{ return ctznz(x); }

// overload for platforms where ctznz does something else
template<class T>
constexpr auto ctz(T x) 
-> typename std::enable_if<ctznz(0) != std::numeric_limits<T>::digits, int>::type
{ return x ? ctznz(x) : std::numeric_limits<T>::digits; }

这种黑客攻击的好处在于,为ctz(0)提供所需结果的平台可以省略测试x==0的额外条件(这似乎是微优化的,但是当您已经达到内置位旋转功能,可以有很大的不同)

The gain from this hack is that platforms that give the required result for ctz(0) can omit an extra conditional to test for x==0 (which might seem a micro-optimization, but when you are already down to the level of builtin bit-twiddling functions, it can make a big difference)

内置函数clz(0)ctz(0)的家族如何定义?

How undefined is the family of builtin functions clz(0) and ctz(0)?

  • 他们可以抛出std::invalid_argument异常吗?
  • 对于x64,当前的gcc发行版是否会返回底层字体的大小?
  • ARM/x86平台有什么不同吗(我无权对其进行测试)?
  • 上面的SFINAE技巧是否是一种定义明确的方法来分离此类平台?
  • can they throw an std::invalid_argument exception?
  • for x64, will they for the current gcc distro return the size of the underyling type?
  • are the ARM/x86 platforms any different (I have no access to that to test those)?
  • is the above SFINAE trick a well-defined way to separate such platforms?

推荐答案

不幸的是,即使x86-64实现也可能有所不同-与Intel的

Unfortunately, even x86-64 implementations can differ - from Intel's instruction set reference,BSF and BSR, with a source operand value of (0), leaves the destination undefined, and sets the ZF (zero flag). So the behaviour may not be consistent between micro-architectures or, say, AMD and Intel. (I believe AMD leaves the destination unmodified.)

较新的LZCNTTZCNT指令并非无处不在.两者都仅在Haswell架构(适用于Intel)上存在.

The newer LZCNT and TZCNT instructions are not ubiquitous. Both are present only as of the Haswell architecture (for Intel).

这篇关于__builtin_ctz(0)或__builtin_clz(0)有多不确定?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆