__builtin_ctz(0)或__builtin_clz(0)有多不确定? [英] How undefined are __builtin_ctz(0) or __builtin_clz(0)?
问题描述
很长一段时间以来, gcc一直在提供内置位纠缠功能,尤其是尾随和前导0位的数量(也适用于long unsigned
和long long unsigned
,其后缀为l
和ll
):
For a long time, gcc has been providing a number of builtin bit-twiddling functions, in particular the number of trailing and leading 0-bits (also for long unsigned
and long long unsigned
, which have suffixes l
and ll
):
—内置功能:
int __builtin_clz (unsigned int x)
返回
x
中前导0位的数量,从最高有效位开始
位置.如果x
为0,则结果不确定.
Returns the
number of leading 0-bits in x
, starting at the most significant bit
position. If x
is 0, the result is undefined.
—内置功能:int __builtin_ctz (unsigned int x)
返回
x
中的尾随0位的数量,从最低有效位开始
位置.如果x
为0,则结果不确定.
Returns the
number of trailing 0-bits in x
, starting at the least significant bit
position. If x
is 0, the result is undefined.
但是,在我测试的每个在线(免责声明:仅x64)编译器上,结果都是clz(0)
和ctz(0)
都返回基础内置类型的位数,例如
On every online (disclaimer: only x64) compiler I tested, however, the result has been that both clz(0)
and ctz(0)
return the number of bits of the underlying builtin type, e.g.
#include <iostream>
#include <limits>
int main()
{
// prints 32 32 32 on most systems
std::cout << std::numeric_limits<unsigned>::digits << " " << __builtin_ctz(0) << " " << __builtin_clz(0);
}
实时示例 .
在std=c++1y
模式下的最新Clang SVN干线使所有这些函数都放松了C ++ 14 constexpr
,这使它们可以在SFINAE表达式中用于3 ctz
/<周围的包装器函数模板. c15> unsigned
,unsigned long
和unsigned long long
The latest Clang SVN trunk in std=c++1y
mode has made all these functions relaxed C++14 constexpr
, which makes them candidates to use in a SFINAE expression for a wrapper function template around the 3 ctz
/ clz
builtins for unsigned
, unsigned long
, and unsigned long long
template<class T> // wrapper class specialized for u, ul, ull (not shown)
constexpr int ctznz(T x) { return wrapper_class_around_builtin_ctz<T>()(x); }
// overload for platforms where ctznz returns size of underlying type
template<class T>
constexpr auto ctz(T x)
-> typename std::enable_if<ctznz(0) == std::numeric_limits<T>::digits, int>::type
{ return ctznz(x); }
// overload for platforms where ctznz does something else
template<class T>
constexpr auto ctz(T x)
-> typename std::enable_if<ctznz(0) != std::numeric_limits<T>::digits, int>::type
{ return x ? ctznz(x) : std::numeric_limits<T>::digits; }
这种黑客攻击的好处在于,为ctz(0)
提供所需结果的平台可以省略测试x==0
的额外条件(这似乎是微优化的,但是当您已经达到内置位旋转功能,可以有很大的不同)
The gain from this hack is that platforms that give the required result for ctz(0)
can omit an extra conditional to test for x==0
(which might seem a micro-optimization, but when you are already down to the level of builtin bit-twiddling functions, it can make a big difference)
内置函数clz(0)
和ctz(0)
的家族如何定义?
How undefined is the family of builtin functions clz(0)
and ctz(0)
?
- 他们可以抛出
std::invalid_argument
异常吗? - 对于x64,当前的gcc发行版是否会返回底层字体的大小?
- ARM/x86平台有什么不同吗(我无权对其进行测试)?
- 上面的SFINAE技巧是否是一种定义明确的方法来分离此类平台?
- can they throw an
std::invalid_argument
exception? - for x64, will they for the current gcc distro return the size of the underyling type?
- are the ARM/x86 platforms any different (I have no access to that to test those)?
- is the above SFINAE trick a well-defined way to separate such platforms?
推荐答案
不幸的是,即使x86-64实现也可能有所不同-与Intel的
Unfortunately, even x86-64 implementations can differ - from Intel's instruction set reference,BSF
and BSR
, with a source operand value of (0)
, leaves the destination undefined, and sets the ZF
(zero flag). So the behaviour may not be consistent between micro-architectures or, say, AMD and Intel. (I believe AMD leaves the destination unmodified.)
较新的LZCNT
和TZCNT
指令并非无处不在.两者都仅在Haswell架构(适用于Intel)上存在.
The newer LZCNT
and TZCNT
instructions are not ubiquitous. Both are present only as of the Haswell architecture (for Intel).
这篇关于__builtin_ctz(0)或__builtin_clz(0)有多不确定?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!