如何计算32位无符号整数中的前导零 [英] How to count leading zeros in a 32 bit unsigned integer

查看:355
本文介绍了如何计算32位无符号整数中的前导零的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以告诉我,在C编程中,有什么有效的算法可以计算32位无符号整数中前导零的数目?

Could anyone tell me please what is an efficient algorithm to count the number of leading zeroes in a 32-bit unsigned integer in C programming?

推荐答案

此讨论假设您的编译器不支持该操作,或者未产生足够好的汇编。请注意,如今这两种情况都不大可能发生,因此我建议您仅使用 __ builtin_clz 用于编译器上的gcc或等效版本。

This discussion assumes that your compiler either doesn't support the operation or that it doesn't produce good enough assembly. Note that both of these are unlikely nowadays so I'd recommend just using __builtin_clz for gcc or equivalent on your compiler.

请注意,只有您才能确定哪个是最佳 clz算法。现代处理器是复杂的野兽,这些算法的性能将在很大程度上取决于您在其上运行的平台,在其上投放的数据以及使用该代码的代码。确保唯一的方法是测量,测量和测量更多。如果您不能分辨出差异,那么您可能就不在看瓶颈,而您的时间会花在其他地方更好。

Note that determining which is the "best" clz algo can only be done by you. Modern processors are complicated beasts and the performance of these algorithm will heavily depend on the platform you run it on, the data you throw at it and the code that uses it. The only way to be sure is to measure, measure and measure some more. If you can't tell the difference then you're probably not looking at your bottleneck and your time will be better spent elsewhere.

现在,无聊的免责声明已被淘汰顺便说一句,让我们看看 Hacker's Delight 对这个问题要说些什么。快速调查显示,所有算法都依赖某种描述的二进制搜索。这是一个简单的示例:

Now that the boring disclaimers are out of the way, let's take a look at what Hacker's Delight has to say about the problem. A quick survey shows that all algorithms rely on a binary search of some description. Here's a straightforward example:

int n = 32;
unsigned y;

y = x >>16; if (y != 0) { n = n -16; x = y; }
y = x >> 8; if (y != 0) { n = n - 8; x = y; }
y = x >> 4; if (y != 0) { n = n - 4; x = y; }
y = x >> 2; if (y != 0) { n = n - 2; x = y; }
y = x >> 1; if (y != 0) return n - 2;
return n - x;

请注意,这适用于32个整数,并且如果需要,也可以将其转换为迭代版本。不幸的是,该解决方案并没有太多的指令级并行性,并且具有相当多的分支,因此并不能很好地调整算法。请注意,上面的代码有一个免费的分支版本,但是它更加冗长,因此在这里不再赘述。

Note that this works on 32 ints and that it can also be transformed into an iterative version if needed. Unfortunately that solution doesn't have a whole lot of instruction-level parallelism and has quite a few branches which doesn't make for a very good bit twiddling algo. Note that a branch free version of the above code exists but it's much more verbose so I won't reproduce here.

因此,让我们使用pop指令来改进解决方案(计算位数):

So let's improve on the solution by using the pop instruction (counts the number of bits):

x = x | (x >> 1);
x = x | (x >> 2);
x = x | (x >> 4);
x = x | (x >> 8);
x = x | (x >>16);
return pop(~x);

那么这是怎么工作的呢?关键是最后的 pop(〜x)指令,该指令计算 x 中的零个数。为了使零计数有意义,我们首先需要消除所有非前导的0。我们通过使用二进制算法正确传播1来做到这一点。尽管我们仍然没有太多的指令级并行性,但我们确实摆脱了所有分支,并且比以前的解决方案使用了更少的周期。好多了。

So how does this work? The key is the pop(~x) instruction at the end which counts the numbers of zeros in x. For the count of zeros to be meaningful we first need to get rid of all 0s which aren't leading. We do this by right propagating the 1s using a binary algorithm. While we still don't have much instruction-level parallelism, we did get rid of all the branches and it uses fewer cycles then the previous solution. Much better.

那条流行指令怎么样,不是在作弊吗?大多数体系结构都有1周期弹出指令,可通过编译器内置函数(例如gcc的 __ builtin_pop )进行访问。否则,存在基于表的解决方案,但是在权衡缓存访问的周期时,即使表完全保留在L1缓存中,也必须要小心。

So how about that pop instruction, isn't that cheating? Most architecture have a 1 cycle pop instruction which can be accessed via compiler builtins (eg. gcc's __builtin_pop). Otherwise there exist table based solutions but care must be taken when trading off cycles for cache accesses even if the table is kept entirely in the L1 cache.

最后,通常为了让黑客高兴,我们开始在陌生地区徘徊。让我们用浮点数为我们计算一些前导零:

Finally, as is usual for hacker's delight, we start wandering in strange territories. Let's count us some leading zeroes using floating point numbers:

union {
    unsigned asInt[2];
    double asDouble;
};
asDouble = (double)k + 0.5;
return 1054 - (asInt[LE] >> 20);

首先,有一个警告:请勿使用此算法。就标准而言,这会触发不确定的行为。复制的目的是为了娱乐因素,而不是任何实际用途。后果自负。

First off, a little warning: DON'T USE THIS ALGORITHM. This triggers undefined behaviour as far as the standard is concerned. This was reproduce for the fun factor more then any practical uses. Use at your own peril.

现在,免责声明已被取消,它如何工作?它首先将int转换为double,然后继续提取double的指数成分。整洁的东西。如果在小字节序的计算机上执行,则LE常数应为1,而在大字节序的计算机上,则应为0。

Now that the disclaimers are out of the way, how does it work? It first converts the int into a double and goes on to extract the exponent component of the double. Neat stuff. The LE constant should be 1 if executed on a little-endian machine and 0 on a big-endian machine.

这应为您简要介绍各种比特摆动算法对于这个问题。请注意,这本书有几种变体,可以进行各种折衷,但是我会让您自己发现那些变体。

This should give you a brief survey of various bit twiddling algorithms for this problem. Note that the book has several variations of these which makes various tradeoffs but I'll let you discover those on your own.

这篇关于如何计算32位无符号整数中的前导零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆