编码挑战:计算一个数字中的一位。 [英] Coding challenge: count the one bits in a number.
问题描述
给定一个32位整数值作为参数,写一个函数或方法来返回数字中1位的数量。
所以如果你通过42,它应该返回3 - 42十进制是2A十六进制,或101010二进制 - 因此31位。
应特别注意让它尽可能高效:循环是一个糟糕的主意!
我尝试过:
有史以来第一次设置挑战!
休息室 [ ^ ]
Given a thirty-two bit integer value as a parameter, write a function or method to return the number of "1" bits in the number.
So if You get passed 42, it should return 3 - 42 decimal is 2A hex, or 101010 binary - hence 3 "1" bits.
Special care should be taken to make it as efficient as possible: loops are a poor idea!
What I have tried:
Setting a Challenge for the first time ever!
The Lounge[^]
推荐答案
使用x86 CPU(Intel Nehalem和AMD Barcelona或更高版本)的Visual C ++的快速和脏污:
Quick and dirty for Visual C++ with x86 CPUs (Intel Nehalem and AMD Barcelona or later):
#include <iostream>
#include <stdlib.h>
#include <intrin.h>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
if (2 != argc)
{
cout << "Usage: Bitcount <32-bit value>" << std::endl;
return 1;
}
TCHAR *stop;
unsigned long u = _tcstoul(argv[1], &stop, 0);
if (0 != *stop || u > _UI32_MAX)
{
cout << "Usage: Bitcount <32-bit value>" << std::endl;
return 1;
}
cout << "The value " << u << " has " << __popcnt(u) << " bits set" << endl;
return 0;
}
__ popcnt()
函数是一个调用x86 <$ c $的内部函数c> POPCNT 指令。
使用GCC __ builtin_popcount( )
而是用 -mpopcnt
进行编译。
[/ EDIT]
The __popcnt()
function is an intrinsic function that calls the x86 POPCNT
instruction.
With GCC use __builtin_popcount()
instead and compile with -mpopcnt
.
[/EDIT]
显然,很难击败处理器指令popcnt
。
在高级语言中,主要有两种选择:
- 计数位1乘1,如果数据大小加倍,则需要两倍的时间。
Obviously, it will be difficult to beat the processor instructionpopcnt
.
In high level languages, there is mainly 2 options:
- Counting bits 1 by 1, if size of data double, it takes twice the time.
int BitsCnt(unsigned long int Bits) {
int Cnt=0;
do {
Cnt += Bits && 1;
Bits = Bits >> 1;
}
while (Bits != 0)
return Cnt;
}
Variant可以展开循环。
Variant也可以利用处理器标志。
子变量:使用查找表,可以处理大于1位的块。查找表的问题在于它倾向于废弃处理器缓存。
- 使用穷人的并行编程计数位,如果数据大小加倍,则只需1更多步骤。
Variant can unroll the loop.
Variant can also take advantage of processor flags.
Subvariant: With a lookup table, one can treat in chunks bigger than 1 bit. problem with lookup tables is that it tend to trash processor cache.
- Counting bits using the parallel programming of the poor, if size of data double, it takes only 1 more step.
int BitsCnt(unsigned long int Bits) {
Bits= (Bits && 0x55555555) + ((Bits >> 1) && 0x55555555);
Bits= (Bits && 0x33333333) + ((Bits >> 2) && 0x33333333);
Bits= (Bits && 0x0f0f0f0f) + ((Bits >> 4) && 0x0f0f0f0f);
Bits= (Bits && 0x00ff00ff) + ((Bits >> 8) && 0x00ff00ff);
Bits= Bits + (Bits >>16);
return Bits&& 0xff;
}
一个C
一个(我知道它与Richard的CountSetBits4非常相似) )。
AC
one (I am aware it is very similar to Richard's CountSetBits4).
uint8_t bits(uint32_t w)
{
const uint8_t m[] =
{
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,
};
uint8_t * p = (uint8_t *) & w;
uint8_t b = m[*p++];
b += m[*p++];
b += m[*p++];
b += m[*p++];
b += m[*p++];
return b;
}
这篇关于编码挑战:计算一个数字中的一位。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!