如何找到的唯一的集位的有效利用位操作64位值的位置? [英] How to find the position of the only-set-bit in a 64-bit value using bit manipulation efficiently?

查看:249
本文介绍了如何找到的唯一的集位的有效利用位操作64位值的位置?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

只是说我有 uint64_t中看作是八位字节序列(1个字节= 8位)类型的值。在 uint64_t中值是已知的只含有一个的设置位的在MSB位置。因此, uint64_t中值可以在下面的二进制重新presentations之一:

Just say I have a value of type uint64_t seen as sequence of octets (1 octet = 8-bit). The uint64_t value is known containing only one set bit at a MSB position. Thus, the uint64_t value can be in one of the following binary representations:

00000000 00000000 00000000 00000000 00000000 00000000 00000000 10000000  pos = 7
00000000 00000000 00000000 00000000 00000000 00000000 10000000 00000000  pos = 15
00000000 00000000 00000000 00000000 00000000 10000000 00000000 00000000  pos = 23
00000000 00000000 00000000 00000000 10000000 00000000 00000000 00000000  pos = 31
00000000 00000000 00000000 10000000 00000000 00000000 00000000 00000000  pos = 39
00000000 00000000 10000000 00000000 00000000 00000000 00000000 00000000  pos = 47
00000000 10000000 00000000 00000000 00000000 00000000 00000000 00000000  pos = 55
10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000  pos = 63

我需要一个快速的函数,返回的设置位的地位,但如果没有位被设置返回0。

I need a fast function that returns the set bit position, but returns 0 if there is no bit that is set.

如果有可能,我希望它没有既不循环也不分支。

If possible, I want it without neither looping nor branching.

推荐答案

按乘以值一个精心设计的64位不变,然后屏蔽掉高4位。对于具有快速64位乘法的CPU,这可能是因为最佳的,因为你可以得到的。

Multiply the value by a carefully designed 64-bit constant, then mask off the upper 4 bits. For any CPU with fast 64-bit multiplication, this is probably as optimal as you can get.

int field_set(uint64_t input) {
    uint64_t field = input * 0x20406080a0c0e1ULL;
    return (field >> 60) & 15;
}

// field_set(0x0000000000000000ULL) = 0
// field_set(0x0000000000000080ULL) = 1
// field_set(0x0000000000008000ULL) = 2
// field_set(0x0000000000800000ULL) = 3
// field_set(0x0000000080000000ULL) = 4
// field_set(0x0000008000000000ULL) = 5
// field_set(0x0000800000000000ULL) = 6
// field_set(0x0080000000000000ULL) = 7
// field_set(0x8000000000000000ULL) = 8

铛三x86_64的指令实现了这个,不计算框架的设置和清除:

clang implements this in three x86_64 instructions, not counting the frame setup and cleanup:

_field_set:
    push   %rbp
    mov    %rsp,%rbp
    movabs $0x20406080a0c0e1,%rax
    imul   %rdi,%rax
    shr    $0x3c,%rax
    pop    %rbp
    retq

请注意,对于任何其它的输入结果将是pretty很多随机的。 (所以不要做。)

Note that the results for any other input will be pretty much random. (So don't do that.)

我不认为有延长这种方法在7..63范围内直接返回值(不变的结构不允许它)任何可行的方法,但是你可以通过结果转换为范围7结果乘以。

I don't think there's any feasible way to extend this method to return values in the 7..63 range directly (the structure of the constant doesn't permit it), but you can convert the results to that range by multiplying the result by 7.

对于如何不断设计:我开始与以下意见:

With regard to how this constant was designed: I started with the following observations:


  • 无符号乘法是在大多数CPU的快速操作,并且可具有有用的效果。我们应该使用它。 :)

  • 在零乘零结果什么。由于这与无位设置输入所需的结果一致,我们做得很好至今。

  • 乘以 1ULL&LT任何东西;其中p 63 (即,你的POS = 63的值)只能有可能导致相同的值,或者为零。 (它不可能有任何较低位设置,并且没有较高位改变)。因此,我们必须找到某种方式为这个值作为正确的结果进行处理。

  • 使这个价值是自己的正确结果的一个简便的方法是通过右移它由60位。这种转移下来到8,这是一个方便的足够重presentation。我们可以通过7进行连接code中的其它输出为1。

  • 由每个其他比特字段乘法我们的常数等同于左移它由若干等于其位置的位。由60位右移导致仅4位到给定的位置,以显示在结果中的左侧。因此,我们可以创建所有的情况下的除了一个的如下:

  • Unsigned multiplication is a fast operation on most CPUs, and can have useful effects. We should use it. :)
  • Multiplying anything by zero results in zero. Since this matches with the desired result for a no-bits-set input, we're doing well so far.
  • Multiplying anything by 1ULL<<63 (i.e, your "pos=63" value) can only possibly result in the same value, or zero. (It cannot possibly have any lower bits set, and there are no higher bits to change.) Therefore, we must find some way for this value to be treated as the correct result.
  • A convenient way of making this value be its own correct result is by right-shifting it by 60 bits. This shifts it down to "8", which is a convenient enough representation. We can proceed to encode the other outputs as 1 through 7.
  • Multiplying our constant by each of the other bit fields is equivalent to left-shifting it by a number of bits equal to its "position". The right-shift by 60 bits causes only the 4 bits to the left of a given position to appear in the result. Thus, we can create all of the cases except for one as follows:

 uint64_t constant = (
      1ULL << (60 - 7)
    | 2ULL << (60 - 15)
    | 3ULL << (60 - 23)
    | 4ULL << (60 - 31)
    | 5ULL << (60 - 39)
    | 6ULL << (60 - 47)
    | 7ULL << (60 - 55)
 );


到目前为止,不变的是 0x20406080a0c0e0ULL 。然而,这并没有给出 POS = 63 正确的结果;这个常数是偶数,所以通过输入乘以它给为零。我们必须设置最低位(即恒| = 1ULL )来获取这种情况下的工作,使我们的最终值 0x20406080a0c0e1ULL

So far, the constant is 0x20406080a0c0e0ULL. However, this doesn't give the right result for pos=63; this constant is even, so multiplying it by that input gives zero. We must set the lowest bit (i.e, constant |= 1ULL) to get that case to work, giving us the final value of 0x20406080a0c0e1ULL.

请注意,上面的结构可以被不同地修改,以连接code中的结果。然而,输出 8 如上所述是固定的,并且所有其它输出必须适合4比特(即,0到15)。

Note that the construction above can be modified to encode the results differently. However, the output of 8 is fixed as described above, and all other output must fit into 4 bits (i.e, 0 to 15).

这篇关于如何找到的唯一的集位的有效利用位操作64位值的位置?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆