将特定位置的位收集到一个新值 [英] Gather bits at specific positions into a new value

查看:50
本文介绍了将特定位置的位收集到一个新值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大小为N个字符的位掩码,这是静态已知的(即可以在编译时计算出来,但是它不是一个常数,所以我不能只写下来),其位设置为1表示所需"位.而且我有一个相同大小的值,只有在运行时才知道.我想从该值中按顺序将所需"位收集到新值的开头.为简单起见,让我们假设所需位数为< = 32.

I have a bit-mask of N chars in size, which is statically known (i.e. can be calculated at compile time, but it's not a single constant, so I can't just write it down), with bits set to 1 denoting the "wanted" bits. And I have a value of the same size, which is only known at runtime. I want to collect the "wanted" bits from that value, in order, into the beginning of a new value. For simplicity's sake let's assume the number of wanted bits is <= 32.

完全未优化的参考代码,希望它具有正确的行为:

Completely unoptimized reference code which hopefully has the correct behaviour:

template<int N, const char mask[N]>
unsigned gather_bits(const char* val)
{
    unsigned result   = 0;
    char*    result_p = (char*)&result;
    int      pos      = 0;
    for (int i = 0; i < N * CHAR_BIT; i++)
    {
        if (mask[i/CHAR_BIT] & (1 << (i % CHAR_BIT)))
        {
            if (val[i/CHAR_BIT] & (1 << (i % CHAR_BIT)))
            {
                if (pos < sizeof(unsigned) * CHAR_BIT)
                {
                    result_p[pos/CHAR_BIT] |= 1 << (pos % CHAR_BIT);
                } 
                else
                {
                    abort();
                }
            }
            pos += 1;
        }
    }
    return result;
}

尽管我不确定该公式是否真正允许在编译时访问掩码的内容.但是无论如何,它都是可以使用的,也许constexpr函数或某些更好的主意.我不是在这里寻找必要的C ++向导(我会找出答案),只是算法.

Although I'm not sure whether that formulation actually allows access to the contents of the mask at compile time. But in any case, it's available for use, maybe a constexpr function or something would be a better idea. I'm not looking here for the necessary C++ wizardry (I'll figure that out), just the algorithm.

一个输入/输出示例,为清楚起见,具有16位值和虚数二进制表示法:

An example of input/output, with 16-bit values and imaginary binary notation for clarity:

mask   = 0b0011011100100110
val    = 0b0101000101110011
--
wanted = 0b__01_001__1__01_ // retain only those bits which are set in the mask
result = 0b0000000001001101 // bring them to the front
                   ^ gathered bits begin here

我的问题是:

  • 最有效的方法是什么? (是否有任何可以帮助您的硬件说明?)

  • What's the most performant way to do this? (Are there any hardware instructions that can help?)

如果掩码和值都限制为unsigned,那么用一个单词而不是无限制的char数组怎么办?然后可以使用固定的简短指令序列来完成此操作吗?

What if both the mask and the value are restricted to be unsigned, so a single word, instead of an unbounded char array? Can it then be done with a fixed, short sequence of instructions?

推荐答案

pext(并行位提取)将完全满足您在Intel Haswell中的要求.我不知道该指令的性能如何,可能比其他方法要好.此操作也称为正确压缩"或简称为压缩",来自Hacker's Delight的实现是这样的:

There will pext (parallel bit extract) that does exactly what you want in Intel Haswell. I don't know what the performance of that instruction will be, probably better than the alternatives though. This operation is also known as "compress-right" or simply "compress", the implementation from Hacker's Delight is this:

unsigned compress(unsigned x, unsigned m) {
   unsigned mk, mp, mv, t; 
   int i; 

   x = x & m;           // Clear irrelevant bits. 
   mk = ~m << 1;        // We will count 0's to right. 

   for (i = 0; i < 5; i++) {
      mp = mk ^ (mk << 1);             // Parallel prefix. 
      mp = mp ^ (mp << 2); 
      mp = mp ^ (mp << 4); 
      mp = mp ^ (mp << 8); 
      mp = mp ^ (mp << 16); 
      mv = mp & m;                     // Bits to move. 
      m = m ^ mv | (mv >> (1 << i));   // Compress m. 
      t = x & mv; 
      x = x ^ t | (t >> (1 << i));     // Compress x. 
      mk = mk & ~mp; 
   } 
   return x; 
} 

这篇关于将特定位置的位收集到一个新值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆