快速位置换 [英] Fast bit permutation

查看:29
本文介绍了快速位置换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要存储和应用 16 位整数的排列.我想出的最佳解决方案是将置换存储为 64 位整数,其中每 4 位对应第 i 位的新位置,应用程序将如下所示:

I need to store and apply permutations to 16-bit integers. The best solution I came up with is to store permutation as 64-bit integer where each 4 bits correspond to the new position of i-th bit, the application would look like:

int16 permute(int16 bits, int64 perm)
{
   int16 result = 0;
   for(int i = 0; i < 16; ++i)
      result |= ((bits >> i) & 1) * (1 << int( (perm >> (i*4))&0xf ));
   return result;
}

有没有更快的方法来做到这一点?谢谢.

is there a faster way to do this? Thank you.

推荐答案

有替代方案.

任何排列都可以由 Beneš 网络处理,并编码为掩码这是多路复用器的输入以应用混洗.这也可以在软件中相当有效地完成(不是很好但还可以),这只是一堆蝴蝶排列.掩码计算起来有点棘手,但应用起来可能比单独移动每一位更快,尽管这取决于您要处理的位有多少,而 16 并不是很多.

Any permutation can be handled by a Beneš network, and encoded as the masks that are the inputs to the multiplexers to apply the shuffle. This can be done reasonably efficiently in software too (not great but OK), it's just a bunch of butterfly permutations. The masks are a bit tricky to compute, but probably faster to apply than moving every bit on its own, though that depends on how many bits you're dealing with and 16 is not a lot.

一些较小类别的 shuffle 可以由更简单(更快)的网络处理,您也可以在该页面上找到.

Some smaller categories of shuffles can be handled by simpler (faster) networks, which you can also find on that page.

最后在实践中,在现代 x86 硬件上,有高度通用的 pshufb 函数,它可以在(通常)单个周期中将排列(但可能包括重复和零)应用于 16 个字节.在字节上分配位是有点尴尬,但是一旦你在那里,它只需要一个 pshufb 进行置换,使用 pmovmskb 将其压缩回 16 位.

Finally in practice, on modern x86 hardware, there is the highly versatile pshufb function which can apply a permutation (but may include dupes and zeroes) to 16 bytes in (typically) a single cycle. It is slightly awkward to distribute the bits over the bytes, but once you're there it only takes a pshufb to permute and a pmovmskb to compress it back down to 16 bits.

这篇关于快速位置换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆