什么是重新present短位串的最好方法? [英] What's the best way to represent a short bit string?

查看:183
本文介绍了什么是重新present短位串的最好方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想重新present高达约120位的串,并且速度是至关重要的。我需要能够通过反复 snoc建立一个位串操作,然后反复 uncons 操作来使用它。一种想法是从数据DWORD窃取 Word128 实施和使用这样的构建:

I want to represent a string of up to around 120 bits, and speed is critical. I need to be able to build a bitstring by repeated snoc operations, and then to consume it with repeated uncons operations. One idea is to steal the implementation of Word128 from data-dword and use something like this to build:

empty = 1
snoc xs x = (xs `shiftL` 1) .|. x

但unconsing似乎变得有点难看,必先 countLeadingZeros 键,左移位,以消除他们能够读出由移位和屏蔽高的元素之前位。

But the unconsing seems to get a bit ugly, having to first countLeadingZeros and shift left to eliminate them before being able to read off the elements by shifting and masking the high bits.

有一些这至少快更愉快的方式,或者说没有太多更难听一些更快捷的方式?

Is there some more pleasant way that's at least as fast, or some faster way that's not too much more unpleasant?

菲尔Ruffwind曾在提出了一个版本镜头的 Data.Map中,但所有的实现迄今为止比幼稚的做法时,键比较便宜镜头目前使用的慢得多。如果我能产生路径的非常便宜的再presentation一个条目,同时寻找它,然后用或特殊版本插入非常有效地使用它删除,那么也许我可以做这项有意义的。

Phil Ruffwind has proposed a version of lens's at for Data.Map, but all implementations thus far are substantially slower than the naive implementation lens currently uses when key comparison is cheap. If I could produce a very cheap representation of the path to an entry while looking it up, and then consume it very efficiently with a specialized version of insert or delete, then maybe I could make this worthwhile.

推荐答案

我不知道,如果这资格。我担心,我重新实现 countLeadingZeros 以某种形式...

I am not sure if this qualifies. I fear that I'm re-implementing countLeadingZeros in some form...

反正想法是从左侧snoc位右移。然后,我们可以算用x X-1 和XOR的的尾随零。 计数的结果是一个面具00..01..11的,大致是尾随零一元复presentation。我们因为我们没有必要不要将此一元转换为二进制:一些位层面的工作,我们可以uncons

Anyway, the idea is to snoc bits from the left, shifting right. Then, we can "count" the trailing zeros of x using x-1 and a XOR. The result of the "count" is a mask "00..01..11" which, roughly, is a unary representation of the trailing zeros. We do not convert this unary to binary since we have no need to: with some bit-level work, we can uncons.

未经测试和未经证实的code如下:

Untested and unproven code follows.

import Data.Word
import Data.Bits
import Text.Printf

type T = Word64     -- can be adapted to any WordN

-- for pretty printing
pr :: T -> String
pr x = printf "%064b\n" x

empty :: T
empty = shiftL 1 63

snoc :: T -> T -> T
snoc x xs = shiftR xs 1 .|. (shiftL x 63)

-- returns (head, tail)
-- head is not normalized (0 or 1), only (0 or /=0)
uncons :: T -> (T, T)
uncons xs = 
   let -- example
       -- 0101001100000000000   xs  
       y = (xs `xor` (xs - 1))
       -- 0000000111111111111   y
       z = shiftR y 1 + 1
       -- 0000000100000000000   z
       z' = shiftL z 1
       -- 0000001000000000000   z'
   in (xs .&. z' , (xs .&. complement z) .|. z' )

这篇关于什么是重新present短位串的最好方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆