可变长度整数编码 [英] Variable length integer encoding

查看:975
本文介绍了可变长度整数编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图逆向工程LZ1 / LZ77解压缩算法。要输出的解码缓冲器/窗口的区域的长度在文件中被编码为可变长度整数。我已经阅读尽可能多的可变长度整数编码,在这种情况下使用的方法看起来不像任何其他我看到的。也许是为了避免专利问题或者只是模糊。所包含的代码可能不是很完整,但它至少在几个文件在这一点上工作。

I am attempting to reverse engineer an LZ1/LZ77 decompression algorithm. The length of an area of the decode buffer/window to be output is encoded in the file as a variable length integer. I've read as much as I can about variable length integer encoding and the method being used in this case does not appear to be like any others I have seen. Perhaps to avoid patent issues or maybe just to obfuscate. The included code might not be quite complete but it is working on at least several files at this point.

我看不到,如果有的话,下面使用的公式可以简化为更简单的东西。大多数可变长度整数编码算法使用某种循环,但是对于这一个,我不能这样做,因为公式在评估每个半字节时似乎不一致。

I cannot see how, if at all, the formulas being used below could be reduced into something simpler. Most of the variable length integer encoding algorithms use some sort of loop but for this one, I have not been able to do that because the formula just doesn't seem to be consistent when evaluating each nibble.

非常感谢您的建议。

private static int getLength(BitReader bitStream)
{
    const int minSize = 2;

    int length = 0;

    byte nibble3, nibble2, nibble1;

    nibble3 = bitStream.ReadNibble();

    if (nibble3 >= 0xc)
    {
        nibble2 = bitStream.ReadNibble();
        nibble1 = bitStream.ReadNibble();

        if (nibble3 == 0xF & nibble2 == 0xF & nibble1 == 0xF) return -1;

        if ((nibble3 & 2) != 0)
        {
            length = (((((nibble3 & 7) + 3) << 6) + 8)) + 
                ((nibble2 & 7) << 3) + nibble1 + minSize;
        }
        else if ((nibble3 & 1) != 0)
        {
            length = (((nibble3 & 7) << 6) + 8) + 
                ((((nibble2 & 7)) + 1) << 3) + nibble1 + minSize;
        }
        else
        {
            length = ((((nibble3 & 7) << 4) + 8)) + 
                ((nibble2 & 7) << 4) + nibble1 + minSize;
        }
    }
    else if ((nibble3 & 8) != 0)
    {
        nibble1 = bitStream.ReadNibble();

        length = ((((nibble3 & 7) << 1) + 1) << 3) + nibble1 + minSize;
    }
    else
    {
        length = nibble3 + minSize;
    }

    return length;
}


推荐答案

使用的长整数编码算法与 Dlugosz可变长整型编码非常相似方法。实际上,需要多个计算,而不是单个公式。

It turns out that the variable length integer encoding algorithm being used is very similar to the Dlugosz' Variable-Length Integer Encoding method. Indeed, there are multiple calculations that are required, rather than a single formula.

基于这一点,我重写了如下的代码。我仍然试图找出使用前导0xFFF的机制的确切格式。

Based on that, I re-wrote the code as follows. I am still trying to figure out the exact format of the mechanism where a leading 0xFFF is used.

    private static int getLength(BitReader bitStream)
    {
        const int minSize = 2;
        int length = 0;
        byte nibble3, nibble2, nibble1;
        byte nibble;
        nibble = bitStream.ReadNibble();
        if (nibble == 0xF)
        {
            nibble2 = bitStream.ReadNibble();
            nibble1 = bitStream.ReadNibble();
            if (nibble2 == 0xf && nibble1 == 0xF)
            {
                //The next nibble specifies the number of nibbles to be read, maybe.
                byte nibblesToRead = (byte) (bitStream.ReadNibble()) ;
                //The Dlugosz' mechanism would use a mask on the value but that doesn't appear to be the case here.
                //nibblesToRead &= 7;
                //switch (nibblesToRead & 7){
                //    case 0: nibblesToRead = 5; break;
                //    case 1: nibblesToRead = 8; break;
                //    case 2: nibblesToRead = 16; break;                           
                //}
                byte value=0;
                byte[] values = new byte[nibblesToRead];
                bool c=true;
                for (int i = 0; i < nibblesToRead; i++)
                {
                    value = bitStream.ReadNibble();
                    //values[i] = value;
                    length += (((value << 1) | 1) << 3);
                }
                value = bitStream.ReadNibble();
                length += value;
            }
        }
        else if((nibble >= 0xC)){
           nibble2 = bitStream.ReadNibble();
           nibble1 = bitStream.ReadNibble();
           length = ((((((nibble & 1) <<1)|1))<< 3) + ((nibble2<<1)|1)<<3)+nibble1;
        }
        else if ((nibble & 8)!=0){
            nibble1 = bitStream.ReadNibble();
            length = ((((nibble & 3)<<1) | 1) << 3) + nibble1;
        }
        else{
            length=nibble;
        }
        return length + minSize;
      };

这篇关于可变长度整数编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆