可变长整数编​​码 [英] Variable length integer encoding

查看:188
本文介绍了可变长整数编​​码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试逆向工程LZ1 / LZ77解压缩算法。要输出的解码缓冲器/窗口的区域的长度被编码在文件中作为可变长度的整数。我已经阅读了尽可能多的可变长度整数编码,并且在这种情况下使用的方法似乎不像我看到的任何其他。也许是为了避免专利问题或者只是混淆。包含的代码可能不是很完整,但现在至少在几个文件上工作。



我看不到如果在下面使用的公式如何可以简化成一些东西。大多数可变长度的整数编码算法使用某种循环,但是对于这个循环,我无法做到这一点,因为在评估每个半字节时,该公式似乎并不一致。



非常感谢您的建议。

  private static int getLength(BitReader bitStream)
{
const int minSize = 2;

int length = 0;

字节nibble3,nibble2,nibble1;

nibble3 = bitStream.ReadNibble();

if(nibble3> = 0xc)
{
nibble2 = bitStream.ReadNibble();
nibble1 = bitStream.ReadNibble();

if(nibble3 == 0xF& nibble2 == 0xF& nibble1 == 0xF)return -1;

if((nibble3& 2)!= 0)
{
length =((((((nibble3& 7)+ 3)< 6) + 8))+
((nibble2& 7)< 3)+ nibble1 + minSize;
}
else if((nibble3& 1)!= 0)
{
length =(((nibble3& 7)< 6)+ 8) +
((((nibble2& 7))+ 1)< 3)+ nibble1 + minSize;
}
else
{
length =((((nibble3& 7)< 4)+ 8))+
((nibble2& 7)<< 4)+ nibble1 + minSize;
}
}
else if((nibble3& 8)!= 0)
{
nibble1 = bitStream.ReadNibble();

length =((((nibble3& 7)< 1)+ 1)< 3)+ nibble1 + minSize;
}
else
{
length = nibble3 + minSize;
}

返回长度;
}


解决方案

事实证明,正在使用的长整型编码算法非常类似于 Dlugosz'可变长度整数编码方法。实际上,需要多个计算,而不是单个公式。



根据这一点,我重新编写了如下代码。我仍然试图找出使用前导0xFFF的机制的确切格式。

  private static int getLength(BitReader bitStream)
{
const int minSize = 2;
int length = 0;
字节nibble3,nibble2,nibble1;
字节半字节;
nibble = bitStream.ReadNibble();
if(nibble == 0xF)
{
nibble2 = bitStream.ReadNibble();
nibble1 = bitStream.ReadNibble();
if(nibble2 == 0xf&& nibble1 == 0xF)
{
//下一个半字节指定要读取的半字节数,也许。
byte nibblesToRead =(byte)(bitStream.ReadNibble());
// Dlugosz的机制将使用一个掩码的值,但这似乎并非如此。
// nibblesToRead& = 7;
// switch(nibblesToRead& 7){
// case 0:nibblesToRead = 5;打破;
// case 1:nibblesToRead = 8;打破;
// case 2:nibblesToRead = 16;打破;
//}
byte value = 0;
byte [] values = new byte [nibblesToRead];
bool c = true; (int i = 0; i< nibblesToRead; i ++)
{
value = bitStream.ReadNibble();
// values [i] = value;
length + =(((value<< 1)| 1)< 3);
}
value = bitStream.ReadNibble();
length + = value;
}
}
else if((nibble> = 0xC)){
nibble2 = bitStream.ReadNibble();
nibble1 = bitStream.ReadNibble();
length =((((((nibble& 1)< 1)| 1))< 3)+((nibble2< 1)| 1)< 3 )+ nibble1;
}
else if((nibble& 8)!= 0){
nibble1 = bitStream.ReadNibble();
length =((((nibble& 3)<< 1)| 1)< 3)+ nibble1;
}
else {
length = nibble;
}
返回长度+ minSize;
};


I am attempting to reverse engineer an LZ1/LZ77 decompression algorithm. The length of an area of the decode buffer/window to be output is encoded in the file as a variable length integer. I've read as much as I can about variable length integer encoding and the method being used in this case does not appear to be like any others I have seen. Perhaps to avoid patent issues or maybe just to obfuscate. The included code might not be quite complete but it is working on at least several files at this point.

I cannot see how, if at all, the formulas being used below could be reduced into something simpler. Most of the variable length integer encoding algorithms use some sort of loop but for this one, I have not been able to do that because the formula just doesn't seem to be consistent when evaluating each nibble.

Suggestions are greatly appreciated.

private static int getLength(BitReader bitStream)
{
    const int minSize = 2;

    int length = 0;

    byte nibble3, nibble2, nibble1;

    nibble3 = bitStream.ReadNibble();

    if (nibble3 >= 0xc)
    {
        nibble2 = bitStream.ReadNibble();
        nibble1 = bitStream.ReadNibble();

        if (nibble3 == 0xF & nibble2 == 0xF & nibble1 == 0xF) return -1;

        if ((nibble3 & 2) != 0)
        {
            length = (((((nibble3 & 7) + 3) << 6) + 8)) + 
                ((nibble2 & 7) << 3) + nibble1 + minSize;
        }
        else if ((nibble3 & 1) != 0)
        {
            length = (((nibble3 & 7) << 6) + 8) + 
                ((((nibble2 & 7)) + 1) << 3) + nibble1 + minSize;
        }
        else
        {
            length = ((((nibble3 & 7) << 4) + 8)) + 
                ((nibble2 & 7) << 4) + nibble1 + minSize;
        }
    }
    else if ((nibble3 & 8) != 0)
    {
        nibble1 = bitStream.ReadNibble();

        length = ((((nibble3 & 7) << 1) + 1) << 3) + nibble1 + minSize;
    }
    else
    {
        length = nibble3 + minSize;
    }

    return length;
}

解决方案

It turns out that the variable length integer encoding algorithm being used is very similar to the Dlugosz' Variable-Length Integer Encoding method. Indeed, there are multiple calculations that are required, rather than a single formula.

Based on that, I re-wrote the code as follows. I am still trying to figure out the exact format of the mechanism where a leading 0xFFF is used.

    private static int getLength(BitReader bitStream)
    {
        const int minSize = 2;
        int length = 0;
        byte nibble3, nibble2, nibble1;
        byte nibble;
        nibble = bitStream.ReadNibble();
        if (nibble == 0xF)
        {
            nibble2 = bitStream.ReadNibble();
            nibble1 = bitStream.ReadNibble();
            if (nibble2 == 0xf && nibble1 == 0xF)
            {
                //The next nibble specifies the number of nibbles to be read, maybe.
                byte nibblesToRead = (byte) (bitStream.ReadNibble()) ;
                //The Dlugosz' mechanism would use a mask on the value but that doesn't appear to be the case here.
                //nibblesToRead &= 7;
                //switch (nibblesToRead & 7){
                //    case 0: nibblesToRead = 5; break;
                //    case 1: nibblesToRead = 8; break;
                //    case 2: nibblesToRead = 16; break;                           
                //}
                byte value=0;
                byte[] values = new byte[nibblesToRead];
                bool c=true;
                for (int i = 0; i < nibblesToRead; i++)
                {
                    value = bitStream.ReadNibble();
                    //values[i] = value;
                    length += (((value << 1) | 1) << 3);
                }
                value = bitStream.ReadNibble();
                length += value;
            }
        }
        else if((nibble >= 0xC)){
           nibble2 = bitStream.ReadNibble();
           nibble1 = bitStream.ReadNibble();
           length = ((((((nibble & 1) <<1)|1))<< 3) + ((nibble2<<1)|1)<<3)+nibble1;
        }
        else if ((nibble & 8)!=0){
            nibble1 = bitStream.ReadNibble();
            length = ((((nibble & 3)<<1) | 1) << 3) + nibble1;
        }
        else{
            length=nibble;
        }
        return length + minSize;
      };

这篇关于可变长整数编​​码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆