可变长度编码的整数 [英] Variable length integer encoding

查看:177
本文介绍了可变长度编码的整数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图反向工程LZ1 / LZ77 DECOM pression算法。去code缓冲区/窗口的区域的长度为输出连接的文件作为一个可变长度整数codeD。我读过的尽我所能关于可变长度编码的整数和方法在这种情况下使用不会出现像我见过的任何人。也许是为了避免专利问题或者只是混淆。所包含的code可能没有比较完备,但在这一点上至少几个文件工作。

我看不出如何,如果在所有被下面使用的公式可以还原为简单的东西。大多数可变长度整数编码算法使用某种形式的循环,但为这一个,我没有能够做到这一点,因为公式似乎没有评估每个蚕食的时候要一致。

建议是大大AP preciated。

 私有静态诠释的getLength(BitReader比特流)
{
    const int的minSize属性= 2;    INT长度= 0;    字节nibble3,nibble2,nibble1;    nibble3 = bitStream.ReadNibble();    如果(nibble3> =位于0xC)
    {
        nibble2 = bitStream.ReadNibble();
        nibble1 = bitStream.ReadNibble();        如果(nibble3 ==&0xF的放大器; nibble2 ==&0xF的放大器; nibble1 == 0xF的)返回-1;        如果((nibble3&放大器;!2)= 0)
        {
            长度=(((((nibble3和7)+ 3)&所述; 6;)+ 8))+
                ((nibble2和7)所述; 3;)+ nibble1 + minSize属性;
        }
        否则如果((nibble3&放大器;!1)= 0)
        {
            长度=(((nibble3和7)所述; 6;)+ 8)+
                ((((nibble2和7))+ 1) - ; 3;)+ nibble1 + minSize属性;
        }
        其他
        {
            长度=((((nibble3和7)所述; 4;)+ 8))+
                ((nibble2和7)所述; 4;)+ nibble1 + minSize属性;
        }
    }
    否则如果((nibble3&放大器;!8)= 0)
    {
        nibble1 = bitStream.ReadNibble();        长度=((((nibble3和7)所述;&所述; 1)+ 1) - ; 3;)+ nibble1 + minSize属性;
    }
    其他
    {
        长度= nibble3 + minSize属性;
    }    返回的长度;
}


解决方案

原来,正在使用的可变长度整数编码算法是非常相似的Dlugosz'可变长度编码整数方法。事实上,有所需的,而不是一个单一的公式多个计算

在此基础上

,我重新写了code如下。我仍然试图找出其中使用了领先的0xFFF的机制的确切格式。

 私有静态诠释的getLength(BitReader比特流)
    {
        const int的minSize属性= 2;
        INT长度= 0;
        字节nibble3,nibble2,nibble1;
        字节蚕食;
        半字节= bitStream.ReadNibble();
        如果(四位== 0xF的)
        {
            nibble2 = bitStream.ReadNibble();
            nibble1 = bitStream.ReadNibble();
            如果(nibble2 ==&0xF的放大器;&安培; nibble1 == 0xF的)
            {
                //下半字节用于指定要读取的半字节的数目,可能。
                字节nibblesToRead =(字节)(bitStream.ReadNibble());
                //该Dlugosz'机制将使用上的值的掩模,但不会出现在这里是这样的。
                // nibblesToRead&安培; = 7;
                //开关(nibblesToRead和7){
                //情况下0:nibblesToRead = 5;打破;
                //案例1:nibblesToRead = 8;打破;
                //案例2:nibblesToRead = 16;打破;
                //}
                字节值= 0;
                字节[] =值新字节[nibblesToRead]
                布尔C =真实的;
                的for(int i = 0; I< nibblesToRead;我++)
                {
                    值= bitStream.ReadNibble();
                    //值[I] =价值;
                    长+ =(((值下;&所述; 1)| 1) - ; 3;);
                }
                值= bitStream.ReadNibble();
                长度+ =价值;
            }
        }
        否则,如果((四位> =位于0xC)){
           nibble2 = bitStream.ReadNibble();
           nibble1 = bitStream.ReadNibble();
           长度=((((((半字节&放大器; 1) - ;&所述; 1)| 1))≤; 3;)+((nibble2&所述;&所述; 1)| 1) - ; 3;)+ nibble1;
        }
        否则如果((半字节和放大器;!8)= 0){
            nibble1 = bitStream.ReadNibble();
            长度=((((半字节和3)所述;&所述; 1)| 1) - ; 3;)+ nibble1;
        }
        其他{
            长度=蚕食;
        }
        返回长度+ minSize属性;
      };

I am attempting to reverse engineer an LZ1/LZ77 decompression algorithm. The length of an area of the decode buffer/window to be output is encoded in the file as a variable length integer. I've read as much as I can about variable length integer encoding and the method being used in this case does not appear to be like any others I have seen. Perhaps to avoid patent issues or maybe just to obfuscate. The included code might not be quite complete but it is working on at least several files at this point.

I cannot see how, if at all, the formulas being used below could be reduced into something simpler. Most of the variable length integer encoding algorithms use some sort of loop but for this one, I have not been able to do that because the formula just doesn't seem to be consistent when evaluating each nibble.

Suggestions are greatly appreciated.

private static int getLength(BitReader bitStream)
{
    const int minSize = 2;

    int length = 0;

    byte nibble3, nibble2, nibble1;

    nibble3 = bitStream.ReadNibble();

    if (nibble3 >= 0xc)
    {
        nibble2 = bitStream.ReadNibble();
        nibble1 = bitStream.ReadNibble();

        if (nibble3 == 0xF & nibble2 == 0xF & nibble1 == 0xF) return -1;

        if ((nibble3 & 2) != 0)
        {
            length = (((((nibble3 & 7) + 3) << 6) + 8)) + 
                ((nibble2 & 7) << 3) + nibble1 + minSize;
        }
        else if ((nibble3 & 1) != 0)
        {
            length = (((nibble3 & 7) << 6) + 8) + 
                ((((nibble2 & 7)) + 1) << 3) + nibble1 + minSize;
        }
        else
        {
            length = ((((nibble3 & 7) << 4) + 8)) + 
                ((nibble2 & 7) << 4) + nibble1 + minSize;
        }
    }
    else if ((nibble3 & 8) != 0)
    {
        nibble1 = bitStream.ReadNibble();

        length = ((((nibble3 & 7) << 1) + 1) << 3) + nibble1 + minSize;
    }
    else
    {
        length = nibble3 + minSize;
    }

    return length;
}

解决方案

It turns out that the variable length integer encoding algorithm being used is very similar to the Dlugosz' Variable-Length Integer Encoding method. Indeed, there are multiple calculations that are required, rather than a single formula.

Based on that, I re-wrote the code as follows. I am still trying to figure out the exact format of the mechanism where a leading 0xFFF is used.

    private static int getLength(BitReader bitStream)
    {
        const int minSize = 2;
        int length = 0;
        byte nibble3, nibble2, nibble1;
        byte nibble;
        nibble = bitStream.ReadNibble();
        if (nibble == 0xF)
        {
            nibble2 = bitStream.ReadNibble();
            nibble1 = bitStream.ReadNibble();
            if (nibble2 == 0xf && nibble1 == 0xF)
            {
                //The next nibble specifies the number of nibbles to be read, maybe.
                byte nibblesToRead = (byte) (bitStream.ReadNibble()) ;
                //The Dlugosz' mechanism would use a mask on the value but that doesn't appear to be the case here.
                //nibblesToRead &= 7;
                //switch (nibblesToRead & 7){
                //    case 0: nibblesToRead = 5; break;
                //    case 1: nibblesToRead = 8; break;
                //    case 2: nibblesToRead = 16; break;                           
                //}
                byte value=0;
                byte[] values = new byte[nibblesToRead];
                bool c=true;
                for (int i = 0; i < nibblesToRead; i++)
                {
                    value = bitStream.ReadNibble();
                    //values[i] = value;
                    length += (((value << 1) | 1) << 3);
                }
                value = bitStream.ReadNibble();
                length += value;
            }
        }
        else if((nibble >= 0xC)){
           nibble2 = bitStream.ReadNibble();
           nibble1 = bitStream.ReadNibble();
           length = ((((((nibble & 1) <<1)|1))<< 3) + ((nibble2<<1)|1)<<3)+nibble1;
        }
        else if ((nibble & 8)!=0){
            nibble1 = bitStream.ReadNibble();
            length = ((((nibble & 3)<<1) | 1) << 3) + nibble1;
        }
        else{
            length=nibble;
        }
        return length + minSize;
      };

这篇关于可变长度编码的整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆