的Python:充气和放气实现 [英] Python: Inflate and Deflate implementations

查看:169
本文介绍了的Python:充气和放气实现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我与要求发送给它的数据是一个服务器接口的COM pressed用的减缩的算法(霍夫曼编码+ LZ77)和发送,我需要的充气数据的。

我知道了Python包含的Zlib,而C库中支持zlib调用的膨胀的和的减缩的,但这些显然不是由Python的Zlib模块提供。它提供的的COM preSS 的和的 DECOM preSS 的,但是当我拨打电话,如以下内容:

  result_data = zlib.decom preSS(base64_de coded_com pressed_string)

我收到以下错误:

 错误-3而DECOM pressing数据:不正确的头检查

Gzip已确实没有更好的;在拨打电话时,如:

  result_data = gzip.GzipFile(FileObj文件= StringIO.StringIO(base64_de coded_com pressed_string)).read()

我收到错误消息:

  IO错误:不是g​​zip压缩的文件

这是有道理的数据是的平减的文件不是一个真正的 Gzip压缩的文件。

现在我知道,有一个的减缩的实施提供(Pyflate),但我不知道一个的膨胀的实施。

似乎有几个选项:

1。 找到的膨胀的的现有实施(理想)和减缩的Python中

2。写我自己的Python扩展到zlib的C库包含的膨胀的和的减缩

3。呼叫别的东西,可以在命令行执行(如Ruby脚本,因为的膨胀 / 减缩的调用zlib的完全包裹在红宝石)

4。 ?

我寻求解决办法,但缺乏一个解决方案,我会心存感激的见解,建设性的意见和想法。

其他信息
放气(和编码)的字符串的结果应该,因为我需要的目的,产生相同的结果为C#code的片断,其中输入参数是UTF字节对应的数据数组COM preSS:

 公共静态字符串DeflateAndEn codeBase64(字节[]数据)
{
    如果(空== ||数据&data.Length。1)返回NULL;
    串COM pressedBase64 =;    //写入被放气流包装一个新的内存流
    使用(MemoryStream的毫秒=新的MemoryStream())
    {
        使用(DeflateStream deflateStream =新DeflateStream(MS,COM pressionMode.Com preSS,真))
        {
            //写字节的缓冲区到的MemoryStream
            deflateStream.Write(数据,0,data.Length);
            deflateStream.Close();            //倒带内存流,并写入基地64串
            字节[] COM pressedBytes =新的字节[ms.Length]
            ms.Seek(0,SeekOrigin.Begin);
            ms.Read(COM pressedBytes,0,(INT)ms.Length);
            COM pressedBase64 = Convert.ToBase64String(COM pressedBytes);
        }
    }
    返回COM pressedBase64;
}

运行这个.NET code字符串放气和连接code我给出结果\"7b0HYBxJliUmL23Ke39K9UrX4HShCIBgEyTYkEAQ7MGIzeaS7B1pRyMpqyqBymVWZV1mFkDM7Z28995777333nvvvfe6O51OJ/ff/z9cZmQBbPbOStrJniGAqsgfP358Hz8iZvl5mbV5mi1nab6cVrM8XeT/Dw==\"

在放气和连接code我是通过Python Zlib.com $ P $(PSS),然后连接的base64 codeD运行,结果是eJxLSU3LSSxJVUjMS1FIzUvOT0lVyE0FAFXHB6k =。

显然zlib.com preSS()是不相同的算法作为标准DEFLATE算法的实现。

更多信息

前2个字节的.NET的放气数据(7b0HY ...),B64解码后的0xEDBD,这不符合Gzip已数据(0x1f8b),也可选择bzip2(0x425A)数据,或Zlib压缩(0x789C)数据。

前2个字节Python的COM $ P $的pssed数据(eJxLS ...),B64解码后的0x789C。这是一个zlib的头。

解决

要处理的原始deflate和膨胀,没有头和校验和,发生需要以下几件事:

在放气/ COM preSS:剥离的前两个字节(标题)和最后4个字节(校验和)。


在充气/ DECOM preSS:有窗尺寸的第二个参数。如果这个值是负数它燮presses头。


这里是我目前的方法,包括base64编码/解码 - 和正常工作:

 进口的zlib
进口的base64高清德code_base64_and_inflate(b64string):
    德coded_data = base64.b64de code(b64string)
    返回zlib.decom preSS(德coded_data,-15)高清deflate_and_base64_en code(string_val):
    zlibbed_str = zlib.com preSS(string_val)
    COM pressed_string = zlibbed_str [2:-4]
    返回base64.b64en code(COM pressed_string)


解决方案

这是一个附加到MizardX的回答,给一些解释和背景。

请参阅http://www.chiramattel.com/george/blog/2007/09/09/deflatestream-block-length-does-not-match.html

根据 RFC 1950年,在默认的方式构成的zlib的流是由


  • 2个字节的报头(例如,0x78为0x9c)

  • 放气流 - 见 RFC 1951年

  • 的uncom pressed数据的Adler-32校验(4字节)

C#的 DeflateStream 工作在(你猜对了),放气流。 MizardX的code告诉zlib的模块中的数据是原始放气流。

意见:(1)一个希望C#通缩的方法产生更长的字符串只发生短输入(2)使用原始放气流无Adler-32校验?有点冒险,除非换成更好的东西。

更新

错误信息块长度不符合其补充

如果你想用C# DeflateStream 夸大某些COM pressed的数据,你会得到该消息,那么很有可能你给它AA ZLIB流,而不是放气流。

请参阅你如何在一个文件的一部分使用DeflateStream?

另外复制/粘贴错误信息到谷歌搜索,你会得到大量的点击率(含一起来这个答案的前面)多说了同样的话。

在Java Deflater ...通过网站使用C#DeflateStream是pretty简单,已经对Java实现进行测试。以下哪项可能的Java Deflater构造函数是在使用本网站?


  

公共Deflater(INT水平,布尔NOWRAP)


  
  

使用指定的COM pression级别一个新的COM pressor。如果'nowrap'选项为真,那么ZLIB头和校验和字段就不会为了支持在GZIP和PKZIP使用的COM pression格式使用。


  
  

公共Deflater(INT级)


  
  

使用指定的COM pression级别一个新的COM pressor。 COM pressed的数据将以ZLIB格式生成。


  
  

公共Deflater()


  
  

创建一个新的COM pressor使用默认的COM pression水平。玉米pressed数据将以ZLIB格式生成。


的单行deflater 丢掉2个字节的zlib头和4字节的校验后:

  uncom pressed_string.en code('zlib的')[2:-4]#并不在Python 3.x的工作


  zlib.com preSS(uncom pressed_string)[2:-4]

I am interfacing with a server that requires that data sent to it is compressed with Deflate algorithm (Huffman encoding + LZ77) and also sends data that I need to Inflate.

I know that Python includes Zlib, and that the C libraries in Zlib support calls to Inflate and Deflate, but these apparently are not provided by the Python Zlib module. It does provide Compress and Decompress, but when I make a call such as the following:

result_data = zlib.decompress( base64_decoded_compressed_string )

I receive the following error:

Error -3 while decompressing data: incorrect header check

Gzip does no better; when making a call such as:

result_data = gzip.GzipFile( fileobj = StringIO.StringIO( base64_decoded_compressed_string ) ).read()

I receive the error:

IOError: Not a gzipped file

which makes sense as the data is a Deflated file not a true Gzipped file.

Now I know that there is a Deflate implementation available (Pyflate), but I do not know of an Inflate implementation.

It seems that there are a few options:
1. Find an existing implementation (ideal) of Inflate and Deflate in Python
2. Write my own Python extension to the zlib c library that includes Inflate and Deflate
3. Call something else that can be executed from the command line (such as a Ruby script, since Inflate/Deflate calls in zlib are fully wrapped in Ruby)
4. ?

I am seeking a solution, but lacking a solution I will be thankful for insights, constructive opinions, and ideas.

Additional information: The result of deflating (and encoding) a string should, for the purposes I need, give the same result as the following snippet of C# code, where the input parameter is an array of UTF bytes corresponding to the data to compress:

public static string DeflateAndEncodeBase64(byte[] data)
{
    if (null == data || data.Length < 1) return null;
    string compressedBase64 = "";

    //write into a new memory stream wrapped by a deflate stream
    using (MemoryStream ms = new MemoryStream())
    {
        using (DeflateStream deflateStream = new DeflateStream(ms, CompressionMode.Compress, true))
        {
            //write byte buffer into memorystream
            deflateStream.Write(data, 0, data.Length);
            deflateStream.Close();

            //rewind memory stream and write to base 64 string
            byte[] compressedBytes = new byte[ms.Length];
            ms.Seek(0, SeekOrigin.Begin);
            ms.Read(compressedBytes, 0, (int)ms.Length);
            compressedBase64 = Convert.ToBase64String(compressedBytes);
        }
    }
    return compressedBase64;
}

Running this .NET code for the string "deflate and encode me" gives the result "7b0HYBxJliUmL23Ke39K9UrX4HShCIBgEyTYkEAQ7MGIzeaS7B1pRyMpqyqBymVWZV1mFkDM7Z28995777333nvvvfe6O51OJ/ff/z9cZmQBbPbOStrJniGAqsgfP358Hz8iZvl5mbV5mi1nab6cVrM8XeT/Dw=="

When "deflate and encode me" is run through the Python Zlib.compress() and then base64 encoded, the result is "eJxLSU3LSSxJVUjMS1FIzUvOT0lVyE0FAFXHB6k=".

It is clear that zlib.compress() is not an implementation of the same algorithm as the standard Deflate algorithm.

More Information:

The first 2 bytes of the .NET deflate data ("7b0HY..."), after b64 decoding are 0xEDBD, which does not correspond to Gzip data (0x1f8b), BZip2 (0x425A) data, or Zlib (0x789C) data.

The first 2 bytes of the Python compressed data ("eJxLS..."), after b64 decoding are 0x789C. This is a Zlib header.

SOLVED
To handle the raw deflate and inflate, without header and checksum, the following things needed to happen:

On deflate/compress: strip the first two bytes (header) and the last four bytes (checksum).
On inflate/decompress: there is a second argument for window size. If this value is negative it suppresses headers.
here are my methods currently, including the base64 encoding/decoding - and working properly:

import zlib
import base64

def decode_base64_and_inflate( b64string ):
    decoded_data = base64.b64decode( b64string )
    return zlib.decompress( decoded_data , -15)

def deflate_and_base64_encode( string_val ):
    zlibbed_str = zlib.compress( string_val )
    compressed_string = zlibbed_str[2:-4]
    return base64.b64encode( compressed_string )

解决方案

This is an add-on to MizardX's answer, giving some explanation and background.

See http://www.chiramattel.com/george/blog/2007/09/09/deflatestream-block-length-does-not-match.html

According to RFC 1950, a zlib stream constructed in the default manner is composed of:

  • a 2-byte header (e.g. 0x78 0x9C)
  • a deflate stream -- see RFC 1951
  • an Adler-32 checksum of the uncompressed data (4 bytes)

The C# DeflateStream works on (you guessed it) a deflate stream. MizardX's code is telling the zlib module that the data is a raw deflate stream.

Observations: (1) One hopes the C# "deflation" method producing a longer string happens only with short input (2) Using the raw deflate stream without the Adler-32 checksum? Bit risky, unless replaced with something better.

Updates

error message Block length does not match with its complement

If you are trying to inflate some compressed data with the C# DeflateStream and you get that message, then it is quite possible that you are giving it a a zlib stream, not a deflate stream.

See How do you use a DeflateStream on part of a file?

Also copy/paste the error message into a Google search and you will get numerous hits (including the one up the front of this answer) saying much the same thing.

The Java Deflater ... used by "the website" ... C# DeflateStream "is pretty straightforward and has been tested against the Java implementation". Which of the following possible Java Deflater constructors is the website using?

public Deflater(int level, boolean nowrap)

Creates a new compressor using the specified compression level. If 'nowrap' is true then the ZLIB header and checksum fields will not be used in order to support the compression format used in both GZIP and PKZIP.

public Deflater(int level)

Creates a new compressor using the specified compression level. Compressed data will be generated in ZLIB format.

public Deflater()

Creates a new compressor with the default compression level. Compressed data will be generated in ZLIB format.

A one-line deflater after throwing away the 2-byte zlib header and the 4-byte checksum:

uncompressed_string.encode('zlib')[2:-4] # does not work in Python 3.x

or

zlib.compress(uncompressed_string)[2:-4]

这篇关于的Python:充气和放气实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆