为什么gzip / deflate压缩一个小文件导致许多尾随零? [英] Why does gzip/deflate compressing a small file result in many trailing zeroes?

查看:231
本文介绍了为什么gzip / deflate压缩一个小文件导致许多尾随零?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用以下代码在C#中压缩一个小的(〜4kB)HTML文件。

I'm using the following code to compress a small (~4kB) HTML file in C#.

byte[] fileBuffer = ReadFully(inFile, ResponsePacket.maxResponsePayloadLength); // Read the entire requested HTML file into a memory buffer
inFile.Close();                                                                 // Close the requested HTML file

byte[] payload;
using (MemoryStream compMS = new MemoryStream())                                       // Create a new memory stream to hold the compressed HTML data
{
    using (GZipStream gzip = new GZipStream(compMS, CompressionMode.Compress))            // Create a new GZip object pointing to the empty memory stream
    {
        gzip.Write(fileBuffer, 0, fileBuffer.Length);                                   // Compress the file buffer and write it to the empty memory stream
        gzip.Close();                                                                   // Close the GZip object
    }
    payload = compMS.GetBuffer();                                            // Write the compressed file buffer data in the memory stream to a byte buffer
}

所得到的压缩数据大约为2k,但是大约一半的数据只是零。这是一个非常带宽敏感的应用程序(这就是为什么我不得不压缩4kB在第一位),所以额外的1kB的零是浪费宝贵的空间。我最好的猜测是,压缩算法将数据填充到块边界。如果是这样,有没有办法重写这种行为或更改块大小?我使用vanilla .NET GZipStream和zlib的GZipStream以及DeflateStream获得相同的结果。

The resulting compressed data is about 2k, but about half of it is just zeroes. This is for a very bandwidth sensitive application (which is why I'm bothering to compress 4kB in the first place), so the extra 1kB of zeroes is wasted valuable space. My best guess would be that the compression algorithm is padding out the data to a block boundary. If so, is there any way to override this behavior or change the block size? I get the same results with vanilla .NET GZipStream and zlib's GZipStream, as well as DeflateStream.

推荐答案

错误的MemoryStream方法。 GetBuffer()返回底层缓冲区,它总是比流中的数据更大(或完全一样大)。非常有效,因为不需要复制。

Wrong MemoryStream method. GetBuffer() returns the underlying buffer, it is always larger (or exactly as large) as the data in the stream. Very efficient because no copy needs to be made.

但是你需要ToArray()方法。或使用Length属性。

But you need the ToArray() method here. Or use the Length property.

这篇关于为什么gzip / deflate压缩一个小文件导致许多尾随零?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆