zlib编程中,CHUNK大小会影响压缩文件大小吗? [英] In zlib programming, will the CHUNK size affect the compressed file size?

查看:40
本文介绍了zlib编程中,CHUNK大小会影响压缩文件大小吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Linux 平台上使用 C 编程语言.我参考了zlib官网的zlib用法示例(http://www.zlib.net/zlib_how.html).html) 并编写压缩程序.注意我的压缩方式是gzip,也就是说使用deflateint2()函数而不是deflateinit().

I use C programming language in Linux platform. I refer to the zlib usage example on zlib's official website (http://www.zlib.net/zlib_how.html) and write a compression program. Note that my compression method is gzip, which means using the deflateint2() function instead of deflateinit().

根据 zlib 的网站,CHUNK 只是用于向 zlib 例程提供数据和从 zlib 例程中提取数据的缓冲区大小.较大的缓冲区大小会更有效,尤其是对于 inflate().如果内存可用,则应使用大小为 128K 或 256K 字节的缓冲区."所以我认为CHUNK越大,压缩文件越小,压缩速度越快.

According to zlib's website,"CHUNK is simply the buffer size for feeding data to and pulling data from the zlib routines. Larger buffer sizes would be more efficient, especially for inflate(). If the memory is available, buffers sizes on the order of 128K or 256K bytes should be used. " So I think the bigger the CHUNK, the smaller the compressed file will be and the faster the compression speed will be.

但是当我测试我的程序时,我发现无论CHUNK大小是16384还是1,压缩后的文件大小都是一样的(16384是zlib官方给出的典型值).不同的是当chunk size为1时,压缩速度要慢很多.

But when I tested my program, I found that no matter the CHUNK size is 16384 or 1, the compressed file size is same (16384 is a typical value given by zlib official routine). The difference is that when the chunk size is 1, the compression speed is much slower.

这个结果让我很困惑.我认为当CHUNK大小为1时,压缩处理无效.因为在这个例程中,每一个输入的CHUNK都会被直接处理输出到一个压缩文件中,我觉得1字节的数据是不能压缩的.

This result makes me very confused. I think when the CHUNK size is 1, the compression processing is invalid. Because in this routine, each input CHUNK will be processed and output to a compressed file directly, and I think 1 byte of data cannot be compressed.

所以我的问题是,为什么 CHUNK 大小只影响压缩速度,而不影响压缩率?

这是我的程序:

#define CHUNK 16384
int def(FILE *source, FILE *dest, int level, int memLevel)
{
    int ret, flush;
    unsigned have;
    z_stream strm;
    unsigned char in[CHUNK];
    unsigned char out[CHUNK];

    /* allocate deflate state */
    strm.zalloc = Z_NULL;
    strm.zfree = Z_NULL;
    strm.opaque = Z_NULL;
    ret = deflateInit2(&strm, level, Z_DEFLATED, MAX_WBITS + 16, memLevel, Z_DEFAULT_STRATEGY);
    if (ret != Z_OK)
        return ret;

    /* compress until end of file */
    do {
        strm.avail_in = fread(in, 1, CHUNK, source);
        if (ferror(source)) {
            (void)deflateEnd(&strm);
            return Z_ERRNO;
        }
        flush = feof(source) ? Z_FINISH : Z_NO_FLUSH;
        strm.next_in = in;

        /* run deflate() on input until output buffer not full, finish
           compression if all of source has been read in */
        do {
            strm.avail_out = CHUNK;
            strm.next_out = out;
            ret = deflate(&strm, flush);    /* no bad return value */
            assert(ret != Z_STREAM_ERROR);  /* state not clobbered */
            have = CHUNK - strm.avail_out;
            if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
                (void)deflateEnd(&strm);
                return Z_ERRNO;
            }
        } while (strm.avail_out == 0);
        assert(strm.avail_in == 0);     /* all input will be used */

        /* done when last data in file processed */
    } while (flush != Z_FINISH);
    assert(ret == Z_STREAM_END);        /* stream will be complete */

    /* clean up and return */
    (void)deflateEnd(&strm);
    return Z_OK;
}

推荐答案

因为 deflate 在内部缓冲数据以进行压缩.无论您如何提供数据以进行 deflate,它都会累积并压缩字节,直到它有足够的数据来发出 deflate 块.

Because deflate internally buffers the data for compression. Regardless of how you feed the data to deflate, it accumulates and compresses bytes until it has enough to emit a deflate block.

您不能压缩字节是正确的.如果您想看看这是多么正确,请将 flushZ_NO_FLUSH 更改为 Z_FULL_FLUSH,然后一次输入一个字节.然后确实 deflate 会尝试分别压缩输入的每个字节.

You are correct that you cannot compress a byte. If you would like to see how true that is, then change flush from Z_NO_FLUSH to Z_FULL_FLUSH and then feed it a byte at a time. Then indeed deflate will attempt to compress each byte of input separately.

这篇关于zlib编程中,CHUNK大小会影响压缩文件大小吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆