Uncom $ P $使用zlib的的gzip文件存取功能pssed文件大小 [英] Uncompressed file size using zlib's gzip file access function

查看:101
本文介绍了Uncom $ P $使用zlib的的gzip文件存取功能pssed文件大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Linux命令行工具的gzip我可以告诉使用一个COM preSS文件的uncom pressed大小的gzip -l <​​/ code>。

Using linux command line tool gzip I can tell the uncompressed size of a compress file using gzip -l.

我找不到上的zlib手册中的gzip的档案存取功能之类的任何功能。

I couldn't find any function like that on zlib manual section "gzip File Access Functions".

在这个环节,我找到了解决办法 HTTP: //www.abeel.be/content/determine-uncom$p$pssed-size-gzip-file 涉及读的是最后4个字节的文件,但我避免它,因为现在我preFER使用LIB的功能。

At this link, I found a solution http://www.abeel.be/content/determine-uncompressed-size-gzip-file that involves reading the last 4 bytes of the file, but I am avoiding it right now because I prefer to use lib's functions.

推荐答案

有没有可靠的方式来获得pssed一个gzip文件的大小,而不DECOM pressing的uncom $ P $,或至少整个事情进行解码。原因有三。

There is no reliable way to get the uncompressed size of a gzip file without decompressing, or at least decoding the whole thing. There are three reasons.

首先,关于uncom pressed长度的唯一信息是在gzip文件(存储在little-endian顺序)的最后四个字节。根据需要,也就是长度模2 32 。因此,如果uncom pressed长度为4 GB或更多,你不会知道的长度是什么。你只能是肯定pssed长度uncom $ P $是小于4 GB如果COM pressed长度小于像2 32 / 1032 + 18,或约4 MB。 (1032是放气的最大COM pression因素。)

First, the only information about the uncompressed length is four bytes at the end of the gzip file (stored in little-endian order). By necessity, that is the length modulo 232. So if the uncompressed length is 4 GB or more, you won't know what the length is. You can only be certain that the uncompressed length is less than 4 GB if the compressed length is less than something like 232 / 1032 + 18, or around 4 MB. (1032 is the maximum compression factor of deflate.)

其次,这是糟糕的是,一个gzip文件实际上可以是多个的gzip流的串联。比其他的解码,就没有办法找到每个流的gzip为了看那块的四字节uncom pressed长度结束。 (这可能是错误的反正由于第一原因。)

Second, and this is worse, a gzip file may actually be a concatenation of multiple gzip streams. Other than decoding, there is no way to find where each gzip stream ends in order to look at the four-byte uncompressed length of that piece. (Which may be wrong anyway due to the first reason.)

三,gzip的文件有时会具有gzip的数据流(通常是零)结束后的垃圾。那么最后4个字节是不是长度。

Third, gzip files will sometimes have junk after the end of the gzip stream (usually zeros). Then the last four bytes are not the length.

所以的gzip -l <​​/ code>并没有真正反正工作。其结果是,存在在提供的zlib该功能是没有意义的。

So gzip -l doesn't really work anyway. As a result, there is no point in providing that function in zlib.

pigz 有一个选项,其实德code整个输入,以获得实际uncom pressed长度: pigz -lt ,保证了正确的答案。 pigz -l <​​/ code>做什么的gzip -l <​​/ code>确实,这可能是错误的。

pigz has an option to in fact decode the entire input in order to get the actual uncompressed length: pigz -lt, which guarantees the right answer. pigz -l does what gzip -l does, which may be wrong.

这篇关于Uncom $ P $使用zlib的的gzip文件存取功能pssed文件大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆