解析zlib标头 [英] Parsing zlib header
问题描述
我花了几天时间阅读zlib(以及gzip和deflate)RFC,我可以说它们有点垃圾.遗漏了很多细节,所以我要提出这个问题.
I spent a few days reading zlib (and gzip and deflate) RFC and I can say they are kind of rubbish. Quite some details are missing, so I'm opening this question.
我正在尝试解析zlib数据,我需要了解有关标头的一些详细信息.
I'm trying to parse a zlib data and I need to know some details about the header.
首先,RFC表示将有2个字节,分别是CMF
和FLG
.
First of all, RFC says there will be 2 bytes, CMF
and FLG
.
CMF
分为2个4位的部分.第一个是CM
,第二个是CINFO
.
CMF
is divided in 2 4 bits sections. The first one is CM
and the second one is CINFO
.
CM
的可能值是什么? RFC表示8
表示deflate
且15
是保留的,但是其余可能的值呢?
What are the possible values of CM
? RFC says that 8
means deflate
and that 15
is reserved, but what about the rest of the possible values?
CINFO
应该始终为8(如果我写错了,请纠正我).
CINFO
on the other side, should be always 8, if I understand the RFC correctly (please correct me if I'm wrong).
跳过FLG
和可能的FDICT
,我们进入Compressed data
部分. RFC的这一部分说:
Skipping FLG
and the possible FDICT
, we get to the Compressed data
section. This part of the RFC says:
For compression method 8, the compressed data is stored in the
deflate compressed data format as described in the document
"DEFLATE Compressed Data Format Specification" by L. Peter
Deutsch. (See reference [3] in Chapter 3, below)
这是什么意思?我是否应该假设CM
始终为8?如果是yes
,那么为什么整个CM
东西都存在?
What does this mean? Should I assume that CM
will always be 8? If yes
, then why does the entire CM
thing exists?
最后,我有点困惑.我一直认为zlib可以包装deflate和gzip,但是阅读此RFC时,我看不到gzip压缩数据适合的位置.我有什么想念的吗?
Last, I'm a bit confused. I always believe zlib can wrap both deflate and gzip, but reading this RFC I can't see where a gzip compressed data fits in here. Is there anything that I'm missing about this?
推荐答案
CM
的可能值是什么? RFC表示8
表示deflate
且15
是保留的,但是其余可能的值呢?
What are the possible values of
CM
? RFC says that8
meansdeflate
and that15
is reserved, but what about the rest of the possible values?
...
我应该假设CM
始终为8吗?如果是yes
,那么为什么整个CM
东西都存在?
Should I assume that CM
will always be 8? If yes
, then why does the entire CM
thing exists?
CM
可供将来使用,并允许其他(非标准)压缩方法:
CM
is there for future use and to allow other (non-standard) compression methods:
在此版本的zlib规范中未指定其他压缩数据格式.( RFC 1950,"ZLIB压缩数据格式规范版本3.3" )
您不应该假设它始终是8.相反,您应该检查它,如果不是8,则抛出不支持"错误.
You should NOT assume that it's always 8. Instead, you should check it and, if it's not 8, throw a "not supported" error.
另一方面,如果我正确理解RFC,则
CINFO
应该始终为8(如果我写错了,请纠正我).
CINFO
on the other side, should be always 8, if I understand the RFC correctly (please correct me if I'm wrong).
否,CINFO
的含义取决于CM
.如果CM
为8(唯一有意义的标准化值),则:
No, the meaning of CINFO
depends on CM
. If CM
is 8 (the only meaningful standardized value), then:
CINFO
是LZ77窗口大小的以2为底的对数,减去8(CINFO=7
表示32K窗口大小).此版本的规范中不允许CINFO
的值大于7.( RFC 1950, "ZLIB压缩数据格式规范版本3.3" )
CINFO
is the base-2 logarithm of the LZ77 window size, minus eight (CINFO=7
indicates a 32K window size). Values of CINFO
above 7 are not allowed in this version of the specification. (RFC 1950, "ZLIB Compressed Data Format Specification version 3.3")
实际上,CINFO
不能为8.
跳过
FLG
和可能的FDICT
,我们进入Compressed data
部分. RFC的这一部分说:
Skipping
FLG
and the possibleFDICT
, we get to theCompressed data
section. This part of the RFC says:
For compression method 8, the compressed data is stored in the
deflate compressed data format as described in the document
"DEFLATE Compressed Data Format Specification" by L. Peter
Deutsch. (See reference [3] in Chapter 3, below)
这是什么意思?
这意味着在本标准中未指定DEFLATE编码的详细信息,但在 ftp://ftp.uu.net/pub/archiving/zip/zlib/.
It means that the details for the DEFLATE encoding is not specified in this standard, but is described elsewhere, at ftp://ftp.uu.net/pub/archiving/zip/zlib/.
如果您愿意,DEFLATE拥有自己的RFC,即 RFC 1951,"DEFLATE压缩数据格式规范版本1.3".
If you prefer, DEFLATE has its own RFC, that is RFC 1951, "DEFLATE Compressed Data Format Specification version 1.3".
最后,我有点困惑.我一直认为zlib可以包装deflate和gzip,但是阅读此RFC时,我看不到gzip压缩数据适合的位置.我有什么想念的吗?
Last, I'm a bit confused. I always believe zlib can wrap both deflate and gzip, but reading this RFC I can't see where a gzip compressed data fits in here. Is there anything that I'm missing about this?
否,zlib无法包装gzip. gzip和zlib是用于压缩数据的不同包装器(如zip格式,PNG格式,PDF格式等)
No, zlib can't wrap gzip. gzip and zlib are different wrappers for deflate data (as is the zip format, the PNG format, the PDF format, etc.)
Gzip使用DEFLATE:
Gzip uses DEFLATE:
该格式目前使用DEFLATE压缩方法,但可以轻松扩展以使用其他压缩方法.( RFC 1952,"GZIP文件格式规范版本4.3" )
CM = 8
表示窗口大小最大为32K的压缩"压缩方法.这是gzip和PNG ( RFC 1950,"ZLIB压缩数据格式规范版本3.3"使用的方法")
CM = 8
denotes the "deflate" compression method with a window size up to 32K. This is the method used by gzip and PNG (RFC 1950, "ZLIB Compressed Data Format Specification version 3.3")
如果您发现RFC不清楚或难以理解,请考虑研究zlib实现的源代码.尽管某些实现可能是非标准的,但查看源代码可能会帮助您解决一些疑问.
If you find the RFC unclear or difficult to understand, consider looking into the source code for an implementation of zlib. While some implementations may be non-standard, looking at the source may help you solve some of your doubts.
以下是zlib.net的 zlib的源代码摘录,它回答了您的一个问题:
Here's an excerpt from the source code of zlib from zlib.net that answers one of your questions:
#define Z_DEFLATED 8
/* ... */
if (BITS(4) != Z_DEFLATED) {
strm->msg = (char *)"unknown compression method";
state->mode = BAD;
break;
}
这篇关于解析zlib标头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!