解析zlib标头 [英] Parsing zlib header

查看:655
本文介绍了解析zlib标头的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我花了几天时间阅读zlib(以及gzip和deflate)RFC,我可以说它们有点垃圾.遗漏了很多细节,所以我要提出这个问题.

I spent a few days reading zlib (and gzip and deflate) RFC and I can say they are kind of rubbish. Quite some details are missing, so I'm opening this question.

我正在尝试解析zlib数据,我需要了解有关标头的一些详细信息.

I'm trying to parse a zlib data and I need to know some details about the header.

首先,RFC表示将有2个字节,分别是CMFFLG.

First of all, RFC says there will be 2 bytes, CMF and FLG.

CMF分为2个4位的部分.第一个是CM,第二个是CINFO.

CMF is divided in 2 4 bits sections. The first one is CM and the second one is CINFO.

CM的可能值是什么? RFC表示8表示deflate15是保留的,但是其余可能的值呢?

What are the possible values of CM? RFC says that 8 means deflate and that 15 is reserved, but what about the rest of the possible values?

CINFO应该始终为8(如果我写错了,请纠正我).

CINFO on the other side, should be always 8, if I understand the RFC correctly (please correct me if I'm wrong).

跳过FLG和可能的FDICT,我们进入Compressed data部分. RFC的这一部分说:

Skipping FLG and the possible FDICT, we get to the Compressed data section. This part of the RFC says:

For compression method 8, the compressed data is stored in the
deflate compressed data format as described in the document
"DEFLATE Compressed Data Format Specification" by L. Peter
Deutsch. (See reference [3] in Chapter 3, below)

这是什么意思?我是否应该假设CM始终为8?如果是yes,那么为什么整个CM东西都存在?

What does this mean? Should I assume that CM will always be 8? If yes, then why does the entire CM thing exists?

最后,我有点困惑.我一直认为zlib可以包装deflate和gzip,但是阅读此RFC时,我看不到gzip压缩数据适合的位置.我有什么想念的吗?

Last, I'm a bit confused. I always believe zlib can wrap both deflate and gzip, but reading this RFC I can't see where a gzip compressed data fits in here. Is there anything that I'm missing about this?

推荐答案

CM的可能值是什么? RFC表示8表示deflate15是保留的,但是其余可能的值呢?

What are the possible values of CM? RFC says that 8 means deflate and that 15 is reserved, but what about the rest of the possible values?

...

我应该假设CM始终为8吗?如果是yes,那么为什么整个CM东西都存在?

Should I assume that CM will always be 8? If yes, then why does the entire CM thing exists?

CM可供将来使用,并允许其他(非标准)压缩方法:

CM is there for future use and to allow other (non-standard) compression methods:

在此版本的zlib规范中未指定其他压缩数据格式.( RFC 1950,"ZLIB压缩数据格式规范版本3.3" )

您不应该假设它始终是8.相反,您应该检查它,如果不是8,则抛出不支持"错误.

You should NOT assume that it's always 8. Instead, you should check it and, if it's not 8, throw a "not supported" error.

另一方面,如果我正确理解RFC,则

CINFO应该始终为8(如果我写错了,请纠正我).

CINFO on the other side, should be always 8, if I understand the RFC correctly (please correct me if I'm wrong).

否,CINFO的含义取决于CM.如果CM为8(唯一有意义的标准化值),则:

No, the meaning of CINFO depends on CM. If CM is 8 (the only meaningful standardized value), then:

CINFO是LZ77窗口大小的以2为底的对数,减去8(CINFO=7表示32K窗口大小).此版本的规范中不允许CINFO的值大于7.( RFC 1950, "ZLIB压缩数据格式规范版本3.3" )

CINFO is the base-2 logarithm of the LZ77 window size, minus eight (CINFO=7 indicates a 32K window size). Values of CINFO above 7 are not allowed in this version of the specification. (RFC 1950, "ZLIB Compressed Data Format Specification version 3.3")

实际上,CINFO 不能为8.

跳过FLG和可能的FDICT,我们进入Compressed data部分. RFC的这一部分说:

Skipping FLG and the possible FDICT, we get to the Compressed data section. This part of the RFC says:

For compression method 8, the compressed data is stored in the
deflate compressed data format as described in the document
"DEFLATE Compressed Data Format Specification" by L. Peter
Deutsch. (See reference [3] in Chapter 3, below)

这是什么意思?

这意味着在本标准中未指定DEFLATE编码的详细信息,但在 ftp://ftp.uu.net/pub/archiving/zip/zlib/.

It means that the details for the DEFLATE encoding is not specified in this standard, but is described elsewhere, at ftp://ftp.uu.net/pub/archiving/zip/zlib/.

如果您愿意,DEFLATE拥有自己的RFC,即 RFC 1951,"DEFLATE压缩数据格式规范版本1.3".

If you prefer, DEFLATE has its own RFC, that is RFC 1951, "DEFLATE Compressed Data Format Specification version 1.3".

最后,我有点困惑.我一直认为zlib可以包装deflate和gzip,但是阅读此RFC时,我看不到gzip压缩数据适合的位置.我有什么想念的吗?

Last, I'm a bit confused. I always believe zlib can wrap both deflate and gzip, but reading this RFC I can't see where a gzip compressed data fits in here. Is there anything that I'm missing about this?

否,zlib无法包装gzip. gzip和zlib是用于压缩数据的不同包装器(如zip格式,PNG格式,PDF格式等)

No, zlib can't wrap gzip. gzip and zlib are different wrappers for deflate data (as is the zip format, the PNG format, the PDF format, etc.)

Gzip使用DEFLATE:

Gzip uses DEFLATE:

该格式目前使用DEFLATE压缩方法,但可以轻松扩展以使用其他压缩方法.( RFC 1952,"GZIP文件格式规范版本4.3" )

CM = 8表示窗口大小最大为32K的压缩"压缩方法.这是gzip和PNG ( RFC 1950,"ZLIB压缩数据格式规范版本3.3"使用的方法")

CM = 8 denotes the "deflate" compression method with a window size up to 32K. This is the method used by gzip and PNG (RFC 1950, "ZLIB Compressed Data Format Specification version 3.3")

如果您发现RFC不清楚或难以理解,请考虑研究zlib实现的源代码.尽管某些实现可能是非标准的,但查看源代码可能会帮助您解决一些疑问.

If you find the RFC unclear or difficult to understand, consider looking into the source code for an implementation of zlib. While some implementations may be non-standard, looking at the source may help you solve some of your doubts.

以下是zlib.net的 zlib的源代码摘录,它回答了您的一个问题:

Here's an excerpt from the source code of zlib from zlib.net that answers one of your questions:

#define Z_DEFLATED   8
/* ... */
if (BITS(4) != Z_DEFLATED) { 
    strm->msg = (char *)"unknown compression method";
    state->mode = BAD;
    break;
}

这篇关于解析zlib标头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆