压缩后的输出不同于Go to Ruby的实现 [英] Compressed output differs from Go to Ruby Implementation

查看:74
本文介绍了压缩后的输出不同于Go to Ruby的实现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在实现一个程序,将文件分解为git blob并适当地存储它.

我有一个红宝石参考实现基于 git书中的文章

我正尝试在此处

但是,我遇到了一个问题,即每个实现中存储的压缩数据略有不同.

vbindiff显示前两个字节相同(如从此测试脚本)(如果我没看错的话).这些字节分别存储压缩方法,标志和标志(按照 https://tools.ietf.org /html/rfc1950 ).第三个字节是差异开始的地方,这可以是字典ID或原始输入数据的开始.数据将保持相似,直到接近文件末尾.我假设这可能是ADLER32校验和中的差异.

默认情况下,zlib的go和Ruby实现似乎都不会将字典传递给zlib(根据解决方案

RFC 1951中定义的deflate算法(以RFC 1950定义的zlib格式以及RFC 1952定义的gzip格式使用)允许对压缩时可能导致不同结果的实现.但是这些结果仍将解压缩为相同的值.这样可以权衡压缩时间到压缩级别,并使 zopfli 之类的程序也可以实现更好的压缩比原始zlib库要大(但压缩时间要大得多).

Go使用Go编写的deflate算法自己的实现,而ruby使用 zlib库.这就是您的示例在同一输入上创建不同的压缩输出的原因.但是,如果您从Go或Ruby程序中获取输出并解压缩(无论是使用Ruby还是Go还是任何符合标准的实现),它将再次得到完全相同的值.

I'm implementing a program that deflates a file into a git blob and stores it appropriately.

I have a ruby reference implementation that's based on an article from the git book

I'm attempting to implement this in go here

However, I'm running into an issue where the stored compressed data differs slightly with each implementation.

vbindiff shows that the first 2 bytes are identical (as run from this test script) (If I'm reading this right). These bytes store the compression method and flags, and flags respectively (as per https://tools.ietf.org/html/rfc1950). The third byte is where the difference begins, this is either the dictionary ID or the start of the original input data. The data remains similar until near the end of the file. I'm assuming this is probably the difference in the ADLER32 checksum.

It seems that both the go and Ruby implementations of zlib do not pass a dictionary to zlib by default (as per go zlib source and ruby zlib source)

The data appears identical.

I'm not sure if there's an implementation error in the libraries or if I'm just missing something.

Why are these outputs different?

解决方案

The deflate algorithm as defined in RFC 1951 (which is used in the zlib format defined by RFC 1950 and also in gzip defined by RFC 1952) allows variations in the implementation which might lead to different results when compressing. But these results will still decompress to the same value. This allows for a tradeoff of compression time to compression level and makes also programs like zopfli possible which achieve better compression than the original zlib library (at the cost of significantly larger compression time).

Go uses its own implementation of the deflate algorithm written in Go while ruby uses the zlib library. This is the reason your examples create different compressed output on the same input. But if you take the output from the Go or Ruby program and decompress (no matter if done with Ruby or Go or whatever standard-conforming implementation) it again it will result in exactly the same value.

这篇关于压缩后的输出不同于Go to Ruby的实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆