对于小的4 KB文件,什么是最好的压缩算法? [英] What is the best compression algorithm for small 4 KB files?

查看:230
本文介绍了对于小的4 KB文件,什么是最好的压缩算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试压缩大小约为4 KB的TCP数据包。数据包可以包含任何字节(从0到255)。我发现的所有压缩算法的基准都是基于更大的文件。我没有找到任何比较不同的算法对小文件的压缩率,这是我需要的。我需要它是开源的,所以它可以在C ++上实现,所以没有RAR例如。对于大小约4千字节的小文件,可以推荐什么算法? LZMA HACC ZIP gzip bzip2

I am trying to compress TCP packets each one of about 4 KB in size. The packets can contain any byte (from 0 to 255). All of the benchmarks on compression algorithms that I found were based on larger files. I did not find anything that compares the compression ratio of different algorithms on small files, which is what I need. I need it to be open source so it can be implemented on C++, so no RAR for example. What algorithm can be recommended for small files of about 4 kilobytes in size? LZMA? HACC? ZIP? gzip? bzip2?

推荐答案

选择最快的算法,因为您可能需要实时进行此操作。通常对于较小的数据块,算法压缩大致相同(给出或占用几个字节),主要是因为除了有效载荷,算法需要传输字典或哈夫曼树。

Choose the algorithm that is the quickest, since you probably care about doing this in real time. Generally for smaller blocks of data, the algorithms compress about the same (give or take a few bytes) mostly because the algorithms need to transmit the dictionary or Huffman trees in addition to the payload.

我强烈推荐Deflate(由zlib和Zip使用),原因有很多。该算法相当快,经过良好测试,BSD许可,是Zip支持的唯一压缩(根据infozip Appnote)。除了基本知识,当它确定压缩大于解压缩大小时,有一个存储模式,每个数据块只添加5个字节(最大块为64k字节)。除了STORE模式,Deflate支持两种不同类型的Huffman表(或词典):动态和固定。动态表意味着霍夫曼树作为压缩数据的一部分被传输并且是最灵活的(对于不同类型的非随机数据)。固定表的优点是该表被所有解码器所知,因此不需要包含在压缩流中。解压缩(或者Inflate)代码相对容易。我写的两个Java和Javascript版本直接关闭了zlib,他们表现得很好。

I highly recommend Deflate (used by zlib and Zip) for a number of reasons. The algorithm is quite fast, well tested, BSD licensed, and is the only compression required to be supported by Zip (as per the infozip Appnote). Aside from the basics, when it determines that the compression is larger than the decompressed size, there's a STORE mode which only adds 5 bytes for every block of data (max block is 64k bytes). Aside from the STORE mode, Deflate supports two different types of Huffman tables (or dictionaries): dynamic and fixed. A dynamic table means the Huffman tree is transmitted as part of the compressed data and is the most flexible (for varying types of nonrandom data). The advantage of a fixed table is that the table is known by all decoders and thus doesn't need to be contained in the compressed stream. The decompression (or Inflate) code is relatively easy. I've written both Java and Javascript versions based directly off of zlib and they perform rather well.

其他压缩算法提到了他们的优点。我更喜欢Deflate因为它在压缩步骤,特别是在解压缩步骤的运行时性能。

The other compression algorithms mentioned have their merits. I prefer Deflate because of its runtime performance on both the compression step and particularly in decompression step.

清楚点:Zip不是压缩类型,它是一个容器。对于做数据包压缩,我会绕过Zip,只使用由zlib提供的deflate / inflate API。

A point of clarification: Zip is not a compression type, it is a container. For doing packet compression, I would bypass Zip and just use the deflate/inflate APIs provided by zlib.

这篇关于对于小的4 KB文件,什么是最好的压缩算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆