在Java中快速压缩? [英] Fast compression in Java?

查看:382
本文介绍了在Java中快速压缩?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有一个非常快的压缩库Java?标准的gzip库比我想要的慢。我正在寻找类似于 http://www.oberhumer.com/opensource/lzo/ 的内容,这是原生Java提供快速压缩和解压缩的代码。非常感谢!

Is there a very fast compression library for Java? The standard gzip library is slower than I would like. I'm looking for something similar to http://www.oberhumer.com/opensource/lzo/ that's native Java code that provides fast compression and decompression. Thanks!

一些其他快速压缩程式库供日后参考:

A few other fast compression libraries for future reference:

QuickLZ -
C / C#/ Java - GPL或commercial
http://www.quicklz.com/

QuickLZ - C/C#/Java - GPL or commercial http://www.quicklz.com/

libLZF -
C - BSD样式许可证
http:// oldhome。 schmorp.de/marc/liblzf.html

libLZF - C - BSD style license http://oldhome.schmorp.de/marc/liblzf.html

FastLZ -
C - MIT样式许可证
http://fastlz.org/

FastLZ - C - MIT style license http://fastlz.org/

LZO
C - GPL或商业广告
< a href =http://www.oberhumer.com/opensource/lzo/> http://www.oberhumer.com/opensource/lzo/

zlib -
C / Java(GZIP和deflate) - 商业友好许可证
http://zlib.net/

zlib - C / Java (GZIP and deflate) - Commercial friendly license http://zlib.net/

Hadoop-LZO集成(JNI):
http ://github.com/kevinweil/hadoop-lzo

Hadoop-LZO integration (JNI): http://github.com/kevinweil/hadoop-lzo

Snappy-Java(JNI):
https://github.com/xerial/snappy-java

Snappy-Java (JNI): https://github.com/xerial/snappy-java

QuickLZ用户的基准测试:
http://www.quicklz.com/bench.html

Benchmarks from the QuickLZ folks: http://www.quicklz.com/bench.html

推荐答案

您可以使用DeflatorOutputStream和InflatorInputStream。这两种都使用LZW压缩。你可以只使用他们提供的库。

You could use the DeflatorOutputStream and InflatorInputStream. These both use LZW compression. You could just use the library they provide.

编辑:实时性能通常用延迟来衡量,但是你通过吞吐量来引用数字。

Real time performance is usually measured in terms of latency, however you quote numbers in terms of throughtput. Could you clarify what you mean by real-time.

对于延迟,使用BEST_SPEED,每个调用平均花费220 ns + 13 ns /字节。

For latency, using the BEST_SPEED, each call took 220 ns + 13 ns/byte on average.

注意:在低延迟情况下,您经常会遇到CPU运行热时延迟的多倍。

Note: in low latency situations you often get many times the latency you might expect when the CPU is running "hot". You have perform the timing in a realistic situation.

编辑:这是我使用Java 6更新21的压缩率;

This is the compression rates I got with Java 6 update 21;

Raw OutputStream.write() - 2485 MB/sec

Deflator.NO_COMPRESSION - 99 MB/s

Deflator.BEST_SPEED - 85 MB/s.

Deflator.FILTERED - 77 MB/s

Deflator.HUFFMAN_ONLY - 79 MB/s

Deflator.DEFAULT_COMPRESSION - 30 MB/s

Deflator.BEST_COMPRESSION - 14 MB/s

注意:默认设置比最佳速度设置快。我只能假设前者已优化。

Note: I am not sure why the default setting is faster than the "best speed" setting. I can only assume the former has been optimised.

输出缓冲区大小为4KB,您可能会发现不同的大小是最适合您的。

The output buffer size was 4KB, you might find a different size is best for you.

编辑:以下代码为大型CSV文件打印。延迟为5 KB块。

The following code prints for a large CSV file. The latency is for a 5KB block.

Average latency 48532 ns. Bandwidth 91.0 MB/s.
Average latency 52560 ns. Bandwidth 83.0 MB/s.
Average latency 47602 ns. Bandwidth 93.0 MB/s.
Average latency 51099 ns. Bandwidth 86.0 MB/s.
Average latency 47695 ns. Bandwidth 93.0 MB/s.

public class Main {
    public static void main(String... args) throws IOException {
        final String filename = args[0];
        final File file = new File(filename);
        DataInputStream dis = new DataInputStream(new FileInputStream(file));
        byte[] bytes = new byte[(int) file.length()];
        dis.readFully(bytes);
        test(bytes, false);
        for (int i = 0; i < 5; i++)
            test(bytes, true);
    }

    private static void test(byte[] bytes, boolean print) throws IOException {
        OutputStream out = new ByteOutputStream(bytes.length);
        Deflater def = new Deflater(Deflator.BEST_SPEED);
        DeflaterOutputStream dos = new DeflaterOutputStream(out, def, 4 * 1024);
        long start = System.nanoTime();
        int count = 0;
        int size = 5 * 1024;
        for (int i = 0; i < bytes.length - size; i += size, count++) {
            dos.write(bytes, i, size);
            dos.flush();
        }
        dos.close();
        long time = System.nanoTime() - start;
        long latency = time / count;
        // 1 byte per ns = 1000 MB/s.
        long bandwidth = (count * size * 1000L) / time;
        if (print)
            System.out.println("Average latency " + latency + " ns. Bandwidth " + bandwidth + " MB/s.");    
    }
}

这篇关于在Java中快速压缩?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆