Java 8:流与集合的性能 [英] Java 8: performance of Streams vs Collections

查看:20
本文介绍了Java 8:流与集合的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Java 8 的新手.我仍然不深入了解 API,但我做了一个小型的非正式基准测试来比较新的 Streams API 与旧的集合的性能.

I'm new to Java 8. I still don't know the API in depth, but I've made a small informal benchmark to compare the performance of the new Streams API vs the good old Collections.

测试包括过滤Integer的列表,并为每个偶数计算平方根并将其存储在Double<的结果List中/代码>.

The test consists in filtering a list of Integer, and for each even number, calculate the square root and storing it in a result List of Double.

代码如下:

    public static void main(String[] args) {
        //Calculating square root of even numbers from 1 to N       
        int min = 1;
        int max = 1000000;

        List<Integer> sourceList = new ArrayList<>();
        for (int i = min; i < max; i++) {
            sourceList.add(i);
        }

        List<Double> result = new LinkedList<>();


        //Collections approach
        long t0 = System.nanoTime();
        long elapsed = 0;
        for (Integer i : sourceList) {
            if(i % 2 == 0){
                result.add(Math.sqrt(i));
            }
        }
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Collections: Elapsed time:	 %d ns 	(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));


        //Stream approach
        Stream<Integer> stream = sourceList.stream();       
        t0 = System.nanoTime();
        result = stream.filter(i -> i%2 == 0).map(i -> Math.sqrt(i)).collect(Collectors.toList());
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Streams: Elapsed time:		 %d ns 	(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));


        //Parallel stream approach
        stream = sourceList.stream().parallel();        
        t0 = System.nanoTime();
        result = stream.filter(i -> i%2 == 0).map(i -> Math.sqrt(i)).collect(Collectors.toList());
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Parallel streams: Elapsed time:	 %d ns 	(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));      
    }.

这是双核机器的结果:

    Collections: Elapsed time:        94338247 ns   (0,094338 seconds)
    Streams: Elapsed time:           201112924 ns   (0,201113 seconds)
    Parallel streams: Elapsed time:  357243629 ns   (0,357244 seconds)

对于这个特定的测试,流的速度大约是集合的两倍,并且并行性无济于事(或者我以错误的方式使用它?).

For this particular test, streams are about twice as slow as collections, and parallelism doesn't help (or either I'm using it the wrong way?).

问题:

  • 这个测试公平吗?我犯了什么错误吗?
  • 流比集合慢吗?有没有人在这方面制定过良好的正式基准?
  • 我应该努力采用哪种方法?

更新结果.

我按照@pveentjer 的建议在 JVM 预热(1k 次迭代)后运行了 1k 次测试:

I ran the test 1k times after JVM warmup (1k iterations) as advised by @pveentjer:

    Collections: Average time:      206884437,000000 ns     (0,206884 seconds)
    Streams: Average time:           98366725,000000 ns     (0,098367 seconds)
    Parallel streams: Average time: 167703705,000000 ns     (0,167704 seconds)

在这种情况下,流的性能更高.我想知道在运行时只调用一次或两次过滤功能的应用程序中会观察到什么.

In this case streams are more performant. I wonder what would be observed in an app where the filtering function is only called once or twice during runtime.

推荐答案

  1. 停止使用 LinkedList 除了使用迭代器从列表中间大量删除.

  1. Stop using LinkedList for anything but heavy removing from the middle of the list using iterator.

停止手动编写基准测试代码,使用 JMH.

Stop writing benchmarking code by hand, use JMH.

适当的基准:

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(StreamVsVanilla.N)
public class StreamVsVanilla {
    public static final int N = 10000;

    static List<Integer> sourceList = new ArrayList<>();
    static {
        for (int i = 0; i < N; i++) {
            sourceList.add(i);
        }
    }

    @Benchmark
    public List<Double> vanilla() {
        List<Double> result = new ArrayList<>(sourceList.size() / 2 + 1);
        for (Integer i : sourceList) {
            if (i % 2 == 0){
                result.add(Math.sqrt(i));
            }
        }
        return result;
    }

    @Benchmark
    public List<Double> stream() {
        return sourceList.stream()
                .filter(i -> i % 2 == 0)
                .map(Math::sqrt)
                .collect(Collectors.toCollection(
                    () -> new ArrayList<>(sourceList.size() / 2 + 1)));
    }
}

结果:

Benchmark                   Mode   Samples         Mean   Mean error    Units
StreamVsVanilla.stream      avgt        10       17.588        0.230    ns/op
StreamVsVanilla.vanilla     avgt        10       10.796        0.063    ns/op

正如我预期的那样,流实现相当慢.JIT 能够内联所有 lambda 内容,但不会产生像 vanilla 版本那样完美简洁的代码.

Just as I expected stream implementation is fairly slower. JIT is able to inline all lambda stuff but doesn't produce as perfectly concise code as vanilla version.

通常,Java 8 流并不神奇.他们无法加速已经很好实现的东西(可能是简单的迭代或 Java 5 的 for-each 语句替换为 Iterable.forEach()Collection.removeIf() 通话).流更多的是关于编码的便利性和安全性.方便 - 速度权衡在这里起作用.

Generally, Java 8 streams are not magic. They couldn't speedup already well-implemented things (with, probably, plain iterations or Java 5's for-each statements replaced with Iterable.forEach() and Collection.removeIf() calls). Streams are more about coding convenience and safety. Convenience -- speed tradeoff is working here.

这篇关于Java 8:流与集合的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆